Abstract
Background: The automated classification of videos through artificial neural networks is addressed in this work. To explore the concepts and measure the results, the data set UCF101 is used, consisting of video clips taken from YouTube to recognize actions. The study is carried out with the authors' resources to determine the feasibility of independent research in the area.
Methods: This work was developed in the Python programming language using the Keras library with Tensorflow as the back-end. The objective is to develop a network that presents performance compatible with the state of the art in terms of classifying videos according to the actions taken.
Results: Given the hardware limitations, there is considerable distance between the implementation possibilities in this work and what is known as the state-of-the-art.
Conclusion: Throughout the work, some aspects in which this limitation influenced the development are presented, but it is shown that this realization is feasible and that obtaining expressive results is possible 98.6% accuracy is obtained in the UCF101 data set, compared to the 98 percentage points of the best result ever reported, using, however, considerably fewer resources. In addition, the importance of transfer learning in achieving expressive results as well as the different performances of each architecture are reviewed. Thus, this work may open doors to carry patent- based outcomes.