This is the C3D model used with a fork of Caffe to the Sports1M dataset migrated to Keras. Details about the network architecture can be found in the following arXiv paper:
Tran, Du, et al. "Learning Spatiotemporal Features With 3D Convolutional Networks." Proceedings of the IEEE International Conference on Computer Vision. 2015.
Download: weights
how can we use the C3D features to model violence detection on videos?