Skip to content

Instantly share code, notes, and snippets.

@jin-zhe
Last active March 18, 2024 23:47
Show Gist options
  • Star 49 You must be signed in to star a gist
  • Fork 18 You must be signed in to fork a gist
  • Save jin-zhe/3a6054e99162bc9277940867f942bba2 to your computer and use it in GitHub Desktop.
Save jin-zhe/3a6054e99162bc9277940867f942bba2 to your computer and use it in GitHub Desktop.
An overview of action recognition datasets and their detection classes

Activity Recognition Datasets

An overview of recent action recognition datasets and their detection classes

Concepts & terminologies:

  • Action: Atomic low-level movement such as standing up, sitting down, walking, talking etc.
  • Activity/event: Higher level occurence then actions such as dining, playing, dancing
  • Trimmed video: A short video clip containing event/action/activity of interest
  • Untrimmed video: A video clip of arbitrary length potentially containing durations without activities of interest
  • Localization: locating an instance of event/action/activity within a video at a spatial or temporal scale
  • Spatial localization: Locating the region/area of an instance of action/activity within a video
  • Temporal localization: Locating the start time and end time of an instance of action/activity within an untrimmed video
  • Spatio-temporal localization: Locating the start and end-time for an instance of action/activity and also its spatial location within the frames during the duration of that action

Some dataset lists:


The following datasets are ranked in the order of most recent first

Moments is a research project in development by the MIT-IBM Watson AI Lab. The project is dedicated to building a very large-scale dataset to help AI systems recognize and understand actions and events in videos. Today, the dataset includes a collection of one million labeled 3 second videos, involving people, animals, objects or natural phenomena, that capture the gist of a dynamic scene.

Released: Jan 2018

Paper: https://arxiv.org/pdf/1801.03150.pdf

Video characterisitcs: trimmed 3-second videos, single event

Related tasks: Spatial action detection

Classes
  • adult+female+singing
  • adult+female+speaking
  • adult+male+singing
  • adult+male+speaking
  • aiming
  • applauding
  • ascending
  • asking
  • assembling
  • attacking
  • autographing
  • baking
  • balancing
  • barbecuing
  • barking
  • bathing
  • bending
  • bicycling
  • biting
  • blocking
  • blowing
  • boarding
  • boating
  • boiling
  • bouncing
  • bowing
  • bowling
  • boxing
  • breaking
  • brushing
  • bubbling
  • building
  • bulldozing
  • burning
  • burying
  • buttoning
  • buying
  • calling
  • camping
  • carrying
  • carving
  • catching
  • celebrating
  • chasing
  • cheering
  • cheerleading
  • chewing
  • child+singing
  • child+speaking
  • chopping
  • clapping
  • clawing
  • cleaning
  • clearing
  • climbing
  • clinging
  • clipping
  • closing
  • coaching
  • colliding
  • combing
  • combusting
  • competing
  • constructing
  • cooking
  • coughing
  • covering
  • cracking
  • crafting
  • cramming
  • crashing
  • crawling
  • crouching
  • crushing
  • crying
  • cuddling
  • cutting
  • dancing
  • descending
  • destroying
  • digging
  • dining
  • dipping
  • discussing
  • diving
  • dragging
  • draining
  • drawing
  • drenching
  • dressing
  • drilling
  • drinking
  • dripping
  • driving
  • dropping
  • drumming
  • drying
  • dunking
  • dusting
  • eating
  • emptying
  • entering
  • erupting
  • exercising
  • exiting
  • extinguishing
  • falling
  • feeding
  • fencing
  • fighting
  • filling
  • filming
  • fishing
  • flicking
  • flipping
  • floating
  • flooding
  • flowing
  • flying
  • folding
  • frowning
  • frying
  • fueling
  • gambling
  • gardening
  • giggling
  • giving
  • grilling
  • grinning
  • gripping
  • grooming
  • guarding
  • hammering
  • handwriting
  • hanging
  • hiking
  • hitchhiking
  • hitting
  • howling
  • hugging
  • hunting
  • imitating
  • inflating
  • instructing
  • interviewing
  • jogging
  • joining
  • juggling
  • jumping
  • kicking
  • kneeling
  • knitting
  • knocking
  • landing
  • laughing
  • launching
  • leaking
  • leaning
  • leaping
  • lecturing
  • licking
  • lifting
  • loading
  • locking
  • manicuring
  • marching
  • measuring
  • mopping
  • mowing
  • officiating
  • opening
  • operating
  • packaging
  • packing
  • painting
  • parading
  • paying
  • pedaling
  • peeling
  • performing
  • photographing
  • picking
  • piloting
  • pitching
  • placing
  • playing+fun
  • playing+music
  • playing+sports
  • playing+videogames
  • plugging
  • plunging
  • pointing
  • poking
  • pouring
  • praying
  • preaching
  • pressing
  • protesting
  • pulling
  • punching
  • pushing
  • putting
  • queuing
  • racing
  • rafting
  • raining
  • raising
  • reaching
  • reading
  • removing
  • resting
  • riding
  • rinsing
  • rising
  • roaring
  • rocking
  • rolling
  • rowing
  • rubbing
  • running
  • saluting
  • sanding
  • sawing
  • scratching
  • screwing
  • scrubbing
  • selling
  • serving
  • sewing
  • shaking
  • shaving
  • shooting
  • shopping
  • shouting
  • shoveling
  • shredding
  • shrugging
  • signing
  • singing
  • sitting
  • skating
  • sketching
  • skiing
  • skipping
  • slapping
  • sleeping
  • slicing
  • sliding
  • slipping
  • smashing
  • smelling
  • smiling
  • smoking
  • snapping
  • sneezing
  • sniffing
  • snowing
  • snuggling
  • socializing
  • speaking
  • spilling
  • spinning
  • spitting
  • splashing
  • spraying
  • spreading
  • sprinkling
  • sprinting
  • squatting
  • squinting
  • stacking
  • standing
  • starting
  • stealing
  • steering
  • stirring
  • stomping
  • stopping
  • storming
  • stretching
  • stroking
  • studying
  • submerging
  • surfing
  • sweeping
  • swerving
  • swimming
  • swinging
  • talking
  • taping
  • tapping
  • tattooing
  • teaching
  • tearing
  • telephoning
  • throwing
  • tickling
  • towing
  • trimming
  • tripping
  • tuning
  • turning
  • twisting
  • typing
  • unloading
  • unpacking
  • vacuuming
  • waking
  • walking
  • washing
  • watering
  • waving
  • waxing
  • weeding
  • welding
  • wetting
  • whistling
  • winking
  • working
  • wrapping
  • wrestling
  • writing
  • yawning

The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.65M action labels with multiple labels per human occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual actions, rather than composite actions; (2) precise spatio-temporal annotations with possibly multiple annotations for each person; (3) exhaustive annotation of these atomic actions over 15-minute video clips; (4) using movies to gather a varied set of action representations.

Released: May 2017

Paper: https://arxiv.org/abs/1705.08421

Video characteristics: untrimmed 15 minutes videos, spatio-temporal atomic action labelling, multiple actions per frame

Related tasks: Spatio-temporal action localization

Classes
  • stand
  • sit
  • talk to (e.g., self, a person, a group)
  • watch (a person)
  • listen to (a person)
  • carry/hold (an object)
  • walk
  • bend/bow (at the waist)
  • lie/sleep
  • dance
  • ride (e.g., a bike, a car, a horse)
  • run/jog
  • answer phone
  • watch (e.g., TV)
  • grab (a person)
  • smoke
  • eat
  • fight/hit (a person)
  • sing to (e.g., self, a person, a group)
  • read
  • crouch/kneel
  • touch (an object)
  • hug (a person)
  • martial art
  • open (e.g., a window, a car door)
  • play musical instrument
  • give/serve (an object) to (a person)
  • hand clap
  • lift/pick up
  • get up
  • drink
  • drive (e.g., a car, a truck)
  • kiss (a person)
  • put down
  • write
  • close (e.g., a door, a box)
  • listen (e.g., to music)
  • catch (an object)
  • take (an object) from (a person)
  • hand wave
  • lift (a person)
  • pull (an object)
  • hand shake
  • jump/leap
  • dress/put on clothing
  • push (another person)
  • text on/look at a cellphone
  • fall down
  • throw
  • sail boat
  • work on a computer
  • play with kids
  • hit (an object)
  • crawl
  • enter
  • take a photo
  • climb (e.g., a mountain)
  • push (an object)
  • play with pets
  • point to (an object)
  • cut
  • shoot
  • dig
  • press
  • play board game
  • swim
  • cook
  • clink glass
  • fishing
  • paint
  • row boat
  • extract
  • stir
  • chop
  • brush teeth
  • kick (a person)
  • kick (an object)
  • exit
  • turn (e.g., a screwdriver)
  • shovel

Kinetics is a large-scale, high-quality dataset of YouTube video URLs which include a diverse range of human focused actions. Our aim in releasing the Kinetics dataset is to help the machine learning community to advance models for video understanding.

The dataset consists of approximately 300,000 video clips, and covers 400 human action classes with at least 400 video clips for each action class. Each clip lasts around 10s and is labeled with a single class. All of the clips have been through multiple rounds of human annotation, and each is taken from a unique YouTube video. The actions cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging.

Released: May 2017

Paper: https://arxiv.org/abs/1705.06950

Video characteristics: trimmed video of around 10 seconds, single activity per video

Related tasks: temporal activity localization

Classes
  • abseiling
  • air drumming
  • answering questions
  • applauding
  • applying cream
  • archery
  • arm wrestling
  • arranging flowers
  • assembling computer
  • auctioning
  • baby waking up
  • baking cookies
  • balloon blowing
  • bandaging
  • barbequing
  • bartending
  • beatboxing
  • bee keeping
  • belly dancing
  • bench pressing
  • bending back
  • bending metal
  • biking through snow
  • blasting sand
  • blowing glass
  • blowing leaves
  • blowing nose
  • blowing out candles
  • bobsledding
  • bookbinding
  • bouncing on trampoline
  • bowling
  • braiding hair
  • breading or breadcrumbing
  • breakdancing
  • brush painting
  • brushing hair
  • brushing teeth
  • building cabinet
  • building shed
  • bungee jumping
  • busking
  • canoeing or kayaking
  • capoeira
  • carrying baby
  • cartwheeling
  • carving pumpkin
  • catching fish
  • catching or throwing baseball
  • catching or throwing frisbee
  • catching or throwing softball
  • celebrating
  • changing oil
  • changing wheel
  • checking tires
  • cheerleading
  • chopping wood
  • clapping
  • clay pottery making
  • clean and jerk
  • cleaning floor
  • cleaning gutters
  • cleaning pool
  • cleaning shoes
  • cleaning toilet
  • cleaning windows
  • climbing a rope
  • climbing ladder
  • climbing tree
  • contact juggling
  • cooking chicken
  • cooking egg
  • cooking on campfire
  • cooking sausages
  • counting money
  • country line dancing
  • cracking neck
  • crawling baby
  • crossing river
  • crying
  • curling hair
  • cutting nails
  • cutting pineapple
  • cutting watermelon
  • dancing ballet
  • dancing charleston
  • dancing gangnam style
  • dancing macarena
  • deadlifting
  • decorating the christmas tree
  • digging
  • dining
  • disc golfing
  • diving cliff
  • dodgeball
  • doing aerobics
  • doing laundry
  • doing nails
  • drawing
  • dribbling basketball
  • drinking
  • drinking beer
  • drinking shots
  • driving car
  • driving tractor
  • drop kicking
  • drumming fingers
  • dunking basketball
  • dying hair
  • eating burger
  • eating cake
  • eating carrots
  • eating chips
  • eating doughnuts
  • eating hotdog
  • eating ice cream
  • eating spaghetti
  • eating watermelon
  • egg hunting
  • exercising arm
  • exercising with an exercise ball
  • extinguishing fire
  • faceplanting
  • feeding birds
  • feeding fish
  • feeding goats
  • filling eyebrows
  • finger snapping
  • fixing hair
  • flipping pancake
  • flying kite
  • folding clothes
  • folding napkins
  • folding paper
  • front raises
  • frying vegetables
  • garbage collecting
  • gargling
  • getting a haircut
  • getting a tattoo
  • giving or receiving award
  • golf chipping
  • golf driving
  • golf putting
  • grinding meat
  • grooming dog
  • grooming horse
  • gymnastics tumbling
  • hammer throw
  • headbanging
  • headbutting
  • high jump
  • high kick
  • hitting baseball
  • hockey stop
  • holding snake
  • hopscotch
  • hoverboarding
  • hugging
  • hula hooping
  • hurdling
  • hurling (sport)
  • ice climbing
  • ice fishing
  • ice skating
  • ironing
  • javelin throw
  • jetskiing
  • jogging
  • juggling balls
  • juggling fire
  • juggling soccer ball
  • jumping into pool
  • jumpstyle dancing
  • kicking field goal
  • kicking soccer ball
  • kissing
  • kitesurfing
  • knitting
  • krumping
  • laughing
  • laying bricks
  • long jump
  • lunge
  • making a cake
  • making a sandwich
  • making bed
  • making jewelry
  • making pizza
  • making snowman
  • making sushi
  • making tea
  • marching
  • massaging back
  • massaging feet
  • massaging legs
  • massaging person's head
  • milking cow
  • mopping floor
  • motorcycling
  • moving furniture
  • mowing lawn
  • news anchoring
  • opening bottle
  • opening present
  • paragliding
  • parasailing
  • parkour
  • passing American football (in game)
  • passing American football (not in game)
  • peeling apples
  • peeling potatoes
  • petting animal (not cat)
  • petting cat
  • picking fruit
  • planting trees
  • plastering
  • playing accordion
  • playing badminton
  • playing bagpipes
  • playing basketball
  • playing bass guitar
  • playing cards
  • playing cello
  • playing chess
  • playing clarinet
  • playing controller
  • playing cricket
  • playing cymbals
  • playing didgeridoo
  • playing drums
  • playing flute
  • playing guitar
  • playing harmonica
  • playing harp
  • playing ice hockey
  • playing keyboard
  • playing kickball
  • playing monopoly
  • playing organ
  • playing paintball
  • playing piano
  • playing poker
  • playing recorder
  • playing saxophone
  • playing squash or racquetball
  • playing tennis
  • playing trombone
  • playing trumpet
  • playing ukulele
  • playing violin
  • playing volleyball
  • playing xylophone
  • pole vault
  • presenting weather forecast
  • pull ups
  • pumping fist
  • pumping gas
  • punching bag
  • punching person (boxing)
  • push up
  • pushing car
  • pushing cart
  • pushing wheelchair
  • reading book
  • reading newspaper
  • recording music
  • riding a bike
  • riding camel
  • riding elephant
  • riding mechanical bull
  • riding mountain bike
  • riding mule
  • riding or walking with horse
  • riding scooter
  • riding unicycle
  • ripping paper
  • robot dancing
  • rock climbing
  • rock scissors paper
  • roller skating
  • running on treadmill
  • sailing
  • salsa dancing
  • sanding floor
  • scrambling eggs
  • scuba diving
  • setting table
  • shaking hands
  • shaking head
  • sharpening knives
  • sharpening pencil
  • shaving head
  • shaving legs
  • shearing sheep
  • shining shoes
  • shooting basketball
  • shooting goal (soccer)
  • shot put
  • shoveling snow
  • shredding paper
  • shuffling cards
  • side kick
  • sign language interpreting
  • singing
  • situp
  • skateboarding
  • ski jumping
  • skiing (not slalom or crosscountry)
  • skiing crosscountry
  • skiing slalom
  • skipping rope
  • skydiving
  • slacklining
  • slapping
  • sled dog racing
  • smoking
  • smoking hookah
  • snatch weight lifting
  • sneezing
  • sniffing
  • snorkeling
  • snowboarding
  • snowkiting
  • snowmobiling
  • somersaulting
  • spinning poi
  • spray painting
  • spraying
  • springboard diving
  • squat
  • sticking tongue out
  • stomping grapes
  • stretching arm
  • stretching leg
  • strumming guitar
  • surfing crowd
  • surfing water
  • sweeping floor
  • swimming backstroke
  • swimming breast stroke
  • swimming butterfly stroke
  • swing dancing
  • swinging legs
  • swinging on something
  • sword fighting
  • tai chi
  • taking a shower
  • tango dancing
  • tap dancing
  • tapping guitar
  • tapping pen
  • tasting beer
  • tasting food
  • testifying
  • texting
  • throwing axe
  • throwing ball
  • throwing discus
  • tickling
  • tobogganing
  • tossing coin
  • tossing salad
  • training dog
  • trapezing
  • trimming or shaving beard
  • trimming trees
  • triple jump
  • tying bow tie
  • tying knot (not on a tie)
  • tying tie
  • unboxing
  • unloading truck
  • using computer
  • using remote controller (not gaming)
  • using segway
  • vault
  • waiting in line
  • walking the dog
  • washing dishes
  • washing feet
  • washing hair
  • washing hands
  • water skiing
  • water sliding
  • watering plants
  • waxing back
  • waxing chest
  • waxing eyebrows
  • waxing legs
  • weaving basket
  • welding
  • whistling
  • windsurfing
  • wrapping present
  • wrestling
  • writing
  • yawning
  • yoga
  • zumba

Charades is dataset composed of 9848 videos of daily indoors activities collected through Amazon Mechanical Turk. 267 different users were presented with a sentence, that includes objects and actions from a fixed vocabulary, and they recorded a video acting out the sentence (like in a game of Charades). The dataset contains 66,500 temporal annotations for 157 action classes, 41,104 labels for 46 object classes, and 27,847 textual descriptions of the videos.

Released: April 2016

Paper: https://arxiv.org/abs/1604.01753

Video characteristics: trimmed videos, single activity per video, home activities

Related tasks: Temporal activity localization

Classes
  • Holding some clothes
  • Putting clothes somewhere
  • Taking some clothes from somewhere
  • Throwing clothes somewhere
  • Tidying some clothes
  • Washing some clothes
  • Closing a door
  • Fixing a door
  • Opening a door
  • Putting something on a table
  • Sitting on a table
  • Sitting at a table
  • Tidying up a table
  • Washing a table
  • Working at a table
  • Holding a phone/camera
  • Playing with a phone/camera
  • Putting a phone/camera somewhere
  • Taking a phone/camera from somewhere
  • Talking on a phone/camera
  • Holding a bag
  • Opening a bag
  • Putting a bag somewhere
  • Taking a bag from somewhere
  • Throwing a bag somewhere
  • Closing a book
  • Holding a book
  • Opening a book
  • Putting a book somewhere
  • Smiling at a book
  • Taking a book from somewhere
  • Throwing a book somewhere
  • Watching/Reading/Looking at a book
  • Holding a towel/s
  • Putting a towel/s somewhere
  • Taking a towel/s from somewhere
  • Throwing a towel/s somewhere
  • Tidying up a towel/s
  • Washing something with a towel
  • Closing a box
  • Holding a box
  • Opening a box
  • Putting a box somewhere
  • Taking a box from somewhere
  • Taking something from a box
  • Throwing a box somewhere
  • Closing a laptop
  • Holding a laptop
  • Opening a laptop
  • Putting a laptop somewhere
  • Taking a laptop from somewhere
  • Watching a laptop or something on a laptop
  • Working/Playing on a laptop
  • Holding a shoe/shoes
  • Putting shoes somewhere
  • Putting on shoe/shoes
  • Taking shoes from somewhere
  • Taking off some shoes
  • Throwing shoes somewhere
  • Sitting in a chair
  • Standing on a chair
  • Holding some food
  • Putting some food somewhere
  • Taking food from somewhere
  • Throwing food somewhere
  • Eating a sandwich
  • Making a sandwich
  • Holding a sandwich
  • Putting a sandwich somewhere
  • Taking a sandwich from somewhere
  • Holding a blanket
  • Putting a blanket somewhere
  • Snuggling with a blanket
  • Taking a blanket from somewhere
  • Throwing a blanket somewhere
  • Tidying up a blanket/s
  • Holding a pillow
  • Putting a pillow somewhere
  • Snuggling with a pillow
  • Taking a pillow from somewhere
  • Throwing a pillow somewhere
  • Putting something on a shelf
  • Tidying a shelf or something on a shelf
  • Reaching for and grabbing a picture
  • Holding a picture
  • Laughing at a picture
  • Putting a picture somewhere
  • Taking a picture of something
  • Watching/looking at a picture
  • Closing a window
  • Opening a window
  • Washing a window
  • Watching/Looking outside of a window
  • Holding a mirror
  • Smiling in a mirror
  • Washing a mirror
  • Watching something/someone/themselves in a mirror
  • Walking through a doorway
  • Holding a broom
  • Putting a broom somewhere
  • Taking a broom from somewhere
  • Throwing a broom somewhere
  • Tidying up with a broom
  • Fixing a light
  • Turning on a light
  • Turning off a light
  • Drinking from a cup/glass/bottle
  • Holding a cup/glass/bottle of something
  • Pouring something into a cup/glass/bottle
  • Putting a cup/glass/bottle somewhere
  • Taking a cup/glass/bottle from somewhere
  • Washing a cup/glass/bottle
  • Closing a closet/cabinet
  • Opening a closet/cabinet
  • Tidying up a closet/cabinet
  • Someone is holding a paper/notebook
  • Putting their paper/notebook somewhere
  • Taking paper/notebook from somewhere
  • Holding a dish
  • Putting a dish/es somewhere
  • Taking a dish/es from somewhere
  • Wash a dish/dishes
  • Lying on a sofa/couch
  • Sitting on sofa/couch
  • Lying on the floor
  • Sitting on the floor
  • Throwing something on the floor
  • Tidying something on the floor
  • Holding some medicine
  • Taking/consuming some medicine
  • Putting groceries somewhere
  • Laughing at television
  • Watching television
  • Someone is awakening in bed
  • Lying on a bed
  • Sitting in a bed
  • Fixing a vacuum
  • Holding a vacuum
  • Taking a vacuum from somewhere
  • Washing their hands
  • Fixing a doorknob
  • Grasping onto a doorknob
  • Closing a refrigerator
  • Opening a refrigerator
  • Fixing their hair
  • Working on paper/notebook
  • Someone is awakening somewhere
  • Someone is cooking something
  • Someone is dressing
  • Someone is laughing
  • Someone is running somewhere
  • Someone is going from standing to sitting
  • Someone is smiling
  • Someone is sneezing
  • Someone is standing up from somewhere
  • Someone is undressing
  • Someone is eating something

ActivityNet is a new large-scale video benchmark for human activity understanding. ActivityNet aims at covering a wide range of complex human activities that are of interest to people in their daily living. In its current version, ActivityNet provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours.

Released: June 2015

Paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Heilbron_ActivityNet_A_Large-Scale_2015_CVPR_paper.html

Video characteristics: untrimmed videos, multiple activity instances per video

Related tasks: spatio-temporal activity localization

Classes
  • Applying sunscreen
  • Health-related Self Care
  • Arm wrestling
  • Wrestling
  • Assembling bicycle
  • BMX
  • Biking
  • Baking cookies
  • Baton twirling
  • Beach soccer
  • Playing soccer
  • Beer pong
  • Blow-drying hair
  • Blowing leaves
  • Playing ten pins
  • Bowling
  • Braiding hair
  • Building sandcastles
  • Beach activities
  • Bullfighting
  • Participating in rodeo competitions
  • Calf roping
  • Camel ride
  • Canoeing
  • Boating
  • Capoeira
  • Carving jack-o-lanterns
  • Changing car wheel
  • Cleaning sink
  • Clipping cat claws
  • Croquet
  • Curling
  • Cutting the grass
  • Decorating the Christmas tree
  • Disc dog
  • Doing a powerbomb
  • Doing crunches
  • Working out
  • Drum corps
  • Elliptical trainer
  • Doing fencing
  • Fencing
  • Fixing the roof
  • Exterior repair, improvements, & decoration
  • Fun sliding down
  • Park activities
  • Futsal
  • Gargling mouthwash
  • Grooming dog
  • Hand car wash
  • Hanging wallpaper
  • Having an ice cream
  • Hitting a pinata
  • Hula hoop
  • Hurling
  • Ice fishing
  • Fishing
  • Installing carpet
  • Kite flying
  • Kneeling
  • Knitting
  • Laying tile
  • Longboarding
  • Making a cake
  • Making a lemonade
  • Making an omelette
  • Mooping floor
  • Painting fence
  • Painting furniture
  • Building and repairing furniture
  • Peeling potatoes
  • Plastering
  • Playing beach volleyball
  • Playing volleyball
  • Playing blackjack
  • Attending gambling establishments
  • Playing congas
  • Playing drums
  • Playing ice hockey
  • Playing pool
  • Playing rubik cube
  • Powerbocking
  • Putting in contact lenses
  • Putting on shoes
  • Rafting
  • Raking leaves
  • Removing ice from car
  • Riding bumper cars
  • River tubing
  • Rock-paper-scissors
  • Rollerblading
  • Roof shingle removal
  • Rope skipping
  • Running a marathon
  • Running
  • Scuba diving
  • Sharpening knives
  • Appliance, tool, and toy set-up, repair, & maintenance (by self)
  • Appliances, Tools, and Toys
  • Shuffleboard
  • Skiing
  • Skiing, ice skating, snowboarding
  • Slacklining
  • Snow tubing
  • Snowboarding
  • Spread mulch
  • Sumo
  • Surfing
  • Swimming
  • Swinging at the playground
  • Table soccer
  • Throwing darts
  • Trimming branches or hedges
  • Tug of war
  • Using the monkey bar
  • Using the rowing machine
  • Wakeboarding
  • Waterskiing
  • Waxing skis
  • Welding
  • Drinking coffee
  • Eating and Drinking
  • Food & Drink Prep., Presentation, & Clean-up
  • Eating and drinking Activities
  • Root
  • Zumba
  • Doing aerobics
  • Participating in Sports, Exercise, or Recreation
  • Sports, Exercise, and Recreation
  • Doing kickboxing
  • Participating in martial arts
  • Doing karate
  • Tango
  • Dancing
  • Arts and Entertainment
  • Socializing, Relaxing, and Leisure
  • Putting on makeup
  • Washing, dressing and grooming oneself
  • Grooming
  • Personal Care
  • High jump
  • Playing sports
  • Playing bagpipes
  • Playing musical instruments
  • Cheerleading
  • Wrapping presents
  • Household & personal organization and planning
  • Household Management
  • Household Activities
  • Cricket
  • Clean and jerk
  • Weightlifting
  • Preparing pasta
  • Food and drink preparation
  • Bathing dog
  • Care for animals and pets (not veterinary care)
  • Animals and Pets
  • Discus throw
  • Playing field hockey
  • Playing hockey
  • Grooming horse
  • Walking / exercising / playing with animals
  • Preparing salad
  • Playing harmonica
  • Playing saxophone
  • Chopping wood
  • Heating and cooling
  • Interior Maintenance, Repair, & Decoration
  • Washing face
  • Using the pommel horse
  • Doing gymnastics
  • Javelin throw
  • Spinning
  • Using cardiovascular equipment
  • Ping-pong
  • Playing racquet sports
  • Making a sandwich
  • Brushing hair
  • Playing guitarra
  • Doing step aerobics
  • Drinking beer
  • Playing polo
  • Participating in equestrian sports
  • Snatch
  • Paintball
  • Long jump
  • Cleaning windows
  • Interior cleaning
  • Housework
  • Brushing teeth
  • Playing flauta
  • Tennis serve with ball bouncing
  • Bungee jumping
  • Triple jump
  • Horseback riding
  • Layup drill in basketball
  • Playing basketball
  • Vacuuming floor
  • Cleaning shoes
  • Sewing, repairing, & maintaining textiles
  • Doing nails
  • Shot put
  • Fixing bicycle
  • Vehicle repair and maintenance (by self)
  • Vehicles
  • Washing hands
  • Ironing clothes
  • Laundry
  • Using the balance beam
  • Shoveling snow
  • Exterior cleaning
  • Exterior Maintenance, Repair, & Decoration
  • Tumbling
  • Using parallel bars
  • Getting a tattoo
  • Washing, dressing and grooming
  • Rock climbing
  • Climbing, spelunking, caving
  • Smoking hookah
  • Tobacco and drug use
  • Relaxing and Leisure
  • Shaving
  • Getting a piercing
  • Springboard diving
  • Participating in water sports
  • Playing squash
  • Playing piano
  • Dodgeball
  • Smoking a cigarette
  • Sailing
  • Getting a haircut
  • Playing lacrosse
  • Cumbia
  • Tai chi
  • Painting
  • Interior arrangement, decoration, & repairs
  • Mowing the lawn
  • Lawn, garden, and houseplant care
  • Lawn, Garden, and Houseplants
  • Shaving legs
  • Walking the dog
  • Hammer throw
  • Skateboarding
  • Polishing shoes
  • Ballet
  • Attending arts and entertainment
  • Hand washing clothes
  • Plataform diving
  • Playing violin
  • Breakdancing
  • Windsurfing
  • Hopscotch
  • Playing games
  • Doing motocross
  • Mixing drinks
  • Starting a campfire
  • Belly dance
  • Removing curlers
  • Archery
  • Volleyball
  • Playing water polo
  • Playing racquetball
  • Kayaking
  • Polishing forniture
  • Playing kickball
  • Using uneven bars
  • Washing dishes
  • Kitchen and food clean-up
  • Pole vault
  • Playing accordion
  • Playing badminton

UCF101 is an action recognition data set of realistic action videos, collected from YouTube, having 101 action categories. This data set is an extension of UCF50 data set which has 50 action categories.

With 13320 videos from 101 action categories, UCF101 gives the largest diversity in terms of actions and with the presence of large variations in camera motion, object appearance and pose, object scale, viewpoint, cluttered background, illumination conditions, etc, it is the most challenging data set to date. As most of the available action recognition data sets are not realistic and are staged by actors, UCF101 aims to encourage further research into action recognition by learning and exploring new realistic action categories.

The videos in 101 action categories are grouped into 25 groups, where each group can consist of 4-7 videos of an action. The videos from the same group may share some common features, such as similar background, similar viewpoint, etc.

The action categories can be divided into five types: 1)Human-Object Interaction 2) Body-Motion Only 3) Human-Human Interaction 4) Playing Musical Instruments 5) Sports.

Released: November 2012

Paper: http://crcv.ucf.edu/papers/UCF101_CRCV-TR-12-01.pdf

Video characteristics: trimmed videos, single action per video

Related tasks: Temporal activity detection

Classes
  • Apply Eye Makeup
  • Apply Lipstick
  • Archery
  • Baby Crawling
  • Balance Beam
  • Band Marching
  • Baseball Pitch
  • Basketball Shooting
  • Basketball Dunk
  • Bench Press
  • Biking
  • Billiards Shot
  • Blow Dry Hair
  • Blowing Candles
  • Body Weight Squats
  • Bowling
  • Boxing Punching Bag
  • Boxing Speed Bag
  • Breaststroke
  • Brushing Teeth
  • Clean and Jerk
  • Cliff Diving
  • Cricket Bowling
  • Cricket Shot
  • Cutting In Kitchen
  • Diving
  • Drumming
  • Fencing
  • Field Hockey Penalty
  • Floor Gymnastics
  • Frisbee Catch
  • Front Crawl
  • Golf Swing
  • Haircut
  • Hammer Throw
  • Hammering
  • Handstand Pushups
  • Handstand Walking
  • Head Massage
  • High Jump
  • Horse Race
  • Horse Riding
  • Hula Hoop
  • Ice Dancing
  • Javelin Throw
  • Juggling Balls
  • Jump Rope
  • Jumping Jack
  • Kayaking
  • Knitting
  • Long Jump
  • Lunges
  • Military Parade
  • Mixing Batter
  • Mopping Floor
  • Nun chucks
  • Parallel Bars
  • Pizza Tossing
  • Playing Guitar
  • Playing Piano
  • Playing Tabla
  • Playing Violin
  • Playing Cello
  • Playing Daf
  • Playing Dhol
  • Playing Flute
  • Playing Sitar
  • Pole Vault
  • Pommel Horse
  • Pull Ups
  • Punch
  • Push Ups
  • Rafting
  • Rock Climbing Indoor
  • Rope Climbing
  • Rowing
  • Salsa Spins
  • Shaving Beard
  • Shotput
  • Skate Boarding
  • Skiing
  • Skijet
  • Sky Diving
  • Soccer Juggling
  • Soccer Penalty
  • Still Rings
  • Sumo Wrestling
  • Surfing
  • Swing
  • Table Tennis Shot
  • Tai Chi
  • Tennis Swing
  • Throw Discus
  • Trampoline Jumping
  • Typing
  • Uneven Bars
  • Volleyball Spiking
  • Walking with a dog
  • Wall Pushups
  • Writing On Board
  • Yo Yo

With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lack far behind. Here we introduce HMDB collected from various sources, mostly from movies, and a small proportion from public databases such as the Prelinger archive, YouTube and Google videos. The dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.

Released: March 2011

Paper: http://serre-lab.clps.brown.edu/wp-content/uploads/2012/08/Kuehne_etal_iccv11.pdf

Video characteristics: untrimmed video of at least 1 second, single activity per video

Related tasks: Temporal activity localization

Classes
  • General facial actions
    • smile
    • laugh
    • chew
    • talk
  • Facial actions with object manipulation
    • smoke
    • eat
    • drink
  • General body movements
    • cartwheel
    • clap hands
    • climb
    • climb stairs
    • dive
    • fall on the floor
    • backhand flip
    • handstand
    • jump
    • pull up
    • push up
    • run
    • sit down
    • sit up
    • somersault
    • stand up
    • turn
    • walk
    • wave
  • Body movements with object interaction
    • brush hair
    • catch
    • draw sword
    • dribble
    • golf
    • hit something
    • kick ball
    • pick
    • pour
    • push something
    • ride bike
    • ride horse
    • shoot ball
    • shoot bow
    • shoot gun
    • swing baseball bat
    • sword exercise
    • throw
  • Body movements for human interaction
    • fencing
    • hug
    • kick someone
    • kiss
    • punch
    • shake hands
    • sword fight
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment