Advertisement

Google's AVA data set raises the bar for identifying human actions in videos

Google's AVA data set raises the bar for identifying human actions in videos
From TechCrunch - October 19, 2017

Today Google announced a new labeled data set of human actions taking place in videos. That may sound obscure, but its a big deal for anyone working to solve problems in computer vision.

If youve been following along, youve noticed the significant uptick in companies building products and services that act as a second pair of human eyes. Video detectors like Matroid, security systems like Lighthouse and even autonomous cars benefit from an understanding of whats going on inside a video, and that understanding is borne on the back of good labeled data sets for training and benchmarking.

Googles AVA is short for atomic visual actions. In contrast to other data sets, it takes things up a notch by offering multiple labels for bounding boxes within relevant scenes. This adds more detail in complex scenes and makes for a more rigorous challenge for existing models.

In its blog post, Google does a great job explaining what makes human actions so difficult to classify. Actions, unlike static objects, unfold over timesimply put, theres more uncertainty to solve for. A picture of someone running could actually just be a picture of someone jumping, but over time, as more and more frames are added, it becomes clear what is really happening.You can imagine how complicated things could get with two people interacting in a scene.

Advertisement

Continue reading at TechCrunch »