Teaching AI systems to understand what is happening in videos just like a human is one of the toughest challenges – and biggest potential breakthroughs – in the world of machine learning.
Access to training data is one of the biggest competitive advantages in AI, and by gathering this resource from millions and millions of users, technology giants have been able to advance in various fields.
While Facebook has trained machine vision models on billions of photos collected from Instagram, it has not previously announced projects with a similar ambition to understand the video.
Facebook said: By learning from publicly available videos, covering nearly every country and hundreds of languages, AI systems will not only improve accuracy, but also adapt to our fast-moving world and recognize nuances and visual cues across different cultures and regions.
The project, titled Learning from Videos, is part of Facebook’s broader effort to build machines that learn like humans.
The resulting machine learning models are being used to create new content recommendation systems and moderation tools, but they could do more in the future.
Artificial intelligence that can understand the content of videos can give Facebook unprecedented insight into users’ lives, allowing it to analyze their hobbies, interests, preferences in brands, clothing, and countless other personal details.
Facebook enjoys access to such information through its current ad targeting process, but the ability to analyze video through artificial intelligence adds a rich source of data to its capabilities.
Although the project is still in its early stages, it is paying off, and Facebook said: It used this technology to improve Instagram Reels recommendations, such as: showing videos of people dancing to the music itself.
The system offers improved results in speech recognition errors as well, which could enhance automatic caption features and make it easier to detect hate speech in videos.
Facebook explains that it takes privacy into account when it comes to learning from videos, and wrote in a blog post: We maintain a strong privacy foundation that uses automated solutions to enforce privacy on a large scale.
She added: By embedding this work at the infrastructure level, we can consistently implement privacy requirements across our systems and support efforts such as artificial intelligence, and this includes implementing technical safeguards throughout the data life cycle.
Understanding what is happening in the videos can be a very difficult task for AI systems, as there are many obstacles, such as background noise that make speech difficult to understand.
However, Facebook is taking what the system has learned and putting it into practical use in other areas less than a year after starting the video learning project.