By simulating how neurons process motion, MovieNet sets a new standard for AI in dynamic scene recognition, offering transformative possibilities from diagnostics to eco-friendly computing.
Research: Identification of movie encoding neurons enables movie recognition AI. Image Credit: Shutterstock AI
Imagine an artificial intelligence (AI) model that can watch and understand moving images with the subtlety of a human brain. Now, scientists at Scripps Research have made this a reality by creating MovieNet, an innovative AI that processes videos much like how our brains interpret real-life scenes as they unfold over time.
This brain-inspired AI model, detailed in a study published in the Proceedings of the National Academy of Sciences, can perceive moving scenes by simulating how neurons—or brain cells—make real-time sense of the world. Conventional AI excels at recognizing still images, but MovieNet introduces a method for machine-learning models to recognize complex, changing scenes. This breakthrough could transform fields from medical diagnostics to autonomous driving, where discerning subtle changes over time is crucial. Inspired by hierarchical spatiotemporal filtering mechanisms observed in the brain, MovieNet is also more accurate and environmentally sustainable than conventional AI.
"The brain doesn't just see still frames; it creates an ongoing visual narrative," says senior author Hollis Cline, PhD, the director of the Dorris Neuroscience Center and the Hahn Professor of Neuroscience at Scripps Research. "Static image recognition has come a long way, but the brain's capacity to process flowing scenes—like watching a movie—requires a much more sophisticated form of pattern recognition. Our study shows that neurons' ON and OFF responses encode distinct temporal information: event timing and duration. By incorporating these principles, we've been able to apply similar principles to AI."
To create MovieNet, Cline and first author Masaki Hiramoto, a staff scientist at Scripps Research, examined how the brain processes real-world scenes as short sequences, similar to movie clips. Specifically, the researchers studied how tadpole neurons responded to visual stimuli.
"Tadpoles have a very good visual system, plus we know that they can detect and respond to moving stimuli efficiently," explains Hiramoto.
He and Cline identified neurons that respond to movie-like features—such as shifts in brightness and image rotation—and can recognize objects as they move and change. Located in the brain's visual processing region known as the optic tectum, these neurons assemble parts of a moving image into a coherent sequence. This dynamic processing is further fine-tuned by inhibition, which prevents false signals and sharpens the responses to critical visual features.
Think of this process as similar to a lenticular puzzle: each piece alone may not make sense, but together, they form a complete image in motion. Different neurons process various "puzzle pieces" of a real-life moving image, which the brain then integrates into a continuous scene. Over time, sensory plasticity tunes the neurons to adapt to frequently observed patterns, which further enhances their efficiency.
The researchers also found that the tadpoles' optic tectum neurons distinguished subtle changes in visual stimuli over time, capturing information in roughly 100 to 600-millisecond dynamic clips rather than still frames. These neurons are highly sensitive to patterns of light and shadow, and each neuron's response to a specific part of the visual field helps construct a detailed map of a scene to form a "movie clip."
Cline and Hiramoto trained MovieNet to emulate this brain-like processing and encode video clips as a series of small, recognizable visual cues. This permitted the AI model to distinguish subtle differences among dynamic scenes.
To test MovieNet, the researchers showed it video clips of tadpoles swimming under different conditions. Not only did MovieNet achieve 82.3 percent accuracy in distinguishing normal versus abnormal swimming behaviors, but it exceeded the abilities of trained human observers by about 18 percent. Despite its extensive training and processing resources, it even outperformed existing AI models, such as Google's GoogLeNet—which achieved just 72 percent accuracy.
"This is where we saw real potential," points out Cline.
The team determined that MovieNet was not only better than current AI models at understanding changing scenes, but it used less data and processing time. The AI's brain-inspired encoding compresses data by identifying key spatiotemporal patterns while eliminating redundant information, much like zipping a file without losing its essential content.
Beyond its high accuracy, MovieNet is an eco-friendly AI model. Conventional AI processing demands immense energy, leaving a heavy environmental footprint. MovieNet's reduced data requirements offer a greener alternative that conserves energy while performing at a high standard.
"By mimicking the brain, we've managed to make our AI far less demanding, paving the way for models that aren't just powerful but sustainable," says Cline. "This efficiency also opens the door to scaling up AI in fields where conventional methods are costly."
In addition, MovieNet has the potential to reshape medicine. As the technology advances, it could become a valuable tool for identifying subtle changes in early-stage conditions, such as detecting irregular heart rhythms or spotting the first signs of neurodegenerative diseases like Parkinson's. For example, small motor changes related to Parkinson's that are often hard for human eyes to discern could be flagged by the AI early on, providing clinicians valuable time to intervene.
Furthermore, MovieNet's ability to perceive changes in tadpole swimming patterns when tadpoles were exposed to chemicals could lead to more precise drug screening techniques, as scientists could study dynamic cellular responses rather than relying on static snapshots.
"Current methods miss critical changes because they can only analyze images captured at intervals," remarks Hiramoto. "With MovieNet, the ability to analyze continuous changes over time provides an unprecedented level of detail."
Looking ahead, Cline and Hiramoto plan to continue refining MovieNet's ability to adapt to different environments, enhancing its versatility and potential applications.
"Taking inspiration from biology will continue to be a fertile area for advancing AI," says Cline. "By designing models that think like living organisms, we can achieve levels of efficiency that simply aren't possible with conventional approaches."
This work for the study "Identification of movie encoding neurons enables movie recognition AI," was supported by funding from the National Institutes of Health (RO1EY011261, RO1EY027437, and RO1EY031597), the Hahn Family Foundation, and the Harold L. Dorris Neurosciences Center Endowment Fund.
Source:
- Scripps Research Institute
Journal reference:
- Hiramoto, M., & Cline, H. T. (2024). Identification of movie encoding neurons enables movie recognition AI. Proceedings of the National Academy of Sciences, 121(48), e2412260121. DOI: 10.1073/pnas.2412260121, https://www.pnas.org/doi/10.1073/pnas.2412260121