OpenAI’s Sora: A New AI Tool that Creates Realistic Minute-Long Videos
OpenAI has developed a new AI tool called Sora, which stands for “sky” in Japanese.
Sora is a text-to-video model that can create realistic minute-long videos. It uses still images or existing footage provided by the user to generate these videos.
The goal of Sora is to teach AI to understand and simulate the physical world in motion, helping people solve real-world problems that require interaction.
Currently, Sora is in the red-teaming phase, where experts simulate real-world use to identify vulnerabilities and weaknesses in the system.
Access to the model is limited, but OpenAI has shared multiple demos on its blog post. The company is also seeking feedback from visual artists, designers, and filmmakers to improve the model for creative professionals.
Sora works by gradually removing noise from a static image, similar to clearing a fuzzy TV picture to reveal a clear, moving video.
It uses a “transformer architecture” and can generate entire videos at once, rather than frame by frame.
Users can guide the video’s content by providing text descriptions, ensuring that specific elements remain visible even if they move off-screen.
The model builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3 to generate descriptive captions for visual training data, allowing it to faithfully follow user instructions in the generated video.
However, OpenAI acknowledges that the current model has weaknesses. It may struggle to accurately simulate the physics of complex scenes and may not understand cause and effect in specific instances.
It may also confuse spatial details and struggle with precise descriptions of events that occur over time.
OpenAI’s CEO, Sam Altman, used an AI model called Sora to create a video based on a tweet by Kunal Shah, the CEO of CRED.
In the tweet, Shah suggested a scenario of a bicycle race on the ocean with animals riding the bicycles and a drone camera view.
Altman shared the video on social media, and it quickly gained popularity, accumulating millions of views and many positive comments.
The AI model, Sora, is capable of generating realistic and imaginative scenes based on text instructions.
Altman had previously announced the launch of Sora and invited people to provide captions for videos they would like to see. Shah’s tweet caught Altman’s attention, and he used Sora to bring the scenario to life.
Also Read: US-India Business Council Launches AI Task Force To Drive Innovation
People who watched the video were impressed by the quality and creativity of the AI-generated content.
They praised the AI model’s ability to accurately interpret the text instructions and create visually stunning scenes. The video has sparked excitement and admiration for the capabilities of AI technology.