ChatGPT Can Now Make AI Videos with Sora Text-To-Video Model

OpenAI, known for its popular ChatGPT, is now entering the world of video with its AI technology.

OpenAI Introduced Sora, its latest AI model. Sora functions much like DALL-E, OpenAI’s tool for generating images. Users can input a scene description, and Sora will produce a high-definition video clip.

Sora can generate videos based on still images and enhance existing videos by adding frames.

OpenAI’s Sora aims to rival video-generation AI tools from companies like Meta and Google, who introduced Lumiere.

OpenAI provided brief clips to demonstrate its capabilities, such as showing woolly mammoths walking in the snow, Sea life in the ocean, and people walking in the city.

pic.twitter.com/Um5CWI18nS
— OpenAI (@OpenAI) February 15, 2024

Currently, Sora can only create videos lasting one minute or less. OpenAI, supported by Microsoft, aims to achieve multimodality, combining text, image, and video generation, as part of its goal to provide a wider range of AI models.

pic.twitter.com/cjIdgYFaWq
— OpenAI (@OpenAI) February 15, 2024

Sora has been accessible only to a select group of safety testers called “red teamers.” These testers evaluate the model for vulnerabilities related to misinformation and bias.

OpenAI hasn’t shared any public demonstrations yet, apart from 10 sample clips on its website. The accompanying technical paper is set to be released later on Thursday.

Today, Sora is becoming available to red teamers to assess critical areas for harms or risks. We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals. Openai Explains

The current model has some weaknesses. It might have trouble accurately simulating the physics of a complex scene and understanding certain cause-and-effect scenarios.

For instance, it could miss showing a cookie with a bite after someone eats it.

Also, it may mix up spatial details, like left and right, and struggle with describing events that unfold over time, such as following a specific camera path.

OpenAI announced plans to develop a “detection classifier” to recognize video clips generated by Sora. They also intend to include specific metadata in the output to aid in identifying AI-generated content.

This metadata resembles what Meta aims to utilize for identifying AI-generated images during this year.

pic.twitter.com/ruTEWn87vf
— OpenAI (@OpenAI) February 15, 2024

Sora, like ChatGPT, is a diffusion AI model based on the Transformer architecture, which was first introduced by Google researchers in a paper published in 2017.

According to Clarity, a machine learning company, there has been a 900% increase in the creation of AI-generated deepfakes year after year.

With chatbots and image generators already widespread, video may become the next big area for generative AI. While this offers exciting creative possibilities, it also raises concerns about misinformation, especially with important political events happening worldwide.