02-16-24 | 11:33 am

OpenAI steps into realm of generative video tech with Sora

ChatGPT maker’s new AI tool can generate instant videos depicting ‘complex scenes with multiple characters and accurate details’ lasting up to a minute

[Source photo: Chetan Jha/Press Insider]

ChatGPT maker OpenAI on Thursday stepped into the domain of generative video technology by unveiling Sora, a tool that can instantly generate realistic videos from text prompts.

The new artificial intelligence system, which pushes the boundaries in the realm of artificial intelligence (AI), can instantly generate videos lasting up to a minute representing “complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” OpenAI said.

Sora will be initially available to “red teamers”, or domain experts in areas such as misinformation, hateful content, and bias, to assess critical areas for harms or risks, the company said.

“We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals,” it added.

OpenAI is “starting red-teaming and offering access to a limited number of creators,” chief executive Sam Altman said, terming the moment “remarkable”.

The company said it is also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.

To ensure the authenticity and integrity of content generated by Sora, “we plan to include C2PA (Coalition for Content Provenance and Authenticity) metadata in future if we deploy the model in an OpenAI product” it added.

How does Sora work?

Explaining how the AI system works, OpeanAI said, “Sora is a diffusion model, which generates a video by starting off with one that looks like static noise and gradually transforms it by removing the noise over many steps.”

A diffusion model is a type of a generative technique used in machine learning to create or generate new data instances that resemble the training data. It works by gradually modifying a random noise pattern into a coherent image or, in the case of Sora, a video

The process begins with what is essentially a random, meaningless pattern that looks like static on a television screen. This noise does not contain any useful information or resemble the final video in any way.

Over many steps, the model systematically alters this initial noise pattern by using the input text prompt as a guide to shape the noise into a video that matches the described scene. Each step in the process reduces the randomness (noise) and introduces more specific features and details, guided by the patterns the model learned during its training.

As the model progresses through its steps, it gradually eliminates the randomness and replaces it with elements that make up the final video, such as characters, objects, and backgrounds that align with the input text

What else does Sora do?

“Sora is capable of generating entire videos all at once or extending generated videos to make them longer. By giving the model foresight of many frames at a time, we’ve solved a challenging problem of making sure a subject stays the same even when it goes out of view temporarily,” it said.

The new tool is even capable of generating video from a still image.

“In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small detail. The model can also take an existing video and extend it or fill in missing frames,” it added.

OpenAI steps into realm of generative video tech with Sora

ChatGPT maker’s new AI tool can generate instant videos depicting ‘complex scenes with multiple characters and accurate details’ lasting up to a minute

Other generative video tech

Featured Videos

More Top Stories:

Business

Business

Business

Policy

Policy

Policy

TECHNOLOGY

TECHNOLOGY

TECHNOLOGY

Insight

Insight

Insight

Carlyle, EQT, Partners Group show interest in buying EuroSchool from KKR

Must respect LAC, Jaishankar tells Chinese counterpart at Asean meet

Lammy arrives in India to reinforce UK’s commitment to FTA