The networker OpenAI’s new video generation tool could learn a lot from babies

John Naughton

The Guardian

1
24.02.2024

“First text, then images, now OpenAI has a model for generating videos,” screamed Mashable the other day. The makers of ChatGPT and Dall-E had just announced Sora, a text-to-video diffusion model. Cue excited commentary all over the web about what will doubtless become known as T2V, covering the usual spectrum – from “Does this mark the end of [insert threatened activity here]?” to “meh” and everything in between.

Sora (the name is Japanese for “sky”) is not the first T2V tool, but it looks more sophisticated than earlier efforts like Meta’s Make-a-Video AI. It can turn a brief text description into a detailed, high-definition film clip up to a minute long. For example, the prompt “A cat waking up its sleeping owner, demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics, and finally, the owner pulls out his secret stash of treats from underneath the pillow to hold off the cat a little longer,” produces a slick video clip that would go viral on any social network.

Cute, eh? Well, up to a point. OpenAI seems uncharacteristically candid about the tool’s limitations. It may, for example, “struggle with accurately simulating the physics of a complex........

© The Guardian

visit website

Categories

Sources

Popular

The networker OpenAI’s new video generation tool could learn a lot from babies

John Naughton

© The Guardian