Op-Ed: Sora text to video — A very useful tool for creative arts and media in general

To hell with the paranoia, let’s see what it can do.

OpenAI said its new platform Sora can generate videos up to a minute long
Sora from OpenAI is a very well-behaved AI tool. It pays for licensed training materials, and it delivers pretty good videos.  It’s so new it hasn’t even had time for the usual negativity to set in.

To borrow a quote from someone: Don’t Panic! Just pay attention.

Take a little time to look at Sora’s videos on the website. The resolution is good, somewhere between 4K and pretty basic HD. The flow of images and continuity is pretty good. Some videos, notably the “paper darts in a forest” indicate a sprite-like capability for introduced images in a scene. Graphic artists also please be aware the rendering by Sora is quite good on multiple surfaces. Not perfect, perhaps, but better than market mainstream.

As cinematography, it’s what you’d expect from a simple prompt like “driving on a mountain road”. Not fancy, just practical.

This is where the blue sky thinking hits the road. A text prompt can do a lot quickly. It can create a baseline image for exploration. It can put words into action without a massive budget.

Blue sky thinking is fabulous fun, but when it comes to doing the work, there has to be an interpreter, and preferably not a dogmatic expensive interpreter. Blue sky and production don’t necessarily speak the same language. There’s also the dead weight of archetypal scenes, “period décor”, and other unwholesome impediments to manage.

For example – Must a 19th-century scene include every cliché in history, relevant or not? How do you distinguish your scene from the drudgery? This is tricky work. If you’re Ken Burns, you can get classic Americana and scenery on the screen seamlessly. Most people can’t do that. Now imagine that you have a text prompt that can at least give you something to work with in context with your idea.

If you’re a game developer, you need to manage your sprites (active elements in the game) and scenery. Visualization becomes incredibly complex. A game like World of Warhammer doesn’t just happen. More to the point, what if someone has an idea, or something? A new scene, new terrain, new sprites? Would it be nice to at least visualize these things rather than spend years trying to make them?

Basic media production is another grim issue to consider. Doesn’t really matter what you’re producing, from a bland commercial to a full series. How many mock-ups can you actually do? Right now, just a few before you decide to take the easy option and go broke. How about hundreds with text prompts and almost no cost?

This could well be the sketch pad for all visual media in future. The trouble with higher-end graphics in particular is that they’re very difficult to have fun with. Adobe Illustrator is the showcase, but really… You have to nearly get a degree to do a few sketches? The crayons are winning that one, guys. Sora could dovetail well with Adobe’s excellent but irritatingly cumbersome suite.

It’s interesting to note that OpenAI is also enlisting “red teamers”, adversarial critics, to put Sora through its paces. This is the old principle of testing taken to a much more appropriate and far more demanding level. AI isn’t just any old bit of software in media. It’s a critical integer in production. You need the critics and faultfinders.

Red teaming can find things that nobody even knew could be issues. It’s a step beyond alpha and beta testing. For a first sortie into the market, Sora is quite a bit better than you’d expect, but it can’t be flawless. Red teaming will find the flaws.

This will also override the insufferable and largely illiterate hype that AI has had to endure over the last couple of years. There’s nothing supernatural about AI tech on any level. It’s a mix of technologies with massive processing capabilities. With a bit of luck it’ll meet the long-existing standards of scientific AI.

It hasn’t, yet, but Sora could be the catalyst that does that. Visual media is a mix of technologies. It delivers a functional package of live information. To put that package together, multiple different related processes have to be delivered.

Does that sound a bit more useful than the cuckoos you’ve been hearing about AI in general? That’s what’s important about Sora. It’s a continuum of integrated information.  

For the creative purists:

Most importantly, this is not a threat to your creative talents. AI is very much overrated in that regard.

Don’t kid yourselves about production. Quick sketching of anything visual without the massive time expenditure and costs has to be a lot better than the software equivalent of stone tablets.

You could spend months or years with a series of visuals trying to make them work. This can now be done in seconds on a yes or no basis. Saves you doing millions of angles, too. That should be very easy for AI.

If you can introduce your own visual elements into a Sora video, you have an instant sandbox.

The visual quality for Sora seems trustworthy. That’s better than most, and a big plus in itself.

You can translate the videos into useful stills. How much simpler could it get?

These images generate metrics for developers. Another pretty handy feature. You should be able to scale anything pretty easily.

Sora could be one of the most valuable tools ever in creative work. To hell with the paranoia, let’s see what it can do.


