Artificial intelligence | Instant videos could be the next leap forward

Ian Sansavera, a software architect at a New York startup called Runway, typed out a short description of what he wanted to see in a video. “A quiet river in the forest,” he wrote.

Less than two minutes later, an experimental internet service generated a short video of a lazy river in a forest. The running water of the river glistened in the sun as it weaved through the trees and ferns, took a turn and gently splashed against the rocks.

Runway, which plans to open its service to a small group of testers this week, is one of many companies developing technology using artificial intelligence (AI) that will soon allow people to generate videos simply by typing a few words in a box on a computer screen.

These companies represent the next step in an industry race — involving giants like Microsoft and Google, as well as much smaller start-ups — to create new kinds of artificial intelligence-based systems that some say will could be the next big technological innovation, as important as web browsers or the iPhone.

These systems are examples of so-called generative AI, which can instantly create text, images and sounds. Another example is ChatGPT, the online chatbot created by San Francisco startup OpenAI, which stunned the tech industry with its capabilities late last year.

Google and Facebook’s parent company Meta unveiled the first video-generating systems last year, but did not make them available to the public because they feared the systems could be used to spread misinformation with a new speed and efficiency.

Runway CEO Cris Valenzuela said he believes the technology is too important to keep in a research lab, despite the risks it entails. “This is one of the most impressive technologies we’ve built in the last hundred years,” he said. People need to actually use it. »

The ability to edit and manipulate films and videos is nothing new, of course. Filmmakers have been doing it for over a century. In recent years, researchers and digital artists have used artificial intelligence and software to create and edit videos, often referred to as deepfakes.

But systems like the one created by Runway could eventually replace editing skills with the push of a button.

This works best if the scene has a bit of action, but not too much – something like “a rainy day in the big city” or “a dog with a cellphone in the park”. Hit “Enter” and the system generates a video in a minute or two.

The technology can reproduce common images, such as a cat sleeping on a carpet. She can also combine disparate concepts to generate weirdly funny videos, like a cow at a birthday party.

The videos are only four seconds long, and on closer inspection, they’re choppy and blurry. Sometimes the images are weird, distorted and disturbing. The system has a habit of fusing animals like dogs and cats with inanimate objects like balls or cell phones. But given the right direction, he produces videos that show the future of technology.

“At this point, if I see a high-resolution video, I’m probably going to trust it,” said Phillip Isola, a Massachusetts Institute of Technology professor and AI specialist. “But that’s going to change pretty quickly.” »

Like other technologies that use generative AI, Runway’s system learns by analyzing digital data — in this case, photos, videos, and captions describing the content of those images. By training this type of technology on increasingly large amounts of data, researchers are confident that they can quickly improve and extend its skills. Soon, according to experts, they will produce professional-looking mini-movies, complete with music and dialogue.

“It used to be that to do something like that, you needed a camera. We needed props. We needed a filming location. Permission was needed. You had to have money,” says Susan Bonser, an author and publisher in Pennsylvania who experimented with early incarnations of generative video technology. “Today, none of that is necessary. You can just sit back and imagine it. »