Google’s Veo 3 AI video generator is a slop monger’s dream


Even at first look, there’s one thing off in regards to the physique on the road. The white sheet it’s below is somewhat too clear, and the officers’ actions are completely devoid of objective. “We have to clear the road,” considered one of them says with a agency hand gesture, although her lips don’t transfer. It’s AI, alright. However right here’s the kicker: my immediate didn’t embody any dialogue.

Veo 3, Google’s new AI video technology mannequin, added that line all by itself. Over the previous 24 hours I’ve created a dozen clips depicting information reviews, disasters, and goofy cartoon cats with convincing audio — a few of which the mannequin invented all by itself. It’s greater than somewhat creepy and far more subtle than I had imagined. And whereas I don’t assume it’s going to propel us to a misinformation doomsday simply but, Veo 3 strikes me as an absolute AI slop machine.

Google introduced Veo 3 at I/O this week, highlighting its most vital new functionality: producing sound to go together with your AI video. “We’re coming into a brand new period of creation,” Google’s VP of Gemini, Josh Woodward, defined within the keynote, calling it “extremely real looking.” I wasn’t fully bought, however then, a number of days later, I had Veo 3 generate a video of a information anchor saying a fireplace on the Area Needle. All it took was a primary textual content immediate, a couple of minutes, and an costly subscription to Google’s AI Extremely plan. And you recognize what? Woodward wasn’t exaggerating. It’s real looking as hell.

I attempted the information anchor immediate after seeing what Alejandra Caraballo, a scientific teacher at Harvard Regulation College’s Cyberlaw Clinic, was in a position to produce. One of her clips encompasses a information anchor saying the dying of US Secretary of Protection Pete Hegseth. He isn’t useless, however the clip is extremely convincing. A submit together with a string of movies with AI-generated characters protesting the prompts used to create them has 50,000 upvotes on Reddit. The scenes embody disasters, a lady in a hospital mattress utilizing a respiratory tube, and a personality being threatened at gunpoint — all with spoken dialogue and real looking background sounds. Actual lighthearted stuff!

Perhaps I’m being naive, however after taking part in round with Veo 3 I’m not fairly as involved as I used to be at first. For starters, the plain guardrails are in place. You possibly can’t immediate it to create a video of Biden tripping and falling. You possibly can’t have a information anchor announce the assassination of the president, and even generate a video of a T-shirt-and-chain-wearing tech firm CEO laughing whereas greenback payments rain down round him. That’s a begin.

That stated, you may generate some troubling shit. With none intelligent workarounds I prompted Veo 3 to create a video of the Area Needle on fireplace. Beginning with my very own photograph of Mount Rainier, I generated a video of it erupting with smoke and lava. Coupled with a clip of a information anchor saying stated catastrophe, I can see how you may seed some mischief actual simply with this instrument.

Right here’s the higher information: it doesn’t appear to be a ready-made deepfake machine. I gave it a few pictures of myself and requested it to generate a video with particular dialogue and it wouldn’t comply. I additionally requested it to convey a pair of large boots in a photograph to life and have them stroll out of the scene; it managed one boot stomping throughout the sidewalk with some comical crunching noises within the background.

I had a better time producing movies when my prompts have been much less particular, which is how I confirmed one thing my colleague Andrew Marino pointed out: Veo 3 is great at creating the form of lowest-common-denominator YouTube content material geared toward youngsters.

If you happen to’ve by no means been subjected to the infinite pit of rubbish on YouTube Children, let me enlighten you. Think about watching the worst 3D rendering of a monster truck driving down a ramp, touchdown in a vat of coloured paint. Subsequent to it, one other monster truck drives down one other ramp into one other vat of paint — this time, a unique coloration. Now watch that once more. And once more. And once more. There are hours of these things on YouTube designed to mesmerize toddlers. These movies are normally innocent, simply empty energy designed to rack up views that make Cocomelon seem like Citizen Kane. In about 10 minutes with Veo 3, I threw collectively a clip following the identical primary system — full with jaunty background music. However the clip that’s much more troubling to me is the 2 cartoon cats on a pier.

I believed it could be humorous to have the cats complain to one another that the fish aren’t biting. In simply a few minutes, I had a clip full with two cats and a few AI-generated dialogue that I by no means wrote. If it’s this straightforward to make a 10-second clip, stretching it out to a seven-minute YouTube video can be trivial. In its present kind, clips revert to Veo 2 whenever you attempt to prolong them into longer scenes, which removes the audio. However the best way that Google has been pushing these instruments ahead relentlessly, I can’t think about it’ll be lengthy earlier than you may edit a full feature-length video with Veo 3.

Truthfully, I’m wondering if this form of use for AI-generated video is a characteristic and never a bug. Google confirmed us some fancy AI-generated video from actual filmmakers, together with Eliza McNitt, who’s working with Darren Aronofsky on a brand new movie with some AI-generated parts. And certain, AI video might be an fascinating instrument in the suitable arms. However I believe what we’re most definitely to see is a proliferation of the form of bland imagery that AI is so good at producing — this time, in stereo.

Leave a Reply

Your email address will not be published. Required fields are marked *