Google’s Veo 3 brings the era of video on command – Crypto News – Crypto News
Connect with us
Google’s Veo 3 brings the era of video on command Google’s Veo 3 brings the era of video on command

Metaverse

Google’s Veo 3 brings the era of video on command – Crypto News

Published

on

Now, describe what’s in your mind very specifically to Google’s Gemini assistant, after tapping the video tab. Describe the sound you want as well. Wait a minute or two. And watch as your video appears.

That’s how easy it has become to create a video out of thin air. No camera, no props, no people. DeepMind has refined its text-to-video tool to the point where it generates beautiful-looking slices of video, complete with sound. But, of course, there’s a catch.

Several, in fact.

First, you need to be on the paid tiers of Gemini—Pro or Ultra. Pro costs 1,950 a month and gives you access to the Gemini app with 2.5 Pro, limited access to Veo 3, Flow, Whisk, NotebookLM Plus, Gemini in Gmail, Docs, Vids, and more, plus two terabytes (TB) of storage. The Gemini AI Ultra plan is over 21,000 a month and gives access to more products, plus fewer limits to using them.

Veo 3, the video-generating tool, can be found in the Gemini app (or browser) if you have the Pro plan, but is limited to three videos a day, and an overall maximum limit as well. The videos are of eight seconds duration and output in 720p resolution at 24 frames per second in a 16:9 aspect ratio. Veo 3 is capable of producing 4K videos, depending on the platform, but the limits mentioned are what you get with the Pro tier.

Whatever the technical details, the quality of the tiny videos you create with Veo 3 is very nice. The visual experience is rich. Vivid imagery, smooth movement, and best of all, clear sound. The earlier Veo 2 had not integrated sound, and now that it is part of the videos, it completes them in a way that shows you what could be possible, were all limits to be removed.

The sound is good enough to be quite loud and clear. You won’t miss the purring of a cat or the fizz of soda. You can even have people conversing, though they will need to hurry it up to fit into the eight-second slot. The audio is synchronized. You can also have music if you describe it well enough.

There’s a problem, too.

The Veo 3 version available to Pro users—called Veo 3 Fast—is really like a teaser. You can’t actually do much that’s useful within Gemini without bumping into the limitations. One rather frustrating limit is that adherence to instructions or prompts is by no means flawless at this level. I’ve been playing with Veo from its previous version and have actually only once or twice managed to have a video created to my specifications. The rest have what you might call goof-ups that make them unusable.

For example, with Veo 2, I had once requested a video of Tom and Jerry, from the beloved cartoons, in which Tom was chasing Jerry around a large piece of cheese at high speed. Jerry was to win, as he usually does, by tricking Tom, in this case by jumping on top of the cheese, leaving Tom running. There was no sound then, so I asked for text that said, “Who moved my mouse”.

The result was hilarious. The cheese chased Tom, who in turn chased Jerry. The text said, “Who who cheese?” I iterated many times with no better success.

You will often find errors like the one I described. I asked for a girl swimming in clear blue water, doing the breaststroke. She appeared, swimming in the strangest manner possible. Her face was underwater and staring at the camera, her arms were pushing the water backwards, and there was no sign of the signature breaststroke movements. If she had, in fact, carried on in that vein, she would have shortly drowned.

Compliance with prompts is stronger with the more professional platforms—and those aren’t cheap. Industry insiders may opt for access and will know what to do with those videos. For the average user, Veo 3 Fast is a glance at what’s to come, some day not far off. If you get the video right, through a combination of good luck and clever prompting, you could use the videos on social media to illustrate something to students, to send a message, such as a birthday wish. It can be fun if you get it as desired in three tries.

All the same, whether Tom chases Jerry, Jerry chases Tom, or the cheese chases both of them, the democratization of video has truly arrived, and what we will have to cope with is figuring out whether seeing is believing.

The New Normal: The world is at an inflexion point. Artificial Intelligence is set to be as massive a revolution as the Internet has been. The option to just stay away from AI will not be available to most people, as all the tech we use takes the AI route. This column series introduces AI to the non-techie in an easy and relatable way, aiming to demystify and help a user to actually put the technology to good use in everyday life.

Mala Bhargava is most often described as a ‘veteran’ writer who has contributed to several publications in India since 1995. Her domain is personal tech, and she writes to simplify and demystify technology for a non-techie audience.

Trending