Skip to main content

Meta made DALL-E for video, and it’s both creepy and amazing

Meta unveiled a crazy artificial intelligence model that allows users to turn their typed descriptions into video. The system is called Make-A-Video and is the latest in a trend of AI generated content on the web.

The system accepts short descriptions like “a robot surfing a wave in the ocean” or “clown fish swimming through the coral reef” and dynamically generates a short GIF of the description. There are even three different styles of videos to choose from: surreal, realistic, and stylized.

An artist’s brush painting on a canvas close up

According to a Facebook post by Meta CEO, Mark Zuckerberg, translating written text into video is much harder because of how video requires movement:

Recommended Videos

“It’s much harder to generate video than photos because beyond correctly generating each pixel, the system also has to predict how they’ll change over time. Make-A-Video solves this by adding a layer of unsupervised learning that enables the system to understand motion in the physical world and apply it to traditional text-to-image generation.”

A young couple walking in a heavy rain

Meta’s AI Research team wrote a paper describing how the system works and how it differs from current text-to-image (T2I) methods. Unlike other machine language models, Meta’s Text-to-Video (T2V) method doesn’t use pre-defined text-video pairs. For example, it doesn’t pair “man walking” with a video of an actual man walking.

If this sounds a lot like DALL-E, the popular T2I application, you wouldn’t be far off. Other T2I applications have rolled out since DALL-E gained popularity. TikTok released a filter in August called AI Greenscreen that generates painting style images based on the words you type.

A fluffy baby sloth with an orange knitted hat trying to figure out a laptop close up highly detailed studio lighting screen reflecting in its eye

AI-generated content has become quite buzzworthy within the last few years. Deepfake technology , machine learning techniques to replace a person’s face with another, is even used by visual effects studios for big budget shows like The Mandalorian .

In July, The Times mistakenly reported on a Ukrainian woman in the midst of the Russia-Ukraine war. The problem is she wasn’t real .

The threat of AI probably isn’t a real threat , but projects like DALL-E and Make-A-Video are fun explorations into some of the interesting possibilities.

David Matthews
David is a freelance journalist based just outside of Washington D.C. specializing in consumer technology and gaming. He has…
Meta and Google made AI news this week. Here were the biggest announcements
Ray-Ban Meta Smart Glasses will be available in clear frames.

From Meta's AI-empowered AR glasses to its new Natural Voice Interactions feature to Google's AlphaChip breakthrough and ChromaLock's chatbot-on-a-graphing calculator mod, this week has been packed with jaw-dropping developments in the AI space. Here are a few of the biggest headlines.

Google taught an AI to design computer chips
Deciding how and where all the bits and bobs go into today's leading-edge computer chips is a massive undertaking, often requiring agonizingly precise work before fabrication can even begin. Or it did, at least, before Google released its AlphaChip AI this week. Similar to AlphaFold, which generates potential protein structures for drug discovery, AlphaChip uses reinforcement learning to generate new chip designs in a matter of hours, rather than months. The company has reportedly been using the AI to design layouts for the past three generations of Google’s Tensor Processing Units (TPUs), and is now sharing the technology with companies like MediaTek, which builds chipsets for mobile phones and other handheld devices.

Read more
Meta just unveiled a huge update to its Ray-Ban Smart Glasses
Meta announced new AI features for Ray-Ban Smart Glasses.

Ray-Ban Meta Smart Glasses have been a big success and the company is continuing to expand the capabilities of these stylish tech shades that include a camera and speakers. You'll soon get live translation, reminders, and more, along with a new clear style.

Since these Ray-Bans can see and hear, Meta is leveraging the advanced AI capabilities of its new Llama 3.2 model to enable live translation. In a live demo, Meta founder Mark Zuckerberg spoke with Brandon Moreno, one speaking English and the other Spanish, while their Meta glasses translated for each person.

Read more
ChatGPT can now generate images for free using Dall-E
ChatGPT results on an iPhone.

Since its launch last September, OpenAI's Dall-E 3 image generator has only been available to its Plus, Teams, and Enterprise subscribers. Now, nearly a year later, Dall-E is accessible to the rest of us — just with some stringent restrictions.

https://twitter.com/OpenAI/status/1821644904843636871

Read more