- OpenAI is the startup behind the viral AI chatbot ChatGPT, but the company has other AI products.
- DALL-E creates images based on detailed text descriptions and Sora creates videos.
- Whisper is a speech-recognition model that can transcribe and translate audio from many languages.
ChatGPT quickly went viral after it was released in November 2022.
The tool has generated controversy and even kicked off a race among large tech companies like Google and Meta to develop their own, more powerful AI tools. OpenAI now has a $13 billion partnership with Microsoft and the tech giant has integrated GPT-4o into Copilot and the Azure AI cloud suite.
However, the startup behind it, OpenAI, has other AI products, too — and it recently made its AI video-generator Sora available to users. Take a look at some of the startup’s other AI products.
DALL-E
Just months before ChatGPT launched, OpenAI removed the waitlist for its generative AI art generator, DALL-E. It quickly grew to over 1.5 million daily users by September 2022, the company wrote in a blog post. The tool — which quickly creates imaginative and detailed artwork via a text prompt — sparked controversy among artists when it came out, who debated what DALL-E and other AI art generators like it could mean for people in creative jobs.
Since DALL-E launched, OpenAI released DALL-E 2 and DALL-E 3. The latest upgrade, DALL-E 3 understands more nuance and detail than previous versions, the company said.
The AI art generator creates original images called “generations” from detailed text prompts input by a person. You can write detailed prompts such as the one above — “astronaut fish swimming in an ocean in outer space, digital art” — and specify an art style or even reference a specific artist like Vincent Van Gogh.
You can also edit “generations” with the tool using one of the credits the program gives you each month, and upload your own photos to create images.
Whisper
Whisper is an automatic speech recognition model that transcribes speech to text and can identify and translate multiple languages to English. The model can transcribe in multiple languages too.
The system was trained on 680,000 hours of multilingual and multitask supervised data collected from the internet, according to OpenAI.
In examples on its product page, Whisper transcribes an almost 30-second long audio of quick-spoken text, a clip of a K-pop song, an audio clip of spoken French, and an audio clip of someone speaking with a strong accent.
Whisper is now used in a number of industries including healthcare. Recently, an Associated Press report revealed that the technology is prone to hallucinations that include comments about race and violent rhetoric, which could pose problems if it’s used in medical settings.
Codex
Codex is an AI system that translates natural language into code. OpenAI says Codex is “most capable” in Python, but is also proficient in over a dozen coding languages like JavaScript and Swift.
The model can interpret simple commands input by a user. OpenAI says Codex is a “general-purpose programming model,” which means it can be used for “essentially any programming task,” although its results can vary. OpenAI said it’s successfully used Codex “for transpilation, explaining code, and refactoring code.”
OpenAI has some examples of how Codex works, including using the model to program a space-themed game and giving a computer spoken commands to edit a Word document.
Sora
OpenAI announced during its “Shipmas” livestream on December 9 that it would launch its AI video generator Sora to the public after making it available to a limited group of artists and creators in February.
Sora can generate up to 20-second videos from written instructions. The tool can also complete a scene and extend existing videos by filling in missing frames.
The company showed off the new product and its various features, including the Explore page, which is a feed of videos shared by the Sora community. It also demonstrated various style presets for the videos like pastel symmetry, film noir, and balloon world.
The company said in a blog post that the product “may struggle to simulate the physics of a complex scene,” as well as with depicting events that happen over time. It may also mix up left and right, the company said.
While the tool already made a strong impression on some in Hollywood, the tool’s product designer said in the demonstration that Sora wasn’t going to create feature films at the click of a button. Rather, the employee said the tool was moreso “an extension of the creator who’s behind it.”
API tools
OpenAI also has a set of tools geared toward developers. Its flagship reasoning models include o1, o1-mini, and the soon-to-be-released o3 and 03-mini models. OpenAI also has GPT models, including GPT-4o and GPT-4o mini. OpenAI offers Chat Completions API, Assistants API, Batch API, and Realtime API. Users can explore models and APIs in OpenAI’s Playground without writing code. According to the company website, three million developers are building with its tools.