OpenAI takes another step closer to getting AI to think like humans with new 'o1' model

The line separating human intelligence from artificial intelligence just got more narrow.

OpenAI on Thursday revealed o1, the first in a new series of AI models that are “designed to spend more time thinking before they respond,” the company said in a blog post.

The new model can work through complex tasks and, in comparison to previous models, solve more difficult problems in science, coding, and math. In essence, they think a little more like humans than existing AI chatbots.

While previous iterations of OpenAI’s models have excelled on standardized tests like the SAT to the Uniform Bar Examination, the company says that o1 goes a step further. It performs “similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.”

For example, it beat GPT-4o — a multimodal model OpenAI unveiled in May — in the qualifying exam for the International Mathematics Olympiad by a long shot. GPT-4o only correctly solved 13% of the exam’s problems, while o1 scored 83%, the company said.

The sharp surge in the o1’s reasoning capabilities comes, in part, from a prompting technique known as “chain of thought.” OpenAI said o1 “learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn’t working.”

That’s not to say there aren’t some tradeoffs compared to earlier models. OpenAI noted that while human testers preferred o1’s responses in reasoning-heavy categories like data analysis, coding, and math, GPT-4o still won out in natural language tasks like personal writing.

OpenAI’s primary mission has long been to create artificial general intelligence, or AGI, a still hypothetical form of AI that mimics human capabilities. Over the summer, while o1 was still in development, the company unveiled a new five-level classification system for tracking its progress toward that goal. Company executives reportedly told employees that o1 was nearing a level two, which it identified as “reasoners” with human-level problem-solving.

Ethan Mollick, a professor at the University of Pennsylvania’s Wharton School who has had access to o1 for over a month, said the model’s gains are perhaps best illustrated by how it solves crossword puzzles. Crossword puzzles are typically difficult for large language models to solve because “they require iterative solving: trying and rejecting many answers that all affect each other,” Mollick wrote in a post on his Substack. Most large language models “can only add a token/word at a time to their answer.”

But when Mollick asked o1 to solve a crossword puzzle, it thought about it for a “full 108 seconds” before responding. He said that its thoughts were both “illuminating” and “pretty impressive” even if they weren’t fully correct.

Other AI experts, however, are less convinced.

Gary Marcus, a New York University professor of cognitive science, told Business Insider that the model is “impressive engineering” but not a giant leap. “I am sure it will be hyped to the sky, as usual, but it’s definitely not close to AGI,” he said.

Since OpenAI unveiled GPT-4 last year, it’s been releasing successive iterations in its quest to invent AGI. In April, GPT-4 Turbo was made available to paid subscribers. One update included the ability to generate responses that are “more conversational.”

The company announced in July that it’s testing an AI search product called SearchGPT with a limited group of users.

Trending Now

EUR/USD stays sideways while Fed’s guidance of fewer rate cuts keep USD upbeat

Bank of Italy Exposes Bitcoin P2P Platforms as Tools for Crime

Here’s how the Big 4 consulting firms said they performed this year

Throwing the same Christmas party for 20 years has taught me 4 things about being a host

American Airlines grounds all flights in US due to technical issue By Reuters

OpenAI takes another step closer to getting AI to think like humans with new ‘o1’ model

Throwing the same Christmas party for 20 years has taught me 4 things about being a host

These Democrats aren’t fully dismissing DOGE. It could give Elon Musk and Vivek Ramaswamy a serious bipartisan boost.

The pantry is a new symbol of success and wealth. Now, this CEO wants her grocery store to be the Sephora of food retail.

Meet the real people living in famous houses from movies and TV shows, from ‘Home Alone’ to ‘Breaking Bad’

Meet the power players at Elon Musk’s xAI, the startup taking on OpenAI with its Grok chatbot

Russia’s economy is set to lose another source of income that Ukraine controls

Controversial messaging app Telegram is profitable, says its founder. Here’s how it makes money.

The UK says it’s successfully tested its new radio-wave weapon that kills drones from 1,000 yards at $0.13 a pop

A timeline of OpenAI’s rocky 2024

Trending Now

OpenAI takes another step closer to getting AI to think like humans with new ‘o1’ model

Related Articles