OpenAI, for example, has continued to lead the world in advancing artificial intelligence with their work towards a new model that can reason to perform complex tasks. The nonprofit recently lured away five Meta-researchers from the tech giant’s deadly branch on superintelligence, with multi-billion dollar paychecks. This decision highlights OpenAI’s long-term, ambitious plan of producing AI agents that can handle all kinds of tasks on their own.
Now in 2023, OpenAI has reached a major milestone through the use of a novel AI system first referred to as “Q*” but later changed to “Strawberry.” This pioneering model merges the power of large language models (LLMs), reinforcement learning (RL), and a novel approach called test-time computation. We at OpenAI acknowledge that our models are flawed. Yet they frequently fail at difficult tasks and even occasionally print falsehoods.
Here’s a look at OpenAI’s efforts to continue to hone its AI powers, as well as what makes its latest improvements so important.
Recruitment of Top Researchers
OpenAI’s unique approach to talent recruitment has already sent shockwaves through the industry. With the critical addition of five top-seeded researchers from Meta’s superintelligence-fevered taskforce, OpenAI wants to strengthen its operations.
“When we showed the evidence [for o1], the company was like, ‘This makes sense, let’s push on it.’” – Hunter Lightman
Such extraordinary compensation packages underscore OpenAI’s drive to recruit a team capable of further speeding its research efforts. Meta’s talented infusion further OpenAI’s aspirational long-term vision. Their overarching mission is to develop powerful general-purpose AI agents that can function independently across diverse applications.
The move highlights the competitive landscape of AI research, where companies are vying for top talent to drive innovation. OpenAI’s aggressive recruitment strategy gives it a leg up over other tech titans like Google and Microsoft in the ongoing battle for AI dominance.
Advancements in AI Reasoning Models
In 2018, the nonprofit released its first large-scale language model. They used millions of hours of internet data, and, backed by high-performance GPU clusters, made this amazing breakthrough. This important work was the cognitive scaffold upon which all future progress was built.
In recent years, OpenAI’s MathGen team, led by Hunter Lightman, has played a crucial role in developing state-of-the-art reasoning models. Their efforts culminated in a noteworthy achievement: one of OpenAI’s models secured a gold medal at the International Math Olympiad, showcasing the model’s exceptional mathematical abilities.
“I think these models will become more capable at math, and I think they’ll get more capable in other reasoning areas as well,” – Noam Brown
OpenAI researchers say they are pleased with the direction that their models for reasoning have been trending. They just think these improvements will lead to more capable AI systems. These systems will need to address multifaceted issues that call for a holistic approach.
The Future of AI Agents
Our focus on AI agents represents a major inflection point in OpenAI’s research direction. Under the guidance of researcher Daniel Selsam, ARL’s newly minted “Agents” team is wholly focused on pursuing this exciting new paradigm. Our vision is to develop AI agents that singularly autonomously complete tasks. This will further enhance productivity for creators of all trades.
OpenAI’s Codex agent is one attempt at realizing this ambition, created to help software engineers with low-level coding tasks. This functionality has the potential to let engineers push tedious tasks to the side and focus on more advanced areas of the software development life cycle.
“Eventually, you’ll just ask the computer for what you need and it’ll do all of these tasks for you,” – Sam Altman
Even OpenAI admits that its current models are not perfect. Though they have truly advanced the state of the art in reasoning, the models cannot yet consistently and correctly execute complex tasks. A key focus of research is finding new, creative ways to make these powerful agents even more capable.
Lightman stresses that making training that’s harder to verify a major focus of future research would be key to closing this gap. He notes that “some of the research I’m really excited about right now is figuring out how to train on less verifiable tasks. We have some leads on how to do these things.”
Navigating Challenges in AI Development
Since then, even with advances like system plugins and function calling, OpenAI’s models continue to struggle with hallucinations and complex multi-step reasoning. These issues highlight the challenges of creating trustworthy AI systems that can reason deeply.
El Kishky, another key researcher at OpenAI, reflects on the evolution of their models: “I could see the model starting to reason. It would notice mistakes and backtrack. It would get frustrated. It really felt like reading the thoughts of a person.” This outlook has fueled the decades-long effort to develop AI systems that can better mimic human-like reasoning and adaptability.
We’re pleased to see that OpenAI admits that most of the problems are due to data as well. As Lightman points out, “Like many problems in machine learning, it’s a data problem.” A vibrant community of researchers is working on solutions to improve model data quality and increase training techniques to improve model performance.