Reuters: OpenAI is developing a new model codenamed “Strawberry” that can browse the Internet and reason independently
According to Reuters, OpenAI is working on a new artificial intelligence model project codenamed “Strawberry.”
Internal documents show the team is working on this, but details of its work and when it will be made public are unclear. The project aims to enable its artificial intelligence to not only answer questions, but also to browse the Internet autonomously and reliably for in-depth research.
OpenAI hopes to significantly improve the model’s reasoning capabilities, which is a key component to overcoming challenges.
The project is ongoing and there is no clear timeline for a public release, according to internal documents and people familiar with the matter.
Details of the Strawberry Project
1. Project Background and Objectives
- Project name : Strawberry
- Goal : Improve the intelligence level of AI models by enhancing reasoning capabilities, enabling AI to autonomously conduct in-depth research and long-term tasks (LHT).
2. Project Overview
- Enabling deep research : A core goal of the Strawberry project is to enable AI models to not only generate answers, but also autonomously browse the internet to conduct “deep research.” This means that the AI will be able to independently retrieve and analyze information and take action based on its findings.
- Improved reasoning capabilities : Strawberry aims to improve the reasoning capabilities of AI models, enabling them to better handle multi-step problems and long-term tasks. This improvement will enable AI models to excel in complex fields such as scientific discovery and software development.
3. Technical methods
- Post-training : The project involves a special post-training method, that is, after the model is pre-trained, the performance of the model is improved through further adjustments and optimizations. This process includes but is not limited to fine-tuning, that is, adjusting the output of the model through manual feedback and examples.
- Self-training data generation : The Strawberry project’s approach is similar to the “Self-Taught Reasoner” (STaR) developed by Stanford University, which allows the model to self-generate training data and continuously improve its intelligence level. In theory, it can enable language models to surpass human intelligence.
4. Internal leaked documents and development progress
- Internal Documents : According to internal documents, the Strawberry project is already underway, but a specific release date has not yet been determined.
- ‘Deep Research’ dataset: OpenAI is creating, training, and evaluating models using what the company calls a “Deep Research” dataset, according to its internal documents.
- Strawberry will be used to perform tasks that require long-term planning and continuous action, such as scientific research and software development.
OpenAI specifically wants its models to use those capabilities to conduct research by automatically browsing the web with the help of “CUAs,” or computer usage agents, which can act on their findings, according to the documents and one of the sources. OpenAI also plans to test its ability to do the work of software and machine learning engineers.
Computer Usage Agent (CUA)
Definition of CUA
- CUA (Computer Using Agent) : This refers to a software agent that can autonomously operate a computer system. CUA can automatically browse the Internet, perform information retrieval and analysis, and take subsequent actions based on its findings according to preset instructions and goals.
Possible applications of CUA in Strawberry project
Automatic browsing and research
- Automatically browse the Internet : The Strawberry project aims to enable AI models to browse the Internet autonomously through CUA. The AI models can not only find and read online information, but also analyze this information and conduct in-depth research.
- Action decision : Based on the research results, CUA can take corresponding actions. For example, if the AI model discovers a new scientific research direction during browsing, it can automatically download relevant papers, generate reports, or even start experimental simulations.
Possible applications in software and machine learning engineering
Execute engineering tasks
- Software Engineering : OpenAI plans to test the application of Strawberry models in software development. For example, CUA can browse code bases, find and fix code errors, generate new code modules, and even develop complete software applications.
- Machine Learning Engineering : In the field of machine learning, CUA can help AI models with data preprocessing, model training and optimization, and result analysis. CUA can independently select and download data sets, adjust model parameters, evaluate model performance, and perform further optimization based on the results.
The following is a translation of the Reuters report
OpenAI, the maker of ChatGPT, is developing a new way to improve its artificial intelligence models, a project codenamed “Strawberry,” according to people familiar with the matter and internal documents reviewed by Reuters.
The details of the project, which has not been previously reported, are being shown by a team at Microsoft-backed OpenAI that its model can provide advanced reasoning capabilities.
According to a recent copy of an internal OpenAI document seen by Reuters in May, the OpenAI team is developing “Strawberry”. Reuters could not determine the specific date of the document, but the document details how OpenAI intends to use “Strawberry” for research. People familiar with the matter said the plan is underway. Reuters could not determine how close “Strawberry” is to a public release.
Even within OpenAI, how Strawberry works is a closely guarded secret, people familiar with the matter said.
The document describes a project using the “Strawberry” model that aims to enable the company’s AI to not only generate answers to queries, but also plan ahead, autonomously and reliably browse the internet to conduct “deep research.”
According to interviews with more than a dozen AI researchers, this is a goal that current AI models have not yet achieved.
Asked about Strawberry and the details in this report, an OpenAI spokesperson said: “We want our AI models to see and understand the world as we do. Continuous research into new AI capabilities is common practice in the industry, and there is general agreement that the reasoning ability of these systems will improve over time.”
The spokesperson did not respond directly to questions about “Strawberry.”
The “Strawberry” project was originally called Q*, and Reuters reported last year that Q* had been seen as a breakthrough within the company.
Two sources described watching a demonstration earlier this year of what OpenAI employees called Q*, which was able to answer tough scientific and mathematical questions that today’s commercially available models can’t solve.
According to Bloomberg, at an internal all-staff meeting on Tuesday, OpenAI showed a demonstration of a research project that it claims has new human-like reasoning capabilities. An OpenAI spokesperson confirmed the meeting but declined to provide details of its content. Reuters could not determine whether the project on display was “Strawberry.”
Researchers interviewed by Reuters said reasoning is key for AI to achieve human or superhuman intelligence.
While large language models have been able to summarize dense text and write elegant essays faster than any human, the technology often falls short on common sense problems whose solutions seem intuitive to humans, such as identifying logical fallacies and playing tic-tac-toe. When the model encounters these problems, it often “hallucinates” false information.
AI researchers interviewed by Reuters generally agreed that in the context of AI, reasoning involves forming a model that enables the AI to plan ahead, reflect how the physical world works and reliably solve complex, multi-step problems.
Improving the reasoning ability of AI models is seen as key to unlocking everything the models can do, from making major scientific discoveries to planning and building new software applications.
OpenAI CEO Sam Altman said earlier this year that in AI, “the most important area of progress will be the ability to reason.”
Companies like Google, Meta, and Microsoft are also trying different techniques to improve the reasoning capabilities of AI models, as are most academic labs doing AI research. However, researchers are divided over whether large language models (LLMs) can incorporate ideas and long-term planning into their predictions. For example, Yann LeCun, one of the pioneers of modern AI who works at Meta, often says that LLMs do not have human-like reasoning capabilities.
AI Challenges
People familiar with the matter said Strawberry is a key component of OpenAI’s efforts to overcome these challenges. The documents seen by Reuters describe what Strawberry aims to achieve but do not explain how it will be achieved.
According to one source, Strawberry includes a particular way OpenAI generates AI models called post-training, which is to optimize their performance in a specific way by adapting a base model after it has been “trained” on a large amount of general data.
The post-training phase of developing a model involves methods such as “fine-tuning,” a process used on nearly all language models in various forms, such as having humans provide feedback on the model’s responses and feeding the model examples of good and bad answers.
Strawberry has similarities to an approach developed at Stanford University in 2022 called Self-Taught Reasoner, or STaR, according to a person familiar with the matter. STaR enables AI models to “bootstrap” to higher levels of intelligence by iteratively creating their own training data, and could theoretically be used to enable language models to surpass human-level intelligence, one of its creators, Stanford professor Noah Goodman, told Reuters.
“I think it’s both exciting and scary… If things continue to go in this direction, we as humans have some serious questions to think about,” said Goodman, who has no connection to OpenAI and no knowledge of Strawberry.
As the first source explained, one of the capabilities OpenAI is targeting with “Strawberry” is performing long-duration tasks (LHT), that is, complex tasks that require the model to plan ahead and perform a series of actions over a period of time.
OpenAI is creating, training and evaluating models using what the company calls a “deep research” dataset, according to internal OpenAI documents. Reuters was unable to determine the content of the dataset or the significance of the extended time period.
OpenAI specifically wants its models to use those capabilities to conduct research by automatically browsing the web with the help of “CUAs,” or computer usage agents, which can act on their findings, according to the documents and one of the sources. OpenAI also plans to test its ability to do the work of software and machine learning engineers.
Reporting by Anna Tong in San Francisco and Katie Paul in New York; Editing by Ken Li and Claudia Parsons
Original article: https://www.reuters.com/technology/artificial-intelligence/openai-working-new-reasoning-technology-under-code-name-strawberry-2024-07-12/