Google explores emerging capabilities in big AI models


When language models are scaled, new capabilities sporadically appear that are not found in the smaller models. A research paper examines this effect.

In various disciplines such as philosophy, classical sciences, cognitive sciences, systems theory and even in art, emergence refers to the situation in which an object of study exhibits properties that its individual elements do not. not own by themselves. These are, for example, behaviors or abilities that only emerge through the interaction of individual parts.

The term comes from the Latin emerge, a word that translated as “appear”, “arrive” or “rise”. Some theories view consciousness, for example, as an emergent property of biological brains. An example of emergence in physical systems is the emergence of complex symmetrical and fractal patterns in snowflakes.

Large language models exhibit emergent capabilities

Large language models such as OpenAI’s GPT-3 have recently defined natural language processing (NLP) and enabled significant performance advancements. These models showed that scaling language models using more data and training parameters leads to better performance on NLP tasks. By studying “scaling laws”, researchers have been able to consistently predict the effects of scale on performance in many cases.

A d

With scaling, however, it was realized that the model’s performance on some tasks did not continuously increase with scaling. The performance jumps observed on such tasks cannot therefore always be predicted in advance. Rather, capabilities are found in large language models that are not found in smaller models.

A new paper by researchers from Google Research, Stanford University, UNC Chapel Hill, and Deepmind now explores these emerging capabilities in large-scale language models.

Researchers study the unpredictable phenomenon of emerging capabilities

According to the team, these emerging abilities include, for example, the ability to control language model outputs with a few prompts or to perform basic mathematical calculations such as addition and subtraction by three or multiplication by two. figures.

In these and other cases, it can be shown that when visualized using a scaling curve, the performance is nearly random at the start, and at some critical threshold of the model scaling, performance jumps well above chance.

In many benchmarks, there is clearly a transition where a scaled language model acquires emergent capabilities. | Image: Wei et al.

This qualitative change is also known as phase transition: A dramatic change in overall behavior that could not have been predicted when the system was studied on a smaller scale.


Deepmind shows how AI can handle uncertainty
Deepmind shows how AI can handle uncertainty

Beyond prompting a few shots, there are other prompting and fine-tuning strategies that enhance the capabilities of large language models. An example is chain-of-thought incitement, which performs inference more reliably.

For some of these methods, researchers also observe emergent effects: in smaller models, performance remains the same or even degrades despite the use of a method. Only in larger models do the methods lead to performance jumps.

Some incentive and fine-tuning methods only produce improvements in larger models. | Image: Wei et al.

Emerging abilities remain unexplained at this time

In their article, the researchers also refer to various explanations of the phenomenon of emergent abilities in major linguistic models. However, they conclude that it cannot yet be conclusively explained.

Besides scaling model size and datasets, in some cases, smaller models with more modern architectures, higher quality data, or improved training procedures can develop similar capabilities. Scaling is therefore not the only factor that activates a new capacity.

However, it is often scaling that shows that such emerging capabilities are possible in the first place. The 175 billion GPT-3 model, for example, had not shown above-chance performance on single prompts. Some researchers suspected that the cause was the architecture of the model used by GPT-3 and the training objective. Later, however, the 540 billion parameter PaLM model showed that scaling alone can be sufficient to achieve above average performance on this task without fundamentally changing the architecture.

The emergence of new capabilities therefore raises the question of whether further scaling will enable larger language models with new capabilities. According to the researchers, there are dozens of tasks in the BIG-Bench benchmark for NLP that no major language model has yet cracked — many of which involve abstract reasoning, like chess or advanced math.

The team considers the following points to be relevant for future research:

  • additional model scaling
  • improved model architectures and training
  • data scaling
  • better techniques and a better understanding of incitement
  • boundary tasks at the limit of the capabilities of current language models
  • understand the emergence

We discussed the emerging capabilities of language models, for which significant performance has so far only been observed at a certain computational scale. Emerging abilities can span a variety of language models, task types, and experimental scenarios. Such abilities are a recently discovered result of scaling language models, and the questions of how they emerge and whether further scaling will enable other emerging abilities seem to be directions for research. important future for the field of NLP.


Leave a Comment

Your email address will not be published. Required fields are marked *