For the First Time, AI Analyzes Language as Well as a Human Expert


The original version of this story appeared in Quanta Magazine.

Among the myriad abilities that humans possess, which ones are uniquely human? Language has been a top candidate at least since Aristotle, who wrote that humanity was “the animal that has language.” Even as large language models such as ChatGPT superficially replicate ordinary speech, researchers want to know if there are specific aspects of human language that simply have no parallels in the communication systems of other animals or artificially intelligent devices.

In particular, researchers have been exploring the extent to which language models can reason about language itself. For some in the linguistic community, language models not only don’t have reasoning abilities, they can’t. This view was summed up by Noam Chomsky, a prominent linguist, and two coauthors in 2023, when they wrote in The New York Times that “the correct explanations of language are complicated and cannot be learned just by marinating in big data.” AI models may be adept at using language, these researchers argued, but they’re not capable of analyzing language in a sophisticated way.

Image may contain Book Indoors Library Publication Adult Person Furniture Bookcase Face and Head

Gašper Beguš, a linguist at the University of California, Berkeley.

Photograph: Jami Smith

That view was challenged in a recent paper by Gašper Beguš, a linguist at the University of California, Berkeley; Maksymilian Dąbkowski, who recently received his doctorate in linguistics at Berkeley; and Ryan Rhodes of Rutgers University. The researchers put a number of large language models, or LLMs, through a gamut of linguistic tests—including, in one case, having the LLM generalize the rules of a made-up language. While most of the LLMs failed to parse linguistic rules in the way that humans are able to, one had impressive abilities that greatly exceeded expectations. It was able to analyze language in much the same way a graduate student in linguistics would—diagramming sentences, resolving multiple ambiguous meanings, and making use of complicated linguistic features such as recursion. This finding, Beguš said, “challenges our understanding of what AI can do.”

This new work is both timely and “very important,” said Tom McCoy, a computational linguist at Yale University who was not involved with the research. “As society becomes more dependent on this technology, it’s increasingly important to understand where it can succeed and where it can fail.” Linguistic analysis, he added, is the ideal test bed for evaluating the degree to which these language models can reason like humans.

Infinite Complexity

One challenge of giving language models a rigorous linguistic test is making sure they don’t already know the answers. These systems are typically trained on huge amounts of written information—not just the bulk of the internet, in dozens if not hundreds of languages, but also things like linguistics textbooks. The models could, in theory, simply memorize and regurgitate the information that they’ve been fed during training.

To avoid this, Beguš and his colleagues created a linguistic test in four parts. Three of the four parts involved asking the model to analyze specially crafted sentences using tree diagrams, which were first introduced in Chomsky’s landmark 1957 book, Syntactic Structures. These diagrams break sentences down into noun phrases and verb phrases and then further subdivide them into nouns, verbs, adjectives, adverbs, prepositions, conjunctions and so forth.

One part of the test focused on recursion—the ability to embed phrases within phrases. “The sky is blue” is a simple English sentence. “Jane said that the sky is blue” embeds the original sentence in a slightly more complex one. Importantly, this process of recursion can go on forever: “Maria wondered if Sam knew that Omar heard that Jane said that the sky is blue” is also a grammatically correct, if awkward, recursive sentence.



Source link

  • Related Posts

    Grok is spreading misinformation about the Bondi Beach shooting

    Grok’s track record is spotty at best. But even by the very low standards of xAI, its failure in the aftermath of the tragic mass shooting at Bondi Beach in…

    AI Saves Workers Less Than an Hour Each Day, Recent OpenAI Report Finds

    OpenAI’s 2025 ‘The State of Enterprise AI’ report provides an in-depth look at how businesses are using AI tools within real companies. Drawing on anonymized usage data from more than 1 million business customers,…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Montreal police to ‘increase vigilance’ after mass shooting at Hanukkah event in Australia

    Montreal police to ‘increase vigilance’ after mass shooting at Hanukkah event in Australia

    Axar ill and Bumrah misses third T20I for personal reasons; South Africa make three changes

    Rand Paul says redistricting battle ‘might lead to violence’ on both sides

    Rand Paul says redistricting battle ‘might lead to violence’ on both sides

    Grok is spreading misinformation about the Bondi Beach shooting

    Grok is spreading misinformation about the Bondi Beach shooting

    Video shows bystander tackling and disarming Bondi Beach gunman

    Consume Me is an RPG minigame buffet that goes straight to the hits

    Consume Me is an RPG minigame buffet that goes straight to the hits