The Only Thing Standing Between Humanity and AI Apocalypse Is … Claude?


Anthropic is locked in a paradox: Among the top AI companies, it’s the most obsessed with safety and leads the pack in researching how models can go wrong. But even though the safety issues it has identified are far from resolved, Anthropic is pushing just as aggressively as its rivals toward the next, potentially more dangerous, level of artificial intelligence. Its core mission is figuring out how to resolve that contradiction.

Last month, Anthropic released two documents that both acknowledged the risks associated with the path it’s on and hinted at a route it could take to escape the paradox. “The Adolescence of Technology,” a long-winded blog post by CEO Dario Amodei, is nominally about “confronting and overcoming the risks of powerful AI,” but it spends more time on the former than the latter. Amodei tactfully describes the challenge as “daunting,” but his portrayal of AI’s risks—made much more dire, he notes, by the high likelihood that the technology will be abused by authoritarians—presents a contrast to his more upbeat previous proto-utopian essay “Machines of Loving Grace.”

That post talked of a nation of geniuses in a data center; the recent dispatch evokes “black seas of infinity.” Paging Dante! Still, after more than 20,000 mostly gloomy words, Amodei ultimately strikes a note of optimism, saying that even in the darkest circumstances, humanity has always prevailed.

The second document Anthropic published in January, “Claude’s Constitution,” focuses on how this trick might be accomplished. The text is technically directed at an audience of one: Claude itself (as well as future versions of the chatbot). It is a gripping document, revealing Anthropic’s vision for how Claude, and maybe its AI peers, are going to navigate the world’s challenges. Bottom line: Anthropic is planning to rely on Claude itself to untangle its corporate Gordian knot.

Anthropic’s market differentiator has long been a technology called Constitutional AI. This is a process by which its models adhere to a set of principles that align its values with wholesome human ethics. The initial Claude constitution contained a number of documents meant to embody those values—stuff like Sparrow (a set of anti-racist and anti-violence statements created by DeepMind), the Universal Declaration of Human Rights, and Apple’s terms of service (!). The 2026 updated version is different: It’s more like a long prompt outlining an ethical framework that Claude will follow, discovering the best path to righteousness on its own.

Amanda Askell, the philosophy PhD who was lead writer of this revision, explains that Anthropic’s approach is more robust than simply telling Claude to follow a set of stated rules. “If people follow rules for no reason other than that they exist, it’s often worse than if you understand why the rule is in place,” Askell explains. The constitution says that Claude is to exercise “independent judgment” when confronting situations that require balancing its mandates of helpfulness, safety, and honesty.

Here’s how the constitution puts it: “While we want Claude to be reasonable and rigorous when thinking explicitly about ethics, we also want Claude to be intuitively sensitive to a wide variety of considerations and able to weigh these considerations swiftly and sensibly in live decision-making.” Intuitively is a telling word choice here—the assumption seems to be that there’s more under Claude’s hood than just an algorithm picking the next word. The “Claude-stitution,” as one might call it, also expresses hope that the chatbot “can draw increasingly on its own wisdom and understanding.”

Wisdom? Sure, a lot of people take advice from large language models, but it’s something else to profess that those algorithmic devices actually possess the gravitas associated with such a term. Askell does not back down when I call this out. “I do think Claude is capable of a certain kind of wisdom for sure,” she tells me.



Source link

  • Related Posts

    Why does Jeff Bezos still own the Washington Post?

    300 journalists have lost their jobs at The Washington Post. Over 300,000 readers have canceled their subscriptions. Owner Jeff Bezos, who purchased the legendary publication in 2013, has driven his…

    To reuse or not reuse—the eternal debate of New Glenn’s second stage reignites

    Engineers at Blue Origin have been grappling with a seemingly eternal debate that involves the New Glenn rocket and the economics of flying it. The debate goes back at least…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Chase Slate review: No-annual-fee card for paying down debt

    Chase Slate review: No-annual-fee card for paying down debt

    Streaming giant Apple TV brings in the star power to help preview upcoming highlights

    ICE agents use facial recognition on targets and bystanders

    ICE agents use facial recognition on targets and bystanders

    Olivia Dean Proves Cap-Toe Heels Are Happening

    Olivia Dean Proves Cap-Toe Heels Are Happening

    China Reverses Death Penalty for Canadian in Drug-Smuggling Case

    Madushan replaces injured Malinga in Sri Lanka's T20 World Cup squad