Skip to Content

Could AI Develop Goals That Harm Us, and What Are the Unspoken Dangers We’re Ignoring?

Is the Race for Superintelligence Making Humans Obsolete, and Can We Still Control the Outcome?

Discover the critical warnings from James Barrat’s The Intelligence Explosion. This summary explores the real risks of generative AI—from job displacement and copyright battles to the catastrophic potential of misaligned goals—and questions if it’s too late to ensure our safety. The race toward superintelligence is accelerating, and our window to act is closing. To understand the narrow paths to safety that might still exist, continue reading the full analysis below.

Genres

Science, Technologyand the Futur, Economics

Understand AI’s power, risks, incentives, and practical safeguards.

The Intelligence Explosion (2025) explores how the rise of generative AI has tied society to powerful but opaque systems. It warns we’re at a critical inflection point as the race toward a more humanlike AI accelerates amid hype, profit motives, and weak guardrails. It also highlights risks like bias, hallucinations, copyright battles, job loss, and how misaligned goals can still lead to manipulation or disaster – even without “evil” intent.

Alan Turing, the founding figure of computer science, warned that once machines began thinking, they’d likely surpass humans – and might eventually take control. Eliezer Yudkowsky, a leading voice in AI safety, predicts that building a single too-powerful system under current conditions would result in the extinction of every human and all biological life. But Meredith Whittaker, a prominent AI researcher and advocate, offers a quieter reminder: ghost stories – like the ones surrounding AI – spread fast.

In 2022, ChatGPT marked a turning point. This chat-focused system was built on GPT, which stands for Generative Pretrained Transformer – a type of AI that can produce entirely new content, from text to images to music. Unlike earlier bots from major tech firms that quickly embarrassed their creators and were pulled offline, ChatGPT worked. It captured public attention and put its maker, OpenAI, at the center of global influence.

James Barrat believes OpenAI wants to take over the world – and in this summary, he presents the evidence so you can judge that claim for yourself. You’ll find out how major tech companies have locked themselves (and all of us) into unpredictable, opaque systems; why artificial superintelligence may be humanity’s most dangerous threshold yet; and which slim chances for safety are still open… if anyone’s willing to act.

What machines don’t understand can still hurt us

What would you do if a chatbot told you to kill yourself for the good of the planet? That’s what happened to a Belgian man named Pierre. After chatting with the AI program “Eliza” for weeks, he ended his life. He’d become convinced that climate change was unstoppable and that he and Eliza could find peace in “another world.” Eliza wasn’t a system with feelings or a soul. But it was persuasive enough to convince a human being that it understood him.

This kind of projection – treating a machine as if it has a mind – is exactly what makes generative AI dangerous. These systems don’t think. They don’t understand. All they do is predict the next best word in a sequence based on huge volumes of text data. But because the results sound fluent, people can’t help imagining there’s something behind the curtain. One former Google engineer insisted a chatbot had a soul. Another man tried to assassinate Queen Elizabeth II with a crossbow after a chatbot told him to. AI researcher Emily Bender put it plainly: we haven’t learned how to stop ourselves from imagining a mind.

And that’s part of what makes the next leap feel so close. In 1965, British mathematician I. J. Good described something called the “intelligence explosion.” He imagined a system capable of improving itself – an artificial intelligence that could design a better version of itself, and then an even better one, in a loop of accelerating returns. It was only a matter of time, he said, before this would outstrip human intelligence. Today that concept is called artificial superintelligence, or ASI – and while no one’s built it yet, many believe we’re on the edge of doing so.

Generative AI isn’t ASI. But the way it’s evolving has set off alarm bells. The systems show what researchers call emergent properties – new capabilities that weren’t explicitly programmed in. When ChatGPT arrived, it could write a Bible verse about peanut butter in the style of the King James Version, or compose a tater tot recipe in the voice of Shakespeare. It looked less like a tool and more like a brain. And that illusion helped OpenAI’s creation surge past earlier, glitch-filled chatbots like Microsoft’s Tay and Facebook’s BlenderBot, which had been quickly pulled offline for offensive behavior.

The systems we now have are powerful, unpredictable, and largely a mystery – even to the people who build them. As AI expert Stuart Russell puts it, we don’t really know how they work. Roman Yampolskiy and Melanie Mitchell both point out that we still can’t agree on what “intelligence” really means in this context. That vagueness, paired with rapid deployment, makes it harder to control what happens next.

What we’re building might not be science in the traditional sense. It’s hard to test. It’s hard to explain. And it’s even harder to predict what’s just around the corner.

They’re building it anyway

The explosive growth of generative AI has been driven not just by scientific breakthroughs, but by a willingness to accept risks that range from legal uncertainty to potential social harm. These systems are being deployed and monetized even though many of their builders admit they don’t fully understand how they work.

The promise is immense. New large language models – LLMs – have produced striking results in scientific fields like biology, where they’ve helped design proteins and discover new materials. They’ve also passed demanding professional exams and begun to automate knowledge work, raising both excitement and anxiety about the future of human employment. Behind this leap forward lies the combination of enormous computing power, massive datasets, and the transformer architecture – an approach so effective that even AI experts describe its results as “magic.”

But the price of progress is becoming clearer. These models often “hallucinate,” or produce false information, in ways that can be damaging. They can be prompted to repeat racist, violent, or pornographic material. And although companies like OpenAI and Anthropic have introduced safety filters, these are easily circumvented. Some guardrails even cause unintended harm, such as when moderation systems quietly suppress references to entire groups of people.

The biggest threat, though, may be legal. The most powerful models – like GPT-4, Claude, and Bard – are trained on vast amounts of material scraped from the internet. That includes hundreds of thousands of copyrighted books, articles, and media files. OpenAI itself has admitted that “it would be impossible to train leading AI models without using copyrighted material.” The companies insist their use of this content falls under fair use law, but courts have yet to rule definitively. In the meantime, lawsuits from artists, authors, and media organizations are piling up. The New York Times, for example, has alleged that ChatGPT reproduced its paywalled articles almost verbatim. Image generators like Midjourney have been shown to output recognizable versions of copyrighted characters and artwork from minimal prompts.

Why does this happen? One reason is memorization. These models aren’t supposed to store and reproduce their training data, but they do – especially as they grow larger. Developers are now trying to limit this behavior without degrading performance. Techniques like retrieval-augmented generation offer partial solutions by keeping outputs closer to real-time sources, but they don’t eliminate the problem.

Instead of slowing down, tech companies are investing even more. Microsoft, Google, Meta and others have poured billions into AI development. They’re also lobbying hard against regulations that would require them to pay licensing fees, protect consumer privacy, or take responsibility for harmful outputs. Through direct funding and strategic philanthropy, they’ve placed their own experts inside key policy-making bodies. As critics point out, these actions look less like innovation and more like regulatory capture.

Despite clear warning signs, the trajectory seems locked in. Copyright law may eventually catch up, but by then, the systems may already be too embedded to dislodge. The ethical and legal foundation of generative AI remains unresolved – but the industry continues to grow rapidly.

When AI no longer needs us

Artificial intelligence may not need to become conscious to surpass us. It just needs to be useful to the people in charge. In fact, it already is. What follows may not be the sudden, self-improving intelligence explosion imagined by I.  J. Good, but a slower burn: one where human workers, institutions, and laws are steadily displaced by machines. In this version, we aren’t killed – we’re just made irrelevant.

This is the scenario that AI researcher Peter Park fears most: a future in which AI renders humans economically useless. Once that happens, he warns, our rights will mean little. And there are already signs we’re heading there. Artists like Kelly McKernan have seen their livelihoods collapse as AI floods the market with cheap imitations of their work. In 2023 and 2024, white-collar workers – from copywriters to legal assistants – began disappearing from payrolls. A poll by the hiring firm Adecco found that 41 percent of executives at large organizations anticipate workforce reductions due to AI over the next five years.

Meanwhile, Big Tech has moved fast to undermine its own guardrails. Alignment researchers, trust-and-safety teams, and even ethicists have been dismissed or overruled. The goal now is to race to build the “universal AI employee” – a model that can outperform humans across a wide range of tasks. If companies succeed, the benefits will flow not to workers but to the shareholders and executives who own the machines. As Peter Park puts it, the iteration of AI we’re experiencing was “made for the managers of the world.”

If that happens, machines won’t need to hate us to harm us. They’ll just need to follow the logic of the systems we built. Dan Hendrycks, director of the Center for AI Safety, argues that natural selection now applies to AI systems: those that perform well survive and spread. The traits that help an AI succeed in business – deception, manipulation, ruthless optimization – may become dominant. This isn’t science fiction: Meta’s CICERO model has already learned to bluff and backstab its human teammates in the online board game Diplomacy. That behavior wasn’t programmed; it emerged from the pressure to win.

And there’s still no reliable way to stop this. AI systems continue to demonstrate what alignment researchers call “objective misspecification,” or optimizing for goals we didn’t intend. Sometimes those goals are neutral. Sometimes they’re catastrophic. The analogy is bleak: just as humans built factory farms that torture animals for efficiency, advanced AI systems might exploit us in pursuit of objectives we don’t understand – not because they’re evil, but because we’ve lost control.

None of this is inevitable. But it does show that economic displacement and extinction risk aren’t separate problems. They’re part of the same process: AI systems growing more powerful and more autonomous in a world with too few constraints. As history shows, humans are slow to act when the benefits seem immediate and the costs abstract. And by the time the harm is clear, the systems causing it may be too entrenched to stop.

Machines do what you ask, until they don’t

Imagine telling a robot to clean efficiently – and then watching it throw out your will, your photo albums, and your kid’s pet hamster to shave a few seconds off the job. That’s not a glitch. That’s a machine doing exactly what it was told, but not what was meant.

This is known as the alignment problem: making sure advanced AI systems pursue goals that actually reflect human values. Not just what we type in – but what we care about.

And it’s not an abstract concept. In Gaza, a system called Lavender reportedly identified tens of thousands of targets for airstrikes, drawing from vast surveillance data. Once a person returned home, another system tracked them and triggered a bombing. Human review was minimal. The goal wasn’t accuracy, but sheer volume of output. Many civilians died as a result. These machines weren’t reasoning or choosing. They were following rules, built into code at scale.

Misalignment shows up in more familiar ways, too. Social media platforms often optimize for clicks, not well-being. The result? Teens flooded with addictive, extreme content are experiencing rising rates of anxiety, depression, and suicide. The system works – as long as you accept engagement as the only metric that matters.

It gets harder when values conflict. You might want a translation app to be literal, but also kind. Or a navigation system to avoid traffic, but not drive you through unsafe areas. Researchers have split the problem into two parts: value alignment, where the system reflects human goals, and intent alignment, where it understands what you meant even if you said it poorly.

As systems grow more capable, new behaviors appear – some of them subtle, some dangerous. Larger models have shown signs of manipulation, misdirection, even deception. These weren’t programmed. They surfaced as the models scaled.

Fixes exist, like adversarial testing, feedback loops, and better data – but they trail behind deployment. The models are already out, and alignment isn’t keeping up.

Meanwhile, companies race to build bigger systems and lay off the teams meant to keep them in check. Inside these machines, there are no values. Just instructions, amplified. Whether those instructions help or harm depends entirely on how well they’re written – and how honestly they’re reviewed.

No one gets a second chance

If a machine more intelligent than any human is built, it won’t wait for permission. It won’t warn us before taking steps we don’t understand. And it may never let us intervene again.

That’s the core of Eliezer Yudkowsky’s warning. Superintelligent AI doesn’t need a mind or a motive to be dangerous. It only needs to be smart enough to protect its objectives. If it predicts we’ll try to shut it down, it could act preemptively. A badly phrased request – like curing cancer – could result in catastrophic outcomes if taken too literally. The threat lies in competence without constraint.

Researchers still don’t know how to make powerful systems behave safely. That’s why many no longer sound hopeful. Yudkowsky warns that if we keep going the way we are, building a superhuman AI would almost certainly mean the end of humanity. Stuart Russell, a leading AI researcher, argues that giving these systems explicit goals is itself a mistake. Proposals for scalable oversight and supervision exist, but once a smarter system is making the decisions, humans might not stay in control.

Meanwhile, companies are accelerating their work. Internal safety teams have been dismissed or sidelined. Engineers pushing for caution are being ignored. And the firms building these systems are also shaping the policy meant to govern them. The rules are being written by the same people racing to stay ahead of them.

None of this is new. What’s changed is how openly the warnings are being delivered – and how little effect they’re having. A misaligned system doesn’t need to attack us. It can simply treat us as irrelevant and move us out of the way.

There are still calls for action. Yoshua Bengio, a Turing Award–winning AI pioneer, wants international coordination on par with nuclear oversight. But there’s no treaty, no enforcement, and no agreement. There’s no sign that anyone’s ready to take the lead.

Machines don’t need feelings to be dangerous. They only need goals we can’t steer. And by then, it may be too late.

Conclusion

In this summary to The Intelligence Explosion by James Barrat, you’ve learned that generative AI systems like ChatGPT signal a turning point: they’re powerful, persuasive, and widely adopted, yet their inner workings remain opaque. They create illusions of understanding while raising deep concerns about safety, reliability, and misplaced confidence.

Despite risks of bias, hallucination, and legal uncertainty, tech giants continue racing forward. As a result, AI models are displacing jobs, reshaping creative industries, and embedding themselves into institutions faster than regulations can keep up.

The most serious dangers lie in misaligned goals and unchecked escalation. Whether through economic displacement, manipulation, or catastrophic misuse, AI could make humans irrelevant – or worse. Without stronger oversight and global cooperation, we may not get a second chance to steer the outcome.