Eva · legaltech-brain

# Who Teaches the Machines What to Know? > A book, a paper, and a conversation that's been living in my head. [Read on Substack](https://lawwhatsnext.substack.com/p/who-teaches-the-machines-what-to) · 2026-04-12 · Law What's Next --- Hey friends 👋 I’ve fallen down a rabbit hole (but help is on its way!). In early 2025 I picked up Brian Christian’s The Alignment Problem — a book about what happens when we try to encode human values into machine learning systems. It was published in 2020, which in AI terms makes it practically ancient. But it’s one of those books that I don’t think loses relevance. I even wrote about my first impressions here (last year). Christian’s central provocation is deceptively simple: when we try to tell machines what we want, we discover we can’t clearly articulate what we want ourselves. The machine becomes a mirror, and the reflection isn’t always flattering. He tells the story of the COMPAS algorithm — a tool used across US courts to predict whether criminal defendants would reoffend. It was sold as objective. Data-driven. Neutral. Except it wasn’t. It appeared to assess Black and White defendants differently, and the biases weren’t bugs in the code — they were features of the historical data the system had been trained on. The data was never neutral. It was a record of the past, including its injustices. What struck me — and this is probably the curious psychology graduate in me procrastinating — was Christian’s exploration of how reinforcement learning mirrors what we know about human motivation and reward. His argument that pure optimisation without curiosity and exploration leads to brittle, narrow outcomes felt less like a technical observation and more like a description of... well, a lot of professional environments I’ve worked in. I put the book down thinking: this is the foundational question underneath everything I’ve been thinking about — our agentic future, privacy, data governance, state surveillance, all of it. The alignment problem isn’t a niche AI safety concern. It’s the question of whether the systems we’re building actually reflect what we care about. And whether we even know what that is. Then, a few weeks ago, a paper landed that felt to me like it was picking up exactly where Christian’s book had landed with me — and ran a few steps further. “Architecting Trust in Artificial Epistemic Agents“ was published in March 2026 by a team at Google DeepMind and Google Research. It’s a mouthful of a title, but the core argument is this: AI systems are no longer just tools we use to find information. They are becoming active participants in how knowledge is created, curated, and shared. The authors call them “epistemic agents” — entities that autonomously pursue knowledge goals and shape our shared information environment. If that sounds abstract, it isn’t. Think about how you used AI last week. Did you ask it to summarise something? Research a question? Draft advice? Every time we do that, we’re delegating a piece of our judgment to a system whose reasoning we can’t fully see. The paper argues this delegation — multiplied across millions of users, compounded by AI agents talking to other AI agents — creates risks we don’t yet have frameworks to manage. It is of particular relevance to this audience - we are after all big benefactors of the “knowledge economy” that is presently being reengineered and disrupted. The paper validates several of Christian’s original concerns (bias in training data, the opacity of model reasoning, the gap between intention and outcome) but it goes significantly further. Three ideas in particular have been occupying my thinking: Cognitive deskilling. The worry that the more AI thinks for us, the worse we get at thinking for ourselves. Not just individually, but generationally. If junior lawyers, analysts, and compliance professionals are offloading cognitive work to AI from day one, what happens to the pipeline of professional expertise in ten years? It is something that many across the profession are beginning to consider. We even hosted Lucie Allen (the soon to be CEO of BARBRI, the legal education provider) to discuss it on a podcast last year. Epistemic drift. A scenario where AI agents generate information, other AI agents ingest and amplify it, errors compound recursively, and no human is in the loop to catch it. The paper describes this contamination of our “epistemic commons” — the shared sources of ground truth that both humans and machines rely on. When the information environment itself becomes unreliable, what does “verification” even mean? The content swamp. AI-generated content doesn’t arrive into a neutral information environment. It enters a knowledge ecosystem that is already deeply polarised, algorithmically sorted, and increasingly hostile to nuance. The internet, once celebrated as a democratising force; has since been weaponised for censorship, state-sponsored disinformation, and engagement-driven radicalisation. I’m telling you all this because last week Alex and I enjoyed a conversation with one of the paper’s authors. Bilva Chandra is a researcher who has spent a significant part of her career working on AI safety, ethics, and governance — both on the frontier lab and public interest/policy side — at OpenAI, RAND, the US AI Safety Institute (now CAISI), and most recently DeepMind. She’s also a Fellow at the AI and Democracy Foundation. In short, she’s been thinking about these problems at the frontier, across institutions, for most of her career. Subscribe now Alex and I sat down with Bilva for a conversation on Law://WhatsNext to explore what this paper means for people like us — lawyers, privacy professionals, compliance specialists, and anyone advising organisations on how to govern AI responsibly. Our conversation ranged widely — from the practical challenge of establishing trust in AI systems at enterprise scale, to the deeper worry about what happens to professional judgment when cognitive work is increasingly offloaded to machines. Bilva described the current “slop economy” — a spectrum running from harmless brain rot to sophisticated state-sponsored disinformation — and the sobering reality that technical solutions like watermarking and content provenance standards, while promising, still cover only a sliver of the ecosystem. We explored the paper’s concept of “knowledge sanctuaries” — protected, verifiable sources of ground truth — and whether that idea can survive contact with a society that increasingly struggles to agree on what counts as credible. Perhaps most provocatively, we asked whether frontier AI labs risk repeating social media’s mistakes: building engagement-driven products that silo users rather than challenge them. Bilva’s answer was characteristically direct — it all comes down to the business model. We will release the conversation through the podcast tomorrow at 12:00 GMT. If you’re not already subscribed, now’s the time 👀 Our conversation with Bilva will be available here, on Spotify, Apple Podcasts, or wherever you enjoy your podcasts.. The conversation only affirmed my belief that the alignment problem isn’t just a Silicon Valley concern anymore. It’s a governance question, a professional responsibility question, and increasingly a question about what kind of knowledge ecosystem we want to inhabit. Christian’s book made me care about this. The conversation with Bilva made me realise the window for shaping the answer is narrower than I thought. Best, Tom Thanks for reading Law://WhatsNext! This post is public so feel free to share it. Share

Who Teaches the Machines What to Know?

Topics

Article