Ahmedabad AI Alignment Group

Reading list

The papers, courses and books we keep coming back to in sessions. You don't need to read everything — pick the section that matches where you are.

Orientation

If you're new to all of this, start here. No technical background needed.

Free, structured courses on alignment and governance. The single best on-ramp; several of us went through it ourselves.

A frontier lab's own account of when and why AI could become dangerous, and what they think can be done about it.

One sentence, signed by many of the field's leading researchers and lab heads. Useful as a marker of how mainstream the concern has become.

AISafety.com

Field map

A living map of the AI safety ecosystem — organisations, courses, funders, communities. Good for finding your next step.

Technical alignment

The core research problem: how do you make capable AI systems reliably do what their designers and users intend?

The paper that translated vague worries into concrete research problems — reward hacking, safe exploration, robustness to distribution shift. Still a useful vocabulary.

Transformer Circuits

Anthropic, ongoing

Mechanistic interpretability — actually opening up neural networks to see what's computed inside. Start with "Toy Models of Superposition."

Where the field argues with itself in public. Uneven, but the best place to watch ideas get stress-tested.

Risk & strategy

The bigger-picture arguments — for and against treating advanced AI as a potential catastrophe.

The most careful version of the x-risk argument, built as an explicit chain of premises with probabilities attached — so you can see exactly where you'd disagree.

A taxonomy of how things could go badly — malicious use, racing dynamics, organisational failure, rogue systems — beyond just the classic misalignment story.

AI 2027

Kokotajlo et al., 2025

A concrete, contested scenario forecast of the next few years. Read it less as prediction and more as a way to make your own expectations explicit.

Governance & India

Policy levers, international coordination, and the view from here.

GovAI research

Centre for the Governance of AI

The central academic hub for AI governance — compute governance, frontier-lab policy, international agreements.

IndiaAI

Govt. of India

The national AI programme — useful for understanding where Indian state capacity and attention currently sit.

Books

For longer-form treatment. Links go to overviews; copies circulate within the group.

The Alignment Problem

Brian Christian, 2020

The best narrative introduction — how alignment research actually emerged, told through the people doing it.

Human Compatible

Stuart Russell, 2019

One of AI's founding textbook authors argues the field's standard objective is the problem, and proposes a redesign.

Superintelligence

Nick Bostrom, 2014

The book that put x-risk on the map. Dated in places, but many of the core concepts still frame the debate.

Going deeper

If the reading has you wanting to contribute, these are the doors people in our group have actually used or recommend.

ARENA

Technical bootcamp

An intensive technical curriculum (interpretability, RL, evals) — also excellent for self-study; the materials are open.

Apart Research

Hackathons & sprints

Regular global research sprints and hackathons — a low-cost way to try doing safety research before committing to it.

The most complete listing of safety-relevant roles across research, engineering, policy and operations.