Today, the 1st of February 2026, I attended the meet-and-greet session for the fellowship conducted by AI Safety Collab . This post is part of a small but intentional experiment: to keep a public, honest record of what I am learning, thinking about, and struggling with as I move through this programme.
The opening session was, on the surface, fairly straightforward. We were introduced to how the fellowship is structured, where to access resources, and whom to reach out to for academic, content-related, or logistical support. But beneath that administrative clarity, something more important was already taking shape: a sense of community.
A Community-Oriented Start
What stood out to me immediately was the tone. The group felt enthusiastic, friendly, and genuinely supportive. There was no competitive undercurrent—just people from very different backgrounds, at different stages of their lives, showing up with curiosity and care.
We were broken into smaller discussion groups and asked questions that were not purely technical, but reflective:
– How do we experience large language models in practice?
– What assumptions about AI do we think are misleading or outright false?
– How do we usually learn new things?
– What questions about AI safety keep us awake?
– What do we do outside of work?
– What has surprised us most about AI safety so far?
– What can we offer this community, and what are we hoping to receive?
These questions mattered. They quietly signalled that this fellowship is not only about knowledge acquisition, but also about how we think, how we learn, and how we relate to one another.
Some Threads That Stayed With Me
One discussion around large language models stayed with me. I spoke about how LLMs can sometimes go down a rabbit hole—producing fluent, confident answers while slowly drifting away from the original intent or context of a question. That gap between surface coherence and underlying alignment feels central to many AI safety concerns.
Another moment that struck me was a discussion about language. It was pointed out by fellow participant that safety mechanisms often weaken when models are prompted in lesser-known or low-resource languages. That observation felt quietly profound. Safety, it seems, is not only about scale and power, but also about inclusion, linguistic diversity, and whose contexts are taken seriously by default.
When the conversation moved to what surprised us about AI safety, I found myself referring to the paper by Dario Amodei on “Machines of Loving Grace”. What surprised me was not a single technical insight, but the broader realisation that conversations about AI safety often end up being conversations about humanity: intelligence, care, control, responsibility, and our tendency to outsource moral thinking to systems that cannot carry it.
Why I’m Here, and What I’m Hoping For
When asked what I could offer the community, my answer was simple and honest: writing, reflection, accountability, and presence. I listen well. I think in sentences. I notice patterns. And I am at a point where I am actively pivoting into AI Safety and Governance, not pretending to have everything figured out.
What I hope to get in return is equally simple: a thinking community. A space where uncertainty is allowed, where questions are not rushed into answers, and where learning is collective rather than solitary.
Looking Ahead
This was only the first session, but it already set the emotional and intellectual tone for what’s to come. I am entering this fellowship not as an expert, but as someone who cares deeply about how powerful technologies intersect with human values, governance, literature, language, and everyday life.
I plan to continue documenting this journey here—not as polished conclusions, but as evolving thoughts. If AI safety is ultimately about alignment, then perhaps this kind of slow, reflective alignment with one’s own thinking is also part of the work.
More soon.
Have a lovely day!
Trishna
