Machine Learning and AI Alignment Researcher
I am a PhD candidate in Computer Science (AI/NLP) at Colorado State University, advised by Dr. Nikhil Krishnaswamy at the SIGNAL Lab. My research focuses on training AI systems for effective multi-agent collaboration through friction-aware alignment and partner-aware learning. I am grateful to be supported by the Wim Böhm Ph.D. Fellowship in Computer Science in 2025, the Evolutionary Computing and Artificial Intelligence Fellowship in 2024 and multiple DARPA grants including the Friction for Accountability in Conversational Transactions (FACT) program. In summer-fall 2025, I worked in RL for personalized recommendation at Amazon Science. I also worked with Cresta Intelligence on robust email writing agents using multi-turn RL in summer 2025. Previously, I contributed to efficient preference alignment in LLMs for healthcare applications at Optum AI.
I am currently seeking Full-Time Research Scientist/Machine Learning Engineer positions.
My work brings three strands together to address long-horizon, multi-turn alignment of LLMs in multi-agent settings — whether the agent is human or machine. The broader goal is to move toward true agency: systems that reason about consequences, interactions, and values over time, rather than optimizing myopic reward signals.
First — causal generalization. I study how LLMs can transfer an understanding of user values to unseen situations, especially during long collaborations where trajectories diverge and standard evaluations (LLM judges, static reward models) break down. Instead of evaluating agents in isolation, my work studies them interactionally — asking how behavior changes when agents respond to one another. In Roleplay Collaboration, I introduced a counterfactual evaluation framework that simulates alternative dialogue loops to measure the marginal contribution of adding a new agent to the team — particularly important when adding an agent has cost and uncertain ripple effects.
Second — principled credit assignment. I design learning objectives that attribute outcomes to the mechanisms that actually caused them, while discounting misleading or adversarial contributions. This allows policies to reason about short- vs. long-term tradeoffs transparently. In Interruptible Collaborative Roleplayer (NeurIPS 2025), I developed an RL objective that operationalizes intentionality: models learn to remain consistent under counterfactual contexts, enabling constrained, stable policy updates when we already know how the agent should behave.
Third — cognitively grounded collaboration. Drawing from BDI, pragmatics, and cognitive modeling, I turn dialogue signals into richer supervision so systems can adapt to evolving, user-specific preferences. Real collaborations often stall in “frictive states” — moments of belief misalignment or uncertainty. In the Frictional Agent Alignment Framework (ACL 2025), I introduced an offline algorithm that detects and models such moments, helping LLMs resolve disagreement and guide groups toward common ground.
In the past, I led the development of AxomiyaBERTa, the first monolingual Assamese transformer-based language model, which set new benchmarks for low-resource language processing by leveraging Assamese-specific phonological signals in transfer learning.
Learning” Partner-Aware” Collaborators in Multi-Party Collaboration
Neurips 2025 Main Track
A novel approach to learning partner-aware and intentional collaborator agents via counterfactual regularization for multi-agent collaboration.
Frictional Agent Alignment Framework: Slow Down and Don’t Break Things
ACL 2025
An LLM alignment framework for learning an optimal intervention agent that guides a group of collaborator agents.
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Preprint
A novel approach to combining reward learning with preference optimization in language models.
DPL: Diverse Preference Learning Without A Reference Model
NAACL Main 2025
Pioneering work on preference learning that eliminates the need for reference models while maintaining diversity.
Okay, Let’s Do This! Modeling Event Coreference with Generated Rationales 🏆
NAACL 2024 (Oral)
Novel approach to event coreference using LLM-generated rationales and knowledge distillation.
Code
“Any Other Thoughts, Hedgehog?” Linking Deliberation Chains ⭐
Findings of EMNLP 2024
Proposed a novel task of linking reasoning chains in multi-agent collaborative dialogues.
Code
Email: abhijnan.nath@colostate.edu
Department of Computer Science
Colorado State University
Fort Collins, CO 80523