Machine Learning and AI Alignment Researcher
I recently completed my PhD in Computer Science (AI/ML) at Colorado State University, advised by Dr. Nikhil Krishnaswamy at the SIGNAL Lab. I am currently an Applied ML Research Scientist at Adobe on the Brand Intelligence team. My PhD research, supported by DARPA, NSF, the U.S. Army Research Office, and ARPA-H, has appeared at NeurIPS, ACL, EMNLP, NAACL, and AAMAS. My PhD dissertation work on friction-aware alignment was highlighted in the Microsoft New Future of Work Report 2025 as a key direction for collaborative AI systems.
My research focuses on reinforcement learning for aligning language models to social intelligence and human preferences in multi-turn, multi-agent settings. I work across three themes: causal generalization — how agents transfer values to unseen situations and evaluate each other interactionally (Collaborate, Deliberate, Evaluate: How LLM Alignment Affects Coordinated Multi-Agent Outcomes, AAMAS 2026); principled credit assignment — learning objectives that attribute outcomes to the mechanisms that caused them, from token-level coalitional credit for search and recommendation (OSPO, preprint) to turn-level epistemic credit for multi-turn information-seeking agents (Epistemic Decision Processes) and counterfactual regularization for partner-aware agents (ICR, NeurIPS 2025); and grounded collaboration — turning dialogue signals into supervision for adapting to evolving preferences (FAAF, ACL 2025), with CRAFT as a grounded multi-agent coordination benchmark for evaluating pragmatic communication under partial information. As part of the NSF iSAT program, I led the initial training and evaluation of collaborative AI agents with the goal of delivering real-time learning support in K-12 classrooms across the US. I also led AxomiyaBERTa, the first monolingual Assamese transformer model for low-resource language processing.
CRAFT: Grounded Multi-Agent Coordination Under Partial Information Preprint A multi-agent coordination benchmark evaluating pragmatic communication under partial observability, revealing failure modes in frontier LLMs.
Owen-Shapley Policy Optimization for Generative Search LLMs Preprint A principled RL framework that redistributes sequence-level rewards via coalitional credit assignment for personalized search and recommendation.
Collaborate, Deliberate, Evaluate: How LLM Alignment Affects Coordinated Multi-Agent Outcomes AAMAS 2026 A counterfactual evaluation framework for measuring the marginal contribution of adding an agent to a collaborative team.
Learning” Partner-Aware” Collaborators in Multi-Party Collaboration
Neurips 2025 Main Track
A novel approach to learning partner-aware and intentional collaborator agents via counterfactual regularization for multi-agent collaboration.
Frictional Agent Alignment Framework: Slow Down and Don’t Break Things
ACL 2025
An LLM alignment framework for learning an optimal intervention agent that guides a group of collaborator agents.
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Preprint
A novel approach to combining reward learning with preference optimization in language models.
DPL: Diverse Preference Learning Without A Reference Model
NAACL Main 2025
Pioneering work on preference learning that eliminates the need for reference models while maintaining diversity.
Okay, Let’s Do This! Modeling Event Coreference with Generated Rationales 🏆
NAACL 2024 (Oral)
Novel approach to event coreference using LLM-generated rationales and knowledge distillation.
Code
“Any Other Thoughts, Hedgehog?” Linking Deliberation Chains ⭐
Findings of EMNLP 2024
Proposed a novel task of linking reasoning chains in multi-agent collaborative dialogues.
Code
Applied ML Research Scientist, Adobe Research Email: abhijnann@adobe.com | abhijnan.nath@colostate.edu