Boundary-Aware Dialogue Analysis
An experimental framework for studying social-boundary failures in LLM-based agents.
Role: Researcher · Timeline: Feb 2026 – Mar 2026
Large language model agents are increasingly deployed in conversational settings where they must respect social and professional boundaries — yet the failure modes of these boundaries are poorly characterized. This project builds an experimental framework to systematically probe and evaluate social-boundary failures in LLM conversations.
What I built
- An evaluation harness that runs controlled dialogue scenarios under varied prompt and persona conditions, capturing model responses for downstream analysis.
- A library of boundary-stress scenarios spanning workplace and interpersonal contexts (authority dynamics, disclosure pressure, role-confusion, emotional manipulation).
- An annotation pipeline for tagging boundary violations and quantifying failure rates across model families.
Findings
Results surface concrete patterns in where current instruction-tuned LLMs fail at maintaining appropriate social boundaries, providing a foundation for boundary-aware dialogue systems and safer agent design.
Designed for AI safety research and conversational agent evaluation.