Boundary-Aware Dialogue Analysis

An experimental framework for studying social-boundary failures in LLM-based agents.

Role: Researcher · Timeline: Feb 2026 – Mar 2026

Large language model agents are increasingly deployed in conversational settings where they must respect social and professional boundaries — yet the failure modes of these boundaries are poorly characterized. This project builds an experimental framework to systematically probe and evaluate social-boundary failures in LLM conversations.

What I built

  • An evaluation harness that runs controlled dialogue scenarios under varied prompt and persona conditions, capturing model responses for downstream analysis.
  • A library of boundary-stress scenarios spanning workplace and interpersonal contexts (authority dynamics, disclosure pressure, role-confusion, emotional manipulation).
  • An annotation pipeline for tagging boundary violations and quantifying failure rates across model families.

Findings

Results surface concrete patterns in where current instruction-tuned LLMs fail at maintaining appropriate social boundaries, providing a foundation for boundary-aware dialogue systems and safer agent design.

Designed for AI safety research and conversational agent evaluation.