Author Labeling
Double-Blind Review
ACL 2026 Submission

Authors Should Label
Their Own Documents

We introduce author labeling, an annotation methodology where authors label their data at the moment of creation. This page provides an interactive demonstration of the system and tools for exploring our collected dataset.

Summary

Third-party annotation is the prevailing method for labeling text, but egocentric information such as sentiment and belief can only be approximated by external annotators who lack access to the author's internal state. We deploy an author labeling system in a live chatbot that monitors user messages, generates contextual annotation questions, and records author responses in real-time.

Dataset
~900K messages from 20K+ users
Downstream Task
Product recommendation CTR
Comparison
vs. MTurk, expert, LLM annotators

Author Labeling Demo

The system supports three modes for generating annotation questions. Select a mode below to see how each works.

1. Select a topic

Topic triggered by health-related queries

2. Example messages that trigger this topic

3. Resulting author labeling interaction

Chatbot Interface
want to switch to a new health insurance provider to cover dental
?
How important is dental coverage in your decision?

Pipeline Overview

1. Monitor

LLM monitors incoming messages for task-relevant content

2. Generate

Contextual annotation question is created based on mode

3. Record

Author response is stored with message context

Data Exploration

Explore the ~900K messages collected through the author labeling system. Use the Topics and Audiences tools on the left panel to browse topic distributions and user demographics from the dataset.

Contact Form Hidden

Contact information is hidden for double-blind review.

Anonymous Submission

Contact Us

Interested in using author labeling for your research? Get in touch.

Submit