Authors Should Label
Their Own Documents
We introduce author labeling, an annotation methodology where authors label their data at the moment of creation. This page provides an interactive demonstration of the system and tools for exploring our collected dataset.
Summary
Third-party annotation is the prevailing method for labeling text, but egocentric information such as sentiment and belief can only be approximated by external annotators who lack access to the author's internal state. We deploy an author labeling system in a live chatbot that monitors user messages, generates contextual annotation questions, and records author responses in real-time.
Author Labeling Demo
The system supports three modes for generating annotation questions. Select a mode below to see how each works.
1. Select a topic
Topic triggered by health-related queries
2. Example messages that trigger this topic
3. Resulting author labeling interaction
Pipeline Overview
LLM monitors incoming messages for task-relevant content
Contextual annotation question is created based on mode
Author response is stored with message context
Data Exploration
Explore the ~900K messages collected through the author labeling system. Use the Topics and Audiences tools on the left panel to browse topic distributions and user demographics from the dataset.
Contact Form Hidden
Contact information is hidden for double-blind review.
Contact Us
Interested in using author labeling for your research? Get in touch.