Silent Stigma

Mapping Emotional and Coping Language at Scale

Analyzing public online discourse to uncover aggregate emotional patterns and coping strategies — without labeling individuals.

Explore the Observatory
-
Comments analyzed
-
Emotional clusters identified
-
Aggregate exports
-
Platform sessions

How It Works

Public Comments
Collected from mental health advocacy channels
Sentence Embeddings
Transformer-based semantic encoding
Unsupervised Clustering
Density-based pattern discovery
Emotional Landscapes
Dimensionality reduction for visualization
Coping Strategies
Pattern extraction and keyword analysis

Ethical Commitments

This platform analyzes only publicly available comments from YouTube. All processing occurs at the aggregate level. No individual-level classification, tracking, or diagnosis is performed.

The system identifies patterns in language use across large volumes of text. It does not attempt to infer personal characteristics, mental health status, or risk factors for any individual commenter.

All analysis is designed for research and educational purposes. The platform is not intended for clinical use, diagnostic purposes, or individual assessment.

Data collection respects platform terms of service and focuses exclusively on public comments from mental health advocacy and education channels.

Research Orientation

SilentStigma uses unsupervised machine learning to identify naturally emerging patterns in mental health discourse. The pipeline employs transformer-based sentence embeddings (Sentence-BERT) to encode semantic meaning, then applies density-based clustering (HDBSCAN) to group similar expressions without predefined categories.

Dimensionality reduction via UMAP enables visualization of high-dimensional semantic spaces in two dimensions, revealing the landscape of discourse patterns. Pattern extraction combines keyword analysis (KeyBERT) with curated lexicons to identify coping strategies, emotional language, and stigma indicators.

The methodology prioritizes transparency and reproducibility. All configuration parameters are versioned, and the pipeline can be run end-to-end from raw comments to final visualizations. The system makes no assumptions about what patterns will emerge, allowing the data to speak for itself.

This approach is designed for computational social scientists, mental health communication researchers, and methodologists interested in unsupervised NLP pipelines for discourse analysis.