Analyzing public online discourse to uncover aggregate emotional patterns and coping strategies — without labeling individuals.
This platform analyzes only publicly available comments from YouTube. All processing occurs at the aggregate level. No individual-level classification, tracking, or diagnosis is performed.
The system identifies patterns in language use across large volumes of text. It does not attempt to infer personal characteristics, mental health status, or risk factors for any individual commenter.
All analysis is designed for research and educational purposes. The platform is not intended for clinical use, diagnostic purposes, or individual assessment.
Data collection respects platform terms of service and focuses exclusively on public comments from mental health advocacy and education channels.
SilentStigma uses unsupervised machine learning to identify naturally emerging patterns in mental health discourse. The pipeline employs transformer-based sentence embeddings (Sentence-BERT) to encode semantic meaning, then applies density-based clustering (HDBSCAN) to group similar expressions without predefined categories.
Dimensionality reduction via UMAP enables visualization of high-dimensional semantic spaces in two dimensions, revealing the landscape of discourse patterns. Pattern extraction combines keyword analysis (KeyBERT) with curated lexicons to identify coping strategies, emotional language, and stigma indicators.
The methodology prioritizes transparency and reproducibility. All configuration parameters are versioned, and the pipeline can be run end-to-end from raw comments to final visualizations. The system makes no assumptions about what patterns will emerge, allowing the data to speak for itself.
This approach is designed for computational social scientists, mental health communication researchers, and methodologists interested in unsupervised NLP pipelines for discourse analysis.