Exploring Mental Stress Expressions in Online Communities: A Subreddit Analysis

Tran Anh Tuan, Nguyen Huu Nghia, Tran Dai An, Dao Thi Thanh Loan


Objectives: This study aims to comprehensively explore trends, sentiments, and visualization of mental stress expressions in online communities, focusing on discussions within subreddits on the social media platform Reddit. Methods/Analysis: Advanced text analysis and statistical techniques are employed to achieve the study’s objectives. The research utilizes natural language processing (NLP) methods, sentiment analysis, and topic modeling to unravel the intricate layers of mental stress expressions found in posts across diverse subreddits. Additionally, engagement metrics, such as Redditors’ scores and the number of comments, are analyzed to discern distinctive information and patterns of interest. Findings: The research sheds light on prevalent trends, sentiments, and themes related to mental stress in online conversations before and after January 2020. The findings provide valuable insights into patterns of exciting topics, shared experiences of stress, coping mechanisms, and the significant role of virtual communities in offering support and understanding. Novelty/Improvement: The novelty lies in applying advanced text analysis techniques, including sentiment analysis with the majority voting method combining different machine learning techniques and topic modeling with semantic networks, to gain a deeper understanding of the dynamics of mental stress expressions in online communities. The research explores current patterns and distinguishes itself by examining temporal variations in stress-related posts and their correlation with engagement metrics, offering an innovative perspective on mental health discussions in the digital age.


Doi: 10.28991/HEF-2024-05-02-01

Full Text: PDF


Mental Stress; Sentiment Analysis; Public Opinion; Subreddits; Natural Language Processing.


Young, C. C., & Dietrich, M. S. (2015). Stressful life events, worry, and rumination predict depressive and anxiety symptoms in young adolescents. Journal of Child and Adolescent Psychiatric Nursing, 28(1), 35-42. doi:10.1111/jcap.12102.

Moore, C. M., & Chuang, L. M. L. (2017). Redditors revealed: Motivational factors of the Reddit community. Proceedings of the Annual Hawaii International Conference on System Sciences, Volumes: January, 2313–2322. doi:10.24251/hicss.2017.279.

Guntuku, S. C., Buffone, A., Jaidka, K., Eichstaedt, J. C., & Ungar, L. H. (2019). Understanding and measuring psychological stress using social media. Proceedings of the 13th International Conference on Web and Social Media, ICWSM 2019, 214–225. doi:10.1609/icwsm.v13i01.3223.

Beyari, H. (2023). The Relationship between Social Media and the Increase in Mental Health Problems. International Journal of Environmental Research and Public Health, 20(3), 2383. doi:10.3390/ijerph20032383.

Inamdar, S., Chapekar, R., Gite, S., & Pradhan, B. (2023). Machine Learning Driven Mental Stress Detection on Reddit Posts Using Natural Language Processing. Human-Centric Intelligent Systems, 3(2), 80–91. doi:10.1007/s44230-023-00020-8.

Abro, H. U., Shah, Z. S., & Abbasi, H. (2022). Analysis Of COVID-19 Effects on Wellbeing - Study of Reddit Posts Using Natural Language Processing Techniques. 2022 International Conference on Emerging Trends in Electrical, Control, and Telecommunication Engineering, ETECTE 2022 - Proceedings, 1–7. doi:10.1109/ETECTE55893.2022.10007300.

Low, D. M., Rumker, L., Talkar, T., Torous, J., Cecchi, G., & Ghosh, S. S. (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during COVID-19: Observational study. Journal of Medical Internet Research, 22(10), 22635. doi:10.2196/22635.

Saha, K., Kim, S. C., Reddy, M. D., Carter, A. J., Sharma, E., Haimson, O. L., & Choudhury, M. D. E. (2019). The language of LGBTQ+ minority stress experiences on social media. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1–22. doi:10.1145/3359191.

Febriansyah, M. R., Nicholas, Yunanda, R., & Suhartono, D. (2022). Stress detection system for social media users. Procedia Computer Science, 216, 672–681. doi:10.1016/j.procs.2022.12.183.

Shen, J. H., & Rudzicz, F. (2017). Detecting Anxiety through Reddit. Computational Linguistics and Clinical Psychology, 58–65.

Nayak, S., Mahapatra, D., Chatterjee, R., Parida, S., & Dash, S. R. (2022). A Machine Learning Approach to Analyze Mental Health from Reddit Posts. Smart Innovation, Systems and Technologies, 271, 357–366. doi:10.1007/978-981-16-8739-6_33.

Naseem, S. S., Kumar, D., Parsa, M. S., & Golab, L. (2020). Text mining of COVID-19 discussions on reddit. Proceedings - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020, 687–691. doi:10.1109/WIIAT50758.2020.00104.

Gao, S., Pandya, S., Agarwal, S., & Sedoc, J. (2021). Topic Modeling for Maternal Health Using Reddit. Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, 69–76.

Stevens, H. R., Acic, I., & Rhea, S. (2021). Natural language processing insight into LGBTQ+ youth mental health during the COVID-19 Pandemic: Longitudinal content analysis of anxiety-provoking topics and trends in emotion in lgbteens microcommunity subreddit. JMIR Public Health and Surveillance, 7(8), 29029. doi:10.2196/29029.

Zhu, J., Yalamanchi, N., Jin, R., Kenne, D. R., & Phan, N. H. (2023). Investigating COVID-19’s Impact on Mental Health: Trend and Thematic Analysis of Reddit Users’ Discourse. Journal of Medical Internet Research, 25, 46867. doi:10.2196/46867.

Papakyriakopoulos, O., Engelmann, S., & Winecoff, A. (2023). Upvotes? Downvotes? No Votes? Understanding the relationship between reaction mechanisms and political discourse on Reddit. Conference on Human Factors in Computing Systems - Proceedings, 1–28. doi:10.1145/3544548.3580644.

Rivera, I. (2019). RedditExtractoR: Reddit Data Extraction Toolkit. A collection of tools for extracting structured data from. Version: 3.0.9. RedditExtractoR archive. Available online: https://cran.r-project.org/web/packages/RedditExtractoR/index.html (accessed on March 2023).

Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., & Matsuo, A. (2018). Quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. doi:10.21105/joss.00774.

Kassambara, A. (2023). ggpubr: 'ggplot2' Based Publication Ready Plots. Version: 0.6.0. ggpubr archive. Available online: https://cran.r-project.org/package=ggpubr (accessed on March 2024).

Liu, B. (2020). Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, Second Edition. Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, Second Edition. Cambridge University Press, Cambridge, United Kingdom. doi:10.1017/9781108639286.

Tran, T. A., Duangsuwan, J., & Wettayaprasit, W. (2021). Novel framework for aspect knowledge base generated automatically from social media using pattern rules. Computer Science, 22. doi:10.7494/csci.2021.22.4.4028.

Tran, T. A., Duangsuwan, J., & Wettayaprasit, W. (2021). A new approach for extracting and scoring aspect using SentiWordNet. Indonesian Journal of Electrical Engineering and Computer Science, 22(3), 1731–1738. doi:10.11591/ijeecs.v22.i3.pp1731-1738.

Loria, S. (2018). textblob Documentation. Release 0.15, 26 April, 1-73. Available online: https://readthedocs.org/projects/ textblob/downloads/pdf/latest/ (accessed on March 2024).

Elbagir, S., & Yang, J. (2019). Twitter sentiment analysis using natural language toolkit and Vader sentiment. Lecture Notes in Engineering and Computer Science, 2239, 12–16.

Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. NAACL 2019, Annual Conference of the North American Chapter of the Association for Computational Linguistics Demonstrations, 54–59.

Fliege, H., Rose, M., Arck, P., Walter, O. B., Kocalevent, R. D., Weber, C., & Klapp, B. F. (2005). The Perceived Stress Questionnaire (PSQ) reconsidered: Validation and reference values from different clinical and healthy adult samples. Psychosomatic Medicine, 67(1), 78–88. doi:10.1097/01.psy.0000151491.80178.78.

Bojanowski, M. (2016). Creating Alluvial Diagrams. The Comprehensive R Archive Network. Available online: https://cran.r-project.org/web/packages/alluvial/vignettes/alluvial.html (accessed on March 2024).

R Programming Language. (2022). The R Project for Statistical Computing. Available online: https://www.r-project.org/about.html. (accessed on March 2024).

Zhang, S., Liu, M., Li, Y., & Chung, J. E. (2021). Teens’ social media engagement during the covid-19 pandemic: A time series examination of posting and emotion on reddit. International Journal of Environmental Research and Public Health, 18(19), 10079. doi:10.3390/ijerph181910079.

Veselovsky, V., & Anderson, A. (2023). Reddit in the Time of COVID. Proceedings of the International AAAI Conference on Web and Social Media, 17, 878–889. doi:10.1609/icwsm.v17i1.22196.

Yan, T., & Liu, F. (2022). COVID-19 sentiment analysis using college subreddit data. PLoS ONE, 17(11 November), 275862. doi:10.1371/journal.pone.0275862.

Ismail, Q., Obeidat, R., Alissa, K., & Al-Sobh, E. (2022). Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Using Ensemble Learning. 2022 13th International Conference on Information and Communication Systems, ICICS 2022, 321–327. doi:10.1109/ICICS55353.2022.9811132.

Czymara, C. S., Langenkamp, A., & Cano, T. (2021). Cause for concerns: gender inequality in experiencing the COVID-19 lockdown in Germany. European Societies, 23(S1), S68–S81. doi:10.1080/14616696.2020.1808692.

Lemay, D. J., Baek, C., & Doleck, T. (2021). Comparison of learning analytics and educational data mining: A topic modeling approach. Computers and Education: Artificial Intelligence, 2, 100016. doi:10.1016/j.caeai.2021.100016.

Rosenberg, J. M., & Krist, C. (2021). Combining Machine Learning and Qualitative Methods to Elaborate Students’ Ideas About the Generality of their Model-Based Explanations. Journal of Science Education and Technology, 30(2), 255–267. doi:10.1007/s10956-020-09862-4.

Gaur, L., Jhanjhi, N. Z., Bakshi, S., & Gupta, P. (2022). Analyzing Consequences of Artificial Intelligence on Jobs using Topic Modeling and Keyword Extraction. Proceedings of 2nd International Conference on Innovative Practices in Technology and Management, ICIPTM 2022, 435–440. doi:10.1109/ICIPTM54933.2022.9754064.

Yang, L. (2023). Mining and visualizing large-scale course reviews of LMOOCs learners through structural topic model. PLoS ONE, 18(5 May), 284463. doi:10.1371/journal.pone.0284463.

Guerra, A. (2023). Sentiment analysis for measuring hope and fear from Reddit posts during the 2022 Russo-Ukrainian conflict. Frontiers in Artificial Intelligence, 6, 1163577. doi:10.3389/frai.2023.1163577.

Full Text: PDF

DOI: 10.28991/HEF-2024-05-02-01


  • There are currently no refbacks.

Copyright (c) 2024 Tran Anh Tuan, Nguyen Huu Nghia, Tran Dai An, Dao Thi Thanh Loan