This project focuses on conducting sentiment analysis on discussions related to abortion from two Reddit communities: r/prochoice and r/prolife. The sentiment analysis aims to understand the attitudes, trends, and themes prevalent in these communities regarding abortion-related discussions.
Several methodologies were employed in this study:
- Python Reddit API Wrapper (PRAW): Used to interact with Reddit's API, facilitating data retrieval from subreddits, posts, and comments.
- Valence Aware Dictionary and Sentiment Reasoner (VADER): Employed for sentiment analysis, VADER is capable of handling emoticons, slang, and informal language prevalent in social media.
- Latent Dirichlet Allocation (LDA): Utilized for topic modeling to identify underlying themes in a collection of documents.
- Bidirectional Encoder Representations from Transformers (BERTopic): Leveraged for topic modeling and deep learning, providing insights into prevalent themes in NLP applications.
Data was collected from r/prochoice and r/prolife from January 2022 to March 2023. The data was divided into quarterly segments for sentiment analysis based on the timeline. Key data points include comments, date, post upvotes, and comment upvotes.
The analysis was conducted in four phases:
- Data Gathering and Cross-Validation: Ensured accurate and relevant data retrieval from Reddit communities, cleaning redundant data, and validating fetched data.
- Sentiment Analysis and Visualization: Utilized VADER model for sentiment analysis and visualized sentiment distribution.
- Timeline Sentiment Analysis: Analyzed sentiment distribution over timeframes to understand sentiment evolution.
- Topic Modelling: Identified prominent themes using LDA model, further refined using BERTopic for precise theme assignment.
- Sentiment Distribution: Both communities predominantly expressed negative sentiments, with r/prolife showing an increase in positive comments over time.
- Attitude Comparison: r/prolife exhibited a slightly higher percentage of negative comments compared to r/prochoice.
- Changes in Views: Comments from r/prolife initially displayed more negative sentiments but transitioned to more positive sentiments over time.
- Frequently Used Words: Word clouds revealed prevalent keywords in both communities' comments, reflecting their core discussions.
- Common Themes: BERTopic identified themes such as 'human women life' in pro-choice comments and 'abortion pro life' in pro-life comments.
The study highlights the disparities in opinions between the r/prochoice and r/prolife communities regarding abortion. Despite limitations such as potential bias and external factors influencing sentiments, the analysis provides insights into prevailing attitudes, trends, and themes within these communities.