Skip to content


Repository files navigation


Analysis of tweets during delhi riots.


To collect data from an API/non static source, store and preprocess it and make preliminary analysis.


Conduct social media analysis for Delhi Riots to conduct sentiment analysis and user profiling.


Following complete lockdown of Indian-Administered-Kashmir on August 5, 2019, through abrogation of Article 370 of the Indian constitution which gave autonomy to the region, the government of India, on December 11, 2019, passed a controversial bill called "Citizenship Amendment Bill", which aimed to provide citizenship to non-muslim minorities through naturalization. These two events were opposed nationally and internationally. Fueling religious radicalization, these events led riots in major cities in the country, more specifically Delhi, capital of India. These riots are reffered to as Delhi Riots.

EU DisinfoLab, a Brussels based NGO, focused on tackling disinformation campaigns targeting EU, on November 26, 2019, released a report titled, "Uncovered: 265 coordinated fake local media outlets serving Indian interests". This report further raised questions on the authenticity of online content and this extends to content on social media. Governments and lobbyists have been using social media to stir public perception.

EU DisinfoLab, a Brussels based NGO, focused on tackling disinformation campaigns targeting EU, on November 26, 2019 realeased a report titled, "Uncovered: 265 coordinated fake local media outlets serving Indian interests". This report further raised questions on authenticity of online content and this extends to content on social media. Governments and lobbyists have been using social media to stir public preception.


To achieve above set goal, tweets will be extracted for this particular the hashtag #delhiriots. Unique users will be identified and keyword will be identified. This approch is illustrated in the figure below:

Data Identification

Platform selection

Criterion for platform selection are as following:

  • textual rich data
  • api/tool for data collection availability
  • amount of disucussion

In order to conduct this analysis, following social media analysis are considered:

  • facebook
  • twitter
  • reddit

For this objective twitter is chosen and data is acquired using twint an opensource library to fetch twitter public data without any limit. Figure below shows rationale behind the decision for platform and tool selection.

Tool selection

Inorder to collect data for this task twint is used. Twint is an advanced twitter scrapping tool which has no limits and no authentication required. The project is 2 years old, however, there is active participation by contributors.


dataset on kaggle


[email protected]