Social Media App:

I build this app as a part of the learning process of full-stack development.

Technology Used:

Frontend:

ReactJs
DraftJs ( Rich Text Editor)

Backend:

Language:

Java

Framework:

Spring Boot

Testing:

JUnit5
Mockito (For mocking and stubbing)
MockMvc (For testing API endpoints)
MockServer (For mocking other service API calls)

Database

PostgreSQL
JPA with Hibernate as ORM

Machine Learning Backend:

Language:

Python

Framework:

Django

APP

What can this App do:

user can create the post in rich text with code snippets
user can Comment on the post with rich-text
user can create booklets combining multiple posts
user can follow each other
the app can identify political posts using machine learning

If you can understand Bengali you can watch the video or read the explanation from below.

I created this app in 1-1.5 months as a pet project for learning. I had planned to add many features like notifications, friend list suggestions using graph DB, Search capability using elastic search. But I thought about stopping it here as it is not giving me considerable benefits for learning new technology.

To run the app in you local your local you need to have docker installed in your system then run the following commands, you also need to install git-lfs for pulling the serialized machine learning models.

sudo apt-get install git-lfs

mkdir social-media-app
cd social-media-app
git clone [email protected]:kamrul1157024/Social-Media-App.git
git clone [email protected]:kamrul1157024/social-app-server.git
git clone [email protected]:kamrul1157024/social-app-client.git
git clone [email protected]:kamrul1157024/text-analyzer.git
cd Social-Media-App
docker-compose build
docker-compose up -d

then go to http://localhost:3080

How political post-detection works:

In the app, users can only post in Bengali, banglish(Bengali written in English mostly following the Avro way), and English.

The idea is to detect the language first and then run the specific model for a particular language to detect if there are any political contents.

Let say,

p(B) = Probability of post in Bangla language

p(E) = Probability of post in English language

p(BE) = Probablity of post in Banglish language

probability of post being political,

p(P) = p(P ∩ B) + p(P ∩ E) + p(P ∩ BE)

or, p(p) = p(B) x p(P|B) + p(E) x p(P|E) + p(BE) x p(P|BE)

So to detect the political posts, P(B) , P(E) , P(BE) is needed.So to get these values the post is passed to the language detector which uses the multinomial Naive Bias model to figure out this probability.

Now to figure out, the probability of the post being political given that post is in Bengali (P|B)

Same for English.

To detect political posts in Banglish first banglish is converted to Bengali, As people do not type banglish in complete Avro format, a converter dictionary is used to convert typical banglish to Avro Bengali then Avro rules is used to parse banglish from Bengali.

This model should work with mixed language posts.

Training the model:

Language Detector: [colab-notebook]

The total number of words for each model is kept almost the same so that it does not create any kind of bias.

Also, There is no banglish data set, so the reverse process is used here to generate the dataset from the Bengali language shown in the picture.

As the dataset is uniform accuracy of the model can be described by precision.

here confusion matrices are not plotted against the number of words they are plotted against the number of sentences as English sentences were long, and they contain lots of words that's why you see the lowest number in the down right corner. But as you can see from the confusion matrix the model was able to detect languages with a very low number of false positive and false negative.

As all of the models are generating high accuracy, I chose the model with the lowest execution time which was Multinomial Naive Bias.

Bengali political post: [colab-notebook]

Here for training prothom alo dataset is used kaggle.

Dataset is highly skewed so precision can not be used to choose the model.

As the dataset is highly skewed towards negative value most of the models are giving a high False Negative Rate. But Gaussian Naive Bias seems to perform well here. As it has a higher False Positive Rate than others, so user experience may hamper a bit. But it is better to have an empty house than an ill tenant.

English Polititcal post: [colab-notebook]

For training, BBC Dataset is used kaggle

The process is the same as in Bengali. Here Multinomial Naive Bias and SVC with sigmoid kernel perform well. I choose SVC over Multinomial Naive bias here as it has a lower false-positive rate. Hence, it will increase the user experience.

Interaction between App Server and Machine learning Server:

Here banglish to bangle conversion is handled by app server and machine learning server only responsible of execution of machine learning models.

Here when a post is added by the user The app server makes a Rest API Call to the machine learning server to identify the language. It machine learning server identifies language as Banglish then the App server converts it to Bangla. Then sends another request to the machine learning server to identify if the post is political or not. If the post is political then it responds with an error.

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
client		client
database		database
doc-assets		doc-assets
local-nginx		local-nginx
server		server
docker-compose.yml		docker-compose.yml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Media App:

Technology Used:

Frontend:

Backend:

Language:

Framework:

Testing:

Database

Machine Learning Backend:

Language:

Framework:

APP

How political post-detection works:

Training the model:

Language Detector: [colab-notebook]

Bengali political post: [colab-notebook]

English Polititcal post: [colab-notebook]

Interaction between App Server and Machine learning Server:

About

Releases

Packages

Contributors 2

Languages

kamrul1157024/Social-Media-App

Folders and files

Latest commit

History

Repository files navigation

Social Media App:

Technology Used:

Frontend:

Backend:

Language:

Framework:

Testing:

Database

Machine Learning Backend:

Language:

Framework:

APP

How political post-detection works:

Training the model:

Language Detector: [colab-notebook]

Bengali political post: [colab-notebook]

English Polititcal post: [colab-notebook]

Interaction between App Server and Machine learning Server:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages