05 Jan
05Jan

Today is the 19th day of war between Russia and Ukraine. Many countries are supporting Ukraine by introducing economic sanctions on Russia. There are a lot of tweets about the Ukraine and Russia war where people tend to update about the ground truths, what they feel about it, and who they are supporting. So if you want to analyze the sentiments of people over the Ukraine and Russian War, this article is for you. In this article, I will take you through the task of Ukraine and Russia war Twitter Sentiment Analysis using Python.

Ukraine Russia War Twitter Sentiment Analysis using Python

The dataset that I am using for the task of Twitter sentiment analysis on the Ukraine and Russia War is downloaded from Kaggle. This dataset was initially collected from Twitter and is updated regularly. You can download this dataset from here. Now let’s import the necessary Python libraries and the dataset to get started with this task: 1

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import nltk
import re
from nltk.corpus import stopwords
import string
data = pd.read_csv("filename.csv")
print(data.head())
             id  conversation_id               created_at       date     time  \ 0  1.502530e+18     1.502260e+18  2022-03-12 06:03:14 UTC  3/12/2022  6:03:14   1  1.502530e+18     1.502530e+18  2022-03-12 06:03:14 UTC  3/12/2022  6:03:14   2  1.502530e+18     1.502530e+18  2022-03-12 06:03:13 UTC  3/12/2022  6:03:13   3  1.502530e+18     1.502210e+18  2022-03-12 06:03:12 UTC  3/12/2022  6:03:12   4  1.502530e+18     1.500440e+18  2022-03-12 06:03:12 UTC  3/12/2022  6:03:12      timezone       user_id         username  \ 0         0  2.019880e+07         redcelia   1         0  2.275356e+08          eee_eff   2         0  8.431317e+07      mistify_007   3         0  9.898620e+17  reallivinghuman   4         0  1.164940e+18           rpcsas                                         name place  ... geo source user_rt_id  \ 0    Johnson Out🇺🇦 🇪🇺🇮🇹🇦🇫💙😷 #NeverVoteTory   NaN  ... NaN    NaN        NaN   1  Wearing Masks still saves lives 🇺🇦🇲🇨🏥🌹🌹   NaN  ... NaN    NaN        NaN   2                                Brian🤸‍♀️   NaN  ... NaN    NaN        NaN   3                                    Basha   NaN  ... NaN    NaN        NaN   4                                   RonJon   NaN  ... NaN    NaN        NaN     user_rt retweet_id                                           reply_to  \ 0     NaN        NaN  [{'screen_name': 'RussianEmbassy', 'name': 'Ru...   1     NaN        NaN                                                 []   2     NaN        NaN                                                 []   3     NaN        NaN  [{'screen_name': 'RussianEmbassy', 'name': 'Ru...   4     NaN        NaN  [{'screen_name': 'IsraeliPM', 'name': 'Prime M...      retweet_date  translate trans_src trans_dest  0           NaN        NaN       NaN        NaN  1           NaN        NaN       NaN        NaN  2           NaN        NaN       NaN        NaN  3           NaN        NaN       NaN        NaN  4           NaN        NaN       NaN        NaN   [5 rows x 36 columns]

Let’s have a quick look at all the column names of the dataset: 1

print(data.columns)
Index(['id', 'conversation_id', 'created_at', 'date', 'time', 'timezone',       'user_id', 'username', 'name', 'place', 'tweet', 'language', 'mentions',       'urls', 'photos', 'replies_count', 'retweets_count', 'likes_count',       'hashtags', 'cashtags', 'link', 'retweet', 'quote_url', 'video',       'thumbnail', 'near', 'geo', 'source', 'user_rt_id', 'user_rt',       'retweet_id', 'reply_to', 'retweet_date', 'translate', 'trans_src',       'trans_dest'],      dtype='object')

We only need three columns for this task (username, tweet, and language); I will only select these columns and move forward: 1

data = data[["username", "tweet", "language"]]

Let’s have a look at whether any of these columns contains any null values or not: 1

data.isnull().sum()
username    0 tweet       0 language    0 dtype: int64

So none of the columns has null values, let’s have a quick look at how many tweets are posted in which language: 1

data["language"].value_counts()
en     8812 pt      251 und     198 it      155 in      122 ru       85 hi       55 ja       52 es       40 ta       23 tr       19 ca       18 fr       16 et       16 tl       15 nl       14 de       13 pl       13 fi        9 ar        9 zh        9 sv        6 uk        6 te        6 mr        5 cs        4 el        4 gu        4 no        3 th        3 kn        3 ro        3 ur        2 or        2 eu        2 ko        2 ht        2 sl        2 bn        1 cy        1 ne        1 Name: language, dtype: int64

So most of the tweets are in English. Let’s prepare this data for the task of sentiment analysis. Here I will remove all the links, punctuation, symbols and other language errors from the tweets: 1

nltk.download('stopwords')
stemmer = nltk.SnowballStemmer("english")
stopword=set(stopwords.words('english'))
def clean(text): text = str(text).lower()
text = re.sub('\[.*?\]', '', text)
 text = re.sub('https?://\S+|www\.\S+', '', text) text = re.sub('<.*?>+', '', text)
    text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
    text = re.sub('\n', '', text)
    text = re.sub('\w*\d\w*', '', text)
    text = [word for word in text.split(' ') if word not in stopword]
    text=" ".join(text)
    text = [stemmer.stem(word) for word in text.split(' ')]
    text=" ".join(text)
    return text
data["tweet"] = data["tweet"].apply(clean)

Now let’s have a look at the wordcloud of the tweets, which will show the most frequently used words in the tweets by people sharing their feelings and updates about the Ukraine and Russia war: 1

text = " ".join(i for i in data.tweet)
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(text)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

Now I will add three more columns in this dataset as Positive, Negative, and Neutral by calculating the sentiment scores of the tweets: 1

nltk.download('vader_lexicon')
sentiments = SentimentIntensityAnalyzer()
data["Positive"] = [sentiments.polarity_scores(i)["pos"] for i in data["tweet"]]
data["Negative"] = [sentiments.polarity_scores(i)["neg"] for i in data["tweet"]]
data["Neutral"] = [sentiments.polarity_scores(i)["neu"] for i in data["tweet"]]
data = data[["tweet", "Positive", "Negative", "Neutral"]]
print(data.head())
                                               tweet  Positive  Negative  \ 0  russianembassi ft mfarussia jeffdsach csdcolum...     0.077     0.284   1  kidnap without charg access lawyer putin russi...     0.000     0.000   2  much western civil everyon feel compel find cr...     0.144     0.259   3  russianembassi love place ill visit sure next ...     0.291     0.126   4  israelipm iaeaorg didnt know state israel advi...     0.000     0.000      Neutral  0    0.639  1    1.000  2    0.596  3    0.583  4    1.000 

Now let’s have a look at the most frequent words used by people with positive sentiments: 1

positive =' '.join([i for i in data['tweet'][data['Positive'] > data["Negative"]]])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(positive)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

Now let’s have a look at the most frequent words used by people with negative sentiments: 1

negative =' '.join([i for i in data['tweet'][data['Negative'] > data["Positive"]]])
stopwords = set(STOPWORDS)
wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(negative)
plt.figure( figsize=(15,10))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

So this is how you can analyze the sentiments of people over the Ukraine and Russia war. I hope this war gets over soon and things get back to normal.

Summary

There are a lot of tweets about the Ukraine and Russia war where people tend to update about the ground truths, what they feel about it, and who they are supporting. I used those tweets for the task of Twitter sentiment analysis on the Ukraine and Russia war. I hope you liked this article. Feel free to ask valuable questions in the comments section below. 

Comments
* The email will not be published on the website.
I BUILT MY SITE FOR FREE USING