Blog 16: Sentiment analysis of the ‘vaccination’ sub-corpus (en, part.2)

3 février 2022


Research Notebook

'Mixology' is an open research project, which aims to extract opinions in times of crisis, here from a corpus collected via the Twitter API, from December 12 to 31, 2021.

The second part of the ‘vaccination’ sub-corpus sentiment analysis in English focused on the sets formed by the categories of tweets obtained through topic modelling with the LDA and CTM (Latent Dirichlet Allocation and Correlated Topic) algorithms. Model, see Blog 9). The sentiment analysis was carried out using the Mixology Lexicon, which combines six dictionaries regularly used in sentiment analysis and the Mixology Covid Lexicon, built as part of this research, to match the terms to the field.

Analysis of all tweets shows a dominant positive sentiment for all countries observed. However, this is a slight majority since the lowest positive percentage is observed in France and Luxembourg (49.71%) and the highest in the United Kingdom (54.1%). Overall, these results show few differences between countries.



However, when it comes to focusing on themes, these results need to be qualified. For this analysis, thirteen themes appearing to be the most significant were selected:

  1. Antivax
  2. Booster
  3. Government
  4. Mandate
  5. Mask
  6. Media
  7. mRNA
  8. Omicron
  9. Pass
  10. Science
  11. Side effects
  12. Unvaccinated
  13. Vaccin* + Child*

The results show that the negative trend mainly concerns the themes of anti-vax (the meaning of which can be understood as people opposing all vaccines or Covid vaccines) and the side effects of vaccines. The feeling is very mixed concerning the themes of the government, compulsory vaccination and unvaccinated people. The positive feeling gets better results in seven themes: booster, wearing a mask, media, mRNA, Omicron variant, sanitary pass, science (except in Luxembourg where the sample is too small to be considered representative of the opinions of Luxembourgers on Twitter) and the vaccination of children.

The analysis also underlines a polarization of the debates concerning the political level. Compulsory vaccination is perceived more negatively in Belgium, where this theme has been invited into the parliamentary debate. Anti-vax people and unvaccinated people are perceived more negatively in Austria and France. However, we observe fairly similar trends from one country to another.


Comparative chart: https://ohmybox.info/rframe/corpus-1-vaccin-en-sentiment-lexicon/


These observations show that the debates around vaccination do not only concern health aspects. Moreover, these will have to be nuanced/refined during the analysis of the second sub-corpus, which focuses specifically on the political measures to combat the pandemic. They will also be confronted with the observations made in the general English corpus (where the country’s labelling has not always been possible and whose geographical area is a little wider, but still in Western Europe) and with the two sub-corpus in French. Until then, there is still a lot of analytical work to be done to confront/compare/improve the results.


# # #

Read more

Blog 21: Politicians, experts, and journalists

Blog 20: For vaccination, against restrictions

Blog 19: Comparative Sentiment Analysis

Blog18: A health and political crisis

Blog 17: Anatomy of the “political/sanitary measures” sub-corpus (en)

Blog 16: Sentiment analysis of the ‘vaccination’ sub-corpus (en, part.2)

Blog 15: Comparative sentiment analysis of the ‘vaccination’ sub-corpus (en, part.1)

Blog 14: An adapted dictionary for the Covid crisis and sentiment analysis

Blog 13: Building a stop words list

Blog 12: Main Dictionaries for Sentiment Analysis

Blog 11: Statistical description of the corpus #RStats

Blog 10: Sentiment analysis or the assessment of subjectivity

Blog 9: Topic modeling of the ‘vaccination’ corpus (English)

Blog 8: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.2)

Blog 7: Linguistic and quantitative processing of the ‘vaccination’ corpus (English, part.1)

Blog 6: Collecting the corpus and preparing the lexical analysis

Blog 5: The textclean package

Blog 4: Refining the queries

Blog 3: The rtweet package

Blog 2: Collecting the corpus

Blog 1: An open research project

The challenges of research on media use in times of crisis