Res Publica

Sentiment analysis of Italian political texts

Roberto Reale

reale.me/respublica

The future offers very little hope for those who expect that our new mechanical slaves will offer us a world in which we may rest from thinking.
Norbert Wiener, God & Golem, Inc., 1964

AI and Politics

Major Topics

  • influence
  • governance
  • transparency
  • analytics

Sentiment analysis

The use of computational methods to systematically identify, extract, quantify, and study affective states and subjective information.

Tools

  • natural language processing
  • text analysis
  • computational linguistics
  • biometrics

The Manifesto Project

The Manifesto Project provides the scientific community with parties’ policy positions derived from a content analysis of parties’ electoral manifestos.

It covers over 1000 parties from 1945 until today in over 50 countries on five continents.

Collections

  • Countries: Democratic countries, mostly member countries of the OECD.
  • Elections: Parliamentary (lower house) elections since the first democratic election in a country.
  • Parties: Programs of parties that gained at least one seat in parliament.
  • Documents: An authoritative document enacted and published by a party before an election that outlines a party’s policy plan for the time after the election and covers a broad range of policy issues.

Training and Rules

The coding (or annotation) is conducted by country experts.

The country expert coders are mostly political scientist or political science students and native speakers.

Structure of the Main Dataset

Each row in the dataset represents one electoral program.

The variables party and date jointly uniquely identify every row in the dataset.

It covers 4282 manifestos issued at 715 elections in 56 countries.

Res Publica

Data Sets

  • Italian Parliament
  • Manifesto texts of parties

BOW Vectorization

  • Segmentation into semantic units
  • Tokenization into Bag-of-Words vectors (scikit-learn)

Classification Model

\[ p(y = k | \mathbf{x}, \mathbf{W}) = \frac{e^{z_k}}{\sum_{j=1}^K e^{z_k}}, \quad z_k = \mathbf{w}_k^\intercal\mathbf{x} \]

Classification Model

\[ L(\mathbf{W}, \mathbf{x}, \gamma) = -\log \frac{e^{z_k}}{\sum_{j=1}^K e^{z_k}} + \gamma ||\mathbf{W}||_F \]

Sentiment Index

\[ \frac{\mathbf{s}^\intercal\mathbf{x}}{||\mathbf{s}||||\mathbf{x}||} \]

A proof-of-concept

A web app has been developed, as a fork and evolution of the fipi project.

Predicts political views of texts and newspaper articles.

Downloads, parses and analyzes political articles from six major Italian newspapers on the whole political spectum (Il Fatto Quotidiano, il Giornale, Libero, la Rebubblica, Il Sole 24 Ore).

A proof-of-concept

Based on Python (flask, scipy, scikit-learn, pandas and bs4), Docker and AWS Elasticbeanstalk.

Code available on GitHub

To be continued