Twitter Sentiment Visualization

This project studies ways to estimate and visualize sentiment for short, incomplete text snippets. Sentiment is defined as "an attitude, thought, or judgment prompted by feeling." Our specific goal is a visualization that presents basic emotional properties embodied in the text, together with a measure of the confidence in our estimates.

Web Application

Visualizing Tweets

Examples of the visual features assigned to a circle to represent its tweet's estimated sentiment: colour—blue for unpleasant, green for pleasant; brightness—brighter for more aroused; size and transparency—larger and more opaque for more confidence in the sentiment estimate

Collections of tweets are visualized in numerous ways: by sentiment, by topic, by frequent terms, and so on. Individual tweets are drawn as circles. Each circle's colour, brightness, size, and transparency visualize different details about the sentiment of its tweet:

Representing Emotion

In psychology, emotional models have been proposed to define and compare emotional states. These models often use emotional dimensions to position emotions on a 2D plane. The simplest models represents pleasure along a horizontal axis, with highly unpleasant on one end, highly pleasant on the other, and different levels of pleasure in between. More complex models use more than a single dimension. For example, Russell proposed using valence (or pleasure) and arousal (or activation) to build an emotional circumplex of affect. Russell applied multidimensional scaling to position 28 emotional states, producing the model shown to the left with valence running along the horizontal axis and arousal along the vertical axes. The intermediate terms excited–depressed and distressed–relaxed are polar opposites formed by intermediate states of valence and arousal. Similar models have been proposed by Watson and Tellegen (with positive and negative valence axes), Thayer (with tension and energy axes), and Larsen and Diener (with pleasure and activation axes similar to Russell's).

Estimating Sentiment

Computational methods to estimate sentiment include machine learning algorithms like naive Bayesian networks, support vector machines, and maximum entropy approaches, or combinations of common-sense reasoning and affective ontologies—e.g., sentic computing—to perform a concept-level analysis of natural language text.

One requirement for these traditional approaches is sufficient high-quality text to allow for accurate natural language evaluations. Some researchers argue that this is not necessarily available in short text snippets like tweets, instant messages, or SMS texts. Several researchers have suggested an alternative method: the use of dictionaries that report the sentiment of a set of words along one or more emotional dimensions. Examples of sentiment dictionaries includes POMS and POMS-ex—Profile of Mood States—and ANEW—Affective Norms for English Words.

We chose to build our own dictionary, using a combination of terms from ANEW, an extended ANEW dictionary that was recently built by researchers McMaster and Ghent Universities, and a happiness dictionary built by researchers at the University of Vermont.

Our sentiment dictionary provides measures of valence and arousal for approximately 10,680 English words. Each word is rated on a nine-point scale ranging from 1 to 9. Words included in the dictionary were selected from previous research that identified them as good candidates to convey emotion. For example, to construct the ANEW dictionary, volunteers were asked to read a text corpus and provide a rating along each dimension for each occurrence of an ANEW-recognized word. Ratings for a common word are combined into a mean rating and a standard deviation of the ratings for each dimension. For example, for the word house, ANEW reports:

This shows that house has a mean valence v of 7.26 and a standard deviation of 1.72, a mean arousal a of 4.56 and a standard deviation of 2.41, and a frequency fq of 591 ratings.

We decided to use ANEW's approach of valence and arousal ratings to estimate the pleasure and arousal for a tweet's text. We performed post-processing to get full valence and arousal means and standard deviations for terms from the extended ANEW and happiness dictionaries. One consequence of this is that not all the terms included in these sources are present in our own sentiment dictionary. Given our dictionary, the following steps are used to estimate an overall valence and arousal for each tweet:

For example, consider the following tweet from a query tweets containing the two keywords "health care". Words in bold italics are words that occur in the sentiment dictionary.

Our measure of the n = 2 words' means and standard deviations of valence and arousal are:

To combine the means for health and win, we assume that the individual ratings reported for each word form a normal distribution. Intuitively, if a word has a higher standard deviation, for example, a higher σ_v,i for valence, the valence ratings for the word were spread across a wider range of values. If σ_v,i were lower, ratings for the word clustered closer to the mean. Based on this, we use the probability density function of a normal distribution to estimate the probability density of the word's rating falling exactly at the mean. The probabilities are applied as weights when we sum the means. Using this formula, we compute an overall mean valence and arousal of:

Notice that if we'd simply used an arithmetic mean to compute the overall mean valence Μ_v, we would have reported Μ_v = (6.81 + 8.38) / 2 = 7.59. However, the standard deviation of valence for health (σ_v,1 = 1.88) is higher than the standard deviation for win (σ_v,2 = 0.92). Because of this, we weight win's mean valence μ_v,2 = 8.38 higher than health's μ_v,1 = 6.81. The result is an overall mean Μ_v = 7.86 that falls closer to win's mean valence. A similar result can be seen for overall mean arousal Μ_a.

Visualization Techniques

Tweets are presented using several different visualization techniques. Each technique is designed to highlight different aspects of the tweets and their sentiment.

Sentiment Tab

The sentiment tab visualizes where tweets lie in an emotional scatterplot with pleasure and arousal on its horizontal and vertical axes. The spatial distribution of the tweets summarizes their overall sentiment.

Hover the mouse cursor over a tweet to reveal its body. Words in the sentiment dictionary are highlighted in bold italics. Clicking on a tweet generates a detail dialog with the overall pleasure and arousal for the tweet, as well as each dictionary term's mean and standard deviation of pleasure, mean and standard deviation of arousal, and frequency. Recall that the dictionary's pleasure and arousal estimates were made on a nine-point scale ranging from 1 to 9.

Topics Tab

The topic tab identifies tweets that are discussing a common topic or theme. Each topic is visualized as a rectangular group of tweets, with keywords at the top to summarize the topic, and a number at the bottom to identify the number of tweets in the cluster.

Tweets within each cluster are laid out so that the distance between them shows their text similarity: closer for stronger similarity. Topic cluster rectangles are positioned in the same way: closer for more similar topics. Tweets that are not part of any topic are visualized as singletons on the right.

As with the sentiment tab, hovering the mouse over a tweet or clicking a tweet reveals more details about its content and its estimated sentiment.

Heatmap Tab

The heatmap visualizes the number of tweets within different sentiment regions. It highlights "hot" red regions with many tweets, and "cold" blue regions with only a few tweets.

The emotional scatterplot is subdivided into an 8 × 8 grid of bins representing one-unit steps in pleasure and arousal. The number of tweets falling within each bin is counted and visualized using colour: red for bins with more tweets than average, and blue for bins with fewer tweets than average. White bins contain no tweets. Stronger, more saturated colours lie farther from the average.

Hovering the mouse over a heatmap bin reveals the number of tweets that lie in the bin.

Introduction

Web Application

Visualizing Tweets

Representing Emotion

Estimating Sentiment

Visualization Techniques

Sentiment Tab

Topics Tab

Heatmap Tab

Tag Cloud Tab

Timeline Tab

Map Tab

Affinity Tab

Narrative Tab

Tweets Tab