nc-state logo
Tableau
Christopher G. Healey

Introduction

This module will introduce you to Tableau, a tool designed to manipulate and visualize large datasets. We will work through a simple example of loading data, visualizing it in different ways, combining the individual visualizations into a dashboard, and publishing the dashboard online so anyone with a web browser can view the results.

Setup

In order to work with Tableau, you will need a copy of the software. You will also need to register for a Tableau Public Account, which allows you to upload your visualizations for hosting on Tableau's servers. You should have received instructions on how to download and activate Tableau Desktop 10. If you haven't done that, make sure you have a working copy of Tableau Deskop 10 running on your machine. You will also need to create a Tableau Public account by visiting the Tableau Public Account page to create an account to host your visualizations.

Basic Tableau

In order to visualize, we need to start with a dataset. We'll be using a modified version of a Excel activity dataset originally provided by Peter Aldhous as part of his Tableau demo. Much of the material we use here is taken from this demo, then updated for Tableau 8 and extended to apply some of the visualization principles we learned during the summer visualization lectures.

Once you've downloaded the activity dataset to your computer, open it in Excel and spend a few minutes familiarizing yourself with its different fields.

Loading Data

To start, we need to load the activity dataset into Tableau Public.

  1. Launch Tableau Public.
  2. Choose "Excel" from the "Connect To a file" option list.
  3. Select the "USDA_activity_dataset.xls" file that you downloaded, and choose to "Open" it.
  4. Tableau will begin loading and parsing the file. When it's finished, choose the "Go to Worksheet" button that appears as a small popup in the bottom-left of the screen.

Dimensions and Measures

If you loaded the dataset correctly, Tableau should present a window that looks similar to this.

Tableau's initial worksheet after loading the activity dataset.
tableau-start

Notice that Tableau has divided the data in the Excel file into Dimensions and Measures. Tableau calls these data roles. It's very important to understand the distinction between these two concepts.

Tableau treats fields that contain categorical or qualitative information as a dimension. So, fields like "State" (containing U.S. state names), "County" (containing county names within a state), and "Voting in 2008" (containing "D", "R", or "Null" based on how a county voted) are all marked as dimensions by Tableau.

Measures are fields containing numeric or quantitative information. Most of the fields in the activity dataset are measures, for example, "% Smokers", "Adult obesity rate", and "Child food insecurity". More formally, a measure is a field that is dependent on one or more dimensions. For example, if I wanted to visualize "% Smokers", I normally want to categorize the results, say, as "% Smokers by State". Here, "% Smokers" is a measure that's been made dependent on the "State" dimension. If you don't provide a dimension, Tableau will simply aggregate the measure over the entire dataset.

A map showing average % smokers by state (note that because Tableau is weighting each county's average equally, these results are most likely inaccurate).
tableau-smoker-by-state

The most common reason to distinguish between dimensions and measures is because of how they act in a visualization. In general, placing a dimension on the "Rows" or "Columns" shelf creates an axis of individual categories headings. You can try this: drag "State" to the "Rows" shelf, and you should see a vertical axis with 51 positions: "Alabama", "Alaska", "Arizona" down to "Wyoming".

If you place a measure on the "Rows" or "Columns" shelf, Tableau normally creates an axis showing a continuous scale over the measure's range of values. You can try this as well: drag "% Smokers" to the "Columns" shelf. Now, you should see a horizontal bar chart showing the sum of "% Smokers" for each of the individual states. Right-clicking on the "% Smokers" field in the "Columns" shelf, choosing "Measure", then selecting "Average" from the submenu changes the aggregation method to average the percentage of smokers in each state's counties.

A horizontal bar graph of the average percentage of smokers by state.
tableau-state-avg-smoker

Field Type

Each field also has a small icon to its left. This identifies its current data type: "ABC" for strings, "#" for numbers, and a globe icon for fields that provide geographic information.

Discrete and Continuous

Finally, Tableau views each field as either discrete or continuous. Discrete fields are coloured blue, and continuous fields are coloured green. Normally, dimensions are discrete and measures are continuous, although this is not always true.

Dot Map

To start, we'll generate a dot map of the adult obesity rate for each county.

  1. Clear the worksheet by choosing "Clear → Sheet" from the "Worksheet" menu.
  2. Double-click the "County" dimension (if the "State, County" Dimension hierarchy is closed, open it by clicking on the triangle to its left to expose the "County" field).
  3. You should see a map with one blue dot in the center of each county throughout the United States.
  4. Next, find the "Adult obesity rate" measure, and drag it onto the "Color" button in the "Marks" section. You should see a map with each county colored a shade of blue based in the county's adult obesity rate.
  5. By default, Tableau aggregates multiple data points for a given region (in our case, for a given county) by summing. We want to average the adult obesity rate, so right-click on the "AVG(Adult obesity)" pill in the Marks section, and change "Measure(Sum)" by choosing Average in its submenu.
  6. Click on the "Automatic" pop-up menu at the top of the "Marks" section, and choose "Circle" as the mark type. The map will change to again show a circle centered in each county, but now coloured by adult obesity rate.
  7. Next, click on the "Color" button in the "Marks" section, and choose "Edit Colors..."
  8. Click on the "Automatic" pop-up "Palette" menu, and choose the "Red-Blue Diverging" colour scale. Click on the "Reversed" checkbox to select it, then click "OK".
  9. The final map shows red and blue circles: blue for counties with an adult obesity rate below the median, and red for counties with a rate above the median.
A map showing average obesity rate by county using a red–blue colour scale: red for values above the median, blue for values below the median, and saturation for values farther from the median.
tableau-obesity-red-blue

Scatterplots

Next, we'll create some scatterplots to compare adult obesity rate to other, possibly related measures.

  1. Choose "New Worksheet" from the "Worksheet" menu to create a new, empty worksheet to hold the scatterplot visualizations.
  2. Drag the "Adult obesity rate" measure to the "Rows" shelf.
  3. Drag the "% Smokers", "% Eating Few Fruits And Vegetables", and "% Who Do Not Exercise" measures onto the "Columns" shelf, one after another.
  4. Notice that each scatterplot only contains a single circle representing the sum (over all data samples) of adult obesity rate to the average of percentage smokers, percentage eating few fruits and vegetables, and percentage who do not exercise. To drill down to the county level, click on the "All" option in the "Marks" section, then drag the "County" dimension to the "Marks" section, and drop it the blank region below the buttons (that is, do not drop it on any of the buttons).
  5. Change all four attributes from Sum aggregation to Average aggregation by right-clicking their pills and selecting "Measure (Sum) → Average".
  6. Click on the "Automatic" pop-up menu at the top of the "Marks" section, and choose "Circle" to show filled rather than empty circles.
  7. Click on the "Color" button in the "Marks" section, and select a shade of purple for the circle colour. Change "Transparency" to 50%, then click the "Color" button again to close its options.
  8. In the "Analysis" menu choose "Trend Lines → Show All Trend Lines".
  9. Finally, right-click on each trend line, choose "Edut Trend Lines..." and selection the checkbox "Show Confidence Bands".
Three scatterplots showing comparing average obesity rate by county to percentage of smokers, percentage eating few fruits and vegetables, and percentage who do not exercise.
tableau-scatterplot

Dual-Axis Line Graph

Often we want to compare variables across a sequence like time or a categorical dimension like county or state. Here, we'll build a line graph to represent Adult Obesity and % Smokers, to see if there's any visual correlation between the two.

  1. Choose "New Worksheet" from the "Worksheet" menu to create a new, empty worksheet to hold the line graphs.
  2. Drag the "State" dimension to the "Columns" shelf.
  3. Drag the "% Smokers" and "Adult obesity rate" measures to the "Rows" shelf, one after another.
  4. Notice that this produces two independent bar charts, one for "Adult obesity rate" and one for "% Smokers". To convert them to line charts, select "All" in the "Marks" section, click on the "Automatic" pop-up menu at the top of the card, and choose "Line" to show two line graphs.
  5. Next, we want to combine the two line graphs into a single, dual-axis graph. To do this, right-click on the lower graph's vertical axis, "Adult obesity rate", and choose "Dual Axis" from the pop-up menu.
Dual-axis line graphs comparing sum of obesity rate by state to sum of percentage of smoker.
tableau-dual-line-01

At this point, we have a basic dual-axis graph, but there are a number of formatting modifications we can make to improve on Tableau's defaults.

  1. Since we're aggregating across all counties in a state, we want the average obesity and smoker rates, not the sum of all counties. Right click the "SUM(% Smokers)" pill in the Rows shelf, select the "Measure (Sum)" entry, and choose "Average" from submenu. Do the same thing for the "SUM(Adult obesity rate)" pill.
  2. Finally, we'd like the left axis to show values with a "%" sign at the end. To do this, right-click the "Avg % Smokers" label and choose "Format...". The list of dimensions and measure on the leftmost column will change to a "Format AVG(% Smokers)" panel. In the "Numbers:" pop-up, choose "Percentage", set the "Decimal places:" to 0, then click the "X" at the top of the Format panel to close it.
Dual-axis line graphs comparing sum of obesity rate by state to sum of percentage of smoker, but with incorrect percentages on the vertical axis.
tableau-dual-line-02

Although this adds a "%" to the values on the vertical axis, it also exposes a problem: percentages run from 0% to 3500%, because the "% Smokers" column is coded on a range 0–100, and not 0.0–1.0. You can confirm this by choosing the "Data" menu and selecting "Connect to me(USDA_activity_dataset) → View Data", then looking at the "AVG(% Smokers)" column.

To address this, we will create a new calculated field that spans the 0.0–1.0 range we want.

  1. Choose the menu option "Analysis → Create calculated field..." to open the calculated field dialog.
  2. Name the field "% Smokers Fraction" in the text field at the top of the dialog.
  3. Click in the main body of the dialog (below the line beneath the field's title), and enter the formula "[% Smokers] / 100.0". Notice that Tableau will offer variable completion suggestions as you do this.
  4. If you've entered the formula correctly, Tableau should report "The calculation is valid." at the bottom of the dialog. Click "OK" to save the formula.
  5. The new calculated field "% Smokers Fraction" should now appear in the list of Measures. It will have a small "=" sign to the left of the "#" sign to indicate it is a calculated field.

We could remove the original "AVG(% Smokers)" from the Rows shelf and replace it with the new "AVG(% Smokers Fraction)", but rather than go through all the work of rebuilding the dual-axis graph, we'll instead edit the "AVG(% Smokers)" pill in-place on the shelf.

  1. Right-click the "AVG(% Smokers)" pill and choose "Edit in shelf".
  2. Change the text to "AVG[(% Smokers Fraction)]".
  3. Hit Ctrl+Return to accept the new pill definition.
  4. Click outside the pill to return to the worksheet.
Dual-axis line graphs comparing sum of obesity rate by state to sum of percentage of smoker with correct percentages on the vertical axis.
tableau-dual-line-03

The graph will now show an orange "Avg. % Smokers Fraction" line graph and a blue "Avg. Adult obesity rate" line graph, overlaid on top of one another.

Dashboard

Tableau allows individual visualizations to be combined into a common view called a dashboard. We'll use a dashboard to combine our map and scatterplot visualizations.

  1. Choose "New Dashboard" from the "Dashboard" menu.
  2. Double-click the "Sheet 1" label in the tab at the bottom of the screen, and rename the sheet to "Map View".
  3. Double-click the "Sheet 2" label in the tab at the bottom of the screen, and rename the sheet to "Scatterplot View".
  4. Double-click the "Sheet 3" label in the tab at the bottom of the screen, and rename the sheet to "Dual Line View".
  5. Click on the "Dashboard 1" tab, and your three visualization sheets should be listed in the "Sheets" list to the left of the dashboard.
  6. Drag the map visualization sheet onto the dashboard region (the region that says "Drop sheets here").
  7. Your map visualization should appear inside the dashboard.
  8. Drag the scatterplot visualization sheet and drop it near the bottom of the map visualization in the dashboard.
  9. The scatterplot visualization should appear below the map visualization in the dashboard.
  10. Drag the dual line visualization sheet and drop it near the bottom of the scatterplot visualization in the dashboard.
  11. The dual line visualization should appear below the scatterplot visualization in the dashboard.
  12. To size the visualization, click on the size option popup beneath the "Size" title, and within the new dialog, click "Fixed Size" and change it to "Automatic".
  13. Finally, to give a bit more room to each visualization, right-click on each of the titles in the three visualizations and choose "Hide Title".
A Tableau dashboard containing the map and the scatterplot visualizations.
tableau-dashboard-wo-state

Action Filter

Although the current visualization is a good start, showing all the data for the entire country makes it cluttered and difficult to see more detailed patterns. It would be useful to allow viewers to interactively drill down and see data for individual states. Tableau provides action filters to allow this type of interaction.

  1. Choose "New Worksheet" from the "Worksheet" menu to create a new, empty worksheet to hold the State action filter.
  2. Drag the "Sheet 4" tab that's created to the left of the "Dashboard 1" tab, double-click the "Sheet 4" tab, and rename it to "State Filter".
  3. Drag the "State" dimension to the "Rows" shelf.
  4. Drag the "State" dimension to the "Text" button in the "Marks" section, to label each state with its name.
  5. Right-click the "State" dimension in the "Rows" shelf, and uncheck "Show Header". Since a state's name and its header are identical, the header is unnecessary.
  6. Double-click the "State Filter" title at the top of the sheet. An "Edit Title" dialog will, allowing you to change the title from "<Sheet Name>" to something like "Select a State". Click "OK" when you've chosen your sheet title.
A list of state names to allow viewers to filter data by state.
tableau-state-filter

Next, we need to add this sheet to the Dashboard, and tell Tableau to use its current selection as a filter for the map and scatterplot visualizations.

  1. Click the "Dashboard 1" tab to show the dashboard in the main window.
  2. Drag the State Filter worksheet you just created and drop it to the right of the map, below the Adult obesity rate and Measure Names legends.
  3. Select the State Filter worksheet in the dashboard (a frame will appear around it when you do this), then right-click the frame and select "Use as Filter".
  4. Finally, right-click the frame, select "Fit", and choose "Fit Width" to allow the state list to use the entire width provided by the dashboard.
A Tableau dashboard containing the map and scatterplot visualizations, and a state-level filter.
tableau-dashboard

Now, the state list acts as a dynamic filter to the data being shown in the other visualizations. Click a state, and only the data for that state will be shown in the map and scatterplot visualizations. To show all the data, click the currently selected state to unselect it and restore the visualizations to their default formats.

Publishing

The final step is to upload your work to the web. This serves two purposes. First, it saves your sheets and dashboards so you can load and modify them later. Second, it provides a method to share your dashboard through a web browser.

To publish to Tableau Public from Tableau Desktop, you first need to create an extract of the data that feeds your dashboard. This will be uploaded to Tableau Public along with your worksheets and dashboard. To create the extract, choose "Connect to me (USDA_Activity_dataset) → Extact Data..." from the "Data" menu. An "Extract Data" dialog will appear. Click the "Extract" button, choose a name and location for the extract file, and click "Save". The extract is created and bound to your Tableau workbook, so if you move or delete the extract, Tableau will complain when it tries to re-open your workbook. At that point, you will be offered options to locate the extract, remove it, deactivate it, or regenerate it.

Once the extract is created, we can save both the dashboard and its associated data extract to Tableau Public.

  1. Open the "Server" menu and choose "Tableau Public → Save to Tableau Public As..."
  2. Enter the user id and password you chose when you created your Tableau Public Account.
  3. Choose a name for your workbook, then click "Save"
  4. Tableau will show a progress bar as its uploads the data, then it will open your default browser and display your dashboard.
  5. Click the "Share" button on the bottom of the dashboard, and a popup will appear with an "Embed Code" field containing the Javascript needed to embed your dashboard in a web page, and a "Link" field containing the URL to view your dashboard in a web browser.
  6. Select the "Link" URL and Copy (Ctrl+C) it to the clipboard.
  7. Open a new tab in your web browser, then paste the link you copied from Tableau Public and load it in a new web browser tab. You should see your dashboard, with the interactive state filter, displayed in the web browser.

If you run Tableau at a later date and want to continue working on your visualizations, choose "Tableau Public → Open From Tableau Public..." in the "Server" menu to load your workbook from Tableau's servers. This will restore your workspace to its previously saved state.

This tutorial only begins to explain the power and flexibility of Tableau. You can use the Tableau textbook to explore Tableau in more detail. You can also come to talk to me or Andrea if you're having problems, and we can help you find instructions to construct the visualizations you want to build.

Also, remember that Tableau is a tool to allow you to build visualizations. Like any tool, Tableau doesn't necessarily guarantee that the visualizations you produce are good visualizations. Try to apply the design principles we discussed in the visualization lecture, to make sure your visualizations are clear, effective, and well structured. The default visualizations that Tableau provides are unlikely to offer this, since Tableau doesn't have your domain expertise and understanding of the data's context to decide how best to present it.