Charts in Tableau
Charts in Tableau
Course Overview
(Music) Hi everyone. My name is Adam Crahen, and welcome to my course, Tableau Desktop
Playbook: Building Common Chart Types. I'm the head of Data Visualization Engineering here
at Pluralsight. I am also a 2018 Tableau Zen Master and a cofounder of the data visualization
blog, thedataduo.com. Tableau is one of the world's most popular data visualization tools and is
consistently ranked as a leader in Gartner's Magic Quadrant for BI and Analytics platforms.
Tableau helps people see and understand their data, but you need to know the why and the how
behind data and analytics in order to be effective. In this course, we are going to build or enhance
your data visualization toolkit and equip you with the knowledge you need to be successful in the
world of analytics. Some of the major topics we will cover include presenting distributions and
comparisons of data using bar charts, presenting data over time with line charts, and learning
how to unlock additional context with statistical, non-standard, and advanced charts. By the end
of this course, you will know how to read, build, and format over 35 different charts. Along the
way, you will learn foundational and advanced concepts that are essential when working with
Tableau. These concepts will give you the knowledge to know what Tableau is going to draw on
the canvas before you even drag and drop a pill, and it will empower you to be successful when
communicating about data with your users. Before beginning the course, you should be familiar
with Tableau desktop's primary purpose, have a general knowledge of navigating through the
tool, and know how to connect to your data. Having a basic analytics acumen would be helpful,
but if you are a beginner, you can still be successful with this course. I hope you'll join me on
this journey to learn about data visualization with the Tableau Desktop Playbook: Building
Common Chart Types course, at Pluralsight.
Hi, this is Adam Crahen. In this module, you will learn how to quickly display data with variations of the
text table in Tableau Desktop. We'll start as simple as it comes, by building a standard text table. And
then, we will transform that into chart types such as a highlight table and a heat map. Finally, we'll end
the module by transforming our standard text table into an interactive dot plot. For each chart type, we
will learn what they're used for and how to read them. We will look at some real-life examples, and then
we will walk through how to build that chart in Tableau. If you have multiple monitors or devices
available, this would be a good time to start your two-screen experience so you can follow along with the
demo to become more familiar with building these charts. When learning Tableau, I think you will always
learn more if you're behind the steering wheel.
Text Table
Before we can build a text table, we need to understand what it is. It is an arrangement of columns and
rows that organizes and positions our data. Columns or rows can contain headers, which describe the
values in each cell. Other common names for text tables are simply the table, a crosstab, or a pivot table,
depending on how it is structured. We use text tables in our everyday lives and we may not even realize
it. Have you ever checked the hourly weather on your phone? Or looked at the box score of a football
game? Or maybe checked the performance of your favorite data stock? These are simple but effective text
tables, and we use them all the time. Let's take a second to understand the pros and cons of building a text
table, starting with the pros. Text tables are easy to comprehend when clearly labeled with good headers.
We begin using tables in elementary school, so the barrier to entry is low. And even multidimensional
tables have a basic simplicity to them. Tables allow for precision. For example, you might want to show
dollar amounts down to the penny, which is not always something you would want to do in a
visualization. And now the cons. It is really hard to find important data, especially in a large table of
figures. It is almost impossible to see the shape of your data or to identify patterns and trends. And while
you can always add more rows to a table, think about how it would scale if someone had a few million
rows of data to consume. Would this be the best visualization to communicate about a dataset of this size?
Probably not. And finally, there is just no visual appeal. You will see throughout this course that there is a
balance between functionality and visual appeal and you do not always have to sacrifice one for the other.
That being said, let's jump into Tableau and learn about how to build every data analyst's least favorite
chart. However, as long as your audience keeps saying just show me the numbers, we may as well learn
how to build a good text table. As you can see, I have Tableau Desktop up and running. For this course,
we'll be working in Tableau 2018.3 and using the Sample - Superstore data source that comes packaged
with every Tableau installation. If you are following along, start out by clicking on Sample - Superstore in
the Saved Data Sources section. Before we start building anything, let's familiarize ourselves with Show
Me. Show Me is the place to go when you aren't sure where to start. Tableau can build effective charts
with minimal amounts of information or effort on your part. Simply hover over any of the chart
thumbnails to see the type of data Tableau needs to render the chart type. Navigating to the top left of
Show Me, we see that a text table requires one or more dimensions or one or more measures. Let's click
on Region in our Dimensions pane and see how Show Me changes. Notice that the text table is the only
thumbnail that lights up, indicating it is the only chart type that can be built using Show Me when a single
dimension is selected. With Region still selected, hold down your Ctrl+Command key and select Sub-
Category from the Dimensions pane and Sales from the Measures pane. Notice how Show Me has
changed. We now see a variety of chart types supported by one or more dimension and one or more
measure. Notice how horizontal bars has a bold orange border around its thumbnail. This is indicating the
recommended chart type with the fields you've selected. Instead of taking Tableau's recommendation, let's
go ahead and click on text table. Tableau has provided us a simple text table with a few items to nitpick
over. One of the first items we need to consider is readability. There is a high likelihood that this
particular text table is simply too wide as immediately indicated by the horizontal scroll bar. Text tables
are much easier to consume if all of the data is visible at once. And if scrolling is necessary, it will always
be easier to consume with a vertical scroll rather than a horizontal scroll. To quickly fix this, we can
navigate to the ribbon and click Swap Rows and Columns. Or if you're the hotkey type of person, many of
the buttons in the ribbon display the hotkey combinations. As you can see, I am on a Mac and the
combination to Swap Rows and Columns is Ctrl+Command+W. For example's sake, let's assume this
entire text table would have ample space on a dashboard to be displayed in full, sans scroll bar. Navigate
to the fit drop-down and select Entire View. This selection will ensure that all of our rows and columns
will be on full display regardless of the worksheet size on a dashboard. At this point, there isn't much left
we can do to improve our standard text table. We can see the field label for Sub-Category is being cut off,
and we can simply drag and drop to widen the field. Our dimension and measure alignment has been
configured in the best way possible by default. Dimensions are left aligned and measures are right
aligned, the preferred way to align a standard text table. Tableau has provided us with a light gray row
banding, which helps us distinguish one row from another in a way that is less distracting than traditional
borders, which you might be used to seeing in a tool such as Excel. At this point, the most common
request you will encounter is the need to see row or column totals, and we can easily turn those on in
Tableau. To turn on labels, simply navigate to Analysis, Totals, and select either Show Row Grand Totals
or Show Column Grand Totals. I'll go ahead and turn on both. Now that our totals are enabled, we have a
couple of additional options we can configure if desired. If you prefer the totals to be displayed as the first
row or first column, simply navigate to Analysis, Totals, and select either Row Totals to Left or Column
Totals to Top. There isn't a best practice when it comes to the position of the totals, so this is going to be
up to you and your users to decide the best location. There are also going to be times when summing up
all the values is not how you want Tableau to total. Instead of summing, which is the default aggregation
for row and column totals, we can choose one of Average, Minimum, or Maximum. To change the
aggregation for your totals, navigate to Analysis, Totals, Total All Using, and choose your aggregation.
Let's go ahead and temporarily switch our totals to use average. Notice the numbers have dropped
dramatically as we are now seeing the average value across or down each dimension, rather than the sum
of the parts. With that, let's quickly revert our totals back to sum. Finally, I'm going to give this tab a
creative name of Text Table so you can refer to it in the exercise files. Now let's move on to our next
chart type, the highlight table.
Highlight Table
Before we can build a highlight table, we need to know what it is. The main purpose is to compare
categorical data using color. The color identifies pattern or correlation in the data, and you can use this
chart type with or without labels. Most of the rest of the world refers to these charts as heat maps, but we
will refer to them as highlight tables here to be consistent with the terminology used in the Show Me
pane. Let's take a look at a famous example of a highlight table. This is one of my favorites, and it was
published by The Wall Street Journal Graphics team. The chart shows the number of people infected by
measles in the United States from 1928 to 2012. Let's learn how to read this chart. The x axis, or
horizontal axis, represents time where each column is a year. The y axis, or vertical axis, represents states
where each row is a state sorted alphabetically. That means that each square on this chart represents a
number of people infected with measles for a given state in one year. If you think about the layout of this
chart, it is a simple text table under the hood where we've replaced the labels with colors. And that simple
change allows us to identify patterns of measles infections. The designers also did something important
here that we should take note. They provided extra context. We should look for opportunities to do this in
everything that we build. The measles vaccination was released in 1963 and they represented this event
with a vertical black line on the chart. Using the color legend, we can see that infections were rampant
and common in almost every state, and then all of a sudden the vaccination was released, and within a
small period of time, the number of measles infections dwindled in comparison to the beginning of the
century. Let's think about that for a moment. We just painted a pretty vivid picture about measles
infections in the United States. You probably understood what the chart meant in less time than I could
describe it verbally. And that is the power of data visualization. We are able to understand patterns and
see things much quicker than we could by looking at raw data in a table. Imagine looking at 70+ years of
data for 50 states. Do you think you would have arrived at the same conclusions in the same amount of
time? So let's take a second to understand the pros and cons of a highlight table, starting with the pros.
Highlight tables are easy to comprehend, meaning a low barrier to entry for your audience, similar to a
text table. Where a highlight table really shines is by helping us identify those patterns, or correlation,
using color. We are able to see and understand our data very quickly in a visually appealing way.
Additionally, the chart type can scale for a large amount of data as we just saw in our example. And now
for the cons. These charts excel with two dimensions, but become harder to use when additional
dimensions are added. These charts are not good for pointing out small variations in your data. The color
legend takes all data into account, so if there is an extreme outlier, the entire chart, with the exception of
the outlier, might be the same color. In this case, you have to decide as the designer how to handle that. A
potential solution might be using a fixed axis on the color legend or filtering the data, both of which come
with side effects. Now let's jump into Tableau and learn how to build a highlight table. Wait a minute, I
thought we were building a highlight table and not a text table again. Well, remember when I said that a
highlight table is a text table under the hood? Now we're going to prove that. Let's go ahead and right-
click on the sheet name of our text table and select Duplicate. This action will give us a brand-new sheet,
which is exactly the same as our text table in every way. Let's open up our friend Show Me and see the
suggestion for a highlight table. We can see that a highlight table requires one or more dimension and
exactly one measure. If we click on the thumbnail, Tableau has converted our standard text table into
what it feels like is the best representation of a highlight table. What I want you to notice as you use Show
Me is how and where Tableau places fields. Once you start to understand how Tableau likes to build
charts, you will begin to feel more comfortable building your own from scratch rather than starting with
Show Me. We can see that there are only two configuration changes when we move from a text table to a
highlight table. Some of sales has been replicated onto the color property of the marks card, and the mark
type has been changed to Square instead of Text. By using Show Me, Tableau has overwritten some of
our previous configurations, which are actually better than the default output. I'm going to revert back to
our text table, drag and drop Sales onto the Color property, and this will tell Tableau to analyze the
aggregation of sales for each mark and color them appropriately based on the given range of sales values.
Alternatively, we can Ctrl+Command+drag our measure from the Text property onto the Color property.
Ctrl+Command+drag simply creates a copy of the selected field, which can then be dropped onto a
variety of places in the view. Ctrl or Command is based on your operating system. At this point, to
convert our text table into a highlight table, open the mark type drop-down and select Square. A few
items to note at this point. Tableau has center aligned our column headers, but our field labels are still
right aligned. Using a highlight table, we want the focus to be on color rather than the labels, so it is best
to center align along with the column headers. To do this, click on the Label property of the marks card,
click the Alignment drop-down, and click Align Center under the Horizontal category. I'm sure at this
point the bright white total column and row are nagging at your brain. Why aren't those colored as well?
Tableau is smart enough to know that our totals greatly outweigh the table values, which would severely
alter the color scale. If desired, we can include totals in our color selection, but the impact of the highlight
table will be taken away. Allow me to demonstrate this behavior. Navigate to the Color property on the
marks card, click the icon, and select Edit Colors. Also, you could do this by navigating to the color
legend, clicking on the drop-down, and select Edit Colors. Both actions will take you to the same menu.
From here, if you'd like to add totals into consideration, simply click Include Totals and select OK. I think
now you can quickly see why we don't want to include totals in our color configuration. Instead of our
color scale ranging from $503 to $101, 000, we are now ranging from $503 to $ 2.3 million. The total
values just simply overpower every other value in our table. If you recall from the text table clip, we can
configure our totals to use an alternate aggregation other than sum, and if you must have totals on color in
your highlight table, it is recommended to change the aggregation method to something other than sum.
Notice how our table changes when I move to Average as our total aggregation. The color scale comes
back down to Earth and our highlight table becomes more insightful. Now I'm just going to turn off these
totals. I'm going to name this sheet Highlight Table, and I'm going to duplicate it and name that sheet
Highlight Table (2) so I can demonstrate another option. There are many common interpretations of a
highlight table, which all share a common theme. We are using color to highlight big and small values,
and one of the most common variations you will find is simply what we are looking at with labels hidden.
And we can declutter this view to bring out the colors even more. To do this, just drag Sales off of the
text detail. If the colors feel a little overwhelming, there's no shame in adding a subtle border around each
cell. Click on the Color property of the marks card. Select the Border drop-down and choose a color that
closely matches the background of your view. When working with a white background like we are today,
I will usually choose an off-white shade for my borders. When I build heat maps, I try my best to square
the cells. This allows me to compare colors easier than I can with the wide rectangles. To convert our
cells into perfect squares, we need to take off the Entire View fit configuration and instead use Standard.
To ensure we are working with perfect squares, navigate to Format, Cell Size, and click Square Cell. You
can see our colors are now packed into a tighter area, which should lighten our mental load when trying to
compare colors. The only item worth changing at this point is the orientation of our column headers. It is
usually preferred to lay out headers horizontally, but in some scenarios we are forced to display headers
vertically. Right-click on any of the column headers and select Rotate Label. You can drag and drop to
resize as necessary. When displaying labels vertically, we want to make sure they line up with the first
cell, so right-click, again, on any of the column headers, click Format, and change the Vertical alignment
to bottom rather than middle. We can now see the final representation of our highlight table, and it is
impossible to miss insights like phones having high sales across all regions, or envelopes being one of the
lowest sub-categories in terms of total sales. I'm sure we don't expect envelopes to bring in the same
amount of revenue as phones, but are those sales profitable? We'll answer questions like these in the next
clip where we'll explore variations of the heat map.
Heat Map
Before we can build a heat map, we need to know what it is. As I mentioned in the highlight table clip,
most of the world would consider a highlight table to be a heat map. But Tableau considers a heat map to
use size, to identify patterns or correlations in the data, and optionally color can be used to identify
another pattern or correlation, as we will demonstrate here shortly in the demo. Let's take a look at a
Tableau heat map I created. This chart is a visual representation of the lyrics to One More Light by the
band Linkin Park. Let's learn how to read this chart. They y axis represents a line of lyrics, and you can
see this clearly as I have both the text and the circles on the same line. The x axis is the word order of
each line, so this means that each dot is a word in a line of lyrics. Size and color are double encoded in
this view, which means they represent the same value. In this case that value is the number of times a
word is said in the song. I chose to display this data using a heat map because size shows the
repetitiveness of each word within the song. In this case, the chorus is sung three times, so we can expect
the larger, more pink dots to represent the words in the chorus. Let me animate this view to help you
understand what I was trying to display. As you can see, as I hover over the dots that represent words, all
occurrences of that word in the song are highlighted. This is a pretty artsy example of this chart, but we
will build something more likely to be used in a business setting shortly. Let's take a second to understand
the pros and cons, starting with the pros. Heat maps are easy to comprehend, again meaning a low barrier
of entry for your audience, as they are similar to a highlight table. Where a heat map really shines is by
helping us identify patterns or correlation using size and optionally color. We are able to see and
understand our data very quickly in a visually appealing way. Additionally, the chart type can scale for a
large amount of data as we just saw in our example. And now the cons. These charts excel with two
dimensions, but become harder to use when additional dimensions are added. These charts are not good
for pointing out small variations in your data. The size legend takes all data into account, so if there is an
extreme outlier, the entire chart, with the exception of the outlier, might be the same size. In this case, you
have to decide as a designer how to handle that. A potential solution may be to use a fixed axis on the size
legend or filtering the data, both of which come with side effects. Now, let's jump into Tableau and learn
about how to build a heat map. As we did in the previous clip, let's right-click our Highlight Table (2)
sheet and select Duplicate. This will create an exact copy of our Highlight Table (2) in a new sheet. If we
were to access Show Me and hover over the heat map thumbnail, we would see that we can build a heat
map with the level of detail already in our view. We can click the icon in Show Me to see how Tableau
wants us to build the heat map, or we can do it ourselves by simply moving some of Sales from Color to
Size in the marks card. We could do this by dragging and dropping the pill already on the marks card on
Size, or we could click the little icon to the left of the pill and select Size. If you want to control the size
of your marks, click on the Size property and move the slider as you see fit. This is the most simple and
basic way to control size. Now notice, there are hash marks on the slider. These are Tableau's
recommended size settings. If you drag the slider past the hash marks, your marks may overlap and mess
up your view. Okay, let's go ahead and revert that change. And if you haven't picked up on this by now,
let me say clearly, the Undo button is your friend. We can also control size by using the size legend.
Navigate to your size legend, click the drop-down, and select Edit Sizes. From here, we can alter items
such as the size range, making the smallest and largest marks bigger or smaller. We can also reverse the
order if you want to bring attention to your smallest values rather than your largest. And lastly, we can
configure the start and end points for size evaluation. Maybe you prefer the size scale to start at 0 rather
than the minimum amount in the view. And if you just want to get back to the default, you can always
click Reset and hit OK. If you remember, in Show Me, the configuration was suggesting one or two
measures. The heat map will provide us additional insights if we add another measure to the color
property. Let's drag and drop Profit Ratio onto color and see if any insights stand out. Our squares remain
sized by the sum of sales, but are now being enhanced with profit ratio on color. If you recall our
highlight table, we saw high sales in the Phones sub-category, and by adding Profit Ratio as an additional
component, we can see that those are profitable sales. We can see that Paper is a low-revenue sub-
category, as indicated by the small size, but it is actually the most profitable, as indicated by color, while
Tables are high-revenue, but very unprofitable. There isn't much else to configure to consider our heat
map a success. As always, feel free to play around with your Fit window, as what works for one chart
type doesn't always work for the next. When working with size, I like to give my marks a little extra room
to breathe. So in this case, I would suggest not squaring your cells as we did with the highlight table. As
usual, you can drag and drop until you are content. I hope you will continue to join me on the next clip
where we will combine all of these concepts and start to move away from table-based charts and into
interactive data visualizations, beginning with the dot plot.
Dot Plot
Before we can build the dot plot, we need to know what it is. Dots are positioned by their value order on
an axis, which is a space-efficient method for laying out ranks across multiple categories, and it's a good
way to see the entire distribution. These charts are also sometimes called strip plots. Let's take a look at a
Tableau dot plot I created. This chart is a visual representation of a mountain bike ride I took on
September 2, 2017. I have some KPIs, the actual map of the ride, and a dot plot on the left of my chart.
Let's learn how to read this chart. The y axis represents time, and the x axis represents speed in miles per
hour. This means that each dot represents my speed at a point in time during my mountain bike ride. Let
me animate this view to help you understand what I was trying to display. As you can see, as I hover over
the dots I am presented with all kinds of context. There are visual cues along the axis for the time of day,
the speed label is displayed on the screen, and the point is highlighted on the map of where I actually was
on my ride. Now I might be the only person in the world that cares about this chart, but we will build
something more likely to be used in a business setting in our demo. Let's take a second to understand the
pros and cons of a dot plot, starting with the pros. Dot plots are easy to comprehend, meaning there'll be a
low barrier of entry for your audience, and they are really good at showing individual values in a
distribution. Dot plots use position to compare values on that axis, and of course they are visually
appealing. And now for the cons. Dot plots have the potential for data overlap. It's important to know that
there may be data behind other points. This is called over plotting. Additionally, as we begin to make our
way into interactive data visualizations, you need to always consider your audience. Is this the right chart
type for their needs? Can they understand what is being displayed? And know that you may have to act as
a translator in the beginning. Now, let's jump over to Tableau and learn how to make an interactive dot
plot. As you may have guessed, our dot plot will share many of the same concepts as our heat map. So
once again, right-click on the Heat map sheet and select Duplicate to create a copy. In the end, our dot
plot is going to be very similar to our heat map, but we will be bringing in a third visual component we
haven't talked about yet, position. Rather than simply using color or size to show how values are different,
we will plot those values on an axis and show how far apart they are from each other. The first step in
converting our heat map to a dot plot is breaking our pivot table. Let's go ahead and swap our rows and
columns by navigating to the ribbon. And, let's also grab Sub-Category from Columns and drag and drop
to the Detail property of the marks card. At this point, we are looking at a very uncommon chart, the tree
map bar chart. We'll save this one for another day. To finish our base dot plot, simply drag some of Sales
from the marks card to the Columns shelf, and change our mark type to Circle instead of Square. After
resizing and realigning a few components, we can now use our dot plot to look for any outlier sub-
categories. We can see phones and chairs with extremely high revenue in the east and west regions. If we
want to provide more clarity around sub-categories and how they compare, we can turn on Tableau's data
highlighter. Navigate to the Sub-Category dimension on the marks card, right-click and click Show
Highlighter. This will open a new card under our color legend. Take a look at Copiers. What is going on
in the south region? Interactive experiences like these force our users to continue to ask data why, which
should be the goal of almost every visualization. Now that our dot plots is functional and providing new
insights, there are only a few configurations to consider changing. Other completely optional changes that
many will make are removing axis rulers and grid lines. This is a debatable change, but it is one that I will
often do simply to force all the focus onto the data. If you choose to remove these items, navigate to
Format, Lines, set Zero Lines, Axis Rulers, and Axis Ticks to None at the worksheet level, and click on
Column and set Grid Lines to None. We are left with a clean representation of our data, with nothing
extra potentially clouding our perception. Other common themes you will encounter with a dot plot are
reference lines of all types and sizes. This course won't go into why you should or shouldn't use specific
types of reference lines, but it is important to be aware of their presence. If you want to explore adding
reference lines to your dot plot, we can navigate to the Analytics pane and toy around with a variety of
reference line types simply by dragging and dropping them onto your view. Once a reference line is
added, it can always be modified by right-clicking on the related axis and selecting the required reference
line from the Edit Reference Line menu. This is something you may want to configure as Tableau's
defaults might not be right for your analysis. For our dot plot, let's make sure Tableau does not recalculate
the reference lines when we use our data highlighter. After we settle on a reference line, we can consider
our dot plot complete. Using the Median with Quartiles reference line allows me to see the midpoint and
gives me a framework to detect outliers. This is much better than the naked eye, and we are constantly
trying to provide data experiences designed to eliminate gut feeling analysis. Lastly, it is highly likely that
some of your dots were overlapped, and we can help highlight those instances using the opacity scale in
the Color properties. Click on the Color property of the marks card and simply slide the Opacity scale to
something like 80%. Notice our overlapping marks are now more distinguishable than before. You may
decide to turn off the borders we added from the highlight table step. Sometimes when marks are small,
all you can see is the border, so removing them helps clear up the view. Let's go ahead and rename this
sheet Dot Plot. And now, I'm going to add one more bonus chart in here. As we discussed, one of the
main downsides to a dot plot is that if your data is dense, you might be over plotting points on top of each
other. Let's make a copy of our dot plot and simulate this over plotting by adding Product Name to the
marks card. Wow, look at that. We have a lot of over plotting. We are going to apply a technique called
jittering. We can use this as a secondary axis to randomly position our data points and make them more
visible. To do this, double-click on the Rows shelf and type in random with an open and closed
parentheses, and press Return. This random function is hidden in Tableau and does not appear anywhere
near the calculated field. Go ahead and look for it; it's not there. Notice, Tableau thinks we want to
aggregate this value as a sum, which we do not. Let's right-click on this pill and select Dimension. Now,
random randomly generates a value between 0 and 1 for each mark every time the view loads. Notice our
points have moved along the y axis and are easier to see. Let's reduce the size of our circles. Let's hide the
y axis, right-click random again, and select Show Header. Now I would be remiss if I didn't say there is a
huge tradeoff when applying this technique. The jittering has allowed us to see more of our data, but the
cost is that the y axis position means absolutely nothing. As designers, we need to clearly communicate
this somehow through a caption. Let's click on Worksheet, Show Caption. Tableau's captions are usually
not meant to be read by a human being. Let's clean this up. Click on the caption and add a note that makes
sense. Now the jittering technique is not common, but it's very powerful. What I want you to remember is
something that Uncle Ben from Spider-Man said, "With great power comes great responsibility. " Make
sure you communicate clearly about how your visuals are designed and make sure your audience can
understand what you're displaying. And with that, our dot plots are now complete and pretty awesome.
This chart wraps up our text table variations.
Summary
As we wrap up this module, you should now know how to read, build, and format text tables, highlight
tables, heat maps, dot plots, and our bonus chart of the module, jitter plot. We also discussed some
tradeoffs we will face as designers of data visualization and potential solutions. I encourage you to
download the Tableau workbook from the exercise files and build these charts. I hope you'll join me in
the next module where we'll move on to the king of all charts, the bar chart.
Presenting Distributions and Comparisons of Data Using Bar Charts
Introduction
Hi, this is Adam Crahen. Thanks for continuing along on our journey to build common chart types in
Tableau Desktop. In this module, you will learn how to present distributions and comparisons of data
using bar charts. If you spend any real time in Tableau, you will end up building many bar charts, so
buckle up. First, we will learn about the concept of discrete versus continuous and how it's applied in
Tableau. Then we will learn about the many varieties of bar charts. We will learn how to build horizontal
and vertical bars, a couple of varieties of stacked bars, side-by-side bars, we'll learn about presenting
distributions using histograms, and finally, we'll learn how to use a dual access to make a lollipop chart.
As a reminder for each chart type, we will learn what they're used for and how to read them. We will take
look at some real-life examples, and then we will walk through how to build these charts in Tableau. If
you have multiple monitors or devices available, this is your queue to start your two-screen experience so
you can follow along with the demo to become more familiar with building these charts.
Before we build anything, we are going to discuss what I think is the most important concept to
understand when working with Tableau, and that is the concept of discrete versus continuous. So what
exactly is discrete? Discrete is defined as being a separate entity or being individually distinct. Blue pills
in Tableau are discrete. It is important to know that a blue pill does not always mean it is a dimension.
While that is usually the case, both dimensions and measures can be discrete, and when we use a discrete
pill on the Column or the Row shelf, Tableau will draw headers. And now, what do we mean when we
say continuous? It is defined as being an unbroken hole or continuing without interruption. Green pills in
Tableau are continuous, and it is important to know that a green pill does not always mean that the pill is
a measure. While that is usually the case, both dimensions and measures can be continuous. When using a
continuous pill on the Row or the Column shelf, Tableau will draw axes. Now, with that brief
introduction to this concept, let's jump into Tableau and learn how it comes into play as we drag and drop.
As a reminder, we are using the Sample - Superstore dataset that ships with every Tableau installation.
Before we begin, let's take a second to understand how Tableau actually builds charts as we drag and drop
pills. Our data is organized into dimensions and measures. Dimensions are ways we can slice data and
measures are values that can be aggregated for each dimension. Let's demonstrate this by double-clicking
Sales from the Measures pane. Double-clicking has the same effect as us dragging Sales from the
Measures pane to the Row shelf. Tableau aggregated the sales amounts across the entire dataset and drew
one bar on our view. Notice the pill on the Row shelf is green. This means the SUM of Sales value is
continuous. Dragging a continuous pill on to the Rows or Column shelf tells Tableau to draw an axis.
Now let's double-click on Category from the Dimensions pane. Notice, Tableau has placed a blue pill on
the Column shelf. Blue pills are discrete. Dragging a discrete pill onto the Columns or Row shelf tells
Tableau to draw headers, which slice the view of our data into different panes. Test this by clicking on
one of the headers. Here you can see I clicked on Office Supplies, and that is the only header highlighted
blue, as well as the bar. It is really important to note that dimensions are not always discrete and measures
are not always continuous. Almost any pill can be changed between discrete and continuous. Allow me to
demonstrate. If I right-click on the SUM of Sales pill on the Row shelf, notice I can select Discrete from
the drop-down menu. Now let's take a second to let this all soak in. On the Row shelf, we have an
aggregated measure, SUM of Sales, and it is discrete. This configuration now told Tableau to draw
headers for the discrete values of the SUM of Sales. If I drag Category from Columns to Rows, notice
how this makes more sense. Sales is now a header, not a mark. Let's hit the undo button twice and revert
these changes. Now let's demonstrate how a dimension can be continuous. Let's remove Category from
the Column shelf. I am on a Mac, so I'm going to hold down Option and drag Order Date to the Column
shelf and Tableau gives me the option to select a variation of this field. I am going to select the Month of
Order Date that is blue, or discrete. We are in the bar chart section, so let's change our mark type back to
Bar. As we can see, Tableau drew headers and sliced our data by each discrete month. Again, we can tell
our data is sliced if we click on a header. Just that pane of data will be selected and highlighted blue.
Now, let's right-click on the Month of Order Date pill on Columns and select Continuous. Now let this
soak in. We calculated the discrete month of order date, and then turned it into a continuous field, which
tells Tableau to draw an axis. Notice that Tableau is now smart enough to convert those month names
from the headers into integers corresponding to the month number on our axis. You can also see Tableau
adds a little padding, as it has added 0 and 13 to our axis. If I click on the axis, you can see it highlight
blue, indicating all of the data is within a single continuous pane. In my opinion, the distinction between
discrete versus continuous and dimension versus measure is the most important thing to understand while
working with Tableau. If you could master this, you are well on your way to knowing what Tableau is
going to do before you drop a pill on the canvas.
Bar Chart
Before we can build a bar chart, we need to know what it is. We use bar charts to compare the size of
things or the magnitude by encoding the length from 0 to a value on an axis. These comparisons can be
relative, which is the ability to just see if something is larger or smaller, or they can be absolute, which is
the ability to see fine differences. These charts are also sometimes called column charts. Here is an
example I created a few years back using data from Forbes. This chart shows the top 10 technical skills in
the job market based on explosive growth. The length of the bar represents the growth rate over a five-
year time period, and it looks like you're watching the right course because Tableau is listed at number 3.
Here is another Tableau example, and this one comes from two-time Tableau Zen Master, Matt
Chambers. Matt has some of the most amazing Tableau public work out there. This view shows the entire
history of the NFL in a single dashboard. It is made up of eight Tableau worksheets, split by NFL
division, and it does not include any numbers at all, which is a conscious design choice by Matt. What he
was trying to convey here was a team's win-loss differential. In other words, if a team won more games
than they lost, their bar would be above 0, and vice versa. The length of the bar represents this win-loss
differential. He used color smartly as well. If a team had a losing season, the season is grayed out, and if
they had a winning season, the bars are colored by one of their team colors, unless they happen to win the
Super Bowl and then that bar is colored gold. Let me animate a zoomed-in version of my team, the
Buffalo Bills, to demonstrate. Now they have been bad a really long time, so I'm going to focus on the
Super Bowl years. As I hover over the Buffalo Bill's glory years, you can see the wins versus loses and
the total differential, which is the number that is being displayed by the bar. This is a really cool view of
this data, and you can check it out on Matt Chambers's Tableau public profile. Now let's take a second to
understand the pros and cons, starting with the pros. Bar charts are easy to comprehend, as most people
start to learn about them in elementary school. They are relatively easy to build in just a few clicks, and
they do a good job for allowing for comparisons of categorical data. There aren't a ton of cons.
Sometimes as analysts, we tire of building another bar chart, but they are often the best choice. Bar charts
can be misused though to give false impressions, and we will discuss this further in our demo. Views with
too many bars can start to look cluttered. And also, bars can show data over time, but they are less
effective at this when compared to a line chart. Now, let's jump into Tableau and learn about how to build
the king of all charts, the bar chart. Let's get started building one of the most common chart types to
visualize data, the bar chart. Let's say we want to see sales by category. Let's double-click on Sales and
then double-click on Category. If we make our view Entire View to fill up all of the available space, we
now have a vertical bar chart. Some formatting changes we can make here are to hide the field labels for
columns. Tableau leaves these on by default and it annoys me to no end. Notice, Category is hanging out
in the middle of our view not adding any value. Let's right-click directly on the word and select Hide
Field Labels for Columns. You can also hide these labels by navigating to Analysis, Table Layout, and
clicking on Show Field Labels for Columns to uncheck this option. Instead of field labels, I prefer to use
good titling to describe my chart. Let's adjust our title. Let's call this Sales by Category. This nice title
also makes our axis title redundant. We only have one measure in this view, so let's turn that off by right-
clicking on the axis and selecting Edit Axis. Clear the title text and close the dialog box. Do these bars
look kind of fat to you? Let's check our size setting, and here we can see that Tableau has dialed the size
up larger than its recommended setting, which is indicated by the hash mark. Let's dial that down a bit.
Ah, now our data can breathe. Let's rename this sheet Vertical Bar Chart. I want to take this opportunity
to hammer home an important note. Please repeat after me. I, state your name, will never truncate the axis
to a bar chart. As I mentioned, bar charts encode the length of a value from 0. Notice, sales across our
three categories doesn't vary that much. They are all pretty close. Watch what happens if I truncate the
axis. Right-click on the axis, select Edit Axis, and uncheck Include zero. Now look at our chart. Whoa,
office supplies is doing terrible and technology is shooting through the roof! These are the kinds of quick
assumptions people make when we make mistakes like this in our visualizations. There is a saying that if
you see something exciting in your data, it is probably wrong, and this is a great example. Now let's revert
that change, and for my sake, let's once again say out loud, I, state your name, will never truncate the axis
to a bar chart. Okay, let's duplicate this sheet and rename the copy Horizontal Bar Chart. Now let's say we
want to add an additional dimension, like sub-category. Let's double-click on Sub-Category and Tableau
places the dimension on our Column shelf. When you have a lot of data and Tableau forces the headers to
rotate on their sides, I recommend you always use a horizontal bar chart. It is much easier to read when
the text is positioned normally. And in Tableau this is a really simple fix. We just need to swap our
columns and row shelves by navigating up to the Swap Rows and Columns button. And with one click,
look how nice it is to not have to turn your head sideways to read our chart. Let's do some formatting
cleanup. Looks like those pesky field labels are back because our dimensions are now on rows. Right-
click directly on any of that text and hide them. It looks like Tableau has added some additional chart
junk that we can do without. Let's navigate to Format, Borders. Let's turn off our column dividers and
reduce the color on our row dividers. Once the Format pane is open, you can click on the little icons at the
top to adjust formatting of fonts, alignment, shading, borders, and lines. There is also a drop-down so you
can custom format a specific field in your view. Notice when I click this that only the three fields in my
view are displayed. Let's click on Lines and at the Sheet level, let's turn off the zero line. We can see our
grid lines are set to None, but if we click on Columns, we can see that they're still on. Turn those off, too.
Now we are left with a nice clean view without the default chart junk that gets added. And if you haven't
heard that term before, chart junk refers to unnecessary pixels on the screen that can distract from the
data. Now you might be thinking that this looks great, but there is always something we can do to
improve a view and help people digest it quickly. Let's sort this data so we can more easily compare the
lengths of the bars. To sort in Tableau, you need to click on a dimension in your view. In this case, I'd like
to sort the bars in descending order under each category. This means I need to right-click on Sub-
Category and select Sort. Let's sort this by a Field. Notice, Tableau automatically thinks I want to sort by
Sales since that is the only measure in my view, but I can use the drop-down to select fields that are not in
my view as well. You may have noticed that Tableau is updating the view in the background as we make
our selections. Let's select Descending for the order and leave our aggregation on Sum. Close the dialog
box, and voila, we have our sub-categories sorted in descending order for each category. Let's update our
title to match what is being displayed. Sales by Category and Sub-Category. And finally, let's wrap up
this clip with a note about labeling data points. Usually, it is best to use labels and annotations sparingly
so they can point out important information and insights. If everything is labeled, then nothing is
important, so keep this in mind. If you need to label every bar, then you should hide your axis. It would
be redundant to leave it when the data point is directly labeled. If you aren't labeling every point, first,
kudos to you, and secondly, keep the axis visible so users can determine the value of the bar.
Stacked Bar
Before we can build a stacked bar chart, let's take a second to understand what it is. The main advantage
of a stacked bar is to display part-to-whole relationships. We can see how members of one dimension
make up the overall total or even across another dimension. Also, these charts can be configured in a few
different ways. We can stack the component values and compare the absolute overall value across a
category, or we can create a 100% stacked bar chart. This allows us to compare the relative percentages of
multiple data series. Or we can even use a divergent stacked bar, which is used when working with survey
data on a Likert scale. These bars are actually 100% stacked bar charts, but they are positioned to show
which way respondents lean. Here's one of my favorite examples of a stacked bar chart. This one comes
from my daughter, Penny, and she was learning about this chart type in the first grade and was 6 at the
time that we built this in Tableau. We decided to create a Google form and ask people what their favorite
color was. I Tweeted out the survey and the data community responded with over 200 responses. She was
amazed every time we refreshed the data and a new vote rolled in. So let's talk about what we're seeing
here. The stacked bar allows us to make a part-to-whole comparison while still comparing the overall
amounts across the category. In this case, each little bar is a vote for a favorite color. You can hover over
the mark and it will tell you details, like the date and the time the vote was submitted. When we stack all
these votes up, we can now easily see which color had the most votes. Notice that she learned some of the
same lessons I've been discussing in previous clips. We labeled every bar stack with a reference line so
we did not include a redundant axis. The chart junk like borders, axis rulers, axis ticks have all been
removed. Grid lines have been lightened and we are left with just a nice, clean chart. Let's take a second
to understand the pros and cons, starting with the pros. Stacked bar charts are great at showing the part-to-
whole relationship, maybe even better than a pie chart because they allow for easy comparison across
categories. And now the cons. They are really hard to read when there are a lot of components or slices.
This usually requires a lot of user interaction to overcome. Only one component within a bar has the same
baseline, so it is really hard to compare the other components. We are going to get into the realm here
where we need to start to think about our audience data literacy. We aren't always around when the users
of our work need help, and we need to consider their capacity to be successful when working with our
creation. Because of all these downsides, much more thought needs to go into the UX, or the user
experience, when we are designing these charts. We probably want to build a way to sort components to
make comparisons easier. We definitely need clear titles, and maybe we would even provide instructions.
To be honest, I usually do not choose this chart type in a business setting if I can avoid it, but I understand
sometimes you can't. Now let's jump into Tableau and learn about how to build stacked bars. I will also
show you some better options and ways to deal with all the cons we just discussed. Let's build a stacked
bar chart in Tableau. If we double-click on Region, Tableau will place a blue, or a discrete, pill on the
Row shelf. If we drag Sales to the Column shelf, Tableau will place a green, or a continuous, pill on the
Column shelf. So as a result, Tableau has created an axis for Sales and sliced our sales values by region.
Now, let's turn this bar chart into a stacked bar. I am going to drag Ship Mode onto Color on the marks
card. So here, we have our stacked bar. Let's set this to Entire View and let's hide that pesky Field label.
Right-click directly on the word Region and select Hide Field Labels for Rows. Now let's talk about this
chart. As we discussed, stacked bars show us a part-to-whole relationship. We can see the sales for each
region and how that is broken down by Ship Mode. If we look at our color legend, we can see that
Tableau sorted our Ship Mode alphabetically by default. We can just click and drag our color legend
members into an order that makes sense, like speed of Ship Mode. Let's move Same Day shipping speed
above First Class. Now that we have our sort order the way we want it, let's start to talk about the cons
when using this chart. We can easily compare the sizes of the Standard Class bars because they all have a
common baseline at 0 on the x axis. Notice how all of the other ship modes are hard to compare with too
much accuracy since they don't begin at the same baseline. For example, can you tell if sales with First
Class shipping in the East region are greater or less than sales with First Class shipping in the West
region? It's pretty hard to make those comparisons without interacting with the data. Let me show you
another option for this view that isn't a stacked bar chart. I am going to make a copy of the Ship Mode pill
on the marks card by holding down the Command key and dragging it to the Column shelf. Notice that
even though I dropped the pill on the right of Sales, Tableau moved it to the left. Tableau will always
place discrete pills before continuous ones on the Column or Row shelves. Now this view demonstrates
how having a common baseline is important when making absolute or fine comparisons. We can easily
see that sales with First Class shipping in the West region is greater than sales with First Class shipping in
the East region. I personally prefer this view, as it allows for better baseline comparisons and more
accurate axes. But let's revert our change and keep working on the stacked bar. A good way to solve this
problem is to create an experience for our users that will allow them to move any slice to the baseline.
Let's do this with a parameter. Right-click in the whitespace of the Data pane and select Create a
Parameter. Let's give this a creative name, like Sort. I am going to select the String data type and create a
list, and what we are going to do here is create a value that a user can change that will be inserted into a
calculated field. We will sort our Ship Mode by this new parameter injected field. Click on Add from
Field and select Ship Mode from the list. We can see here that Tableau added our dimension members
alphabetically. Let's drag Same Day shipping to the top of the list, so again, we have sorted that list by
shipping speed. Now, in order for a user to use our parameter, we need to show it. Right-click on the Sort
parameter and select Show Parameter Control. Notice, there is a new drop-down menu on the right side of
our screen. This is where the user will be able to add their input to how the view is rendered. Let's change
the value to Same Day shipping. Notice that nothing has changed in our view at all. We still need to set
up how this parameter can be used in a calculated field. Right-click on the whitespace in our Data pane
and select Create a Calculated Field. Let's call this one Sort. Our formula will be if the Sort parameter
equals the Ship Mode then1 2 end. What we are doing here is telling Tableau to put the value 1 on all
rows for whatever dimension we select in our parameter and the value of 2 on all the other rows. Now
let's hit OK. Now we need to use our new sort field. I am going to right-click on Ship Mode from the
marks card. Select Sort and change it to use a field. We will want to sort in descending order this time,
and the field we want to pick is our new sort field that we just created. And we can choose our
aggregation. We want to choose Min here. If we Sum this field, it will multiply our sort values from our
calculation by the number of rows in each region and Ship Mode. Selecting Min will return the individual
values, 1 or 2, that we specified. Now hit OK and you should now see that Same Day shipping is the first
slice in our bar chart. Let me click through the parameters so you can see how sorting of Ship Mode can
now be controlled by the user. Now that was just a little bit of work to compare the baselines. It would
have been easier to just put Ship Mode on Columns, as I demonstrated, but some people insist on the
value of the part-to-whole relationship. Let's give our chart a good title, like Sales by Region and Ship
Mode. Now let's duplicate this sheet. We are going to modify this chart to be a 100% stacked bar chart.
This will allow us to compare the relative percentages across the regions. First, let's swap our axes to
show that stacked bars can be horizontal or vertical. Let's again hide our field labels for columns by
clicking on it and getting rid of it. Now, click on SUM of Sales. Navigate to Add Table Calculation and
select Percent of Total. Notice our sales axis changed to percentages. Click on the SUM of Sales pill
again and select Edit Table Calculation. We can see here that our table calculation is being computed
using Table (across). Now table calculations can be confusing, so let me simplify what this configuration
is doing. If we click on Specific Dimensions, we can see that both Region and Ship Mode are selected.
What this means is that these dimensions are both included in this calculation. If we take a peek at the
worksheet summary in the bottom left corner, we can see that the entire chart adds up to 100%. What we
want to do here is deselect Region. Deselecting a dimension in a table calc window tells Tableau to restart
that percentage calculation in each region. Notice our bars now all extend to 100% on our axis. If we peek
down at that chart summary again in the bottom left corner, we can see our chart now totals 400%, and
this is what we want. Let's close the table calc window. We can make a copy of our table calc doing a
Command+drag and dropping it on Label. Now two decimals is overly precise, so let's format down to
whole percentages. If your data really needs extra precision, I recommend just rounding to one decimal.
Let's right-click on the percentage table calc and select Format. On the Pane menu, change the default
number formatting to Percentage with 0 decimals. Let's do some additional formatting. Let's hide our
axis, as we just labeled every mark. The axis is now redundant. Right-click on the axis and select Show
Header to hide it. I am going to format our label as well. By clicking on Label on the marks card, I'm
going to change the font to Tableau Bold, the size to 8, and I'm going to select Match Mark Color. Let's
give our view a good title, like Percentage of Sales for each Ship Mode within each Region. Now, let's
take a second to understand this view. Each bar totals 100% of the region's sales. The components are
divided by Ship Mode, and now we can compare the relative percentages across the regions. For example,
sales with standard shipping accounts for 64% in the Central region, while sales with standard shipping
accounts for 60% of sales in the East region. Notice our sort parameter still works in this view if I click
through the options, and this is nice if I want to compare fine differences in size for a particular ship
mode, but it really isn't needed since all of our marks are now labeled nicely. As always, drag and drop
until you are content with the formatting. The effort you put into formatting is never wasted and generally
results in a better user experience.
Side-by-side Bars
Before we can build a side-by-side bar chart, let's take a second to understand what it is. These charts
have very similar uses to a standard bar chart, and as a reminder, those are used to compare the size or
magnitude of things. The difference is, this chart type allows you to compare multiple series of data.
These charts are more commonly known as paired bar or paired column charts. Here's an example that I
created. This chart is showing the working age population of China, represented by the blue bars, and the
United States, represented by the black bars. The bars are plotted side by side within discrete years to
allow for a clean comparison within each year. It is a little harder to compare across the years, but it is
possible with just two components. However, if we had three or four countries in this chart, it would be
much harder to do so. To highlight my point, I added a third country into the view. Notice how even the
presence of the third orange bar for India increases the cognitive load as we try to interpret the data. You
have to work harder to focus on just one or two things at a time so we can understand the chart. Don't do
this to your users. Imagine how much worse it would be if we added in another country. Let's take a
second to understand the pros and cons of side-by-side bars, starting with the pros. Side-by-side bar charts
allow for comparisons of multiple data series across the category, and bars are easy to comprehend. And
now the cons. As we just demonstrated, the effectiveness of this chart is seriously degraded with three or
more components in your data. These charts should really only be used when it is a simple side-by-side
comparison of a few categories. If the chart is complicated, like the last example, people using it are
going to have a bad experience. Much of the work we do requires building trust with our stakeholders.
We do that through creating good experiences and solid analysis. To be honest, this is another one of
those chart types I stay away from. Now with that, let's jump into Tableau and learn about how to build
side-by-side bars. I will also show you some better options and ways to deal with the cons we just
discussed. Here we are back in Tableau Desktop. We just talked about how keeping the number of
components down makes side-by-side charts more usable. So let's create a chart that shows the percentage
of orders within each year that are profitable. The first thing we want to do is to calculate whether or not
an order is profitable. And remember, we already talked about how important it is for an analyst to know
the shape of the underlining data when visualizing it. In this case, we may have multiple rows per order
due to multiple products being ordered, so we are going to have to create a level of detailed calculation to
calculate the total profit for each order. Right-click in the whitespace of the Data pane and select Create
Calculated Field. Let's call this calculation Profitable Orders. Let's use a fixed LOD at the order ID level
and sum the profit. This calculation sums the profit measures across all of the rows for each order ID.
Now let's add greater than 0 outside of the LOD. This will change the data type to a Boolean value, or in
other words, true or false. This type of calculation is a really efficient way to show data in Tableau. Using
an if statement is way less efficient. Let's drag this field onto the Column shelf and notice we have the
two values, True or False. Let's make a copy of this pill. I am going to Command+drag this pill from
Columns to Color. Now let's Option+drag Order ID to Rows. What we want to do here is to select the
count distinct of our order ID. We want to know how many orders are profitable. If you had selected the
count aggregation, you would be counting all of the rows of data, which is not correct. Let me
demonstrate by changing the aggregation of the pills on Rows to Count. Click on the pill, select Measure,
and change to Count. Notice the number jumped up quite a bit. Remember, we have more than one row
for each order. Using a count aggregation is again just going to count all of the rows for each order. What
we really want to do is to count the distinct order IDs. I see this mistake all the time in a business setting,
so make sure you know what you're counting. Okay, let's revert that change and get back to the right
aggregation. Now let's make this a true side-by-side bar chart by adding another dimension. Let's take a
look at this chart by year. Option+drag Order Date to the first position on Columns. Notice when you
hover over the Columns shelf with a pill, Tableau lets you know where the pill will be dropped by
popping a little orange caret on the shelf. Here is our side-by-side bar chart. Let's set this chart to Entire
View to take up all of the available space. And again, let's hide those column field labels. Right-click
directly on the text and hide them. Now we can see the breakdown of profitable orders within each year.
Let's change our measure on the Row shelf to a percentage. Click on the pill, navigate to Quick Table
Calculation, and select Percent of Total. Now, the bars show the percent of total for the entire chart. I'd
like to break down the percentage within each year. Let's click on our pill again and select Edit Table
Calculation. Let's click on Specific Dimensions and uncheck the Year of Order Date field. This
configuration tells Tableau to restart the percentage calculation for every year. Let's do some additional
formatting. Let's change our title to Order Profitability by Year. Let's right-click on our axis and clear the
title. We can now alias our Boolean values. Instead of showing true or false, we can make it say
something else. To do this, right-click on the values in the color legend and select Edit Alias. Let's change
True to Profitable, and I'm going to copy the text of Profitable to my clipboard and hit OK. Now, right-
click on False, select Edit Alias, paste in Profitable, and let's update that to say Unprofitable. Let's change
the colors too. Profitable is good, but orange has a negative connotation to it. Let's make Profitable gray
and Unprofitable red. We haven't talked much about color choice yet, but it is very important. We want to
always keep our colors subtle and use it effectively to tell the story we want to convey. And of course, I'm
always looking for a way to clean up chart junk. Let's click on Format, Borders, and turn off our row
borders. Let's click the Lines button and turn off zero lines. Now this is starting to look like a nice chart.
Finally, to hammer home what the weakness of this chart type is, let's navigate over to the Dimensions
pane and right-click on Profitable Orders. Select Replace References and click on Region, and then hit
OK. And now we have a hot rainbow mess. It is really hard to make any sense of a chart like this. So let's
keep this in mind and keep the number of components in these charts low. Let's revert that change and be
satisfied with our side-by-side bar.
Histogram
Before we can build a histogram, let's take a second to understand what it is. Histograms are the standard
way to show a statistical distribution. They show the values in a dataset and how often they occur. These
charts display the shape, or skew, of the data and can highlight the lack of uniformity or equality in the
data. These charts are different from bar charts in that histograms are used to show bins of continuous
data, whereas a bar chart is used to plot categorical variables. Here's an example I created using some data
from The New York Times. It is showing the distribution of ages of first-time mothers in the year 1980.
Along the x axis, we have discrete bins of age. In this case, each bin or bar represents one year. The y axis
and length of the bar represents the percentage of first-time mothers for each year. If you have had any
training in statistics, you may recall that a normal distribution looks like a bell curve. This histogram has
a skewed right distribution where the mean age is about 22 years old. Now, in contrast, let's look at the
chart showing the distribution of ages of first-time mothers in the year 2016. The x axis represents the
same discrete bins of age and the y axis and length of the bar still represent the percentage of first-time
mothers for 2016. Also, notice I've used the same y axis to make the comparison between the two charts
equal. In Tableau, you would need to use a reference line or a fixed axis to make charts consistent. The
histogram on the right is a bimodal distribution, or in other words there are two peaks. The mean age in
2016 has increased to about 26 years old. Let's take a second to understand the pros and cons of
histograms, starting with the pros. Histograms allow us to see the values and counts of our data, and this
is one of my favorite charts to build. When I am presented with a new dataset, I usually start with an
exploratory histogram to get a better idea about the value ranges in my dataset. I can see if there are any
outliers or missing data. These are especially useful charts for a dataset with large value ranges. And now
for the cons. Histograms give us a good idea about what the shape of our data looks like, but they are not
great at comparing data across a category. Even when multiple charts are placed side by side, it is still
difficult to extract actual values. These charts show the counts of values within a bin, not the actual value.
Still, these charts are incredibly useful and should be in every analyst's toolkit. Now let's jump into
Tableau and learn about how to build one of my favorites, the histogram. Now that we know histograms
allow us to see the shape of the distribution, let's build a couple. If we reference the Show Me pane and
hover over the histogram option, we can see that these charts are only supported when a bin field is used.
Now let's close Show Me. Let's talk about what bins are. It is a way to divide the range of our data by
equal buckets, or in other words, to divide the entire range of our values into a series of intervals. Bins are
usually specified as consecutive, non-overlapping intervals of a variable. Let's try and explain this using
bins that Tableau ships with Superstore, and then we will build some of our own. First, let's set our
worksheet to take up all of the available space. Next, navigate to the Dimensions pane and look for the
Profit (bin). Notice there is a histogram icon to the left of the name. Let's drag this pill to Columns and
let's drag Number of Records to Rows. Additionally, Tableau has set this to work with a parameter. So
let's navigate to the Parameter pane and right-click on Profit Bin Size. Select Show Parameter Control.
Let's hide our field labels again by right-clicking on the text and selecting Hide Field Labels for Columns.
And then, let's rename our worksheet to Product Profitability. Now let's take a second to understand what
we are looking at. Tableau has created this bin for us, which buckets our profit value into bins of $200. If
we hover over the $200 bin, we can see that there are 272 records in this bin. This means that 272 records
in the dataset have a profit value between $200 and $399. Let's right-click on this mark and click View
Data. Click on Full Data. If we scroll to the right, we can see these values are all in the $200 profit bin.
Let's look at the actual profit value on these records. Let's scroll to the right some more, and now we can
see the raw values. Let's click on the Profit header to sort this data by profit value. We can see that the
minimum value is $200, and if we click on the header again, Tableau will sort in descending order for us.
We can see our max value is 399. So hopefully the concept of binning our values is starting to make
sense. If we close the View Data dialog box and take a look at the shape of the profit data, we can see this
is close to a normal distribution. This data has a wide range, which is resulting in long tails on either end,
and this makes the view hard to use. Let's right-click on the view and select Show View Toolbar. Now
when we hover over our view, the view toolbar will appear in the top left corner. If we click on the
triangle and use the Zoom Area tool, we can zoom in on the distribution. Now, we could also modify the
bin size. These bins use a parameter. We can change the parameter to, say, $100, and notice how the size
of our bins have changed from 200 to $100, which results in more bars. Let's click the Reset Axis button
at the top of the screen, which is represented by the pin, and try to build our own histograms. Let's create
a new worksheet. I want to build two new histograms. First, I want to know what the distribution of the
number of distinct products people buy on their first order looks like. Secondly, I want to know what the
distribution of first order sales looks like. I want to parameterize the second view so that I can change the
value of the sales bins. To build these views, we are going to use some LODs, so I am first going to build
a text table to help you better understand the calculations. First, drag Customer Name on to the Filter
shelf. Let's filter to one customer to make this easier. Look for Sean Miller. We are going to use his order
history to demonstrate the LODs we are using. Let's double-click on Customer Name and place it on
Rows. Let's Option+drag Order Date to Rows and select the Discrete value of Order Date. Let's drag
Product Name to Rows. Let's put Sales on the marks card. Right-click on Sales and select Discrete. And
now, move it to the Row shelf. Here we can see some of the raw data for Sean Miller. Now, let's create a
new calculated field. Navigate to Analysis and select Create Calculated Field. Call this new field First
Order Date. I am going to comment this calculation by adding two slashes and entering calculates the first
order date for each customer. We are going to use a fixed LOD, and we are going to fix the calculation at
the customer name level and calculate the minimum order date. Now let's Option+drag this new field on
Rows right next to Order Date and select the Discrete value. As you can see, Sean's first order was on
March 18, 2015, and Tableau has entered that value on every single row. Now, let's find out how many
products Sean bought on this first order. We can see that he bought seven products on 3/18. Let's right-
click on First Order Date and select Create Calculated Field. When you create a calculated field this way,
Tableau will create a new field with that dimension already populated in the dialog box. Let's call this
First Order Products. We want to create a fixed LOD at the customer name level and do a distinct count
of product name if the order date equals the first order date. Now drag this new field onto the marks card
and make it discrete. Drag this next to Product Name. And notice, Tableau entered 7 on every row for
Sean Miller. We can even drag Product Name off of the Row shelf and now we will still see that we have
seven products. Let's duplicate first order products from the Measures pane and edit the calculation. Let's
rename this to First Order Sales. Let's change the aggregation to sum and replace Product Name with
Sales. You can double-click on Product Name and then start typing Sales to replace it. Now this
calculation will determine the amount of sales over all of the rows for the first order date. Click OK. Now
drag First Order Sales onto Text on the marks card. At this point, let's remove the discrete value for SUM
of Sales and Order Date from the Row shelf. We are now left with Sean's first order date, the number of
products he purchased, and the amount of his first order. These are the values we are going to use to
create our histograms. Let's clear the sheet. Let's right-click on First Order Products; select Create Bins.
Tableau has suggested the size of the bin based on our data. We can see the minimum is 1 and the
maximum is 9. So let's change our bin size to be equal to a whole number, 1. Hit OK, now find our new
bin field in the Dimensions pane and drop it on Columns. Here you can see our nine bins. Now let's
Option+drag Customer Name to Rows. We want to count the distinct number of customers that purchased
the number of products on their first order. And now we have a bit of formatting to do. Let's change our
measure to be a percent by right-clicking on the pill on Rows and selecting Quick Table Calculation. Pick
Percent of Total. Let's hide that field label on Columns again and set our sheet to Entire View. We can
remove the axis title by right-clicking on the axis, selecting Edit Axis, and clearing the title. Let's also
update the title of our worksheet to Percentage of the Number of Products Purchased on First Orders. The
last bit of formatting might be to dial up the size on the bars. When creating a histogram, reducing the
space between the bars helps to show the shape of the data. Here we have our final histogram where the
distribution is right skewed. Now, let's see what the shape of First Order Sales looks like. Let's create a
new worksheet. Right-click on First Order Sales and select Create Bins. Now we can see the Min is 1 and
the Max is $23, 661, which was Sean Miller, if you remember from our example. Tableau is suggesting a
bin size of 770. I think we may want to tinker with the bin size a bit. Click on the drop-down next to 770
and select Create a New Parameter. Let's update the current value to be 500 and format the parameter as
Currency, removing the decimals. Now click OK. Let's set our worksheet to Entire View. Find our new
First Order Sales bin in the Dimensions pane and drag to Columns. Again, we want to Option+drag
Customer Name to Rows and select count distinct. We again want to count the distinct customers in each
bin for First Order Sales. Let's format this histogram just like the last one. We can change our measure to
be a percentage. Again, we are going to hide those field labels by right-clicking on the text and selecting
Hide. We can remove the axis title by right-clicking on the axis, selecting Edit Axis, clearing the title, and
closing the box. Now let's update the title of this worksheet to be Percentage of First Order Sales by
Increments of. Now, we can make a dynamic title by selecting Insert here and injecting the value of our
parameter in the title. When we close the dialog, notice $500 from our parameter value is showing in the
title. Go ahead and update the parameter to $1, 000 and notice how the title updates. How about $250?
We also have a null indicator in the bottom right corner. We can right-click on that and select Hide the
Indicator. Here we can see that we have an enormous tail due to the giant first order that Sean Miller
placed. Now, some people make the mistake of excluding missing values. Let me demonstrate. Right-
click on the bin field on the Columns shelf and uncheck Show Missing Values. Notice that all of the null
space that was there before is now missing. While it makes the chart easier to read, notice how the gap
between the bars is no longer consistent. People may make the wrong assumptions about the data if it's
presented this way. There is a $13, 000 jump between our last two bars. Let's revert that change and keep
this in mind for the future. The purpose of the histogram is to show the shape of the data, so don't alter the
shape with a bad configuration of your chart.
Lollipop Chart
Before we build a lollipop chart, we need to know what it is. Lollipop charts are essentially a bar chart
with a dot on the end. We use lollipop charts to compare the size or the magnitude of things by encoding
the length from 0 to a value on an axis. These charts put more emphasis on the data value than a standard
bar chart. Here's an example that was created by two-time Tableau Zen Master, Pooja Gandhi. Here, she
was visualizing the population of Bermuda by decade. We can see that the lollipop is essentially a thin bar
chart with a point at the end. Pooja has hidden the axis and provided context to the scale of the chart by
labeling the minimum and the maximum values. This was done to address one of the cons of this chart,
which we will discuss. So let's take a second to understand the pros and cons of a lollipop chart, starting
with the pros. Lollipop charts are easy to comprehend and they're easy to build in just a few clicks. They
do a good job to allow for comparisons of categorical data. And now the cons. The point at the end of the
lollipop chart can cause some trouble. First of all, the dot overstates the value. When we draw a mark
using a circle or another shape, Tableau draws that point at the center of the shape. So the shape that we
use for the end of our bar usually extends past the point it's visualizing. This can cause trouble when
trying to make fine or absolute comparisons, and this problem gets worse when the dot is sized too large.
Now let's jump into Tableau and learn how to build a lollipop chart. This is an exact copy of the chart we
built in our horizontal bar clip. We are going to create a lollipop chart. A lollipop chart is basically a bar
chart with a dot at the end. Make a copy of Sales on Columns and drag it back onto Columns. Let's do
things in a different order this time. Right-click on the second pill and select Dual Axis. Notice Tableau
thinks we want to make a dot plot. Let's right-click on the top axis and select Synchronize Axis. Right-
click again and select Show Header to hide it. Now, on the middle marks card, change the mark type to
Bar. Now we just need to dial down the size to a very thin bar. And that's all there is to a lollipop chart.
We will use the dual axis through the remainder of this course, so you will definitely see it in action
again.
Summary
As we wrap up this module, you should now know how to read, build, and format multiple variations of
bar charts, stacked bars, side-by-side bars, histograms, and even lollipop charts. We also discussed some
important concepts, like discrete versus continuous and how this distinction is important to understand so
you know what Tableau will do when you drag and drop a pill onto the canvas. We talked about
removing chart junk a lot so we can focus on the data, and we talked about the shape of distributions and
identifying outliers. We started to introduce the concept of good color choice and creating a better user
experience for our users. We even covered writing some basic logical table and LOD calculations. I
encourage you to download the Tableau workbook from the exercise files and build these charts. I hope
you'll join me in the next module where we'll learn how to present data over time using line charts.
Presenting Data Over Time With Line Charts
Introduction
Hi, this is Adam Crahen. In this module, you will learn how to present data over time using line charts in
Tableau. We'll start by learning about variations of charts to display data over time. We'll talk about line
charts, area charts, dual axis, step and jump lines, and we'll round out our module with sparklines. We'll
also continue to talk about key data visualization concepts like discrete versus continuous and date part
versus date value. We'll talk about how these concepts apply to displaying data over time. As a reminder,
for each chart type, we will learn what they're used for and how to read them. We will take a look at some
real-life examples, and then we will walk through how to build the chart in Tableau. If you have multiple
monitors or devices available, this is your cue to restart your two-screen experience so you can follow
along with the demo to become more familiar with building these charts.
In this clip, before we build anything, we are going to discuss another important concept, especially when
working with dates, and that is the concept of date part versus date value. So, what is a date part? Like the
name suggests, it is a distinct part of a date, like month, day or year. The result is no longer a date value,
but instead, we are working with an integer data type, which represents a date part. For example, given
the date May 10, 2019, the date part for day would equal 10, The date part for month would equal 5, and
the date part for year would equal 2019. So what is a date value then? It is a truncated date, but we are
still working with an actual date data type, unlike an integer from the date part. For example, if we were
given the same date, May 10, 2019, day would equal 5/10/2019, month would equal May 1, 2019, or the
first day of the month, and year would equal January 1, 2019, or the first day of the year. Now with that
brief introduction to this concept, let's jump into Tableau and learn how this concept comes into play as
we drag and drop. Before we jump right into the building process, I want to expand on how discrete and
continuous play into date parts and date values. To do this, let's drag Order Date onto Rows. Notice
Tableau automatically put the discrete year of Order Date on the Row shelf. This resulted in Tableau
drawing headers for each year. Let's click on the pill. There are two sections in this drop-down menu
under date fields. One represents date parts, and one represents date values. Let me explain. Date parts are
individual parts of a date. Notice Tableau provides examples of what our result will look like for year,
quarter, month, and day. There are a few more options under the More menu, like Week Number and
Weekday. Let's change our date part to Month. Notice Tableau drew a header for each month. If we
double-click on the pill on the Rows shelf, we can inspect the function Tableau's applying to our field.
Notice this is a DATEPART function, extracting month from Order Date. The concept of date part versus
date value is really important to grasp because we have multiple years of data in our view. To
demonstrate, right-click on the mark for January. Select View Data. In this dialog, select Full Data and
scroll to the right. Notice the actual order dates are from multiple years. Now let's close this dialog. While
date parts default to discrete, they can be changed to continuous like any other pill. Let's demonstrate this
by clicking on our pill and selecting Continuous. Notice Tableau has drawn an axis because we are now
using a continuous setting. Our values are still date parts. Tableau is plotting the marks on the number
that represent each month. Now let's change to use a date value instead of a date part. Click on the pill and
select the Month date value. Notice Tableau shows an example of the result, and month is qualified by
each year. Let's click this option. Tableau still drew an axis because this is currently a continuous field,
but we are using date values. Notice our axis changed from the whole numbers to years, and we have
many more marks because we now have a mark for every month and year. Again, let's double-click on the
pill on the Rows shelf. Notice Tableau is now using a DATETRUNC function on our Order Date.
DATETRUNC is used to truncate dates, but the result remains a date, not an individual date part. For
example, let's say we have the date November 17th. If we truncate month, the day value is replaced by the
first of the month, and it would result in being November 1st. If we truncate November 17th to a quarter,
the month and day would be adjusted, and the result would be October 1st, which is the first day of Q4. If
we truncate to the year, the month and day are again adjusted, but the result would be January 1st of the
same year. Let's set our pill back to discrete. Let's change the format of the field. Click on the pill on the
Rows shelf and select Format. On the Header tab, change the date value to show the standard short date.
Notice the values of our pill are still dates, truncated to the first of each month. When working with dates,
it is really important to understand the difference between discrete versus continuous and date part versus
date value.
Line Chart
Before we can build a line chart, let's take a second to understand what it is. Line charts display
information in a series of data points that are connected by straight segments. They are the standard way
in which we show changing data or trends over a time series. These charts should only be used for ordinal
data, or in other words, data that has an order like time or rank. Line charts should not be used to show
categorical comparisons because the connections of data points implies an order. These charts are
sometimes known as line graphs. This example comes from two-time Tableau Zen Master Pooja Gandhi.
Poo is one of the most followed Tableau authors in the world, and this is one of my favorite pieces of her
work. This line chart is meant to show the trend of the temperature on Earth from the 1850s to the 2010s.
The time series data is presented as a temperature anomaly, which is relative to the period of 1961
through 1990. Now you don't have to be a scientist to understand what Poo was intending to
communicate. The temperature on Earth has never been warmer than it is now, and that point is clear. Poo
did a wonderful job using annotations to call out important weather events to provide context to our
reader. There's also some cool interaction when you hover over the globes at the top of the screen. For
more inspiration, definitely check out Pooja's Tableau public profile. This example is one of the earliest
examples of a line chart and comes from William Playfair, who is credited as the inventor of the bar chart,
the line chart, and the area chart. This particular graph shows two lines, one representing imports and one
representing exports, and tells the story of trade between England and Denmark and Norway. He used
some shading to highlight the area in between the lines to show when the balance was in favor of
Denmark and Norway, and then when it shifted to England's favor. These old examples are amazing to
me to see how they incorporated language through the use of excellent titles and captions, annotating
important pieces of information. Let's take a second to understand the pros and cons of line charts,
starting with the pros. Line charts are the standard way for displaying data over time or other ordinal data.
Examples of ordinal data might be showing statistics for a professional athlete by season or game. They
are great at showing the rise and fall of our data, and can even show us small variations. Line charts scale
incredibly well. The example from Pooja had over 160 years of data in one chart. And now the cons. Line
charts should not be used to connect categorical data. Connecting points implies an order. Line charts can
be hard to read when there are multiple lines, but there are techniques like highlighting that can be
employed to help with situations like these. Now let's jump into Tableau and learn about how to build line
charts. Here we are back in Tableau getting ready to build our line charts. Drag Order Date onto Columns
and Sales onto Rows. Here we have a line chart in two clicks. We are currently plotting sales by discrete
years. Let's set our view to Entire View, and once again get rid of that field label. Right-click on it and
select Hide. Now let's change to the date part Month. This chart is combining sales for all years into each
month. To prove this, once again, drag Order Date onto Color of the marks card. And now we have four
lines, one for each year. Adding a field to the marks card changes the level of detail in our view. This
chart allows us to compare year over year changes in each month. For example, we crushed it in
November of 2018 compared to the previous three years. We could make a copy of the pill on the marks
card and drop it on rows. This view will allow us to plot each year in its own pane with its own axis and
shared headers. Now, let's plot some data with continuous dates. Click on the Month of Order Date pill on
the Columns and change it to the Month date value. Notice now we have four lines that are broken. This
is because we have discrete date parts breaking our line. Let's remove the Year of Order Date from Rows.
Notice our lines now line up, but they are still broken. This is because we have discrete year on Color.
Let's remove the pill from the marks card, and here we have our continuous line. This chart shows the
change or trend in sales over all four years, but it is not as easy to compare year over year as compared to
the last view. Your audience and requirements will drive which chart you build. Let's change our view to
use the Week Number date value. Let's apply some formatting here. Turn down the size of it, and now
let's work on our axis. Right-click on the axis and remove the title. We are showing dates over time, so
the title is a bit redundant. Let's right-click on the axis and select Format. On the Axis tab, let's format the
date to use mmm yyyy. Look at how much nicer that axis looks. Now this does come with one minor
annoyance. If we hover over one of our marks, we can see the week value has also been formatted. One
would think you could click on the Pane tab here and change the date formatting back to the standard
short date. However, this doesn't work. The axis formatting overrides the pane. This option has never
worked, and it's extremely annoying. However, we can fix this. Double-click on Week on the Columns
shelf and copy the text from the pill. Navigate to Analysis, Create Calculated Field. Call this field Week
and paste in the formula. Notice it is truncating our date down to the start of the week. This field will
show in the Dimensions pane. We can right-click on it and say Convert to Continuous. Notice that this
field shows as a date time, and we can tell by the icon next to the name. This means if we use this in our
tooltip, Tableau will try to add a timestamp, but we know there is no timestamp because we just truncated
the date. You can click directly on the icon and change it to a Date data type. If you right-click and open
the calc again, we can see that Tableau wrapped the DATETRUNC in Date. It is important to understand
how Tableau modifies things as we click around the canvas. Close the dialog box. Now Option+drag
Week to Tooltip on the marks card. Select the Week date value variation. Notice Tableau has converted
this field to an attribute as indicated by ATTR on the marks card. An attribute returns one value for each
mark. If there happens to be more than one underlying value, Tableau would return a star. However, since
we already calculated the same field, Tableau will return the week date. There is only one value for each
week. Hover over a mark and notice our dates are now back. You can update your tooltip to remove the
formatted one. Finally, let's navigate to Format, Borders, and ditch the column divider. If we click on the
lines icon at the sheet level, remove all zero lines. If we click on the Rows tab, we can then turn on the
axis ruler. This adds a ruler to our y axis. We can still slice sales by other dimensions. For example, drag
Category onto Rows. Duplicate that pill and drop it on Color of the marks card. Let's edit our sales axis
and clear the title. We can hide the field labels on Rows. And then finally, let's give our sheet a new
name. How about Category Sales by Week? Continue to experiment with discrete versus continuous and
date part versus date values as it will really help you while working with dates in the future.
Area Chart
Before we can build an area chart, let's take a second to understand what it is. Area charts are based on
line charts in that they are drawn by plotting points on a graph and connecting those points with straight
lines. But the difference is that the area between that line and the x axis is filled. This fill shows the
magnitude of data similar to a bar chart, so axes should never be truncated for area charts. Similar to line
charts, area should not be used to show categorical data because the connection of points implies order.
This example comes from Curtis Harris who was the winner of the 2016 Tableau Iron Vis contest. Curtis
is an awesome designer among other things, and this was one of his earlier feeder contest submissions.
The contest was focused on data gathered from Wikipedia. Curtis took that data and personalized his
dashboard to tell the story of your birthday. It's designed for you to enter your birthday and a parameter,
and then the dashboard will populate the data in your birth year and birth month, and it'll tell you if
anybody important was born on your birthday. The hero of this chart is the giant area chart on the bottom
right. This particular chart is stacked, so we are seeing the cumulative shape of the data, sliced by month.
The data is sorted with January being the lightest color of orange, and December being the darkest area at
the bottom of the chart. This chart does a good job of showing how the data stacks up. But to compare the
months, we'd need to unstack the data and use a common baseline. Interestingly enough, Curtis made over
his own chart a few years later, and he didn't use an area chart, but I am still drawn to this one anyway.
For more inspiration and great design, check out Curtis's Tableau public profile. Let's take a second to
understand the pros and cons of area charts, starting with the pros. Area charts have similar pros to line
charts. One additional pro is that they show the magnitude over time series data using fill. Area should be
used with time series or other ordinal data. Area charts do scale well when the number of areas are low
and not stacked. And now the cons. Like line charts, area charts should not be used to connect categorical
data. When not stacking area charts, data might be hidden behind another area. You have to use opacity
on the filled area to allow data to be shown through other fills. Sometimes sorting can be an effective part
of the solution. Stacking areas is one way to ensure data is not hidden and does allow us to see the
cumulative total. However, it introduces a new problem where the slices of data are no longer
representative of the values they are in line with on the axis. And this is because they are stacked. This is
a similar problem when using stacked bar charts. Area charts can be used in a number of ways to display
data, and as a general rule of thumb, you just need to use some caution to not hide or misrepresent your
data. Now let's jump into Tableau and learn how to build area charts. And here we are back in Tableau,
getting ready to build our area chart, which is very similar to a line chart. Let's prove that by making a
copy of our line worksheet. Let's rename this sheet Area. Let's navigate to the marks card, and change the
mark type from Automatic to Area. And we're done. Not quite. Let's create a new worksheet and name
this one Area 2. Option+drag Order Date to Columns, and select the Day date value. Let's drag Sales to
Rows, and here we have a line chart by day, but it is pretty hard to see. You may be asked in a business
setting to create a running total chart. Let's do that by clicking on the pill on the marks card and clicking
on Quick Table Calculation and Running Total. Now let's change our mark type to Area on the marks
card. You may have noticed that the color muted a little bit. Tableau automatically changes opacity to
60% when using an Area mark type because of data overlapping. Here we have an area showing the
running total of sales. Now let's drag Segment onto Color. Now we have three areas, one for each
segment that our running total charts. This is a stacked area chart. We can see the overall running total of
sales, and it is sliced by the three segments that make it up. Now you might think this is a great looking
chart, but there is a problem. When we stack our data like this in an area chart, the plotted mark is usually
not representative of the axis. To demonstrate this, let's turn on a little-known feature called drop lines.
Right-click on the chart and select Drop Lines. Right-click on the canvas again and select Drop Lines,
Edit Drop Lines, and turn Labels to Automatic. Now let me click on a point in the corporate area. Look at
the value in the tooltip, and now follow the drop line to the y axis and look at the value that is labeled, and
look at the scale. Notice how it lines up. This is the inherent problem with stacking marks. The actual
values are generally overstated if you are to look at the axis. In my opinion, if you are stacking areas or
bars for that matter, you should hide the axes and provide context on the scale of the chart through labels.
Let's clean up this chart. Let's right-click on our date axis, select Edit Axis, and clear the title. Let's give
this chart a new title like Running Sum of Sales by Segment. Now let's duplicate this worksheet and name
it Area 3. Navigate to Analysis, Stack Marks, and change it to Off instead of Automatic. This is a setting
that causes our marks to stack up. Now we have a common baseline for all of our area charts, and the
values will match the axis, but notice it looks like the consumer area, which is blue, is blocking all the
other data. Let's click on Color on the marks card and change the Opacity to 100%. Notice how we can
barely see anything. This was just a test to show you how important opacity is. Let's revert that change.
We still need to address the consumer area blocking our other data. Let's click on the members of the
color legend to highlight our data. The reason our data is partially hidden is because of the way segment
is sorted. Let's drag the members on the color legend into a better order. This looks better, but we still
have some obscured data in the beginning of our chart where Home Office is blocking the other
segments. We could try to click on Color on the marks card and add at white border to our marks color to
try and improve it showing through the area. Now let's click on a corporate point and see if our dropline
looks better. It sure does, so there you have it. In data vis, there are always tradeoffs with choosing a type
of chart. And now you're prepared for dealing with the ones that come with area charts.
Dual Axis
Let's take a second to understand what a dual axis chart is. Dual axis charts are a way to combine two
charts into a shared axis in Tableau. This provides us access to a second marks card, which opens up the
ability to combine multiple mark types and formatting in a single view, and all of this expands the
possible chart types we can now build in Tableau well beyond the Show Me pane. Here is an example of
a dual axis chart I designed. This chart shows the amount of pounds Americans consume per capita of
chicken, beef, and pork by year. I have combined a line chart and an area chart using a dual axis. Now my
line looks a little funky because I purposely designed it this way to use thickness to show periods of time
when the per capita consumption of chicken increased while both beef and pork were declining year over
year. It's kind of a unique way to add some analysis into this view. Here is another example I created.
Remember that I like the Buffalo Bills? I know, I can't help it. I grew up there. This chart shows Jim
Kelly's quarterback rating for his entire career. This is actually a scatterplot where the points are plotted
on the x axis in the order of the game played. They y axis represents his QB rating. I use some pre-
attentive attributes like color and size to highlight playoff games in large, red dots. On the dual axis, I
have created a 16-game moving average of his quarterback rating. I picked 16 games because that's the
length of a season. This allows us to see how his performance in each game compares to the average. The
moving part of the calculation helps smooth anomalies like really good or really bad games. As you can
see, most of the playoff games were below his average quarterback rating, including all four Super Bowl
losses, which are annotated on the chart. Dual axis charts just open up the possibilities to your
imagination. Like here, I created a frequency chart because I wanted the visual to look like a soundwave
since I was dealing with music data. The x axis represents how many words were used on average by an
artist in the top 100 songs. The length of the bars represent how many songs the artist had in the top 100.
This chart combines circles and bars. It actually uses two sets of dual axes to accomplish the entire look
of the chart. You can download all three of my examples from my Tableau public profile and take a look
at how they are built. Using a dual axis in Tableau opens up the types of charts we can build. It would be
impossible for me to list the pros and cons of every chart type, so instead, I would say, take the time to
understand the pros and cons of building each chart, and be careful when deciding to combine them into a
single view. Some things just shouldn't be combined. Now let's jump into Tableau and build a couple dual
axis charts. Here we are back in Tableau, and this might be one of my favorite clips in the entire course.
Using a dual axis in Tableau gives us many more options and really opens the path to being creative to
construct awesome charts. Let's work through some examples. The starting point for these examples can
be downloaded from the starter workbook in the course exercise files. First, let's start with a copy of the
line chart we built in an earlier clip. In a view like this, I often like to plot a line on top of an area chart. I
use the area chart to provide subtle context to display the magnitude of the data, and then use a line chart
to make everything pop. To get started, let's make a copy of our Week of Order Date pill on Columns and
drop it back on Columns. Now we have two pills on our Columns shelf and two copies of our chart.
Notice we now have three marks cards. We have one titled All. Any changes made here will be applied to
the entire worksheet. And then we have two Week of Order Date marks cards with a line mark type
indicator. This is because when we added our second continuous pill to Columns, we drew a new axis.
The first marks card represents the first pill on Columns, and the marks card at the bottom is for the
second pill on Columns. Let's expand the bottom marks card and change the mark type to Area. Notice
how the mark type indicator on the marks card has changed to Area and our view is updated. Now, let's
drag Color off the Area marks card, and let's update the color to be almost white, or just a couple shades
away from whatever our background color is. Let's also turn off the borders for the Area mark. Notice
how our area is now very subtle. Right-click on the second pill on Columns and select Dual Axis. Tableau
has moved these charts into the same pane. Notice we have a new axis at the top of our view. This is the
axis for the second pill on the Columns shelf. It is really important to make sure we synchronize our axes
when using a dual. Right-click on the top axis and select Synchronize Axis. Now you might be saying
why does that line look a little funky? This is because our area chart is on the top of the line. Tableau
needs to define the z order, or the order in which to layer these charts. To simplify this, if we look at the
Columns shelf, the pill on the right is on top of the pill on the left. Simply drag the pill on the right to the
left, and notice how the line has popped. Let's revert that change. We could also right-click on the top axis
and select Move Marks to the back. Now, right-click on the top header once more and click on Show
Header to hide it. We don't need to have two axes showing the same information. And now we have a
really nice, subtle effect on our line chart. Let's try another example. Click on the Dual Axis 2 worksheet.
Here we have two lines showing me sales for 2018 and 2015. We need to build a chart that highlights the
data between these lines. Let's make a copy of the Sales pill on Rows and drop it back on Rows. Now we
have two rows with the same chart. Change the bottom marks card to an Area mark type. Notice how
Tableau has stacked the marks in our area chart, and the axis is independent of the line chart axis above.
We do not want to stack these marks. Navigate to Analysis, Stack Marks, and select Off. Now you can
see here that by using the area chart's weakness of hiding data, we see the orange slice that we want to fill
in the chart above. The bottom looks gray because the orange and blue are overlapping, and the area chart
currently has Opacity set at 60%. Let's make a few changes. Remove the pills on the Area marks card.
Now we have just one area. Click on Color and turn Opacity to 100%. Navigate to the Dimensions pane
and make a copy of our Order Date field by right-clicking on it and selecting Duplicate. Find the Order
Date copy, and Option+drag it to Color on the marks card. Select the date part Year. The reason we made
a copy of the pill is so that we can switch the colors for the years. Notice we now have two color legends.
On the copied legend, click the drop-down and select Edit Colors. In this dialog, you can double-click on
the squares under Select Data Item to get more options. Now this window looks a little bit different on
Windows versus a Mac. Notice we have multiple options to click the perfect color for our chart. We can
even enter a specific hex code. I am going to click on the crayon box and select snow. Click OK, and if
we hit Apply, watch our blue area behind this dialog disappear. Now the orange we are left with is a little
annoying. Let's double-click on the orange square for 2018. I'm going to use the medicine dropper to
select the exact gray from the word GROWN in my title. Hover over the word GROWN and click when
we have the perfect gray color identified. Now, hit OK to select that color, and then hit OK to close the
color options. Now, let's combine the views. Right-click on the second row pill and select Dual Axis.
Right-click on the axis that appeared on the right side of the screen and select Synchronize Axis. Right-
click on that axis again and select Move marks to the back. And right-click on that axis one more time
and select Show Header to hide it. And now, we have successfully highlighted the data in between the
lines thanks to our area charts. The dual axis can be used for so many things. I feel like I could make a
course on just this topic.
Let's talk about step lines and jump lines. Both of these are based on line charts. The main difference from
a basic line chart is that line charts portray a consistent rate of change between data points by connecting
them with a straight line. However, that is not always the case. So enter step and jump lines, which use a
modified path. Step lines draw an elbow to connect points, emphasizing the magnitude of change, while
jump lines actually break the line, which emphasizes the duration of change. Both of these use path
modification to highlight significant change between data points. Now this is not the prettiest chart, but it
is a fun example. I taught at my daughter's school during an event called Imagine Adventure Day. I think
over 200 kids came to take part in this repeating session throughout the day. They came into the
classroom, filled out a Google survey, and picked their favorite color, and then we opened Tableau and
visualized the live survey data. The chart shown here is a jump line of votes from that day. To make it fun
for the kids, we colored the jump lines so we could see if there are any patterns in the colors that were
being voted for. I guess I should say, I am generally not a fan of rainbow charts. You can almost see the
number of sessions I taught that day because each longer horizontal line represents the passing of time.
The long red line was lunchtime. Notice how the horizontal lines really emphasize that duration. Let's
take a second to understand the pros and cons of step lines and jump lines, starting with the pros. Both of
these charts are essentially line charts, so the same pros apply. The main difference is how modifying the
path between connected points can better handle an inconsistent rate of change in our data. Emphasize
magnitude with step lines, and emphasize duration with jump lines. And now the cons. These should not
be used to connect categorical data, and you need to take into account your audience data literacy. These
are not standard chart types, and not everybody knows how to read them. Now let's jump into Tableau
and learn how to build step and jump lines. Here we are back in Tableau, and let's jump right into
building a step line first. Let's Option+drag Order Date to the Filters shelf. Select the discrete Month/Year
variation of the date. Click Next and scroll down, and let's just filter to December 2018. We want to focus
on just a little bit of data for this chart. Drag Category to Columns, and Option+drag Order Date to
Columns. Select the Day date value. Also, let's drop Sales on Rows. Let's make a copy of Category from
the Columns shelf and drop it on Color on the marks card. Okay, let's start with some normal chart clean
up. Set our view to Entire View. Hide the field labels by clicking on the text and selecting Hide. Right-
click on the date axis, select Edit Axis, and clear the title. Navigate to Format, Lines, and let's clear the
zero lines. Now this already looks a lot better. Let's update our title to Change in Sales by Category for
December of 2018. Okay, so we have a normal line chart here. Let's pause for a second and discuss. Sales
do not occur in each of the three categories every day in this dataset. The longer diagonal lines are where
data is spread across many days. We could highlight these larger jumps between days using a step line.
Click on Path, change the line type from linear to step. Step lines plot data at the same value until there is
a change, and then a vertical line emphasizes the magnitude. For example, under Technology, if I hover
over December 18th, we can see that we had sales of $149. The horizontal line shows us that no sales
were made over several days until December 21st where we had $353. Notice, Tableau drew the elbow
connector between these points. Now let's see how jump lines compare. Duplicate this sheet, and let's
rename it Jump Line. Click on Path and change the line type from step line to jump line. Now let's discuss
what we're seeing here. Tableau continues to draw the mark until the data changes, but then Tableau
breaks the line as data jumps to a new level. This chart emphasizes the duration of time we were on the
same level, but totally de-emphasizes the magnitude by not connecting the points. It's actually hard to tell
which line connects to which. If we hover over the same data point for December 18th, we can see
Tableau drew that line for three days, but then did not connect to December 21st, our next data point.
Finally, now that we learned about dual axis charts, let me show you what I think is the winning
combination. Duplicate this sheet, and let's rename it Step Jump Line. Let's make a copy of Sales on the
Rows shelf and drop it back on Rows. Expand the Top Sales marks card and drag Category off of the
marks card. Let's click on Path and change our line type back to step. Let's click on Size and dial down
the size a little bit. Click on Color and change to a lighter gray. Now right-click on the second pill on the
Rows shelf and select Dual Axis. Right-click the right axis and select Synchronize Axis. Right-click again
on that axis and select Show Header to hide the axis. Now here we have the combination step jump line.
We have used color to point out longer durations of time spent at the same sales levels, but regained the
ability to see which points connect to each other and the magnitude of that change with the subtle step
line in back. This is just another example of how we can be creative using dual axis charts.
Sparklines
Before we can build a sparkline, let's take a second to understand what it is. Sparklines are very small line
charts, typically drawn without axes or coordinates. They can be small enough that they are embedded
within a line of text. They present the general shape of variation over time in a simple and highly
condensed way. Sparklines are often meant to highlight variation by using truncated axes. This example is
hard not to show. This is the iOS Stock app. The charts here depict the change in stock price for one day.
The opening price is indicated by a dotted line at 0. The line shows the performance of the stock
throughout the day. If the stock price is above the 0 line, the entire chart will be colored green. If the
stock price dips below the open price, the entire chart will be displayed red, like in the case of DOMO.
Let's take a second to understand the pros and cons of sparklines, starting with the pros. Sparklines are
meant to be extremely space efficient. They can even be as small as one line of text and could be placed
in line within analysis. They are meant to be used where the conversation is happening around the data.
They do a good job of showing the general shape of data and highlight variation over time, and they scale
very well and can compact a lot of data into a small space. And now the cons. Because the lines are so
small, it is hard to extract any real detail. Their intended purpose is more so to inform at a glance, and this
is along the same lines, but generally there are no axes displayed, and there may be truncated axes to
highlight variation. Now, let's jump into Tableau and learn how to build sparklines. Here we are back in
Tableau, and we are getting ready to build some sparklines. Since sparklines are just tiny line charts, let's
start with a duplicated copy of the line chart we built earlier. Let's drag Category off the marks card, and
let's drop Subcategory on top of Category on the Rows shelf. This will replace the pill. Now, notice here
we have 17 line charts in different rows for each category. The data is pretty flat across the board with the
exception of a few spikes in copiers and machines that, of course, we would suspect since they would
have higher sales than something like paper. Let's update the title since we changed to sub-category. As
discussed, the point of the sparkline is to show the shape of the data, or to highlight variation. In most
cases, these charts will use a truncated axis to highlight that variation between data points. Let's right-
click on one of the tiny y axes and select Edit Axis. Under the Range options, change from Automatic to
Independent axis ranges for each row or column. Now close the dialog box. Notice how some of the
values that are visible on the y axes are now different depending on the scale of the data within each row.
Our lines now show us the shape of the data relative to each sub-category. for example, who bought $9,
000 worth of book cases in September 2016. Now, right-click on the y axis again and select Show Header
to hide them. Now to show how sparklines can be condensed into a small space, set our view to Fit
Height. Now drag the right side of the chart to the left, condensing the spread of the points. And now we
can see that our lines are a little too thick. Let's dial the size all the way down. I'm going to click on Color
on the marks card and change to a dark gray. And notice our date axis isn't adding much value anymore.
We have four years of data in that small space. You may either choose to hide the axis completely or we
can right-click on the axis and select Format. Change our custom format to be yyyy to just display the
year, then then now edit the font, and let's make it size 6 and hit Enter to apply that change. There are a
lot of fun things you can do with sparklines. You could create a dual axis using dates and plot different
points on top of the line. You might want to show the most recent month, and then all of the similar
months in past years. You could also compare these points by adding a reference line. Sparklines are
usually accompanied with some text, so let's add some points of context. Navigate to Analysis and create
a calculated field. Call this field Min. We are going to create a window calculation, which looks at all of
the data in a pane and returns a specific value. In this case, let's do the window_min of the sum of Sales.
Now, let's right-click on Min and go to Default Properties. Change our number format to Currency with
no decimals. We want to look across each row for the minimum value of sales. Now let's duplicate our
Min field and edit the calculation. Update this one to use max. Now let's duplicate that field one more
time and edit the calculation again. Rename this one to Total. Let's update the formula to total sum of
Sales. Now, holding down Command, click on these three pills from the Dimensions pane and drag them
to the marks card. Let's change all three to discrete. Click on the drop-down and select Discrete. Order the
pills to be Total, then Min, then Max on the marks card. Then highlight all of them holding down
Command, continue to hold down Command and drag all three to the marks card on Rows. Let's navigate
to Format, Alignment, and change the default pane and header to be right aligned. We did this since we
added dollar amounts in the table. We should always right align dollar amounts when shown in a text
table. Click on the font icon and dial it down to size 7. Then click on the borders icon and turn our header
border back on and make sure our row and column dividers are also turned back on. We may need to drag
our columns to adjust the size a little bit. And finally, this is probably the only time I will say this in this
course, navigate to Analysis, Table Layout, and turn on field labels for rows. And here's our sparkline,
and it's packing almost 2700 data points in this tiny space.
Summary
As we wrap up this module, you should now have a solid understanding of how to present data over time,
especially how to read, build, and format multiple variations of line charts, area charts, dual axis charts,
step and jump lines, and we rounded out the module with everyone's favorite, sparklines. We continued to
discuss key concepts like understanding how discrete versus continuous pills control what Tableau draws
on our canvas, and we took this to the next level by digging into the difference between date parts and
date values, which is a critical component when working with data over time. We continued exploring the
how and the why of the charts that we built through the use of tools like drop lines and calculations, and,
of course, we continued our pursuit of good chart formatting. I encourage you to download the Tableau
workbook from the exercise files and rebuild these charts as practice makes perfect. I hope you'll join me
in the next module where we'll unlock additional context with statistical and more non-standard charts.
Introduction
Hi, this is Adam Crahen. In this module, we will learn how to unlock additional context with statistical
and nonstandard charts. In this module, we'll be covering how to build chart types, such as the scatter
plot, a connected scatter plot, box and whisker plots, a bullet chart, and context bars. We'll also be diving
into statistical concepts like correlation and summarizing distributions. As a reminder, for each chart type,
we will learn what they're used for and how to read them. We'll take a look at some real-life examples,
and then we will walk through how to build the chart in Tableau. If you have multiple monitors or devices
available, this is your reminder to restart your two-screen experience so you can follow along with the
demo and become more familiar with building these charts.
Scatter Plot
Before we can build a scatter plot, let's take a second to understand what it is. Scatter plots are the
standard way in which we show the relationship or correlation of two continuous variables. Each variable
has its own axis, and points are plotted on the position that corresponds to both. Correlation can be tricky.
While two variables may be correlated, it does not mean that one is causal, or in other words, that one
variable caused the other. We need to be clear about this when we communicate data to our readers. This
example is called Anscombe's quartet, which was designed by Francis Anscombe, who was a statistician.
This famous diagram shows four datasets that have nearly identical descriptive statistics, yet they appear
very differently when they are visualized in a scatter plot. This example shows us the importance of
visualizing data before we analyze it, and it points out the effect outliers have on statistical properties.
These charts all have the same mean, sample variances, correlation, linear regression lines, and
coefficient of determination of the linear regression. Essentially, the devil is in the details, including data
visualization. Let's take a second to understand the pros and cons of scatter plots, starting with the pros.
Scatter plots allow us to see the range of our dataset and shows us the correlation between the X and Y
points. We can apply techniques like clustering to look for common properties of points. And, scatter
points can handle a lot of data, so they scale pretty well. And now the cons. While they can handle a lot of
data, over plotting can be a problem when it is dense. Techniques like highlighting, using opacity, and
zooming into the chart to examine areas with a lot of points can help remedy these issues to some degree.
Scatter plots are definitely one of those charts that people struggle with understanding. You need to take
care to make sure the user of this chart is aligned with your intentions. Also, unless people are told, they
will assume causality from correlation in these charts. Make sure to caveat these charts with correlation
does not equal causation. Now, let's jump into Tableau and learn how to build a scatter plot. Here we are
back in Tableau getting ready to build our scatter plot. For this analysis, I'm interested in comparing sales
and profit ratio by customer. Let's start by setting our worksheet to Entire View. Double-click on Sales,
which places the pill on the Rows shelf, and let's drag Profit Ratio to Columns. Our two continuous pills
told Tableau to draw two axes. Notice we already have one mark in our view. Tableau has aggregated the
values across our entire dataset. We want a point for each customer, so drag Customer Name onto Detail
on the marks card. Now Tableau has drawn a mark for every customer aggregating their total sales and
profit ratio across all their orders. These values are placed in the position that corresponds to both axes.
For example, if we hover over this top mark, we can see our buddy from the histogram clip, Sean Miller,
and he has $25, 000 in sales. But he has a -8% profit ratio, meaning we aren't making any money on that
guy. Now, you have some choices when it comes to mark type. Tableau defaults these to rings. I much
prefer the use of circles in scatter plots. However, you can really choose any shape that makes sense for
your audience. You can even load custom shapes under your My Tableau Repository in your Documents
folder, and that could be used to plot points. For example, navigate to the Dimensions pane, and let's drag
Sub-Category on top of Customer Name on the marks card to replace it. Click the icon on the left of the
Sub-Category and change it to use Shape. Now we have 17 marks using shapes, and some are repeated as
we can see by the shape legend. I never liked the use of shapes like this. I think it's really hard to decipher
and I feel it really affects the user experience. However, sometimes using a shape could be good. Let's
pretend that our sub-categories are names from some of the 2018 Zen Masters. We could plot their faces
on this chart. I am going to click on the Shape button on the marks card to edit shape. I am going to
navigate to a custom shape folder I created with all of our faces. I am going to click the Assign Palette
button to change the shapes to our faces. Now let me adjust a few faces here, and then I hit OK. Now, it
looks like I need to dial the size up a bit. Let me hover over my face, and it looks like I have a 10% profit
ratio and the third highest sales. Okay, now that was a fun example. We have flexibility in Tableau to
create amazing visuals, but it all comes down to what your audience will find useful. Let's go ahead and
revert our changes until we see the Customer Name reappear on the marks card. Okay, and we are back to
comparing all of our customers. Let's change the mark type to Circle. I really do like the field marks
better, but we have some over plotting going on. Let's click on Color on the marks card and dial Opacity
to about 60%. That's a little better. Sometimes adding a border can help identify where the marks start and
end. Click on Color and change our border to the lowest setting underneath white. That helps, but we
could always zoom in using the View toolbar if needed. Now I want to try out Tableau's clustering
capabilities. Let's click on the Analytics pane and drag Cluster onto our view and drop it on the Create
Clusters box that appears. Notice Tableau found three clusters, and we can tell this by looking at the
legend. Tableau is using a k-means clustering algorithm behind the scenes to statistically segment our
data into these clusters based on our two variables. We can change our clusters by adding more variables
to our view. For example, I am pretty curious if adding Average Discount on the Size mark would unlock
some additional analysis. Are these customers in red really unprofitable because we're providing
discounts? Let's drag Discount onto Size on the marks card. The average aggregation was automatically
selected because that field is set to use average by default. Notice that many of our larger circles, which
represent a larger average discount percentage, are on those customers with a negative profit ratio. Now,
let's edit our clusters. Click on the pill on the marks card and select Edit Clusters. Let's drag discount from
the Measures pane into the list of variables. And now, with more information, Tableau has identified just
two clusters. There's a little bit of formatting left that I would do. I would navigate to Format, Lines, and
on the Rows options, I would turn off the zero lines. I actually like 0 on columns in this view because it
provides context. Left of the line, we are losing money, right of the line, we are making money. Now let's
click on the Sheet line options and turn on our axis rulers. Let's adjust our title to Customer Analysis,
Sales vs. Profit Ratio. And now we know how to build a scatter plot.
Before we build a connected scatter plot, let's take a second to understand what it is. Essentially, it is the
same as a normal scatter plot with the addition of connected points as the name implies. The plotted
points show us the correlation of two variables. Each variable has its own axis and the points are plotted
on the position that corresponds to both axes. Connected scatter plots take the visualization a step further
by connecting the points by some order, like time, for example. And this helps us identify the patterns.
This is definitely an advanced chart type, but when it works well, it works, and when it doesn't, there
probably is a better way. Still, I think you will enjoy some of the tricks you'll learn in this clip, even if you
never build one. This is an example of a connected scatter plot I designed. It is actually four of them. The
data comes from the Fraser Institute, who published a dataset that includes the world economic freedom
rankings. It included several indicators of economic freedom and the size of the government for multiple
years. So under the hood, this is a scatter plot, which plots the points of the four indicators as shown on
the X axis and the size of the government on the Y axis. Larger points represent more recent years. There
are many countries plotted in this view, so you can highlight a country using the parameter in the top
corner. In this case, we are focusing on Venezuela, which is represented by the black line. So in each of
the four indicators, you can see the pattern of change, which begins at 1970 with the thin end of the line,
and ends at 2015, represented by the thicker end. The path order of the line is based on the order of year.
All four of these charts end with the smallest value on the X axis. This means that Venezuela's more
recent rankings are an indication of a decline in economic freedom. If we had just plotted the points, the
pattern of decline would not have been as easy to decipher as it becomes using this connected scatter plot.
Let's take a second to understand the pros and cons of connected scatter plots, starting with the pros.
Connected scatter plots can display the ranges of our data and correlation between our X and Y points,
just like a scatter plot. However, the addition of connecting the points provides us context that would
normally go unnoticed. These connections can be extremely useful in identifying patterns. Another
example use case would be to analyze user activity on a platform. Connected scatter plots can identify
decay patterns on this activity that would otherwise go unnoticed. And now the cons. Like normal scatter
plots, over plotting can be a problem with dense data. Highlighting, using opacity, and zooming help to
remedy these situations. Audience data literacy is huge with these charts. This is an advanced chart
showing how two variables move over time together. I've heard these described as hey, that's a nice
squiggle, or did your kids draw that? However, they can be very useful, but only if your audience is ready
to use them. Now let's jump into Tableau and build a connected scatter plot. Here we are back in Tableau,
and we are going to build our connected scatter plot. But first, we are going to use a different data source
for this example. Click on the New Data Source button at the top of the screen. Connect to the World
Indicators dataset in the Saved Data Sources section. Let's set our sheet to Entire View. Now navigate to
the Measures pane, drag Infant Mortality Rate to Rows and Health Expenditures as a percent of GDP to
Columns. Now this data has one row per country per year. We want to drag Country onto detail on the
marks card. And now we will have a mark for every country. Let's add Region to color. The points that
are plotted on our screen still have multiple rows of data behind them for each year. Let's drag Year to
Detail on the marks card. And now we have a point for every country for every year. And at this point,
this looks like a hot mess. Let's do some cleanup. First, let's change our mark type to Circle. Then, let's
click on Color and adjust our Opacity to 70% to help with our over plotting. Let's also dial down the size
of our marks a little bit. And this is starting to look a lot better. Now we can right-click on the null
indicator on the right side of the screen and select Hide. And so here we have a scatter plot. Now we need
to connect it. Change the mark type to Line. Notice everything looks like rainbow spaghetti. In order to
connect these points, we need to tell Tableau how to connect them. And that is what path is for on the
marks card. Click on the icon next to the Year pill on the marks card and change from Detail to Path. And
here is our connected scatter plot. Now, this might not look interesting yet because we can't tell which end
of the line is the beginning. To resolve this, click on Year on the marks card and change this pill to
Continuous. Now let's take that pill and make a copy and drop it on Size. The reason we are doing this is
so that the thin lines will represent the start of the line and the thick line will represent the end of the line.
If we had left this pill discrete, we would've had individual values for each year in the dataset on size. By
changing it to Continuous, we have a continuous range for our size legend. We may need to adjust our
size again at this point, so what we are seeing here is that most countries in this dataset started with a
higher infant mortality rate and it seems to be dropping, as indicated by following the path of the line to
the thicker end, which represents a later year. Right-click on Country on the marks card and select Show
Highlighter. Type Liberia in the highlighter on the right side of the screen. If we hover over the first
point, we can see that in the year 2000, the infant mortality rate was about 12%, and the percentage of
GDP spent on healthcare was close to 6%. If we follow the path of the connected points, we can see that
in 2012, the infant mortality rate dropped to below 6% and the percent of GDP spent on healthcare has
increased to over 15%. We could make a copy of the Region pill from the marks card and drop it on
Columns. This will allow us to see where the rates were in each region. And it sure is nice to see that
infant mortality rate decline. Now let's revert that change. Let's give this sheet a nice title, like Declining
Infant Mortality Rates. Now, since this is such a powerful story, there's one more way we could take a
look at something like a connected scatter plot. Make a copy of this sheet and rename it to Connected
Scatter Plot Animated. That's right, let's see this story in action. Let's change our mark type back to
Circle. Notice how without the lines now, we just can't discern any pattern at all like we just saw. It's just
a lot of points on the screen. Let's drag the Year pill to move it from the marks card to the Pages shelf.
Notice we have a new Pages shelf control card up here. Let's change the speed to fast and then check the
box to show history. Now click on Show history and let's configure this. Let's show the history for all
marks. Let's change from just marks to both trails and marks. And let's set our Fade higher, we really
want to fade our trails out. Finally, let's format the line to be thin dashed lines. Now, let's dial the size up
on our marks just a little bit, and now click Play. Now, it's really cool to see the data in action and move
like this, and it is important to humanize data whenever we can. This data is literally showing how the
deaths of babies is on the decline. This is an awesome story, and I think we've done it justice with our
connected scatter plot and pages animation that, in the end, looks like a connected scatter plot. One final
note, the page itself does not work on Tableau Server or Tableau Public. There are more technical
solutions for animation, but this trick I just showed is for desktop only for now.
Before we build a box and whisker plot, let's take a second to understand what it is. Box and whiskers are
used to visually depict distributions through quartiles. We can see things like the range of our data, the
median, the degree of dispersion or spread, and skewness in our data. And it identifies outliers. These
charts are more commonly called box plots. We are going to spend a little extra time learning how to read
this chart because they're not widely used outside of statistics and many people are unfamiliar with them.
Box plots visually display the quartiles of our data, and it all starts with the median. Fifty percent of our
data will be below the median, and 50% of our data will be above the median. The median is the center of
our dataset. The box represents our inner quartiles. Fifty percent of the data in the distribution is inside
the box and 25% is above and below the inner quartiles. The different colors inside identify the range of
the data within the inner quartiles on either side of the median. The whiskers represent the data spread to
our outer quartiles, and they also represent the smallest and largest values of our distribution, excluding
outliers. The points outside of the plot and the box and whiskers are our outliers. Let's take a second to
understand the pros and cons of a box and whisker plot, starting with the pros. Box and whisker plots
provide a clear summary of distributions and give us an idea for the middle of our data, how the data is
segmented into quartiles, the ranges of our data, etc. They do a good job of identifying outliers, and they
can handle a lot of data. And now the cons. The exact values and details are not retained in the
distribution results. The plot is a simple summary of the distribution. Use a box plot in combination with
another statistical graph method like the histogram for a more detailed analysis of the data. Audience data
literacy is a major issue with this chart. It downright scares people, which is why I spent so long with the
extra slide on how to read the chart. If you build one of these, I would anticipate needing to explain it to
anyone who might need to use it. Now, let's jump into Tableau and learn how to build a box and whisker
plot. Here we are back in Tableau, getting ready to build our box plots. For this chart, we are also going to
use the world indicators data source that ships with every installation of Tableau. We are already
connected to it from our connected scatter plot clip. We are going to create a new view to analyze mobile
phone usage distributions across each region. Let's get started. Let's set our view to Entire View. Drag
Year to the Filters shelf. Select the discrete Year. Click Next. Scroll down and select 2012. We are going
to look at data for a single year. Let's navigate to our Dimensions pane and double-click on Region to add
it to the Rows shelf. Let's navigate to the Measures pane and drag Mobile Phone Usage to the Columns
shelf. And let's duplicate that Region pill and drop it on Color on the marks card. Now we want to create a
point for each country within each region. So let's put country on the marks card. And so now we have a
rainbow stacked bar. The easiest way to create a box plot is by using the Show Me pane. Let's open it up,
click on box and whisker, and close Show Me. Now let's talk about what Tableau did here. For starters, it
changed our view from Entire View back to Standard. So let's go ahead and change it back to Entire
View. It changed our mark type to Circle instead of Bars to create the points. It swapped our axes on us.
We originally put Region on Rows. Let's right-click on our axis and select Edit Reference Line. This is
how Tableau drew the box and whiskers. Fun fact, they refer to the chart as a box plot here. There are
some formatting things you can do in here. You can extend the whiskers to the maximum extent of the
data or leave it on its current setting to identify outliers outside of the whiskers. You can hide all points
except outliers by checking this box. There are a number of styles in here that you can choose from for
your color palettes. And you can style the border and whisker size along with the colors individually. I am
just going to click on the fill and dial the opacity to 50% to allow our points to show through the box a
little better. Now let's swap our axes back the way we had them so we don't have to turn our head
sideways to read our region headers. Let's right-click on field label for rows and hide it. Let's duplicate
our Region pill and drop it on Color on the marks card. Let's dial the size of our marks up so we can
actually see them. Let's right-click on the null indicator and select Hide. And then let's navigate to
Format, Lines, and at the Sheet level, turn on our axis rulers. And finally, let's adjust our title to
Distributions of 2012 mobile phone usage by region. And here we have our box plot. And you might
recognize the orange record from Asia because that was used in our example when I demonstrated how to
read a box plot. So let's talk about what we're seeing here. Now we have more context. We are looking at
the distribution of mobile phone usage in each region. Each row is a region and each point is a country
within the region. If we take a look at Asia, we can see our median, it's where the color changes inside the
box, and we can see our four quartiles. If we hovered over the reference line of the box and whisker, we
can see the values of our median and the four quartiles. We can also see that we have two outliers. And if
I hover over them, we can see that both of them are from China. There is also an outlier in the Americas.
If I go down and hover over that point, we can see that it's Cuba. Now, the only optional thing I would do
here is if you wanted to clearly see each point within a region without them overlapping, we can jitter this
plot. This is a technique I showed back in the dot plot clip. Let's double-click on rows and type random
with an open and closed parenthesis. Let's click on the pill and change it to a dimension. This technique
creates an axis on rows and it assigns a random value to each mark between 0 and 1. This value will
change each time the data refreshes. Let me simulate this by right-clicking on our data source and
selecting Refresh. Notice how all the positions changed. Let's hide our axis because the position is
meaningless. This is just a technique to allow us to see more of our data. This can be really useful when
you have a lot of over plotting. But in this case, I prefer not jittering the chart. Let's remove Random from
Rows and be satisfied with our box and whisker plot. These charts can seem complex at times, but they
are really simple summaries of the distribution. And once we learn how to read them, we are well on our
way to actually building them.
Bullet Chart
Before we can build a bullet chart, let's take a second to understand what it is. Bullet charts are used to
measure performance against the context of a target or a performance range. Actual performance is
displayed by using a bar to visualize the magnitude of the measure while targets are drawn using
reference lines or Gantt bars on a dual axis. These charts were invented by Stephen Few to display a rich
story clearly and in a small space. Here's an example I created for a data project that looked at the
percentage of usable toilets in Indian schools in the year 2016. The data was broken down by state and
gender. This data highlighted some pretty disturbing trends. First, I was a little concerned that toilets in
schools are unusable in the first place, and second, in every state but one, the percentage of usable toilets
for girls was drastically lower than it was for boys. In this view, the percentage for boys is plotted by the
green bars, and I use the white bullets to show the percentage for girls as a comparison. I colored the text
in the titles to match the mark color. The one state where girls were the same as boys, the bullet is colored
gold. These charts use multiple measures, so I think it is important to offer sorting options. In the view we
are looking at, the sort is set by the percentage difference. The higher the row, the closer the girls were to
boys in that state. I also offered a view to sort the bullets to see which state had the highest girl
percentages, and another view to sort the percentage that was usable for boys. Let's take a second to
understand the pros and cons of bullet charts, starting with the pros. Bullet charts allow us to display
performance versus targets or a range of targets. They display magnitude with the added context of those
targets. They are rich, yet compact visuals, and they are built off the bar char, which is one of the easiest
charts for people to understand. And now the cons. Displaying target distributions behind the bars and
plotting a bullet on top of the bars can often confuse people. In my experience, this chart is not consumed
as easily by non-data people as one would think. I would consider your audience's data literacy and
expect to have to explain the target part of the visualization. I find that, as usual, being intentional about
the language you display on the chart can aid the reader in using these. Now let's jump into Tableau and
learn how to build a bullet chart. Here we are back in Tableau, and we're getting ready to build our bullet
chart. We are, again, using the Superstore dataset. In this view, we need to build a chart that will allow us
to see how sales are tracking in 2018 versus our target goal. So let's get started. As usual, let's set our
sheet to Entire View. Let's drag Order Date to the Filters shelf. Click on the discrete Year, click Next, and
select 2018. We only want to look at one year of data in our view. Click OK. Now the main point of a
bullet chart is to visualize an actual value versus a target. We don't have a target, so let's create one. Let's
create a new parameter so we can allow our users to enter their expected values and update the view.
Right-click in the whitespace of the data pane and select Create Parameter. Let's name this parameter
Target %. Let's keep the Data type Float because we are going to enter percentages as decimals, let's
update the Current value to .8, and change our Display format to a Percentage with 0 decimals. I am
going to select a range here so we can create some controls around our view. I don't want anyone entering
anything below 10%, so enter .1 for Min. Similarly, I don't want anyone entering more than 150%, so let's
enter 1.5 on the Maximum line. And also, let's allow our users to only change the target in 5% increments,
so enter .05 for the Step size. Click OK. Let's right-click on our parameter control and select Show
Parameter Control. Now that we've shown our parameter control, we need to use it in a calculation. Let's
navigate to Analysis and click on Create Calculated Field. Let's name this calc Target. Now for some real-
life business logic. Our executives would like us to add 25% to any targets for the technology category, so
our logic would be if min Category does not equal Technology, then sum Sales times the Target %, else
sum of Sales times the Target % + .25. Because we are aggregating our sales values, Tableau will give us
an error if we do not aggregate our dimension. We did this using min because it is more efficient than
using an attribute. Our sub-categories have a one-to-one relationship with category, so the min of the field
will return the same value as the dimension itself. Click OK. Now, the easiest way to build a bullet chart
that includes reference lines is to use the Show Me pane. First, let's drag Sub-Category to Rows, Sales to
Columns, and Target to Rows. Now, this isn't the view we are intending to build. Click on the Show Me
pane, click on the bullet graph, and close Show Me. Now let's talk about what Tableau did to our view.
First of all, it set our view to Standard, let's change that back to Entire View. Next, it moved Target to the
marks card. Let's right-click on our axis and hover over Edit Reference Line. Notice Tableau created two
reference lines for us. Let's start by looking at Average Target. Here, Tableau has created a reference line
using the Per Cell scope. This is what allows Tableau to add a reference line to each row. For example, if
we switch that to Entire Table, notice that Tableau has only drawn one reference line behind our view.
Let's switch that back to Per Cell. Now, under the Line section, Tableau is using our target measure to
draw the line. The aggregation here doesn't really matter because we predefined that in our calculation
already. Leaving this on Average is fine. Tableau is not labeling the reference line, and the rest of these
options are just formatting the line and not filling the areas above or below the lines with color. Let's click
OK. If I hover over one of these target lines, I can see the value of what our target calc produced. Now,
let's take a look at the second reference line. Let's right-click on our axis and hover over Edit Reference
Line and select the 60%, 80% of Average Target. Here, Tableau is using a reference distribution. And the
scope is set to Per Cell, so the values are calculated for every row. If we click on the drop-down for Value
under Computation, we can see some of the options we have for computing this distribution. I am going
to change our percentages to 50 and 75. Notice Tableau is using our target measure and using a total
aggregation. Notice our values have changed because I updated the percentages. Tableau is not labeling
this reference distribution, no line is being visualized. Instead, Tableau has applied a fill to our
distribution. We have additional color options if you click on fill. Let's click OK. If I hover over one of
the end of the fill areas, you can see the value for the 75% distribution. Notice it also highlights the 50%
line. Now, if I hover over that line, I can see the 50% distribution. Let's talk about the visual now. If we
hover over Phones, we can see we have over $105, 000 in sales. We can see we are nearly at 75% of our
target because the blue bar is almost at the end of our filled area, which is $106, 000. The black bar is our
yearly target at $142, 000. So we are about 25% away from the target. This chart is rich in data and does
not take up that much space. Let's clean up this chart. Let's hide our field label for rows by right-clicking
on it and selecting Hide. Let's update our title to Sub-Category Sales vs. Target. Now, there is another
way to build a bullet chart that does not use a reference line and it provides us more control over the
tooltips that are displayed to our users. Let's duplicate this sheet and rename it to Bullet 2. Let's right-click
on the axis and select Remove All Reference Lines. Let's drag Target from the marks card to Columns
because we are going to create a dual axis. Notice we have two new marks cards to work with. Let's click
on the Target marks card and change the mark type to Gantt Bar. Let's change the color of the bar to
orange. And let's click on the Sales marks card and change the bar color to gray. Now right-click on the
Target pill on the Columns shelf and select Dual Axis. Tableau has gone rogue on us again, let's get rid of
measure names from the marks card on both Target and Sales. Now change the mark type on Sales back
to Bars. Let's right-click on our top axis and select Synchronized Axis. Right-click again and select Show
Header to hide the axis. Let's do some quick cleanup. Navigate to Format, Lines, at the Sheet level, turn
off zero lines and turn on axis rulers. At the Columns level, turn off the grid lines. Now click on Borders,
and at the Sheet level, lighten our row dividers and turn off column dividers. Now this version is starting
to look much cleaner to me. In my experience, people struggle with reference distributions unless
someone explains it to them. To me, this view is much clearer. Some quick upgrades to this chart would
be the title. Let's change the color of the word Target to match the orange of our Gantt bar. I am going to
use the medicine dropper to pick the exact color of orange. And then, let's inject our parameter into the
title. To do this, let's click on Insert and select our target parameter. And then I'm going to dial down the
size a little bit. Now, if we click on the Target marks card, we can adjust the size of the target bar. This
control is nice. If you use the reference line method, Tableau will draw the reference line to take up the
entire space between the row dividers. Here we can let the mark breathe a little. And the last upgrade we
can make is to provide some dynamic sorting. Your users may want to sort by sales or the target. Let's
right-click in the whitespace of our data pane again and select Create Parameter. We already have a
parameter named Sort in this data source, so let's call this one Sort Bullet. We're going to want to keep
Float for the Data type. And then let's create a list. We are going to have four options. Let's enter value 1,
display as Sales, asc. Value 2, display as Sales, desc. Value 3, display as Target, asc. Value 4, display as
Target, desc. Click OK, now right-click on the parameter and show the control. Now we need to use this
parameter in a calculation. Navigate to Analysis and select Create Calculated Field. Let's call this field
Sort bullet. We want to case our Sort bullet parameter. We can tell we reference the parameter because it
is purple. And now we need to specify what we want to happen when the parameter is changed. When the
value is 1, we want to sum Sales. When the value is 2, we want -sum Sales to sort in descending order.
When the value is 3, we want to use our Target field. And when the value is 4, we want to use our minus
our Target field to sort in descending order. Notice we had to enter an aggregation for Sales because we
already aggregated the values inside of the Target calc. Click OK, and now we need to use our new calc.
Right-click on Sub-Category, select Sort, change the option to Field, and select our new Sort bullet calc.
You don't need to change the sort order, we handled that already within our calculation. Close the dialog.
Now if I change to sort by Sales, we are sorting by the gray bars in either ascending or descending order.
And if we click on the Target values in our parameter, we are sorting by the orange Gantt bar in either
ascending or descending order. Finally, it would be great to communicate our sort to our users. Navigate
to Worksheet, select Show Caption. Now let's inject our Sort parameter into the caption. Let's update the
value of the caption to the words Sort by and then click Insert and select our sort parameter. Now the text
will dynamically update in both our title and caption when we change the values of our parameters. And
in the end, I think this view is clearer, cleaner, more dynamic, and adds a lot of context for the reader.
Context Bars
Before we can build context bars, let's take a second to understand what they are. Context bars are bar-
over-bar plots that provide context of a target or a goal. These are usually seen in KPI, or key
performance indicator, dashboards. They are a non-standard chart in Tableau, and it is an alternative to a
bullet chart. Here is another example from two-time Tableau Zen Master, Pooja Gandhi. In this example,
she visualized some of my Twitter data. This is just a portion of her total vis. You can check it out on her
Tableau public profile. She created a place to enter a Twitter handle and it would filter my tweets to see
how many times I mentioned that person. In this case, I entered Pooja's handle, drexelpooja, in the
parameter. Now, the light blue bar you see behind the four bars are these context bars. When the data was
not filtered, the color bars took up that space. When the data was filtered, the bars are left in the
background to provide context of how many total tweets mentioned Pooja. Clearly it was a lot. An
additional benefit is that it keeps the axis locked at the original level. If those bars weren't there, Tableau
would have resized the axes to fit the data and comparisons between the different handles would've been
hard to make. This type of context is a really nice upgrade to any data visualization you build. We can use
this in a lot of ways and are only limited by our creativity. We are going to build something completely
different, but it's all the same idea of using a dual axis to unlock context. Let's take a second to understand
the pros and cons of context bars, starting with the pros. Context bars unlock context, as the name
suggests. They provide additional insight that wouldn't otherwise be available using a single axis. These
upgrades are subtle, but powerful for our users and they often help them understand the story better. And
now for the cons. It takes some extra effort to add experiences like this in our dashboards. But in the end,
they are worth it. The best compliment I ever received for my work was along the lines of I love that
visualization because it doesn't look like Tableau. Using your creativity in design and formatting to
improve someone's experience will not be a wasted effort. Now, let's jump into Tableau and learn how to
build context bars. Here we are back in Tableau, getting ready to build our context bars. We are once
again using our Superstore dataset. On this sheet, I want to create a view that provides a KPI snapshot for
our executives. KPI is a key performance indicator. This view will be comprised of a bar-over-bar, and it
will show the percentage of our actual 2018 sales versus our target. Let's get started. Let's drag Order
Date to the Filters shelf. Click on the discrete Year, click Next, and select 2018. Now let's click OK. We
just want to filter for one year of data in our view. Let's double-click on Category to place the field on
Rows. Now, we want to calculate the progress towards our goals for each category. Navigate to Analysis
and select Create Calculated Field. Let's call this field KPI Progress. The formula will be sum of Sales
divided by Target. Remember from the bullet chart clip, when we created the target calc, that field
contains aggregations inside of it. Click OK and drag this new field to Columns. Notice what we have
done here is normalized our progress by using a percentage across our categories. This will allow us to
show our progress on the same scale. Let's right-click on KPI Progress on the Columns shelf and click
Format. Change the number to be formatted as a percentage with no decimals. You can leave your
precision with a decimal if you need it, but ask yourself this first. What would an executive do differently
if sales were 74.5 % of the goal versus 75% of the goal? The answer is nothing. Let's make a copy of the
Category pill from Rows and drag to Color on the marks card. Navigate to the Measures pane. We can
drag Sales and Target to detail on the marks card so they can be used in the tooltips. Now, this represents
our current progress towards our goals. Let's add some context here. Double-click on the Columns shelf
and type min and put 1 in parentheses, then hit Return. Notice Tableau drew a bar to the number 1. We
created a point at number 1 on every row. We can now use this point to provide context. Let's click on the
min 1 marks card that appeared and click the icon next to Category. Change it from Color to Detail. Let's
click on Color and lighten the shade of gray. We want to have something that's very close to our
background color. And then let's turn off the borders. What we are going to do here is plot our KPI
progress on top of this bar, which would represent 100% of our goal. Let's right-click on the min 1 pill
and select Dual Axis. Notice Tableau thinks we want to draw a dot plot again. Click on the All marks
card and change our marks back to Bars. Now right-click on the top axis and select Synchronize Axis. At
this point, our gray bars are on top of our KPI bars. Right-click on the axis and select Move marks to
back. Now let's right-click on the axis again, select Edit Axis, click on Fixed axis, and change the range to
be 0 to 1. Again, one represents 100% of the bar. Tableau automatically adds some padding on the
continuous axis, and we just fixed it so that we will always see 0 to 100 on the axis. Right-click on the top
axis and select Show Header to hide. Right-click on the bottom axis, select Edit Axis, clear the title, and
close the box. Here we can drag our axis down to reduce the space that was taken up by our title. Now
let's do some quick cleanup. Navigate to Format, Lines, and at the Sheet level, turn off zero lines and turn
on axis rulers. At the Column levels, turn off the grid lines. Click on Borders. And at the Sheet level, turn
off column dividers. Now, you could stop here. The gray bar represents 100% of our goal, and the colored
KPI progress bars are on top. The remaining gray that is visible provides context to how much is left to
get to our 100% goal. Now, if you want to get a little fancy, we can continue on. Let's drag the size of our
chart down a little bit, which will make our bars bigger. Click on the All marks card and dial the size
down a little bit. On the KPI marks card, duplicate Category and drop it on Label. Click on Label on the
marks card. Click on Alignment. Under Horizontal, choose left, and under Vertical, choose top. Click on
Font, select Match Mark Color, and then Bold. Click on the three dots and edit the label. After the word
Category, hit Return four times. Click on Allow labels to overlap other marks and close the label dialog.
Now let's talk about what we just did. By turning on the option to allow labels to overlap other marks,
Tableau has to display our text. Tableau wants to draw our text on the bar, but we added some spaces
below our Category title to move it above our bar, but still visible within our pane. This only works
because we dialed the size down on our bar. Now let's repeat this on the min 1 bar. Click on the min 1
marks card. Click the icon next to Target and change it to Label. Make a copy of the KPI pill on Columns
and move it to Label on the min 1 marks card. Click on Label on the marks card and then click on
Alignment. Under Horizontal, choose right, and under the Vertical, choose bottom. Click on the three
dots to edit the labels. Let's update our text to use our KPI progress field, of Target Sales:, and then add
our Target field. Now, before our text, hit Return three times to move our text down. Let's highlight all of
our text and change the color to a lighter gray. And then let's update our font size to size 7 and hit Return
for the change to take effect. Click on Allow labels to overlap other marks and close the dialog. Now we
can see our label below the bars. Now we need to change the formatting of our dollar values. Let's right-
click on target and select Format. Change the default formatting to Currency with no decimals. Right-
click on KPI progress and select Format. Change the default formatting to Percentage with no decimals.
We can even drag our axis to the left to make our view much smaller. This could take up a very small
space on a dashboard. Now hide everything. Right-click on Category on Rows and select Show Header to
hide the dimension. Right-click on KPI progress on Columns and select Show Header to hide the axis.
Navigate to Format, Lines, and at the Sheet level, turn off the axis rulers. Finally, rename our title
Category Sales KPI, Progress to Goal. I am going to reduce the text size and bold part of our title.
Changing the font weights in our title is a really nice formatting tip. Now this is a non-standard chart that
will make your executives happy and have your coworkers asking, how did you build that in Tableau?
Summary
As we wrap up this module, you should now have a solid foundation for using some of Tableau's
statistical and non-standard charts. Specifically, how to read, build, and format scatter plots, connected
scatter plots, box and whisker plots, bullet charts, and context bars. We discussed statistical concepts like
correlation and summarizing distributions. We talked about ways to unlock context for users through non-
standard charts using dual axes, animation, and design creativity. We even dove into the world of
parametrized calculations to provide better experiences for our users. We continued exploring the how
and the why of the charts we built, and of course, we continued our pursuit of good chart formatting. I
encourage you to download the Tableau workbook from the exercise files and rebuild these charts, as
practice makes perfect. I hope you'll join me in the next module where we'll round out your vis toolkit
with advanced charts.
Introduction
Hi, this is Adam Crahen. In this module, we're looking to round out your viz toolkit with some advanced
charts. We'll be covering how to build chart types, such as the Gantt chart, a barcode, or otherwise known
as a strip plot, a slope chart, a Pareto chart, and the donut chart. We will be using concepts we've already
learned like dual axis techniques and some table calculations. As a reminder for each chart type, we will
learn what they're used for and how to read them. We'll take a look at some real-life examples, and then
we will walk through how to build that chart in Tableau. If you have multiple monitors or devices
available, this is, again, your cue to restart your two-screen experience so you can follow along with the
demo to become more familiar with building these charts.
Gantt Bars
Before we can build a Gantt chart, let's take a second to understand what it is. A Gantt chart is a type of
bar chart that illustrates the duration of events. Horizontal bars are sized by duration and shown over a
continuous axis. These charts are typically used in project management to show tasks and dependencies,
but they can also be used to show the duration of any event. Optionally, Gantt bars can be used to create a
chart called a strip plot or a barcode plot, which is used to show distributions similar to a dot plot. Here's
an example of a Gantt chart I built where the data has been anonymized. This particular view shows an
analysis of jobs running on a server over a 24-hour time period. Each row represents a specific job. Some
of the jobs are recurring and some of the jobs are daily. And each horizontal bar represents an occurrence
of one of those jobs running. The white portions of the bar indicate how long a job sat in queue. The
colored portions of the bar indicate how long the job actually took to run. If that part is blue, it was
successful, and if it was red, the job failed. This view is packed with information. I've muted the left of
this chart so we can focus on the three long-running jobs. Notice how the long-running times impacts the
queue of other jobs. We can see that all the other jobs running around that time were sitting in queue. And
we can see that by the wave of white bars in front of each of the jobs. We could expect some of these
recurring jobs to be lost due to the queue time. If we focus on the job at the top, we can see it is failing
over and over and over again, as indicated by all the red in a single row. We can also see how it sat in
queue for extended times in the early morning hours. If we are to describe the performance of this job as
it performed in the last 24 hours, we could see that in the early afternoon, this job started to fail and then
it ran long, failed some more, ran long, started to run successfully, sat in queue for a long time, we lost
some jobs, and then it started running successfully at its normal cadence. This is a classic use case for a
Gantt chart. Look at all the insight we're able to unlock just from visualizing a process. Here's an example
of the strip plot, or the barcode chart, I mentioned earlier. This view also uses Gantt bars, but it does not
size them to analyze process duration. This chart would be used to look at an entire distribution. Here we
are looking at a visualization I created which analyzes the usage of profanity and deaths in Quentin
Tarantino films. The x axis represents minutes into a film. Each row is a grouping of profane words or
deaths. The white Gantt bars are an occurrence of profanity and the red ones indicate a death. I blurred
out the word groupings for this course. I created the groupings of words so I could show a summary view,
and also so I could turn off the profanity to publish a view like this. The main view does not display the
text. You can, however, click on a radio button to turn the profanity on. And since you're probably hoping
I do that, here you go. This is the view with profanity on, but I still blurred out those words. It is on my
Tableau public profile if you're interested in exploring more. Let's take a second to understand the pros
and cons of Gantt charts, starting with the pros. Gantt charts are great for analyzing durations or
managing projects with dependencies. They're data-rich views that can provide a lot of insight, as we
discussed with our server example. A lot of data can be plotted in a Gantt chart, so they also scale pretty
well. And now the cons. The marks can get pretty small if looking at a lot of data over time. While the
view can handle it, it may be harder to extract insight without zooming or filtering. I would also consider
your audience's data literacy as usual. Most people have not heard of this type of chart, and I would
expect to have to provide explanations or demo functionality for your users. Now let's jump into Tableau
and learn how to build a Gantt chart. Here we are back in Tableau getting ready to build our Gantt chart.
For this example, let's start with opening the Gantt Bars tab from the Starter Workbook in our exercise
files. Notice this workbook has been preconfigured with Sub-Category and Year of Order Date on the
Filters shelf. These filters are visible on the sheet and are configured to be single-value lists with the All
option turned off. Additionally, the title has already been created for you and is injecting the selected sub-
category in years values into your title as you change the filters. We are going to create a Gantt chart to
analyze the time it takes for us to ship products by each manufacturer within a sub-category. To get
started, let's navigate to the marks card and change our mark type to Gantt Bars. Let's Option+drag Order
Date to Columns and select the continuous value for Day. Let's drag Manufacturer to Rows and Ship
Mode to Rows. Now drag Order Date to Detail on the marks card, drag Sub-Category to Detail on the
marks card, Option+drag Ship Date to Detail on the marks card, and select the continuous Day variation,
and drag Ship Mode to Color on the marks card. Now at this point, what we're looking at is every day that
an order was placed by manufacturer and ship mode. To make this more insightful, we need to calculate
the time between order date and ship date. Navigate to Analysis and select Create Calculated Field. Let's
call this new field Days to Ship. We are going to use the DATEDIFF function to calculate the number of
days between the order date and the ship date. Option+drag this new field to Size on the Marks card.
Select the AVERAGE aggregation. And this step is very important. We may have multiple records per
order ID because the finest grain of our data is product name. We need to choose Average so that we are
getting the correct aggregation across all the rows for our dimensions. Now our bars are sized by the
average amount of time it took to ship the orders. Let's click on Fellows to highlight just the data in our
view. Here we can see an instance where the Same Day orders did not ship on the same day even though
our customers paid for that service level. Here we can see a few instances where a second class shipping
was shipped faster than first class. This view is packed with insights like these. Let's click on a few more
sub-categories in our view to see what the shipping times look like there. Finally, let's right-click on our
axis, select Edit Axis, clear the title, and click OK. We don't need that since it's obvious we are looking at
data over time because of our axis and because we have a good title. Now, let's create a new worksheet.
Let's name this one Barcode. We are going to create a barcode chart or a strip plot of all our orders. This
is a good way to look at the distributions and frequency in our data. Let's get started. Drag Sub-Category
onto Rows. Let's Option+drag Order Date to Columns, and select the Continuous variation of our Order
Date. Now this data includes a timestamp for each order, so this means that by selecting the variation that
we did, we have selected a pretty fine grain from our dataset. Tableau has automatically chosen Gantt
Bars for our mark type based on the data in the view. Let's look for some additional insights. Option+drag
Days to Ship on Color, select the Average aggregation again, and now we can see that the order dates
with lighter marks shipped faster on average. Let's do some quick cleanup. Let's hide our field labels for
Rows by right-clicking on the text and selecting Hide. Let's right-click on our axis, select Edit Axis, clear
the title, and click OK. Let's give this sheet a good title like Average Days to Ship by Sub-Category. And
now we have built two distinctly different views using Gantt bars.
Slope Chart
Before we can build a slope chart, let's take a second to understand what it is. Slope charts are an
extension of a line chart and are a good way to display data over time as long as the data can be simplified
without missing a significant part of the story. They use lines for each member of a dimension and
connect two different values to measure change over time. If you remember back to algebra, slope is
defined as the ratio of vertical change between points. These charts are also sometimes known as slope
graphs. This is an example of a slope chart I created. The data came from The Guardian where they were
predicting the finishing place of all 20 teams in the Premier League. The Premier League is a professional
soccer, or football, league. Each line in this view is one of the teams. Their place on the left side of this
chart was The Guardian's predicted finishing place in the league's standings. And their place on the right
side of our chart was their actual finishing place. This chart gets its name because the line slope is
showing something whether increased, decreased, or if it stayed the same. I used color here to help tell the
story. I wanted to focus on the fact that they only got two right. The color of the title and the color of the
two horizontal bars showed the teams that were predicted correctly. If the line is blue and sloping
upwards, the team finished higher than predicted. If the line is sloping downwards and colored red, the
team finished lower than predicted. Now there is more to this story. If we just look at the top six, for
example, they really only missed on Chelsea, which caused the next three teams to be missed as well. So
while they literally only got two predictions correct, they predicted the relative order of finishing place
more accurately than I am giving them credit for. Let's take a second to understand the pros and cons of a
slope chart, starting with the pros. Slope charts are an extension of a line chart that use a simplified
approach to show change over time between two points. And now the cons. Slope charts are not very
effective when dealing with more than two points. One would consider a normal line chart or a sparkline
for data with more points in time. The variables plotted on either axis need to be on the same scale for the
slope chart to be meaningful. This is also not a parallel coordinates chart, so make sure that the axes are
the same. Now this is not a very common chart type to see for most people. However, they can be
incredibly useful. But I would anticipate having to spend extra time explaining your choices for using this
chart type to your users. Now let's jump into Tableau and let's build a slope chart. Here we are back in
Tableau getting ready to build our slope chart. As we discussed, slope charts are an extension of a line
chart, so let's prove that. Let's drag Sub-Category to Detail on the marks card. Let's drag Sales to Rows.
Let's Option+drag Order Date on the Columns shelf and select the discrete value for Year, and hit OK.
Here we have a line chart. Make a copy of the pill line Columns and drop it on the Filters shelf. Again,
select the discrete value for Year, uncheck 2015 and 2016 so we just have two years of data in our view,
and hit OK. We can drag our headers to be a little wider here, and here we have a slope chart. Let's right-
click on Sub-Category from the marks card and select Show Highlighter. Now on the right side of our
screen, we can hover over our dimension members and see the slope highlighted in our view. So this
demonstrated that it is an extension of a line chart. It is essentially just two points on a line and you can
call it a slope chart. However, let's clear our sheet and create a more complex view that looks at sales
ranking by sub-category. Again, let's start by dragging Sub-Category to Detail on the marks card. Now in
order to compare sales values across years, we need to create some calculations. Navigate to Analysis and
select Create Calculated Field. Let's name this calc 2017 Sales. The formula will be IF YEAR of Order
Date equals 2017 then Sales. This will return the values from sales, but only when the order was in the
year 2017. Now click OK. Let's duplicate that calculation. Right-click on the pill and select Edit
Calculation. Now we want to create the 2018 version. Rename this to 2018 Sales, and update the Year
value in the calc to 2018. Now drag 2017 Sales onto Rows. Let's drag the 2018 Sales onto the axis of our
view. Notice by dropping another measure on the axis, we have created a new card on our canvas called
Measure Values. Tableau has also added a dimension to the Columns shelf called Measure Names, and
put the same pill on the Filters shelf to filter out all of the other measures in our Measure pane out of our
view. Let's change the mark type to Line, and now we are seeing the start of our slope chart. But what we
have in our screen is visually the same as what we built at the beginning of this clip. But now we are
going to enhance this view. From the Measure Values pane, right-click on the 2017 Sales pill and select
Quick Table Calculation and then select Rank. Click on the pill again, hover over Compute Using, and
select Sub-Category. What we are doing here is calculating the rank of sales for each sub-category in the
year 2017. Now we need to repeat this for 2018. From the Measure Values pane, right-click on 2018
Sales, and select Quick Table Calculation and select Rank. Click on the pill again and hover over
Compute Using by, and select Sub-Category. Now what we are doing here is calculating the rank of sales
for each sub-category in the year 2018. Now we have a slope chart showing the change in sales ranks
across the two years. Notice, our number one ranking is shown at the bottom of the screen. When making
a slope graph or a slope chart, it is more intuitive to have rank number one at the top of the screen. So
right-click on the axis, select Edit Axis, and check the box for Reversed. Close the dialog. Now we have
reversed the order of our axis and our number one rank is at the top of our view. Let's drag our view to be
a bit wider. Now we want to create better headers. Tableau is trying to describe our table calculations in
the Header pane. Right-click directly on the header and select Edit Alias. Let's change this first one to
2017 Rank. Right-click on the other header and edit the alias again and change it to 2018 Rank. Now we
want to label our view better. Make a copy of the 2017 Rank from Measure Values and drop it on Label
of our marks card. Notice, Tableau is trying to label every point including the 2018 ones. Let's click on
Label on the marks card. We just want to label the start of our line with our 2017 Rank. Click on
Alignment, select left under Horizontal, and middle under Vertical. Now let's click on Line Ends under
marks to Label. Notice, new options appear at the bottom with checkboxes. Uncheck Label end of line,
which will remove the labels from the 2018 side of our line. Now click on Font and change the size to 7
pixels; hit Return to apply that change. Now let's label the right side of our chart. To do that, we need
another marks card. Make a copy of the Measure Values pill on Rows and drop it back on Rows. We now
have a new marks card to work with. The bottom marks card corresponds to the pill on the right of our
Rows shelf. Let's expand that marks card. Now let's duplicate our 2018 Rank pill from Measure Values
and drop it on top of the 2017 pill, which is on Label on the marks card. We now want the 2018 value for
the right side of our line. Also, click the icon next to Sub-Category on the marks card and change it from
Detail to Label as well. Notice all the labels are appearing on the left. That's because of how we
configured the last marks card, so let's click on Label, click on Alignment, and now select right under
Horizontal. Now uncheck start of line, and check the box to Label the end of the line. Now we just need
to fix the text of our labels. Click on the three dots to edit the label, and let's change it so it is Rank, bar,
Sub-Category, and you can close this dialog. Now we just need to plot these two charts on top of each
other. Right-click on the bottom axis and select Dual Axis. Right-click on the right axis and select
Synchronize Axis. The rest of this chart is formatting. Right-click on the left axis and select Show
Header. Notice that it hid both the left and right axes. Now let's navigate to Format, Borders, and at the
Sheet level, turn off the row dividers and the column dividers. Click on Lines, and at the Sheet level, turn
off zero lines, which is at the top of our screen because we reversed our axis, and then click on Rows and
turn off the grid lines. Slope charts don't need any of that chart junk if formatted and labeled properly.
Here's another trick. It would be nice if our measure headers were at the top of our view since we
typically read a chart top-down. Navigate to Analysis, Table Layout, Advanced, and uncheck the box for
showing the innermost level at the bottom of a view when there is a vertical axis. Now click OK. Notice,
our Measure Value name headers are now at the top of the screen. Now the last thing that would be nice
is to use color as an indicator of change in rank. Double-click on the 2017 pill in the Measure Values
pane, and copy the text. Navigate to Analysis, Create Calculated Field, and let's just call this one Color.
Now we are going to use the formula that we copied and paste parts of this calculation in. We want our
formula to read if the 2017 ranking of sales is greater than the 2018 ranking of sales, then return the
number 1. Then we're going to write the word else, copy our first two lines of our formula, paste it after
the word else, and update our greater than sign to a less than sign, and change the value to return to be
number 2, and then we'll write else 3. Expand the All marks card, drag our new calc onto Color. Right-
click on the pill and hover over Compute Using, and select Sub-Category. Now our field is being
computed the same way as our other table calculations, but it is using a continuous scale. Let's right-click
on that pill once more and change it to Discrete. Now we can edit the colors individually. One indicates
where a sales rank increase year over year, so let's choose the color blue. Two indicates where sales rank
decrease year over year, so let's choose red. And 3 indicates no change in rank, so let's use this brown-
gray color down here and click OK. Now let's update our worksheet title to Sub-Category Sales Ranking.
And here we have our slope chart comparing the sales ranking for sub-category over two years. Now this
was a little bit of work to build a simplistic view, but the effort was well worth the time. And it isn't easy
to build effective communications, and we used a number of tricks to build this non-standard chart.
Pareto Chart
Before we build a Pareto chart, let's take a second to understand what it is. This chart is used to visualize
the Pareto Principle, or what is more commonly known as the 80/20 Rule. The Pareto Principle states that
80% of effects come from 20% of the causes. The visualization is a dual axis chart comprised of a line
and another mark type that can vary between area, bars, or a dot. Pareto charts are essentially a variant of
a histogram that is sorted by percentage of some total. Here's a Pareto chart created by my friend Curtis
Harris. This particular chart is looking at the words used in the top 100 songs of all time. Curtis stripped
out the top 100 most common words like I, we, you, etc., and then visualized what percentage of the rest
of the words made up the total lyrics. This was the result. So imagine that each individual word has its
own column on the x axis. We would calculate the percent of total for that first word and then take a
running sum of those percents of totals, moving to the right for each word. This running percent of total is
what gives the visualization the smooth curve. We would draw a reference line at 80% on the y axis,
which in this case it is the 80% of total lyrics used in the top 100 songs. And where our moving
calculation intersects the 80% line is the percent of words that make up the 80% of all lyrics. In this
example, Curtis drew lines at 80% of lyrics and 20% of words to highlight the common 80/20 Rule. This
example is available on Curtis's Tableau public profile, so definitely check it out and even download it to
reconstruct how he built it. Let's take a second to understand the pros and cons of Pareto charts, starting
with the pros. Pareto charts visualize the Pareto Principle, or the 80/20 Rule. And now the cons. These
charts are not too simple to build in Tableau. There isn't a ton of actionable insight that can come from
these charts, except maybe confirmation of the need to focus on your top products as a business strategy.
It is hard to explain this chart to users without oversimplifying it and explaining it as the 80/20 Rule.
These charts are comprised of several moving calculations. Now let's jump into Tableau and learn how to
build a Pareto chart. Okay, here we are back in Tableau, getting ready to build our Pareto chart. This is
not going to be an easy one. We need to visualize what percent of our products make up 80% of our sales,
and also we need to be able to filter this data. Let's start by setting our view to Entire View. Let's work on
our filter. Drag Sub-Category onto the Filters shelf, select All, and click OK. Right-click on the pill and
select Show Filter. Notice, Tableau has added our filter to the right side of the screen. Let's click on the
drop-down of the filter and change this to a Single Value List. Next, Option+drag Sub-Category to the
marks card, and select the Attr variation. This is an attribute. This field is going to be used in our title
later. We are selecting the attribute variation because it only returns a single value and does not affect the
table calculations we are going to be writing here shortly. Now drag Product Name to the marks card. We
will need to sort our Product Name in descending order of sales. Right-click on the pill and select Sort.
Choose the Sort By Field, select Descending order under Sort By, pick Sales from the drop-down, and
make sure Sum is our Aggregation. Now let's create our columns for this chart. Option+drag Product
Name to Columns, select the CNTD, or Count Distinct, aggregation. We want one column for each of the
product names on our marks card. Now right-click on the pill on the Columns shelf and hover over at
Quick Table Calculation, and choose Running Total. Now right-click on the pill again and select Edit
Table Calculation. Check the box to Add a secondary calculation. Choose Percent of Total for the second
calculation, and now click on Specific Dimensions, and check the box for Product Name on both
calculations. This is going to create columns for each of our products that represents a running percent of
total. You can close the Table Calc dialog. Let's right-click on our axis, select Edit Axis, and rename the
title to Percent of Products, and then close the dialog. We have a few more of these table calcs, so don't
get too excited. Now let's work on the Rows shelf. Drag Sales to the Rows shelf. Notice how we can now
see our shape of descending sales by product name due to the sorting we applied earlier. Let's change our
mark type to Area. Click on Color on the marks card, and let's change the color of our area to the second
last option under the white hues. We aren't done with the Rows just yet. Duplicate Sales on Rows and
drop it back on Rows. Right-click on the second pill, hover over Add Quick Table Calculation, and select
Running Total. Now right-click on the pill again and select Edit Table Calculation. Check the box to Add
a secondary calculation. Choose Percent of Total for the second calculation, now click on Specific
Dimensions, and check the box for Product Name on both calculations. Close the dialog. Now you can
start to see our Pareto curve taking shape. Change the mark type to a Line. Click on Size on the marks
card and dial it down just a little bit. Click on Color on the marks card and change to orange. Now right-
click on the right axis, select Edit Axis, and change the title to Percent of Total Sales. Right-click on that
axis again and select Dual Axis. Notice our colors changed. Tableau always wants to add measure names
to color. Drag Measure Names off of both of our marks cards and notice our colors have now returned to
normal. Now this is one of the rare times where we are not going to sync our dual axis. However, since
they are not the same, I feel it's important to let people know which axis matches which mark. Right-click
on the right axis and select Format. At the Axis level, change the Font Color to match the orange we
selected for our line. Let's do some quick cleanup. Navigate to Format, Lines, and at the Sheet level, turn
off the grid lines and the zero lines as they aren't contributing any value to this chart. Now we want to
give some reference as to what percentage makes up 80% of our sales. Right-click on the right axis and
select Add Reference Line. Select the Entire Table. Now we could set this up with a parameter and make
this dynamic, but for now, let's just hard code our line. Click on the drop-down for Aggregation and select
Constant. Change our value to .8 for 80%, turn off the Label, and now let's format the line. It is equally
important to know which axis this line will represent since they are not synchronized. Click on the drop-
down and let's make this a chunky dotted line and make the color orange to match our axis; click OK. We
don't need to label this since the line is drawn directly to 80% on our axis and the orange indicates that it
represents a percentage and not a dollar amount. Okay, the next part is hard. We need to be able to filter
our chart. We want to calculate the percent of products that intersects with our 80% of sales line. We will
also use this value in our title. Navigate to Analysis and select Create Calculated Field. Let's name this
field Pareto Title. I'm going to copy and paste this formula from a text editor into our formula dialog and
then describe what it's doing. We are basically calculating the point at which our percent of products
intersects the 80% sales line. So first, we calculate the running percent of total for sales. When that is
greater than or equal to 80%, we will return the value of the pill on our Columns shelf, which is the
running percent of total on the distinct count of product names. Then we wrap the calculation in
WINDOW_MIN to extract just the first value that matches our greater than or equal 80% of sales criteria.
Let's expand the All marks card and let's drag this new field onto Detail. Right-click on the pill and hover
over Compute Using and select Product Name. Now right-click on our Percent of Products axis and select
Add Reference Line. Select Entire Table for Scope, click on the Value drop-down, and select our Pareto
Title calc. Choose Min for aggregation and clear the label. Let's give this the same type of chunky line,
but leave the color black. Uncheck the box at the bottom to recalculate the line when we highlight data,
and click OK. Notice our brand-new line intersects the curve right at 80% of sales. Finally, let's edit our
title to be dynamic to use the value we just calculated from our reference line, as well as the selected sub-
category. Let's insert the Pareto Title calc, and then write the words, of Products account for 80% of
Sales. Below that on a new line, we'll have Sub-Category:, and insert our attribute of Sub-Category. I'm
going to make the 80% of Sales text orange to match the axis in our view, and I'm also going to dial down
the size on the Sub-Category part. Notice, we need to fix the formatting on that percentage. On the All
marks card, right-click on the Pareto Title and select Format. At the Pane level, modify the default
formatting to Percentage with no decimals. Now this is a fully functional dynamic chart. Let's click
through the sub-categories on the filter to see if the percentage of products changes. Notice that as we do
this, all of our reference lines are moving and the intersection is always at the 80% of sales reference line,
our titles are updating dynamically, and here we have our Pareto chart. Thanks for powering through that
one. It is not an easy chart to build.
Donut Chart
Before we can build a donut chart, let's take a second to understand what it is. Donut charts are used to
show a part-to-whole relationship. They are essentially a pie chart with the hole cut out of the middle and
represent the shape of a donut. The hole makes space for more information about the data, like totals, and
helps us better communicate the part-to-whole relationship versus a pie chart. Here's an example of a
donut chart I created. My friend Jeremy has been tracking his biscuit consumption for the better part of a
year and he made the mistake of giving the data to me. He tracked all kinds of things, but one thing in
particular he tracked was which restaurant he was visiting. Usually, it was Chick-fil-A, but sometimes it
was Bojangles. Now I had a good time working with his data and I want you to focus on the donut chart
on the bottom right of this dashboard. And it's a little bit of a joke because I stuck a giant biscuit in the
middle. But besides that, the chart isn't bad. I'm showing the percentage of visits broken down by
restaurant. I only have two slices, and they're nicely labeled and the colors make sense. Notice I also used
a barcode chart there as well, which we talked about back in our Gantt bar clip. This is on my Tableau
public profile if you're worried about Jeremy's consumption and if you want to explore it some more.
Here's another example of a donut chart, and this one is a little bit more serious. Here we are looking at
how much fruit we have, broken down by type. The nice part about a donut chart is that we can make
good use of the space in the center by plotting some number that is representative of the total. For
example, here we have 303, 000 pieces of fruit. The number goes into the center and represents the hole.
The three slices are colored and labeled by type. The labels here are the percentages by type, but we now
have a sense of the scale because of the label in the center. The slices are also sorted here in descending
order. If we are looking at a clock, it would start at 12 and move around the clock clockwise. Let's take a
second to understand the pros and cons of donut charts, starting with the pros. Kids start to learn about pie
charts in elementary school. A donut is the same thing as a pie chart, so our users should be used to seeing
these charts. These charts show a part-to-whole relationship when used effectively. And now the cons. I
have to say that pie charts can be evil. I don't know what happens since elementary school, but these
charts get abused. Because this is a part-to-whole chart, it needs to add up to 100%. There should not be
too many slices. I recommend that you do not go over four slices max. The slice needs to be sorted by
size. Colors and labels should be used effectively. Don't make them 3D, that's even worse. All too often,
these rules get abused and you see it all the time, which is why many people in this profession do not use
pie charts anymore. However, in Tableau, there are some tricks used to make donut charts that can help
unlock the use of the dual axis creativity for the future. Before we move on, please raise your hand and
repeat after me. I, state your name, will never make a bad pie chart. Now that you've taken the oath, let's
jump into Tableau and learn how to make a donut chart. Here we are back in Tableau, getting ready to
build our donut chart. We need to create a view that takes our total sales and breaks it down by segment.
First, let's set our view to Entire View and change our mark type to Pie. Let's drag Segment onto Color on
the marks card. Notice, Tableau is starting to draw a pie chart, but right now the slices are equally sized.
When we change our mark type to Pie, the Angle option appears on the marks card. This is how Tableau
knows how you want to size the marks. Drag Sales onto Angle and see how they change. It's good
practice to sort our slices. They appear to be in descending order now, but our data may change down the
line. Right-click on Segment, select Sort, Sort By a Field, select Descending order, and make sure our
Field is Sales with a Sum aggregation. Now let's label this pie nicely. Make a copy of the Segment pill on
the marks card and drop it on Label. Make a copy of Sales on the marks card and drop it on Label as well.
Let's display the percent of total. Right-click on the Sales pill, which is on Label, hover over Quick Table
Calculation, and select Percent of Total. Let's format those percentages to be whole numbers. Right-click
on the pill again and select Format. At the Pane level, change the default formatting to Percentage with no
decimals. Because sometimes slices can be small, I like to label pie and donut charts with the labels in a
single row. Because you took the pie chart oath, I know you won't have more than four slices, so labeling
shouldn't be too much of an issue. However, let's go ahead and fix that. Click on Label, change the labels
to be Segment, bar, %. One thing I love to get rid of on pie charts is the legend. You have to look away
from the chart to decode a legend, so changing the font color on our labels to match the mark color will
solve for this. Here we have our pie chart and we are almost done. To create the cutout for our donut, we
need access to another marks card. Double-click in the Columns shelf and type min, and put 1 in
parentheses. Notice, Tableau moved our pie onto an axis at the number 1. Make a copy of the pill on
Columns and drop it back on Columns. Now we have two pie charts side by side and access to an
additional marks card. Expand the bottom marks card. Right-click in the whitespace and select Clear
Shelf. Notice, we now have a gray circle on the right side. This is going to be our hole for the donut. Let's
drag Sales back onto Label, click on Label, and let's edit the text. We can directly type in the label editor.
Let's add Total Sales below the value. Let's change the font size on Sales to number 10 and make it bold.
And let's change the font color back to Automatic. Now here is the trick to make the donut. Click on the
top marks card. Dial size up to the second hash mark. Go back to the bottom marks card, dial size up to be
between the hash marks. Right-click on the bottom axis and select Dual Axis. Now right-click on that axis
again and select Synchronize Axis. Notice, this put our gray circle on top of our pie chart. Now click on
the bottom marks card, click on Color, and change the color to be white to match our background. The
rest of this chart is all formatting. Navigate to Format, Borders, and at the Sheet level, turn off row
dividers and column dividers. Click on Lines, and at the Sheet level, turn off zero lines. Click on Columns
and turn off grid lines. Now right-click on the bottom axis and select Show Header to hide both axes.
Let's rename the title to Sales by Segment. The min 1 trick we used here can be used to make all kinds of
other charts on a dual axis. I only like showing this chart because I think it can be a gateway to using that
trick for other creative charts. Now remember your oath. You promised to never make a bad pie chart,
right?
Summary
As we wrap up this module, we have added a few more visualizations to your toolkit. Specifically, you
should now know how to read, build, and format Gantt charts, barcode charts, a slope chart, Pareto chart,
and a donut chart. We continued to use techniques we learned throughout this course like dual axis
techniques and table calculations. I encourage you to download the Tableau workbook from the exercise
files and rebuild these charts as practice makes perfect. And now to wrap up our course, your viz toolkit
now includes how to read, build, and format over 35 different charts. We covered foundational concepts
like the difference between the dimension and a measure, discrete versus continuous, date part versus date
value. We covered design best practices like basic formatting, use of language in our visualizations,
thinking about our user experience, and dual axis capabilities. We had a lot of exposure to calculations,
basic, table, and level of detail. We even had some exposure to statistical concepts like correlations and
distributions. And I hope with that, this course is helping you connect the many dots that make up
Tableau, and I want to thank you for joining me on this journey.
Course author
Adam Crahen
Adam Crahen is the Head of Data Visualization Engineering for Pluralsight, a 2018 Tableau Zen Master
and the co-founder of thedataduo.com. Adam is a creative problem solver specializing in data...