Introduction to Data Visualization with Tableau.
Introduction
Before creating data visualization, this write-up will help you understand what it is, how to create an effective visualization, and the tools needed to do so. After gaining a thorough understanding of the concept, you will be guided to install and set up Tableau Public - a data visualization software tool - to get started.
Learning Objectives
Understanding what data visualization is and its significance
Learning about different types of data visualizations
Learn the best practices to create an effective visualization
Read about the Tableau software tool
Get Tableau Public installed
What is Data Visualization?
Let’s first learn about the concept of data visualization and why it is so important!
Watch the video above to learn about data visualization, including:
What is data visualization?
What makes data visualization so crucial in data science? How is it being used?
What are some of the different types of data visualizations?
Some tips to create effective visualizations (more in Step 2)
Data visualization is a handy tool when analyzing data, but it also comes with both its advantages and disadvantages.
Advantages
One of the benefits of data visualization is that it helps people process information faster and easier compared to text. Data visualization taps into the idea that humans can recognize images and pick out patterns from visual displays, thus allowing us to identify trends and outliers we may not notice in the raw data. It provides a simplified means of communicating the information for people to gain a deeper understanding of the story being told. They are extremely useful especially when handling large amounts of data, such as standardized test scores of all U.S. students or data about a lethal disease taken on a global scale.
Data visualization also makes information more interactive and engaging by providing elements that can be manipulated. A great example of this is a filter that only shows data within a certain range of dates such as the image shown below. Having these interactive features in data visualization generates effective data storytelling and increases user engagement. They also allow people to view the data from different perspectives and understand the information in both a bigger and smaller picture.
Disadvantages
One thing to be careful about when creating data visualizations is that they may cause biased or inaccurate information based on how the data is being presented. When designing a visualization, the visuals may only highlight what is deemed to be the important portion of the data. This will cause the rest of the information to be excluded from the presentation, which can cause biased insights. As a result, core messages in the data can be missed and develop misinterpretations.
When analyzing data, data visualizations can be interpreted to identify false correlations and assume causation. Depending on how the data is being displayed and the integrity of the data, visualization can cause people to see a relationship between different variables inaccurately. Creating false correlations can be critical in data-driven decision-making and determining relationships that are completely unrelated to each other.
An example of how correlation doesn’t cause causation is ice cream sales vs the number of shark attacks in the US every year. Based on the dual-line graph below, those two variables are highly correlated. However, their correlation doesn’t mean that the number of ice cream sales causes shark attacks to happen. It makes more sense that people tend to consume more ice cream and go into the ocean when it’s hotter outside, which increases the number of ice cream sales and shark attacks.
Data Visualization Best Practices
You have learned about the potential risks of creating a data visualization if misrepresented and how misleading it can be. To prevent those drawbacks from occurring, there are several best practices to keep in mind.
Understand your audience and purpose
In data visualization, you need to keep in mind what message you wish to communicate through the information you are providing. The message of your story needs to be clear - is this visualization for a performance review, a call to action, a behavior analysis, or for something else? After deciding what you want to accomplish with the visualization for your target audience, you can then cater your charts to their expectations and knowledge.
Choose the right data visualization
It is important to choose the right type of visualization that best conveys your message and makes your data meaningful. The list below shows a few types of visualizations and what they’re suitable for.
Bar charts
are effective in comparing different categories of data and are one of the most commonly used data visualizations.Line graphs
connect distinct points and are useful in seeing how those points connect over time, allowing you to visualize the changes relative to each other.Maps
are for visualizing location-specific data and identifying spatial relationships in the geographical context.
Make your data visualization as understandable as possible
Every data visualization should have some sort of value for the audience for it to be insightful. Visualizations should be comprehensible and easy to understand. There needs to be a sufficient amount of context for your audience to easily read the visualization and grasp the message. Applying context to visualizations can include:
Adding easy-to-read legends and labels
Scaling the chart with equal intervals on each axis
Creating a title that summarizes the chart
Organizing the chart logically to easily compare data
Use clear color cues
Adding color cues can do so much without using any words. Using distinct and intuitive colors that make sense to the viewers visually attracts more people to that visualization, which helps accentuate the data and allows people to process the information faster.
When choosing an appropriate color scheme, make sure to avoid using mixed colors or rainbow palette colors since they tend to complicate your design and are ineffective. Instead, choose 1-2 tones for your visualizations that make them look simple and easy to understand.
Data Visualization Tools and Getting to Know Tableau
You may be familiar with creating visualizations using Python or other programming libraries, whether for exploratory data analysis or to present a dashboard. However, creating visualizations this way requires knowledge of coding. With the increasing popularity of data visualization, numerous tools have become available that can be used to create visualizations without the need for code, including Tableau, Microsoft Power BI, and Qlik Sense. These tools range from simple to complex, have unique features, and may be suitable for different industries and businesses.
In this task, we will be specifically focusing on Tableau and installing it. Tableau is a visual analytics platform that simplifies data-driven problem-solving by providing graphical representations of data. It is known for being a relatively easy tool for beginners to use and for creating visually appealing visualizations, making it a great starting tool to learn how to create effective data visualizations.
Click on the article to learn more about Tableau. Think of some of these questions while reading the article:
What is Tableau?
What are some useful features of Tableau?
What editions of Tableau are there?
How does Tableau work?
Installing Tableau Public and Setting Up Data
Now that you have gone through the essential knowledge about data visualization and the software tool Tableau, let’s start installing Tableau Public and setting up the data source you will use for the next quest!
Head over to this link to install Tableau Public - the free version. Click on Download Tableau Public, and it will take you to register for an account. Create a new account and make sure to verify your account.
Once you have installed Tableau Public, you will see this site shown below.
On this site, go to the Sample Data page and download the Netflix Movies and TV Shows dataset Excel file found under the Entertainment section. This will be the dataset you will use in the next write-up to create your data visualizations.
As you have read in the article, do take note that when using Tableau Public, the dataset used to create visualizations is available to the public.
NOTE: For non-Mac and Windows users, you can also use Tableau Public on your browser by heading to this link, creating a new account, and selecting “Web Authoring” under the Create tab on the top of that same link. Then click here to go to the Sample Data page and download the Netflix dataset.
Conclusion
Go back to Tableau Public. You will notice that you can get your data source through different means - an Excel file, Text file, JSON file, etc. Click on the Excel file, and in your directory, search for the Netflix Movies and TV Shows dataset you downloaded and open it.
After connecting to the data source, you should be on the data source page with the dataset successfully connected.