Share Data Through the Art of Visualization week 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

This design thinking example showed how important it is to:

-Understand the needs of users -Generate new ideas for data visualizations -Make incremental improvements to data visualizations over time

Are there multiple datasets?

For cases dealing with more than one set of data, consider a line or pie chart for accurate representation of your data. A line chart will connect multiple data sets over a single, continuous line, showing how numbers have changed over time. A pie chart is good for dividing a whole into multiple categories or parts. An example of this is when you are measuring quarterly sales figures of your company. Below are examples of this data plotted on both a line and pie chart.

Then, explain your data viz even further with

a subtitle. A subtitle supports the headline by adding more context and description. Use a font style that matches the rest of the charts elements and place the subtitle directly underneath the headline.

Space is the area between,

around and in the objects. There should always be space in data visualizations, just not too much or too little. For example, the space between the bars of a bar graph like this one should be smaller than the width of the bars themselves. This will draw the viewer's attention to the bar and the data it represents instead of the empty space. Finally, there's movement. Movement is used to create a sense of flow or action in a visualization. One of my favorite examples is the data viz, the Wealth and Health of Nations. This viz showcases a correlation between the financial health and physical health of nations.

You can also make it easier for people to see and hear content by separating foreground from background. Using bright colors, that

contrast against the background can help those with poor visibility, whether permanently or temporarily clearly see the information conveyed.

Causation can only be determined

from an appropriately designed experiment. Sometimes when two variables are correlated, the relationship is coincidental or a third factor is causing them both to change.

Column charts use

size to contrast and compare two or more values, using height or lengths to represent the specific values.

Design thinking is a process used to

solve complex problems in a user-centric way. User-centricity means considering the user and their needs first.

Reviewing each of these visual examples, where do you notice that they fit in relation to your type of data? One way to answer this is by evaluating patterns in data. Meaningful patterns can take many forms, such as:

-Change: This is a trend or instance of observations that become different over time. A great way to measure change in data is through a line or column chart. -Clustering: A collection of data points with similar or different values. This is best represented through a distribution graph. -Relativity: These are observations considered in relation or in proportion to something else. You have probably seen examples of relativity data in a pie chart. -Ranking: This is a position in a scale of achievement or status. Data that requires ranking is best represented by a column chart. -Correlation: This shows a mutual relationship or connection between two or more things. A scatter plot is an excellent way to represent this type of data pattern.

Kaiser Fung's Junk Charts Trifecta Checkup, This approach is a useful set of questions that can help consumers of data visualization critique what they are consuming and determine how effective it is. The Checkup has three questions:

-What is the practical question? -What does the data say? -What does the visual say? Note: This checklist helps you think about your data viz from the perspective of your audience and decide if your visual is communicating your data effectively to them or not. In addition to these frameworks, there are some other building blocks that can help you construct your data visualizations.

Define.

Define the problem that you're going to solve in the context of the pains and challenges you uncovered in your empathy discovery. This leads to a problem/challenge statement.

Empathize.

Work to understand your audience - who your audience is, what are their pains, problems, attitudes, and what they want to accomplish at an emotional level beyond your product.

A histogram resembles

a bar graph, but it's a chart that shows how often data values fall into certain ranges. This histogram shows a lot of data and how it's distributed on a narrow range from a negative one to a positive one. Each bin or bucket, as the bar is called, contains a certain number of values that fall into one small part of the range. If you don't need to show that much data, other histograms would be more effective, like this one about the length of dinosaurs. Here the bins or buckets of data values are segmented. You can show each value that falls into each part of the range

if you want to show a comparison of the different age groups of visitors to a website,

a line graph with a line for each age group, plus one for total users would work well.

Line graphs are

a type of visualization that can help your audience understand shifts or changes in your data. They're usually used to track changes through a period of time, but they can be paired with other factors too. In this line graph, we're using two lines to compare the popularity of cats and dogs over a period of time. With two different line colors, we can immediately tell that dogs are more popular than cats. We'll talk more about using colors and patterns to make visualizations more accessible to audiences later too. Even as a line moves up and down, there's a general trend upwards and the line for dogs always stays higher than the line for cats

Alternative text provides a textual alternative to non-text content. It allows the content and function of the image to be

accessible to those with visual or certain cognitive disabilities. Here's an example that shows additional text describing the chart And speaking of text, you can make data from charts and diagrams available in a text-based format through an export to Sheets or Excel

Shapes and visualizations should

always be two-dimensional. This is because three-dimensional objects in a visualization can complicate the visual and confuse the audience. Shapes are also a great way to add eye-catching contrast, especially size contrast to your data story. This circle used for a pie chart lets someone quickly understand the data in a familiar format. Shapes with symmetry are usually more familiar to people, so there's less work for the audience to do when viewing symmetrical data viz. But the asymmetrical shapes in this map are still instantly recognizable as countries. It's good to note that the data you're sharing with your audience will usually inform the types of shapes you want to use in your data viz.

A static image lets you

control all elements of the story you want to tell. When you start incorporating movement and interactivity, the story is controlled by whoever is controlling the interactivity, whether that's you or possibly your audience if you've turned control over to them.

If you use design thinking when planning and creating your data viz, you'll be making

decisions based on the needs of the people who will be viewing them. This way your audience will be engaged and enlightened by how you visualize your findings

In the empathize phase you think about the

emotions and needs of the target audience of your data viz, whether it's stakeholders, team members or the general public. Here you should avoid areas where people might face obstacles interacting with your visualizations. For example, let's say you've been working on an analysis for a pharmaceutical company about how patients have been responding to a new treatment. You're getting ready to visualize the data, so you should think about the audience, which will include stakeholders like pharmacists, doctors and other medical professionals. Maybe you're thinking of using a color scheme that you like, but you realize that these colors might be a challenge to some people. The colors might be too bright or dramatic, which might not be right for the seriousness of the data. Or the colors might not have enough contrast for people who have color vision deficiencies. By adjusting the colors, you'll be empathizing with the needs of your audience. If there's someone on your team who is vision impaired, you want to find a way to explain the data verbally as well.

Pie charts show

how much each part of something makes up the whole. This pie chart shows us all the activities that make up someone's day. Half of it's spent working, which is shown by the amount of space that the blue section takes up. From a quick scan, you can easily tell which activities make up a good chunk of the day in this pie chart and which ones take up less time.

When you bring design thinking into your work, you're trying to

identify alternative strategies for your visualizations that might not be clear right away. You have to challenge your own thinking and explore different ways of approaching the problems and finding solutions

let's talk about labels. Earlier, we mentioned Dona Wong, a visual journalist who's well known for sharing guidelines on making data viz more effective. She makes a very strong case for using

labels directly on the data instead of relying on legends. This is because lots of charts use different visual properties like colors or shapes to represent different values of data.

Next, we have colors, and colors are, well, colors. Of course, in the eyes of artists and analysts, colors can be

much more complex. Colors can be described by their hue, intensity, and value. The hue of a color is basically its name, red, green, blue and so on. Intensity is how bright or dull a color is, and finally, there's value. The value is how light or dark the colors are in a visualization. In more scientific terms value indicates how much light is being reflected. Dark values with some black added are called shades of color, like these shades of green. Light values with white added are called tints, like these tints of blue. In this map, there are shades and tints of gray. The value of these colors help us understand the population data in the map and varying the color's value can be a very effective way to draw our audience's attention to specific areas.

maps help

organize data geographically. The great thing about maps is they can hold a lot of location-based information and they're easy for your audience to interpret. This example shows survey data about people's happiness in Europe. The borderlines are well-defined and the colors added make it even easier to tell the countries apart. Understanding the data represented here, which we'll come back to again later, can happen pretty quickly.

Correlation charts can show

relationships among data, but they should be used with caution because they might lead viewers to think that the data shows causation. Causation or a cause-effect relationship occurs when an action directly leads to an outcome. Correlation and causation are often mixed up because humans like to find patterns even when they don't exist. If two variables look like they're associated in some way, we might assume that one is dependent on the other. That implies causation, even if the variables are completely independent. If we put that data into a visualization, then it would be misleading. But correlation charts that do show causation can be effective. For example, this correlation chart has one line of data showing the average traffic for Google searches on Tuesdays in Brazil. The other lines for a specific date of search traffic, June 15th. The data is automatically correlated because both lines are representing the same basic information. But the chart also shows one big difference. When a football match or soccer match for Americans began on June 15th, the search traffic showed a significant drop. This implies causation. Football is a very popular and important sport for Brazilians, and the data in this chart verifies that.

One of your biggest considerations when creating a data visualization is where

you'd like your audience to focus. Showing too much can be distracting and leave your audience confused. In some cases, restricting data can be a good thing. On the other hand, showing too little can make your visualization unclear and less meaningful. As a general rule, as long as it's not misleading, you should visually represent only the data that your audience needs in order to understand your findings.

the database we've been discussing here have three essential elements.

-The first is clear meaning, good visualizations clearly communicate their intended insight. -The second is a sophisticated use of contrast, which helps separate the most important data from the rest using visual context that our brains naturally look for. -The third essential element for effective visuals is refined execution. Visuals with refined execution include deep attention to detail, using visual elements like lines, shapes, colors, value, space and movement.

you can use when creating data visualizations,

empathize, define, ideate, prototype, and test. In the spirit of design thinking these phases don't have to follow a set order. Instead, think of them as an overview of actions that can help you produce a user centered design in your visualizations.

few ways you can incorporate accessibility in your data visualization. You'll just have to

think a little differently, it helps to label data directly instead of relying exclusively on legends, which require color interpretation and more effort by the viewer to understand. This can also just make it a faster read for those with or without disabilities.

A line chart is used to

track changes over short and long periods of time. When smaller changes exist, line charts are better to use than bar graphs. Line charts can also be used to compare changes over the same period of time for more than one group.

If your data needs to be ranked, like when ordering the number of responses to survey questions. You should first think about

what you want to highlight in your visualization. Bar charts with horizontal bars effectively show data that are ranked, with bars arranged in ascending or descending order. A bar chart should always be ranked by value, unless there's a natural order to the data like age or time, for example. This simple bar chart shows metals like gold and platinum ranked by density. An audience would be able to clearly see the ranking and quickly determine which metals had the highest density, even if this database included a lot more metals

Direct labeling like this keeps

your audience's attention fixed on your graphic and helps them identify data quickly. While legends force the audience to do more work, because a legend is positioned away from the chart's data.

In your data analysis, remember to:

-Critically analyze any correlations that you find -Examine the data's context to determine if a causation makes sense (and can be supported by all of the data) -Understand the limitations of the tools that you use for analysis

Design thinking for data visualization involves five phases:

-Empathize: Thinking about the emotions and needs of the target audience for the data visualization -Define: Figuring out exactly what your audience needs from the data -Ideate: Generating ideas for data visualization -Prototype: Putting visualizations together for testing and feedback -Test: Showing prototype visualizations to people before stakeholders see them

Are you measuring changes over time?

A line chart is usually adequate for plotting trends over time. However, when the changes are larger, a bar chart is the better option. If, for example, you are measuring the number of visitors to NYC over the past 6 months, the data would look like this:

Ideate.

Come up with a bunch of different ways to solve the problem.

choices you'll make as a data analyst when creating visualizations.

Each of your choices should help make sure that your visuals are meaningful and effective. Another choice you'll need to make is whether you want your visualizations to be static or dynamic

Test.

Get your prototypes in front of users and see what they say.

Does your data have only one numeric variable?

If you have data that has one, continuous, numerical variable, then a histogram or density plot are the best methods of plotting your categorical data. Depending on your type of data, a bar chart can even be appropriate in this case. For example, if you have data pertaining to the height of a group of students, you will want to use a histogram to visualize how many students there are in each height range

Which of the following are necessary to consider while making an effective visualization

In order to make an effective visualization, you must consider the type of data you're visualizing, the needs of your audience, and the design thinking process. An effective visualization can be made in any visualization software. Going forward, you can use your knowledge of creating data visualizations in the chart editor to explore more types of data visualizations. This will help you better present your data and findings to peers and stakeholders.

What are the four elements of effective data visualization according to David McCandless?

The four elements of effective data visualization are the -information (data), -the story (concept), -the goal (function), - and the visual form (metaphor); a successful data visualization must have all four elements.

Prototype.

Turn those ideas into the lowest possible fidelity test that you can execute and still get clean data. This stage usually cuts the ideas down as problems are discovered.

Do relationships between the data need to be shown?

When you have two variables for one set of data, it is important to point out how one affects the other. Variables that pair well together are best plotted on a scatter plot. However, if there are too many data points, the relationship between variables can be obscured so a heat map can be a better representation in that case. If you are measuring the population of people across all 50 states in the United States, your data points would consist of millions so you would use a heat map. If you are simply trying to show the relationship between the number of hours spent studying and its effects on grades, your data would look like this

If outcomes are categorized on the x-axis by distinct numeric values (or ranges of numeric values), the distribution becomes

a histogram. If data is collected from a customer rewards program, they could categorize how many customers consume between one and ten cups of coffee per week. The histogram would have ten columns representing the number of cups, and the height of the columns would indicate the number of customers drinking that many cups of coffee per week.

Things to remember If there is a correlation between two variables,

a pattern will be seen when the variables are plotted on a scatterplot. There are three ways to describe the correlation between variables. -Positive correlation: As xxx increases, yyy increases. -Negative correlation: As xxx increases, yyy decreases. -No correlation: As xxx increases, yyy stays about the same or has no clear pattern.

A legend or key identifies the meaning of various elements in a data visualization and can be used as an

alternative to labeling data directly.

dynamic visualizations :

are interactive or change over time. The interactive nature of these graphics means that users have some control over what they see. This can be helpful if stakeholders want to adjust what they're able to view.

An annotation

briefly explains data or helps focus the audience on a particular aspect of the data in a visualization.

Tableau is a

business intelligence and analytics platform that helps people see, understand, and make decisions with data.

The pie chart is a

circular graph that is divided into segments representing proportions corresponding to the quantity it represents, especially when dealing with parts of a whole.

heatmaps also use

color to compare categories in a data set. They are mainly used to show relationships between two variables and use a system of color-coding to represent different values.

Lines and visualizations can be

curved or straight, thick or thin, vertical, horizontal, or diagonal. They can add visual form to your data and help build a structure for your visualization.

Then there's charts that show parts of a whole. This is known as

data composition, and it's achieved by combining the individual parts of a visualization and displaying them together as a whole. Stack bars, donuts, stacked areas, pie charts and tree maps can do all this.

Static visualizations

do not change over time unless they're edited. They can be useful when you want to control your data and your data story. Any visualization printed on paper is automatically static. Charts and graphs created in spreadsheets are often static too. For example, the owner of this spreadsheet might have to change the data in order for the visualization to update.

The define phase helps you to

find your audiences needs, their problems, and your insights. This goes hand in hand with the empathize phase as you'll use what you learned in that phase to help you spell out exactly what your audience needs from your visualization. You could use this phase to think about which data to show in your visualization. Maybe this data viz will also be presented to patients who are part of your company's study. While you'll need to meet your objectives, there might be data that could make these people uncomfortable. You can think of ways to position that data to make it more digestible. Or if you're presenting to different audiences, you can adjust your visualizations to meet each group's needs by seeking input from members of the group or colleagues who've worked with that group before.

In the ideate phase, you start to

generate your data viz ideas. You'll use all of your findings from the empathize and define phases to brainstorm potential data viz solutions. This might involve creating drafts of your visualization with different color combinations or maybe experimenting with different shapes. Creating as many examples as possible will help you refine your ideas. The key here is to always remember your audience when coming up with ideas and strategies. You want to think about how you can position your visualizations to meet the needs and expectations of your audience.

adding in descriptive wording can really help your audience interpret and understand the data in the right way. Your audience will be less likely to have questions about what you're sharing if you add

headlines, subtitles, and labels.

One of the easiest ways to highlight key data in your data viz, is through

headlines. A headline is a line of words printed in large letters at the top of the visualization to communicate what data is being presented. It's the attention-grabber that makes your audience want to read more. The typography and placement of the headline is important too. It's best to keep it simple. Make it bold or a few sizes larger than the rest of the text and place it directly above the chart, aligned to the left.

Correlation :

in statistics is the measure of the degree to which two variables move in relationship to each other. An example of correlation is the idea that "As the temperature goes up, ice cream sales also go up." It is important to remember that correlation doesn't mean that one event causes another. But, it does indicate that they have a pattern with or a relationship to each other. If one variable goes up and the other variable also goes up, it is a positive correlation. If one variable goes up and the other variable goes down, it is a negative or inverse correlation. If one variable goes up and the other variable stays about the same, there is no correlation.

For comparing data over time we showed you how

line graphs could be effective. Like in this one bar graphs and stacked bar graphs, along with area charts, can also be good ways to visualize how data changes over time.

The elements we'll check out are

line, shape, color, space and movement. Now, these aren't the only elements to consider, but these particular ones can add value to your data viz by making them more visually effective and compelling.

Other dynamic visualizations upload

new data automatically. These bar graphs continually update data by the minute and second. Other data visuals can do the same by day, week or month. If you need to, you can show trends in real-time.

when you're comparing distinct objects like in our example about mobile versus computer usage,

ordered bar, and group bar graphs, and ordered column charts are useful.

Let's say you want to highlight the differences among the age groups to compare them or directly, for that you might use a

positive negative bar chart like this.

The final two phases are prototype and test. Here you'll start

putting your charts, dashboards or other visualizations together.

Causation

refers to the idea that an event leads to a specific outcome. For example, when lightning strikes, we hear the thunder (sound wave) caused by the air heating and cooling from the lightning strike. Lightning causes thunder.

Scatter plots show

relationships between different variables. Scatter plots are typically used for two variables for a set of data, although additional variables can be displayed.

Now to show relationships in your data, you might want to use

scatterplot and bubble charts, column/line charts and heatmaps.

The viz is

set up to show how the search entries change day to day. The bubbles represent the most popular topic on each day in a given part of the US. As new stories come up, the data changes to reflect the topic of those stories. If we wanted the data for weekly or monthly news cycles, we change the interactive feature to show changes by week or month. Another situation is when you need to show how your data is distributed

Bar graphs use

size contrast to compare two or more values. The horizontal line of a bar graph usually placed at the bottom, is called the x-axis, and bar graphs with vertical bars, the x-axis is used to represent categories, time periods, or other variables. The vertical line of a bar graph usually placed to the left is called the y-axis. The y-axis usually has a scale of values for the variables. In this example, the time of day is compared to someone's level of motivation throughout the whole workday. Bar graphs are a great way to clarify trends. Here, it's clear this person's motivation is low at the beginning of the day and gets higher and higher by the end of the workday. This type of visualization makes it very easy to identify patterns.

A distribution graph displays

the spread of various outcomes in a dataset.


Conjuntos de estudio relacionados

Managerial Accounting Chapter 2 Assignment

View Set

Discovering the Internet Chapter 1 (Into the Internet) Key Terms

View Set

AP Classroom: Unit 3 Progress Check - AP Chemistry

View Set

Test: A Revolution and Two World Wars

View Set

MATH-164 - Chapter 1 - 4 Review - Exam 1

View Set

Pharmacy Practice 7e CH2 Exam Questions

View Set