Data Visualization Midterm

Ace your homework & exams now with Quizwiz!

What are the four basic data relationships that can help you with choosing the right graphical presentation for your data? (R-C-D-C)

(R-C-D-C) Relationship - Scatter plot (good for linear regression analysis) and bubble chart (good to show three or more variables) Composition - Pie chart (overused) and stacked column chart (can get messy) Distribution - histogram (good for categorical data) and map chart (distribution overview of geographical areas) Comparison - Column chart and line graph (the ones I use the most)

Histogram

- A bar graph depicting a frequency distribution - The standard way to show a statistical distribution

Population Pyramid

- A bar graph representing the distribution of population by age and sex.

Tableau: Dashboards

- A collection of several related visualizations shown on a single page, usually tied together through interactivity - Useful for: > Increasing the analytical power of a visualization by showing multiple perspectives in the same location

Change Blindness

- A perceptual phenomenon that occurs when a change in a visual stimulus is introduced and the observer does not notice it

Connected Scatter Plots

- A scatter plot with dots linked to each other with lines - Usually used to show how the relationship between two variables has changed over time

Barcode (rug) Charts

- A simple line plotted for every point, usually with some transparency or a color scale to deal with multiple points of the same value - Used to show distribution

Frequency Polygon

- A variation on the histogram using a line connecting all bin totals rather than a bar for each (understanding the shape of distribution)

What is Data Visualization

- A way to convey information through graphical representations of data - The use of visual representations of data to support perception and amplify cognition > Turning data into a landscape you can explore with your eyes

What can you do when you have missing data?

- Add N (sample size) for large consistent missing data. This way you note the absence is a subtle way. - Use designs that make it obvious that data gaps are context, and not the main focus. - Treat the data as a category (not applicable) - Interpolate: fill the gaps in a smart way, like using a weighted moving average

Bubble Plots

- Allows on to add another dimension to a scatter plot - Bubble Size; Color - CORRELATION, RELATIONSHIP

Tableau: Measures

- Any field containing numeric (quantitative) information - Dependent variable - Usually green

Sequential Color Scheme

- Color is ordered from low to high

Alert Color Scheme

- Color used to get reader's attention

Highlight Color Scheme

- Color used to highlight something

Visualization Excellence (CPE)

- Complex ideas communicated with clarity, precision, and efficiency - Is nearly always multivariate

Categorical Color Scheme

- Contrasting colors for individual comparison

Dot (strip) Plots

- Each value is plotted as a point - Shows distribution of actual values - Good for showing individual values in a distribution

Persuasion Techniques (ERI)

- Emphasis - Adding or removing reference points - Isolation

Tables

- Essentially the source of all charts - Use when: > Need to look up individual values > Require precise values - Unidirectional and Bidirectional table designs

Deception Techniques (FEO-E)

- Falsification - Exaggeration - Omission - Equivocation

Why does Change Blindness occur?

- Focused attention and limited resources - Expectations / past experiences - Age - The way objects are presented

Data Wrangling

- Gather data from inside and outside the firewall - Join all your data into a single table - Understand your sources and their limitations - Clean up data

Tableau: Regular Calculations

- Handled by the data source and the result set is performed in Tableau

Describing Color & How many to use (Color has H..L..S...)

- Hue - Luminance - Saturation (intensity) - Use NO MORE than 5-7 colors at once

What is scenario analysis?

- Know what data you have available. Know your audience. What insights do you want derived. Should you use a table or a graph?

Data Types

- Nominal (to categorize and label) - Ordinal (Attributes can be ordered) - Interval (Meaningful gap between data values) - Ratio (Full numerical expressive power; absolute zero exists)

Bin the Data: Unit Chart

- Organize the dots into bins

Gestalt's Principles

- Patterns that transcend the visual stimuli that produced them - Proximity (least effective) - Similarity (least effective) - Connection (2nd most effective) - Enclosure (most effective) - Continuity

Bertin's Visual Channels

- Position - Size - (grey) value - Texture - Color - Orientation - Shape

Scatter Plots

- Present patterns in large sets of data, primarily for correlation and distribution analysis - The more data you include in a scatter plot, the better comparisons you can make

Tableau: Dimensions

- Qualitative data - Independent variable

Tableau: Table Calculations

- Secondary calculations that are performed on the returned results of the view - Fast way to create advanced calculations even without knowing the underlying syntax

Small Multiples

- Series of the same small graph repeated in one visual - Tufte says that they are a great tool to visualize large quantities of data and with a high number of dimensions

Preattentive Attributes of Visual Perception

- Shape - Size - Color (BAD FOR QUANT. COMPARISON) - Length & width - Texture - Elongation - Orientation - Spatial grouping - Order - Spatial position - Enclosure - Convex/Concave **ALSO: Motion, medium, & context

Sources of Illusion (LSDD)

- Single light source and context - We use several perspective cues to make sense of things: lines, sizes, depth, distance

Degrees of Pop-Out (COSC-M)

- Strongest effects are: > Color > Orientation > Size > Contrast > Motion or blinking

Box and Whisker Plots

- Summarize multiple distributions by showing the median (center) and range of the data

How do you determine whether something is pop out?

- The degree of difference of the target from the nontargets (should be the only one that is distinctive) - The degree of difference of the nontargets from each other (should appear more similar)

Diverging Color Scheme

- Two sequential colors with a neutral midpoint

Preattentive Processing and Ease of Search (2 systems)

- Two thinking systems: 1) System 1 (pre-attentive) > Generates feelings, impressions, intuitions, and intentions for System 2 2) System 2 (attentive) > Can be engaged as needed to solve more complex problems

Interpolate

- Understand why data has missing values, and then replace those values with a weighted average that is around a given point

Tableau: Calculated Fields

- Used to create new dimensions such as segments, or new measures such as ratios - Can also be used with any data type, a multitude of functions and aggregations, as well as logical operators - Why use? > Segment data > Prove a concept > Filter out unwanted results

Violin Plots

- Used to visualize the distribution of the data and its probability density - Similar to a box plot but more effective with complex distributions

Multi-Axis Charts

- Used when a single chart just cannot tell the whole story - A good way of showing the relationship between an amount (columns) and a rate (line)

When should you use motion or blink?

-When color and shape channels are fully utilized, consider using motion or blink - but make it subtle.

Types of Relationships

1) Correlational > two things perform in a synchronized manner 2) Reverse Causation > Cause and effect are reversed 3) Third-Variable Problem > An unobserved variable that accounts for a correlation between 2 variables

Tufte's Principles (x5)

1) Have graphical integrity 2) Mind the lie factor > Numbers should be directly proportional to the numerical quantities represented 3) Maximize data-ink ratio > Closer to 1.0, the better the graph is 4) NO chart junk 5) LESS IS MORE: A visual project is good if it communicates a lot with a lit (BIG MINIMALIST) > AKA graphical excellence **Minimalism and efficiency

How do you filter?

1)Dragging on the Filter Shelf, 2)Using an Interactive Filter, (so the user can interact with the view) 3)Selecting a marker in the view to exclude. All of these types of filters put a pill on the filter shelf. To remove a filter, drag the pill off the shelf.

Cairo's Principles (BIF-TE)

3) Beautiful > Attractive and pleasing things work better > Balanced mix 4) Insightful > Clear the path, accessible information 2) Functional > Choose graphics to help people encode information correctly 1) Truthful > Present things concisely, simply, clearly, and elegantly 5) Enlightening > Change people's minds for the better > Choose topics ethically and wisely

What visual attributes can you use for ordered data?

Area and color intensity

Causes of illusions?

Brain relies on past experiences and making assumptions to overcompensate and make sense of things -Focused attention / Limited resources -Single Light Source -Context -Expectations -Age -Distractions

How can you tell if something has a pre-attentive attribute?

Check the response time to find a target among others - if pre- attentive, the time taken to find the target should be equally fast no matter how many distracting non-targets there are.

What are two least effective visual attributes?

Color and shape (often used for categories)

Statistics vs. Data Visualization?

Data visualization is the use of visual representations of data to support perception and amplify cognition Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data

Deviation comparison

Display the degree to which one or more sets of quantitative values differ in relation to a primary set of values or target. Variance from the plan

Correlation

Display the relationship between two paired sets of quantitative values to demonstrate whether they are related, and if so direction (positive or negative) and the strength of the relationship (strong or weak).

Distribution

Display the way in which one or more sets of quantitative values are distributed across their full quantitative range from the lowest to highest and everything in between. Outlier detection

Time series comparison

Displays quantitative values among multiple, sequential points in time. One axis of the graph provides the time scale, with labels for each interval of time (years, quarters, etc.).

divergent vs. convergent thinking

Divergent - CREATIVE: the ability to generate unusual, yet nonetheless appropriate, responses to problems or questions (MANY answers) Convergent - LOGIC/KNOWLEDGE: Less creative: the ability to produce response that is based primarily on knowledge and logic.

How do you select the right graph?

EXAMINE the RELATIONSHIP to choose a graphical form that can effectively communicate your message. Match the quantitative message you wish to communicate to the structural design.

Comparison chart

Each value is discrete and relate to a separate categorical subdivision without any connection; so it's important to emphasize their distinctness Or Nominal item comparison Or Ranking

What is an effective visualization?

Effective visualizations reveal patterns and communicate ideas using the power of perception to offload cognition.

What are Gestalt's 4 principles?

In design, visual hierarchy is the arrangement or positioning of different design elements to give them greater or lesser importance. The various gestalt principles heavily influence visual hierarchy. Proximity, Similarity, Enclosure, Connection, Continuity

What are the STRONGEST pre-attentive attributes? (COSC-M)

In general the strongest effects are based on: -Color -Orientation -Size -Contrast -Motion or blinking Large color differences have more popout effect than small ones (e.g., black-white)

How does pre-attentive processing work?

Parallel processing; multiple visual elements perceived simultaneously Quick; Operates automatically, with little or no effort and no sense of voluntary control Unconscious; generates impressions, intuitions, intentions, and feelings Your perception "at a glance"

What are the two most effective visual attributes?

Position and length

What visual attributes can you use for quantitative data?

Position, Length (most efficient) Slope and angle (less efficient)

What is the purpose of filtering?

Purposes? •minimizing the size of the data for efficiency purposes, •cleaning up underlying data, •removing irrelevant dimension members, and •setting ranges for what you want to analyze. • More? Interaction with the end-user!

What did we do for the in-class design sprint?

Running Design Sprint •Mapping - What's the question? •Sketching - How to sketch ideas? •Deciding - How to decide (as a team)?

What's an example of conjunction of features?

Searching for the red squares is slow because they are identified by a conjunction of shape and color

How long should you spend on each phase?

Sketching is the longest (divergent thinking, many ideas)

When should you start proto-typing?

Start prototyping with a tool only when: •Your sketches reasonably match your What am I trying to say or show? statement. •Your sketches are becoming refinements of one idea. •You find yourself focusing on specifics; e.g., designing the charts, focusing on color, titles, and labels. You feel that you don't have any more ideas

What does the Marks card do?

The Marks Card is key for visual analysis - it adds context and detail to the marks in the view.

hollow mask illusion

The hollow-face illusion is a well-known example of depth inversion occurring with a real object under a wide variety of viewing conditions. We have the compelling and obligatory impression of a convex face with the nose sticking out

What is the core objective of running a Design Sprint?

The purpose of a design sprint is to test ideas fast through rapid prototyping and group creation. It's a structured way to solve a problem fast and design a new tool or product in under a week.

What is a design sprint?

The sprint is a five-day process for answering critical questions through design, prototyping, and testing ideas

Why is a table calculation called that?

They compute the result based on a virtual table that includes only the numbers on the view. Has a pre-defined calculation.

Why shouldn't you use too much color?

Too Much Color BAD. Use no more than 3-6 at once •Short-term Memory = "small chunks of information" •Requires 1) reusing the same or similar color, 2) Requires frequent reference to the legend

What are the fundamental phases of a Design Sprint process? USD-PV

Understand--the team shares insights, brings everyone up to speed, and starts to understand the problem Sketch -- team members collaborate amongst themselves to come up with as many sketches and solutions as they can Decide -- The ideas produced are narrowed down, the team may vote, and a decider chooses winner(s) and a visual storyboard starts taking fruition. Prototype -- A visual prototype of the product is made Validate -- The prototype is tested to see whether the idea adds any value or is usable.

Waterfall chart

Waterfall Charts are used to visually illustrate how a starting value of something (say, a beginning monthly balance in a checking account) becomes a final value (such as the balance in the account at the end of the month) through a series of intermediate additions (deposits, transfers in) and subtractions. - Used for composition

What are the component of graphs?

What are the components of graphs? -Scales along axis -Grid lines -Legends -Components representing quantitative values, e.g.,:

Anscombe's Quartet

four datasets that have the same simple descriptive statistics (mean, median, correlation, varian, linear regression line), yet appear very different when graphed.

Name a few pre-attentive attributes:

§Size §Shape §Length & Width §Texture §Elongation §Orientation §Spatial Grouping §Order §Spatial Position §Enclosure §Convex/Concave §Color - Hue - Value/Intensity/Lightness

Getting to the point (Yau)

•Ask the data questions. •What is the structure of the data? •What is the mean and median? •Correlations, relationships, distributions, outliers? •Develop and explore context-specific questions. •Start with the visualization basics. •Focus: Resist the temptation to add so many things that it obscures the original purpose of the visual. •Iterate until you find your unique STORY.

What are visualization goals?

•Consolidate complex stats (make data accessible) to process lots of data simultaneously (see the forest along with the trees) •Taps into brain's "pre-attentive visual processing" - the way we perceive visual distinctiveness (see the hidden wolf in the forest).

What are the four pill types?

•Discrete dimension •Continuous dimension (Numeric) •Discrete measure (Non-Numeric) •Continuous measure

What are visualization goals?

•Record information - Blueprints, photographs, seismographs •Explore/Analyze data or Reveal patterns to - Find the unknown or support reasoning or assess hypotheses - Answer questions (or discover them) - Make decisions - See data in context and discover errors in data - Expand memory •Communicate ideas to others - Sharing and Persuasion - Storytelling and Inspiration -Collaboration and Revision


Related study sets

Jacquie-Baking Study Guide Chapter 20

View Set

Chapter 28 (no true and false questions)

View Set

Pulmonary Embolism, Acute Respiratory Failure, Acute Respiratory Distress Syndrome

View Set

Chapter 8- Invasion of Privacy: Publication of Private Information and False Light

View Set

Ch. 15 - Working with Command-Line Interface

View Set