Info Viz Midterm
2: Discriminability
How many values (bins, levels) can we distinguish for any given channel? ‒ must be sufficient for number of attribute levels to show ‒ linewidth: few bins - Rule: the number of available bins should match the number of bins we want to be able to see from the data.
Which of the following is the ideal model for visual encoding?
Hue, Saturation, Luminance
3: HSV/HSL
Hue, Saturation, Value (HSV) ‒ Sometimes Value is replaced by Brightness to become HSB• Hue, Saturation, Lightness (HSL) Not ideal, but practically decentcolor space available for visualencoding
Pre-attentive attributes invoke which kind of memory first.
Iconic
Which of the following is not particularly about identifying and removing clutter?
Integrating Text and Graphics
Which of the following is the most important feature of a visualization to answer different questions and possibly generate new questions/answers?
Interactivity
1: Network/Graph
Items (nodes), links attributes - Nodes (vertices_ connected by links (edges) - Users on Facebook, Instagram, Tiktok, etc. - Computers on UI System
1: Tables
Items, attributes Single table - One item per row - Each column is an attribute - Cell holds value for item - Attribute pair Multidimensional tables - Indexing based on multiple keys - Genes, patients - Zip codes resident
1: Geometry
Items, positions - Shape of items - Explicit spatial positions/regions - Boundary between computer graphic and visualization
1: Ordinal
Less/greater than defined ex) Grade level, position in race
1: Collections (clusters, lists, sets)
Lists - How we group items · Sets · Lists · Clusters
4: Tableau works better with long or wide data?
Long data
1: Quantitative
Meaningful magnitude, math is possible ex) Height, weight
3: Cognitive Load
Mental load that's required to learn new information
A dataset with multiple tables in it is called
Multi-dimensional Table
3: Color Deficiency, Luminance
Need luminance for edge detection ‒ fine-grained detail only visible through luminance contrast ‒ legible text requires luminance contrast!
1: Categorical
Nominal, compare equality, no implicit order ex) Eye color, zipcode
3: Similarity
Objects that are of similar color, shape, size, or orientation are perceived as related or belonging to part of a group. ‒ Figure 3.3, you naturally associate the blue circles together on the left or the grey squares together on the right. ‒ Figure 3.4, similarity of color is a cue for our eyes to read across the rows. This eliminates the need for additional elements such as borders to help direct our attention.
3: Color channels for Ordered attributes
Ordered attributes (ordinal, quantitative) ‒ Luminance preferred • But we can easily distinguish only about 5 different non-contiguous values ‒ Saturation works as well but with care
The university administration is looking at student performance in course X to decide whether to increase the number of sections for that class. The data includes an attribute "Grade", which is the letter grade for each student in the data. Grade is _______ attribute.
Ordinal
3:Bezold Effect
Outlines effect
Which of the following is not a property of the Data Interpreter? Cleaning excel reports Tidy formatting of the table data Pivoting Tables
Pivoting tables
3: Pre-attentive Processing
Pre-attentive processing is a subset of Gestalt theory. Pre-attentive attributes like size, color, position, etc. can be leveraged to help direct your audience's attention to where you want them to focus it ex) shape, enclosure, saturation, line width, color, size, markings, orientation, closure, density
3: Gestalt Principles of Visual Perception
Proximity Similarity Enclosure Closure Continuity Connection
Information Viz, in general, is about
visual representation of information/data to help people understand complex phenomenon
Which of the following item is not about channel effectiveness?
Expressiveness
1s: Trees
- Special case, no cycles - Often have roots and are directed
Attributes can take different types of values. We discussed X types of attribute values in the class. X is
3
A regular bar chart as follows will always have
1 Key (categorical), 1 Value (quantitative)
4: Story 3 Acts
1. Set up: Context 2. Conflict: Issue 3. Resolution: Call to Action
Which of the following is a not spatiotemporal scenario? A colony of penguins in southern hemisphere Weather radar Migration of birds north to south across different seasons Ball movement in a basketball ga
A colony of penguins in the souhthern hemisphere
3: Color Space
A color space is a specific organization of colors. ‒ It supports reproducible representations of color -whether such representation entails an analog or a digital representation RGB color model, CMYK color model,
2: Channel Effectiveness
Accuracy: how precisely/accurately can we tell the difference between encoded items? Discriminability: how many unique steps/values can we perceive? Separability: Is our ability to use this channel affected by another one? ‒ Shape affected by color choices Popout: Can things jump out using this channel? ‒ How easy it is to spot some values from the rest
Which of the following statements are correct about choropleth maps? i) Choropleth maps can have the issue of raw vs. normalized data. ii) The size of a region and its color interfere with our perception. iii) It is recommended to show one attribute at a time
All (i, ii, and iii)
Which of the following is not a con of dot maps?
Avoids region size issue of choropleth maps
Which of the following chart type (Viz Idiom) is a better alternative for Radar Plot?
Bar Chart
3: Color channels for Categorical attributes
Categorical attributes ‒ Hue is a very effective channel for categorical attributes and to showgroupings. • Pay attention to the no. of colors used, fewer the better
1: Attribute Types
Categorical, quantitative, ordinal
Which of Gestalt's principle explain the brain processing the graph well even without a y or an x-axis line?
Continuity
1: Spatial and Fields
Continuous Items, positions - Attribute values associated with cells - Cell contains value from continuous domain - Ex. Temperature, pressure, wind velocity
3: Tufte's Data Density
Data Density of a graphic = number of entries in data matrix (table) / area of data graphic • Higher data density is usually preferred. • Edward Tufte refers to maximizing the data ‐ ink ratio, saying "thelarger the share of a graphic's ink devoted to data, the better(other relevant matters being equal)." ‒ This can also be referred to as maximizing the signal to‐ noise ratio • where the signal is the information, we want to communicate, and • the noise are those elements that either don't add to, or in some cases detract from,the message we are trying to impart to our audience
3: Decompose color into 3 channels
Decompose into three channels ‒ ordered can show magnitude • luminance: how bright (B/W) • saturation: how colorful ‒ categorical can show identity • hue: what color
3: Signal vs. Load
Edward Tufte refers to maximizing the data‐ ink ratio, saying "the larger the share of a graphic's ink devoted to data, the better(other relevant matters being equal)" This can also be referred to as maximizing the signal-to‐noise ratio • The signal is the information, we want to communicate, and• the noise are those elements that either don't add to, or in some cases detract from, the message we are trying to impart to our audience
3: Explanatory Analysis
Explanatory analysis focuses on specific things you want to explain, a specific story you want to tell about that data ‒ 99.9 % of the time, we should resist the urge to show the exploratory analysis and focus on the main story (explanatory) of our data
3: Exploratory Analysis
Exploratory analysis is what you do to understand the data and figure out what might be noteworthy or interesting to highlight to others. ‒ It is an introductory step to look for patterns, trends, a story in your data. ‒ Once you have something valuable and interesting to share, we move to the Explanatory space.
A Key (an independent attribute) can be quantitative, ordinal, or categorical.
False
Color and Shape are more easily separable than Color and Position.
False
Humans perceive the area of a region as well as they perceive the length of a bar/line.
False
Summary statistics show all the important characteristics of data/numbers. (t/f)
False
When a data file is constantly updated with new data, it is called static data.
False
When do we prefer a continuous sequential single-hue palette over a segmented single-hue palette?
Focused on the overall trend/story and not the precise estimates of values
4: Full Outer Joins
Full outer join returns all records from both tables
2: Accuracy
Fundamental Theory: - length is accurate: linear (N = 1) - others magnified or compressed - exponent (N) characterizes Factors affecting accuracy: - alignment - distractors - distance - common scale/ alignment
1: Link
Relationship between items
Which of the following is not an example of a Radial Orientation?
Scatterplot Matrix
Design a color scheme that shows as much detail as possible about patterns in employment rates across states in the USA map. Which of the following is a suitable palette option for this task?
Segmented Sequential single hue palette
Which of the following is not necessarily part of learning Info Viz?
Senses of hearing and touch
2: Separable vs. Integral
Separable: can judge each channel individually Integral: two channels are viewed holistically
You are given data with the following data types: Positions, Attributes, and Grids. What is the most suitable visual representation for this data?
Spatial (fields)
Which of the following is not Gestalt's Principle of Visual Perception?
Spatial Awareness
4: Spatial Data
Spatial data is data about ‒ the spatial arrangement (position: latitude & longitude) and ‒ shape of objects • state borders on a map, shape of a brain region, movement of wind, etc.
4: Robert McGee
Storytelling
1: Dataset Types
Tables, network/graph, trees, spatial and fields, geometry, clusters lists sets
2: Effectiveness Principle
The importance of the information should match the salience (prominence) of the channel some channels are better than others - length is better than curve
3: Continuity
The principle of continuity is similar to closure: when looking at objects, our eyes seek the smoothest path and naturally create continuity in what we see even where it may not explicitly exist. ‒ By way of example, in Figure 3.9, if I take the objects (1) and pull them apart, most people will expect to see what is shown next (2), whereas it could as easily be what is shown after that (3). We've removed the vertical y‐ axis line from the graph in Figure 3.10 altogether. ‒ Your eyes actually still see that the bars are lined up at the same point because of the consistent white space (the smoothest path) between the labels on the left and the data on the right
2: Visual Encoding
The way in which data is mapped to visual structures Typically by mapping - data items to visual marks and - data attributes to visual channels
Which of the following visual channel is usually not used for symbol maps?
Tilt/Angle
It is acceptable to show a low-luminance graph on a high-luminance background.
True
Line graphs can only have one ordinal Key and one Quantitative Value.
True
Which of the following is not a major use case of visualizations?
Using Viz for a problem which has a fully automated solution
2: Expressiveness Principle
Visual information should express all and only the information, in right format, in the data match channel type to data type - ordered data (ordinal) should not appear as unordered (categorical)
3: Proximity
We tend to think of objects that are physically close together as belonging to part of a group. ‒ Figure 3.1: you naturally see the dot sas three distinct groups because of their relative proximity to each other
3: Connection
We tend to think of objects that are physically connected as part of a group. ‒ The connective property typically has a stronger associative value than similar color, size, or shape.‒ Note when looking at Figure 3.11, your eyes probably pair the shapes connected bylines (rather than similar color, size, or shape): that's the connection principle inaction
3: White Space
White space is incredibly important in visualizations
Data Abstraction is
about putting a structure on data that is useful for Viz design.
2: Channels
are visual variables we can use to represent attributes of these objects (items). change appearance of marks based on attributes position, shape, color, size, volume, tilt
2: Grouping
containment connection proximity - same spatial region similarity - same values as other categorical channels
Which of the following is correct about designing for color deficiency? i) Use shape for encoding ii) Use luminance effectively iii) Focus on encoding with hue mostly
i and ii
Which of the following statements is correct? i) Marks are basic geometric objects that represent data items or links ii) Channels are visual variables that are used to represent data attributes iii) Visual Encoding is the process of deciding only the visual channels for data.
i and ii
Which of the following statements is correct? i) If you pivot two or more columns, it doesn't matter how many more than two; the output will always be two new columns. ii) Going from long to wide data means we will have more number of columns. iii) Tableau works better with wide data
i and ii
According to Theory, which of the following statements is true? i) Human perception of length is (almost) perfect. ii) Human perception of area is almost as good as it is for lengths. iii) Human perception of Electrical shock is underestimated.
i only
Which of the following statements is correct about visual channels? i) Different types of channels must be used for Ordered and Categorical attributes. ii) The effectiveness principle is about making sure the right channel type is used for the given data type. iii) Spatial channels are not good for Human perception.
i only
Which of the following statements is correct? (i) Exploratory analysis is the focus of our course. (ii) Summarizing the data for the visualization should not be our default step. (iii) Knowing your audience is an integral part of Viz Design.
ii and iii
Tufte's recommends higher data density in a graphic, because (i) It's aesthetically pleasing (ii) more data is always better (iii) it leads to a better signal-to-noise ratio
iii only
3: Visualization Guidelines
importance of context show the data - do not over aggregate identify and reduce clutter focus your audience's attention - pre attentive processing integrate the graphs and text start with gray
1: Item
individual entity, discrete ‒ e.g. patient, car, stock, city ‒ a row in the data table
Attribute is
is something that is measured or observed about an item
2: Judgements- Relative vs. Absolute
perceptual system mostly operates with relative judgements, not absolute ‒ that's why accuracy increases with common frame/scale and alignment
1: Attribute
property that is measured, observed, logged... ‒ e.g. height, blood pressure for patient, horsepower, make for car ‒ A column in the data table
Text annotations and explainers on graphs are particularly useful when the audience is
public
2: Marks
the basic visual objects/units that represent data objects (items, links) visually basic geometric elements points, lines, areas, volume
Visual Reasoning is usually faster and more reliable than mental reasoning because
visual representations replace cognition with perception, which is quite fast for humans
3: Long-Term Memory
‒ After short-term memory, info either gets lost forever or goes to long-term memory. ‒ Long‐term memory is built up over a lifetime and is vitally important for pattern recognition general cognitive processing
3: Short-Term Memory
‒ Limited, can keep about four pieces of visual info at a given time ‒ A busy graph runs the risk of losing audience's attention. ‒ Focus on larger, coherent chunks of info to fit them into finite space in our audience's working memory
3: Iconic Memory
‒ Super fast, as we look around us. ‒ Information stays for a second before moving to short-term memory. ‒ Iconic memory is tuned to a set of pre-attentive attributes, exploit it for visual design
4: Tidy Data
• Each attribute (variable) should be in one column • Each different observation of that attribute should be in a different row • If you have multiple tables, they should include a column in each table that allows them to be joined
4: Continuous
• Forming an unbroken whole, without interruption • Example: response time in seconds. We could record 1.64 seconds or1.642378765 seconds
4: Discrete
• Individually separate and distinct • Example: a household could have 3 or 6 children, but not 4.72!
4: Left and Right Joins
• Left join returns all rows from the left table and only matching rows from the right table • Right join changes the direction of the join: it returns all rows from the right table and only matching rows from the left table
3: Color channels for Ordered: Sequential
• Ordered attribute that is sequential ‒ E.g. years, age, height, etc • Use saturation or luminance
3: Color channels for Ordered: Diverging
• Ordered attribute with diverging values ‒ E.g. temperature, profit, etc. • Needed when data has meaningful midpoint ‒ Use neutral color for midpoint • White, grey ‒ Use saturated colors for endpoints.
3: RGB
• RGB good for hardware display • But not suitable for visualization ‒ RGB does not match human perceptual channels ‒ Color intensity does not change uniformly
4: Relationships
• Relationships are a dynamic, flexible way to combine data from multiple tables for analysis. • A relationship describes how two tables relate to each other, based on common fields, but doesn't merge the tables together. ‒ When a relationship is created between tables, the tables remain separate, maintaining their individual level of detail and domains. • Think of a relationship as a contract between two tables
4: Inner Join
• Returns only rows that occur in both tables using the common Product column • Inner join is the default join in Tableau
3: Interaction Techniques by Intent
• Select: mark something as interesting • Explore: show me something else • Reconfigure: show me a different arrangement • Encode: show me a different representation • Abstract/Elaborate: show me more or less detail • Filter: show me something conditionally • Connect: show me related items
3: Interaction Techniques
• Selection/Hovering • Link and Brush • Dynamic Filtering • Re-Encoding • Pan+Zoom • Sorting • Change Parameters
4: Thematic Maps
• Show spatial variability of attribute ("theme") ‒ combine geographic / reference map with (simple, flat) tabular data ‒ join together • region: interlocking area marks (provinces, countries with outline shapes) ― also could have point marks (cities, locations with 2D lat/lon cords) • region: categorical key attribute in table ― use to look up value attributes • major idioms ‒ choropleth ‒ symbol maps ‒ cartograms ‒ dot maps ‒ density maps
3: CIELAB
• The CIELAB color space, also referred to as L*a*b*, is a colorspace defined by the International Commission on Illumination ‒ It expresses color as three values: L* for perceptual lightness and a* and b* for the four unique colors of human vision: red, green,blue and yellow. • Better for Edge detection • Poor for Visual encoding
4: Union
• Union is another method for combining two or more tables by appending rows of data from one table to another. • Ideally, the tables that you union have the same number of fields, and those fields have matching names and data types.
3: Luminance (L*), Saturation (S) and Hue (H)
• Very good for visual encoding • But it is not standard graphics/tools color space yet
3: Color Deficiency
• perceptual processing before optic nerve ‒ one achromatic luminance channel (L*) • Edge detection through luminance contrast ‒ 2 chroma channels • red-green (a*) & yellow-blue axis (b*) • "colorblind": degraded acuity, one axis ‒ 8% of men are red/green color deficient ‒ blue/yellow is rare avoid encoding for hue alone