UIUC CS 498 Data Visualization Midterm

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Suppose you know you have decimal number data that ranges in value from 0 to 10. If you separate the range into five equal bins, what histogram would result from the data: 1.1, 1.2, 2.1, 2.2, 2.3, 3.1, 6.1, 8.1, 8.2? 2, 3, 1, 1, 2 2, 3, 1, 0, 3 2, 4, 0, 1, 2 2, 3, 1, 3, 0

2, 4, 0, 1, 2 The five equal bins are from 0-2, 2-4, 4-6, 6-8 and 8-10. (Fortunately we don't have to worry about values on the bin boundaries.) The histogram counts the number of values in each bin. Bin 0-2 counts two values: 1.1 and 1.2. Bin 2-4 counts four values: 2.1, 2.2, 2.3 and 3.1. Bin 4-6 counts no values. Bin 6-8 counts one value: 6.1. Bin 8-10 counts two values: 8.1 and 8.2.

How many items can human working memory (short-term memory) typically hold? 3-7 items 30-70 items 300-700 items 3,000-7,000 items

3-7 items Our working memory can only hold 3-7 items at a time, though a single item in our working memory can be a collection of items in our long-term memory.

If you have a table with five fields and ten records, and you pivot three of the fields, how many records does the resulting table have? 7 records 10 records 13 records 30 records

30 records When you pivot three fields, then you would replace each record in the previous table with three records in the new table. Each of those three new records would not have the three pivotted fields, but would instead have a new field indicating the previous record's field name and and second new field indicating the previous record's field value. Since each record has been replaced by three new records, 10 fields becomes 30 fields.

Why is filtering, which removes information from the display, an important part of the information visualization process? Filtering reduces the number of records processed which makes the visualization respond faster. Filtering removes data points which makes the visualization render faster. Removing information from a chart does not make the chart more effective. A chart can sometimes contain too many visual elements that can be overwhelming, obscuring and distracting.

A chart can sometimes contain too many visual elements that can be overwhelming, obscuring and distracting. Indeed, filtering can improve focus on a particular subset of the data of interest, and this is an important part of the information visualization process.

Which of these best demonstrates a cartogram? A geographical map where the size of countries is proportional to their population. A geographical map where the color of countries is assigned according to their population A visualization of the population of countries is visualized as the area of disks, which are packed in an arbitrary order A visualization where the names of the counties are scaled by their population and packed in an arbitrary order

A geographical map where the size of countries is proportional to their population. Correct! A cartogram is a distortion of a map of regions by data values associated with those regions.

Which best characterizes a social network? A graph with fewer low degree nodes and many high degree nodes A graph with many low degree nodes and fewer high degree nodes A graph with only low degree nodes A graph with similar numbers of low degree and high degree nodes

A graph with many low degree nodes and fewer high degree nodes

What is the main advantage to data visualization by using a lens for zooming? The lens zooms only a portion of the display which is faster than zooming the entire display. The lens zooms only a portion of the data which is faster that zooming all of the data. A lens separates nearby data items to help resolve detail while retaining the spatial context of the whole dataset. A lens provides easy access to changing the magnification of the zoom.

A lens separates nearby data items to help resolve detail while retaining the spatial context of the whole dataset. The lens has a visible location in the display area of the chart which provides important context for the area being zoomed.

Suppose you are given the following sales dataset sampling the daily sales at four different dates of the year. Which chart below best communicates the sales data? A scatter plot of four disjoint datapoints plotted with a vertical quantitative continuous sales axes and a horizontal quantitative continuous date axis. A bar chart where the height of four evenly spaces bars (one for each date) is plotted on a quantitative continuous vertical axis. A bar chart in three-dimensional perspective, where the height of four evenly spaces bars (one for each date) is plotted on a quantitative continuous vertical axis, plotted such that the line of sight positions the earlier data in front and the later data further away. A line chart connecting vertices plotted with a vertical quantitative continuous sales axes and a horizontal quantitative continuous date axis.

A line chart connecting vertices plotted with a vertical quantitative continuous sales axes and a horizontal quantitative continuous date axis. Both sales and dates are quantitative continuous. The dates are not evenly spaced, so plotting dates along a quantitative continuous axis accurately spaces them according to the duration between the sales data points. The lines connecting the data points reasonably communicate that sale dollar amounts would be continuous between the dates of the data points, and are approximated by linear interpolation.

Which of the following would effectively visualize these four fields: Year, Country Name, Region, Population? A table of bar charts. A scatterplot of tables A table of tables A table of scatterplots

A table of bar charts. Correct. For example, the rows could be [Region][Country Name] and the columns could be [Year][Population] which would produce a table pane for each Country Name and Year, and this pane would hold a single horizontal bar indicating the population of that country that year.

On which of these colors does the human eye have the most difficulty focusing? Blue Green Yellow Red

Blue Because of the chromatic aberration of the eye's lens, the blue end of the optical spectrum of light tends to focus off the retina. If you have sharp details that need to be displayed in a shade of blue, try to avoid pure blue hues.

A light gray box drawn on top of a dark gray background will make the light gray box appear ______________. The same as it appears on a white background Darker Brighter

Brighter The dark gray box will make the light gray box appear even brighter because the human visual system's lateral inhibition will detect and accentuate the difference.

Given a plot of life expectancy based on country and birth year, you look up your country and birth year, find the displayed life expectancy, and conclude you will probably live that long. This is an example of _________________. Abductive reasoning Inductive reasoning Deductive reasoning Subductive reasoning

Deductive reasoning This is an example of deductive reasoning because we are drawing the conclusion implied by the given data.

When creating an overview visualization of a large dataset, it is most important to: Use many different colors to make it appealing to draw the viewer in to investigate further Display only an important subset of the datapoints so as to not overwhelm the user Pack as many details as possible into the display to be as efficient and informative as possible Display all of the data using a simple representation and axes that spread the data out as much as possible

Display all of the data using a simple representation and axes that spread the data out as much as possible The goal of an overview is to allow the user to get their head around all of the data, without overwhelming the user with details.

Which of these is the least important criterion when visually ordering the elements of a chart. Displaying plotted measure values in order from smaller to larger understand their extremes. Listing ordinal field values in order to make them easier to find in a list. Clustering data based on similarity of one or more fields. Displaying field values in database record order to facilitate interactivity through more rapid access times.

Displaying field values in database record order to facilitate interactivity through more rapid access times. The speed advantage for this ordering is likely negligible, and the user may infer some importance to the ordering.

Which of these is the least important criterion when visually ordering the elements of a chart. Listing ordinal field values in order to make them easier to find in a list. Clustering data based on similarity of one or more fields. Displaying field values in database record order to facilitate interactivity through more rapid access times. Displaying plotted measure values in order from smaller to larger understand their extremes.

Displaying field values in database record order to facilitate interactivity through more rapid access times. The speed advantage for this ordering is likely negligible, and the user may infer some importance to the ordering.

Which of the following field dragging operations would VizQL infer should result in a scatterplot? Dragging population to the rows and per-capita income to the columns Dragging per-capita income to the rows and year to the columns. Dragging year to the rows and country to the columns. Dragging country to the rows and population to the columns.

Dragging population to the rows and per-capita income to the columns Both population and per-capita income are quantitative measures and so would result in a scatterplot. Initially this scatterplot would likely consist of one point corresponding to an aggregation across all of the dimensions to find the world population and the world per-capita income, both over all time. But by disaggegating dimensions by dragging their field to e.g. the detail shelf can separate the single point into a constellation of data points each representing a separate measurement along that dimension.

When visualizing data, you should keep your eyes focused on one point for the entire duration of the visualization. False, because your visual system will play tricks on your perception of the data. True, because your visual system will better detect any changes to datapoints during the visualization.

False, because your visual system will play tricks on your perception of the data. As we showed in the slides, focusing on a single point causes a temporal inhibition in the light sensors and can play tricks on your perception.

Which one of the 3-D depth cues below indicates surface orientation? Occlusion Stereopsis Illumination Shadowing

Illumination Occlusion and shadowing only indicate the surface closest to the observer (or light source), and stereopsis provides relative cues of distance across an image, whereas the illumination of a surface changes based on how the surface is facing the light source (and for specular reflection, the viewer).

Which one of the following is NOT an important part of the "process and provenance" of interactive dynamics of visualization as outlined by Heer and Shneiderman when documenting your visualization? Guiding someone else through a visualization story. Indicating how to change the view e.g. from one chart type to another. Recording previous charts the user has tried. Sharing a visualization online so others can see.

Indicating how to change the view e.g. from one chart type to another. This is an aspect of view manipulation and is indeed not part of the "process and provenance" used for documenting your visualization experience.

In the Data Visualization Framework, what does the Mapping Layer do? It ensures the data is in the proper format for visualization. It is the geographical map layer underneath a layer containing a chart of locations. It maps user interaction into chart actions. It associates geometry with data.

It associates geometry with data. Correct! The mapping layer associates appropriate geometry with corresponding data channels.

What does "brushing" mean in a dashboard visualization? User selection of colors for chart elements to make them more semantically meaningful. Filtering every other data item from view. Selecting data points by sweeping a large circular cursor over them. Manipulating a filter in one chart to see its effect in other charts.

Manipulating a filter in one chart to see its effect in other charts. Indeed, the manipulation often feels like painting one section of the chart and seeing its effects on another, through an "action" that relates the selections and filters between multiple charts.

Which one of the 3-D depth cues below is the strongest? Occlusion Shadowing Lighting Stereopsis

Occlusion Occlusion is the strongest cue, because if a point on object A and object B project to the same point on the image plane, the fact that you see object A and not object B at that point provides incontrovertible evidence of a depth ordering that A is closer than B.

Which one of the following is NOT an element of vector graphics? Pixels Fills Strokes Vertices

Pixels While vector graphics are sometimes converted to a rectilinear raster graphics array of pixel color values for display on a raster device, some applications work directly from the vector graphics specification, such as a plotter or a laser display

Which of these is the best example of providing details on demand for the purpose of information visualization? Interviewing the intended dashboard user to determine which information is important to display. Placing the mouse pointer over a datapoint brings up a popup window with more information about the datapoint. Clicking on a datapoint to remove it from the chart to make room for a label. Progressively adding more detail to an overview visualization at the rate of one item every 10 seconds.

Placing the mouse pointer over a datapoint brings up a popup window with more information about the datapoint. This was the example used to demonstrate details on demand. Another good example would be selecting a data point to fill a neighboring window with further data about the data point, or clicking on a datapoint to go to a second screen showing further data on it.

Which of these mappings is the best choice if you want to visualize a data value but you don't know if the value corresponds to a shirt's price, size, or color? Length Color Area Position

Position Correct! Position is the most effective perceptual mapping for quantitative (price), ordered (size), or nominal (color) values.

Which clause of an SQL query corresponds to fields dragged onto Tableau's filter shelf? The "Where" clause. The "From" clause. The "Order" clause. No clause, and the fields would follow the "Select" statement as the fields that should be queried so they can be filtered from the result.

The "Where" clause. In SQL, the "Where" clause indicates a filter for the query.

Which one of these Tableau functions is not an aggregation of a measure that projects the values of the measure along one or more dimensions into a single value? The attribute function ATTR() The sum function SUM() The minimum function MIN() The function COUNT() that counts the number of values of a measure

The attribute function ATTR() The attribute function checks to make sure that only one value for a measure is reported by the query. (However, that one value could be reported multiple times.) The attribute function is not projecting multiple values into a single value as an aggregation would, because if there were multiple different values present, the attribute function would result in the non-numeric asterisk "*" character.

You're given two circles of the same size. The left one is surrounded by smaller circles and the right one is surrounded by larger circles. Which circle appears larger? The left one. The right one. Neither.

The left one. The perceptual processing of the human visual system is designed not to ignore differences but to accentuate them

Which of these choices is the most perceptually accurate way to map a quantitative value? The length of a bar in a bar chart The gray level of a bar in a bar chart The area of a bar in a bar chart The volume of a box in a 3-D bar chart

The length of a bar in a bar chart Correct! Length is more perceptually accurate at mapping quantitative values than the other choices.

Which best characterizes the betweenness centrality of a node? The inverse of the distance to the farthest node The number of nodes connected to a node by some path The portion of all the shortest paths between any two nodes that pass through the node The total distance to all other nodes

The portion of all the shortest paths between any two nodes that pass through the node Correct! The betweenness centrality (BC) is high for a node when it is visited often along the shortest path between other nodes, and can be used to simplify a graph (by removing edges between low BC nodes) or to reveal communities (by removing high BC nodes).

What is crossfiltering? Filtering the cross product of two fields. Filtering one field but observing the effects in a second correlated field. A selection region created by dragging the mouse diagonally in the shape of a "red cross" thickened plus sign instead of the usual triangle, designed to capture more datapoints along the field axes and less along the diagonals where the fields depend more on each other. The same filter is used for multiple charts.

The same filter is used for multiple charts. The benefit is that controlling the filter through one chart has its effects realized in a second chart.

Recall that we had a Tableau table of regions that displayed the calculated field [Regional Total] which was defined as "{Include [Year] : SUM([Value])}." The dimension [Year] was filtered to include only years from a particular decade. Thus the AVG([Regional Total]) reported the average over a decade of years of the regional total of the value. What would be reported by MAX( {Include [Country] : AVG([Value]) } )? The values would be averaged over all countries in the region for each year, and the largest year's value would be returned. The values would be averaged over the decade for each country in the world, and the country's value that is the largest would be returned. The values would be averaged for each year for each country in the region, and the value for the country in a year that is the largest would be returned. The values would be averaged over a decade for each country in the region, and the country's decade-average value that is the largest would be returned.

The values would be averaged over a decade for each country in the region, and the country's decade-average value that is the largest would be returned. The simple aggregation AVG([Value]) would have returned the average value for all countries in the region for each year. The LOD expression "Include [Country]" means that the country dimension is no longer part of the AVG([Value]) aggregation, so a separate AVG() aggregation is computed for each country that computes its average value over a decade, and then the maximum of those decade averages computed for each country is returned.

Which best characterizes a tree map? The visualization of a tree of quantitative values as a hierarchy of rectangles where rectangle area corresponds to data value The simplification of a tree to its essential backbone by removing nodes with low betweenness centrality. The regions on a geographical choropleth map shaded green to indicate forests. The visualization of a tree where the root node is placed at the center of a circle and the leaf nodes are positioned on the circle.

The visualization of a tree of quantitative values as a hierarchy of rectangles where rectangle area corresponds to data value Correct! The rectangles corresponding to sibling nodes fit inside the rectangle corresponding to their parent node.

Which of these best characterizes what principal component analysis (PCA) is used for? To find in which dimension of a high dimensional dataset the data varies the most To find the best way to run an elementary school To find low dimensional structure in a high dimensional dataset To increase the dimensionality of a low dimension dataset to reveal hidden structure

To find low dimensional structure in a high dimensional dataset Correct! PCA uses the eigenvectors corresponding to the largest eigenvalues of the covariance matrix of a high dimensional dataset to show the directions in the high dimensional space of the data where the data is varying the most.

Which of these best characterizes what multidimensional scaling (MDS) is used for? To layout nodes in a planar graph so that the edges do not cross To find the directions in a high dimensional dataset where the data varies the most To find the shortest route among available flight connections from one airport to another To layout nodes in a complete graph (every node connected to every other node by an edge) according to desired edge lengths

To layout nodes in a complete graph (every node connected to every other node by an edge) according to desired edge lengths Correct! MDS is used to layout points when you are given the desired distances between the points.

What is the primary goal of the layout of a visualization dashboard Size, to ensure that the charts use every bit of space available to maximize the visual information conveyed by the dashboard Aesthetics, to make the viewer become more engaged with the dashboard and its underlying data. To visually organize the charts so the viewer can better find and understand the data. To minimize the mouse interaction distance between common user input options, such as selection, menus and buttons.

To visually organize the charts so the viewer can better find and understand the data. The goal of a visualization dashboard is understanding the data through multiple charts. An important feature alongside the layout of the dashboard is a user's ability to navigate the dashboard.

In which one situation below would be the most appropriate to treat the date data as a category variable instead of a quantitative continuous variable. Sales on January 1st of each year in order from 2000 to 2020. Total annual sales for the year in order of best year to worst year. Sales at randomly chosen dates throughout the year, in date order. Total sales for the decade in order of decade.

Total annual sales for the year in order of best year to worst year. Because the order is not the date order, it would be inaccurate to treat measures over the dates as continuous and one would not want to interpolate between them.

In what order does a data visualization graphics pipeline process information? Pixel processing, then rasterization, then vertex processing Vertex processing, then rasterization, then pixel processing Vertex processing, then pixel processing, then rasterization Rasterization, then vertex processing, then pixel processing Pixel processing, then vertex processing, then rasterization Rasterization, then pixel processing, then vertex processing

Vertex processing, then rasterization, then pixel processing The graphics pipeline accepts vector graphics primitives described as vertices, so it processes vertices first. Rasterization converts vector graphics primitives into the pixel locations used to display them on a display screen. Pixel processing is used to further process the pixels output from rasterization, e.g., to compute their individual colors.

How could zooming be considered filtering? Zooming is a filter on the range of values of the row and column fields of a scatterplot. Zooming is not in any way filtering. Zooming a scatterplot happens when you drag a new field to the size shelf. Zooming a scatterplot happens when you drag a new field to the filter shelf and select a strict subset of the field's values.

Zooming is a filter on the range of values of the row and column fields of a scatterplot. A scatterplot uses the row/column field values as the x and y positions, so limiting their range would map a subset of the elements to the chart display area and effectively zoom the scatterplot.

Which of the following is a measure, as opposed to a dimension, in the WDI database? year indicator code population country code

population Indeed population, specifically total population, is a measure. It is a quantitative value that as reported per country per year.

Suppose we have data in the following table. Country Name Year France 1980 France 1990 EastGermany 1980 WestGermany 1980 Germany 1990 What would be the result of the cross product operation [Country Name] x [Year]? {France1980, France1990, EastGermany1980, EastGermany1990, WestGermany1980, WestGermany1990, Germany1980, Germany1990} {France19801990, EastGermany1980, WestGermany1980, Germany1990} {France, EastGermany, WestGermany, Germany, 1980, 1990} {France1980, France1990, EastGermany1980, WestGermany1980, Germany1990}

{France, EastGermany, WestGermany, Germany, 1980, 1990} The nest operation combines the values of a pair fields but only the field values that co-exist in the same record.

Suppose we have data in the following table. Country Name Year France 1980 France 1990 EastGermany 1980 WestGermany 1980 Germany 1990 What would be the result of the nest operation [Country Name] / [Year]? {France1980, France1990, EastGermany1980, EastGermany1990, WestGermany1980, WestGermany1990, Germany1980, Germany1990} {France19801990, EastGermany1980, WestGermany1980, Germany1990} {France, EastGermany, WestGermany, Germany, 1980, 1990} {France1980, France1990, EastGermany1980, WestGermany1980, Germany1990}

{France1980, France1990, EastGermany1980, EastGermany1990, WestGermany1980, WestGermany1990, Germany1980, Germany1990} This is indeed what results from the cross product. Every element of one field combined with every element of the other field, regardless of whether or not a particular combination appears in a record in the table.


Ensembles d'études connexes

Hand Anatomy: Intrinsic and Extrinsic

View Set

F4 M8: Goodwill, Including Impairment

View Set

Pathophysiology- Exam 1--questions--

View Set