Exam #1

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What is a weak correlation?

0 to 0.4

What is a moderate correlation?

0.4 to 0.6

What is a strong correlation?

0.6 to 1

r (correlation) ranges from, what?

1 - (-1)

What are the four main steps to the visualization process?

1. Define the data you have 2. What do you want to know about your data? 3. What visualization methods should you use? 4. What do you see using the visuals and does the data make sense?

What indicates a strong correlation?

8 and -.8

What are some of the reasons we visualize data?

Answer questions (or discover them)

What is the most commonly used charts?

Bar Chart

What are the three types of Coordinate Systems?

Cartesian Polar Geographical

What is the Ordinal Scale?

Categories where order matters. Ex. horrible, bad, okay, good

What is scale?

Dictates where the shapes are placed and how objects are shaded

What are the types of correlation?

Direction Magnitude Other relationships

Quantitative data can be...

Discrete & continuous

What is a Categorical Scale?

Discrete placement in bins. Ex. A,B,C

What is chart close to a pie chart called?

Donut chart

Examples of "Out of the Box" visualization tools...

Excel, Google Charts, Microsoft Power BI, Tableau, Geospatial Tools

What are some of the reasons we visualize data?

Expand memory

What is the Logarithmic Scale?

Focus on percent change. you multiply or divide to move up or down. Ex. 1, 10, 100,.

Pros of Google Charts...

Free to create charts. Includes interactive, animated and geospatial data graphics. Integrates well with google apps suite. Can easily access data from different computers

What are the easiest online solutions to geospatial mapping tools?

GOOGLE, yahoo, and Microsoft maps

What is Length?

How long the shapes are. length of bars in bar graph provides visual cues. The longer the bar, the longer the absolute value. Starts the axis at zero as people visually compare the distance from 0 to the end of the bar. If not done at 0 then people misrepresent info

Continuous Data

Infinite number of possible intermediate values. Ex. 1.5 lbs

Cons of Tableau...

Initial data preparation required, recently Tableau has launched PREP a separate software to PREP data. Expensive. In the free public version, any data you upload to the servers becomes publicly available

Pros of Tableau...

Integrates a wide range of data sources and file types. careful thought given to design and aesthetics. allows for interactive spatial animated and dashboard displays. powerful community collaboration

What is Color Saturation?

Intensity of a color hue. density of a given color. Ex gradients like light and dark red. Color can be used to show categories. Used to highlight certain aspects of data visualization

Name the types of Scales...

Linear Categorical Percent Logarithmic Ordinal Time Numerical

What are some of the reasons we visualize data?

Make data accessible

What are some of the reasons we visualize data?

Make decisions and persuade others to make decisions

1. What data do you have?

Maybe primary or secondary data we have collected. Time to analyze data is short compared to gathering it

Cons of EXCEL...

NOT interactive, requires customization to adhere to design standards. Not that great aesthetically for presentations, and may not process large datasets (1GB)

Why is Python called Python?

Named after the British comedy Series "Monty Python's Flying Circus". Van Rossum needed something short, unique, and slightly mysterious

Cons of Microsoft Power BI...

Not many options to configure visuals, problem with large data sets

Quantitative data

Numerical data that can be aggregated and measured

Scatter Plot

Often used to visualize the relationship between two variables.

What are some of the reasons we visualize data?

Persuade using evidence through narrative

What are the NINE visual cues?

Position Length Angle Direction Shapes Area Volume Color Saturation Color Hue

Discrete Data

Predefined at exact points, no "in between". Ex. 1 person

Examples of "Programming" visualizations tools...

R, Python, JavaScript

What is the Percent Scale?

Representing parts of a whole. Ex. 0%,25%

Cons of Google Charts...

Requires customization to adhere to standard designs, can't process large data sets

What is Angle?

Rotation between vectors. used in pie charts, commonly used to represent parts of a whole. Donut charts do not use angles since the center of the circle is cut out- arc lengths are used as visual cue

Iconic Memory

Short term or working memory. People can keep up to 4 chunks of visual information

What chart is used for categories and time?

Stacked bar chart

Cons of JavaScript...

Steep learning curve. Requires skills in working with HTML and JSON

Data visualization

The graphical representation of information and data

Where should you provide context through?

Through familiar colors and images, informative titles, and familiar objects and concepts.

Pros of Microsoft Power BI...

Tightly integrated with other Microsoft tools- excel, azure, cloud service, SQL server; highly intuitive user interface, more affordable compared to tableau. Can import data from wide range of sources

What are some of the reasons we visualize data?

To find patterns and see data in context

What is a Time Scale?

Units of months, days, or hours.

What is a Numerical Scale?

Users numbers

What is the Linear Scale?

Values are evenly spaced. it is always adding or subtracting one to move up and down scale. Ex. 1,2,3,4,5...

What are the data visualization components?

Visual Cues Coordinate Systems Scale Context

Edward Tufte's definition of data visualization

Visualization is complex ideas communicated with clarity, precision, and efficiency

Nathan Yau's definition of data visualization

Visualization is often framed as a medium for storytelling. The numbers are the source material, and the graphs are how you describe the source

Long-term Memory

Visuals can more quickly help us recall things from our verbal memory

Pros of JavaScript...

Web-based scripting language. Freely available and allow users to create sophisticated web-based visualizations

What is Position?

Where in space the data is. commonly used in scatter plots, compare values based on where others are place in the coordinate system. Easy to notice outliers and clustering. One disadvantage is that data points not labeled can be hard to grasp right away Ex. scatter plot

The goal of data visualization is to

aid our understanding of data by leveraging the human visual system's highly tuned and identify outliers

Bubble Chart

allows you to compare three variables at once: x, y, and area variable. bubble should be sized based on area.

3. What visualization methods should you use?

bar chart, pie chart or what else. Maybe chart that results in comparison, relationship, distribution, composition.

What charts are easy to use for non-technical audiences?

bar charts pie charts

A Cartesian coordinate system is made up of...

bar charts & line graphs

Histograms and similar to...

bar graphs and continuous density plots

When creating a bar chart, you should be aware of what?

bar height and width. start the axis at zero can be vertical or horizontal

What is ARCGIS?

built for desktop mapping. User interface, no coding required. Used by professional cartographers, graphics departments

Smoothing and Estimation

can be useful for better understanding of trends and predictive purposes. Can help fit trend lines.

Pros of Python...

can handle large amounts of data without crashing. Useful for analyses and heavy computation. Clean and easy to read syntax

Moderate negative correlation points are...

clustered together with some space between them moving down in a rightward motion

Strong Negative correlation points are...

clustered very close together in a downward right position

When creating pie charts, you should be aware of what?

color blind people include percentages

Continuous Temporal Data

constantly changing line graph (time series-cahrt) Step chart Smoothing and estimation

Continuous Density Plots

continuous instead of bins. we know how all data is distributed like a stats graph

Cons of "R" programming tool...

default chart outputs require design refinements like lack of titles. Use R to create graphs and edit using adobe illustrator or ink map. R is good for exploratory but not that good for presenting (explanatory aspect)

Temporal Data can be...

discrete and continuous

No correlation points...

do not follow a pattern. A.K.A scatter plot.

Negative correlation points move...

down and to the right

What is a Histogram?

encodes data using height/length as the visual cue. Uses bin sizes to represent data. Bins should be big enough to see the variability in data.

Stap Chart

fairly constant data for a while and then jumps up. good for tax rates and interest rates.

Categorical Data

for groups or categories, Mutually exclusive labels without any numerical value. Ex. name, gender, product types

Pros of "R" programming tool...

free open-source statistical programming language. Can write your own functions and packages to make graphics the way you want

A Geographic coordinate system is made up of...

geospatial graphs

2. What do you want to know about your data?

get sense from stakeholders to know what to answer. The more specific the better and getter more out of analysis

Cons of Python...

great starting point for data exploration, not very good aesthetically so might need other software for presenting like adobe/ink map

Visualizing Relationships between variables

how can you tell as something goes up, does another thing go up, go down, and is it a causal or correlative relationship?

What is Area & Volume?

how much 2-d/3-d space. bigger objects represent greater values. Make sure the scaling is correct. Keep in mind how many dimensions are present.

Gantt Chart

mainly used for project planning. bars expand the appropriate column corresponding to the time


means one things trends to change a certain way as another thing changes.

Aggregating Data From Different Sources

negative employees maybe error in data entry, maybe order by firm or number of employees

What is a tree map?

not as common good to organize hierarchies. color and area are used as visual cues

Weak negative correlation points are...

not cluttered at all. Very sporadically placed on the graph. still moving in a downward right motion.

Ordinal data

o similar to categorical data but has a clear order. Ex. level of education, satisfaction level, salary bands

Other relationships

outlier, clustering, non-linear

A Polar coordinate system is made up of...

pie charts

what visual cue does a scatter plots use?


Direction correlation

positive or negative correlation

Coefficient of correlation

quantifies how tightly couples the values of two variables are with respect to each other.

Structuring Data

reviews are textual data and are based on opinion. There is no format to follow. You can quantify data through star reviews.

What is a Box Plot?

shows range, median, and quartiles of data. Uses position & height/length visual cues. Less specific than histograms or density plots.

What is Direction?

slope of a vector in space. commonly used in line graphs. Direction helps with noticing trends and with time. Slope is used to signal sharp changes

Cleaning Data

some parts of table don't use the same format of spelling Texas. Maybe organize names alphabetically and delete duplicate data. There are missing observations. Maybe organize based on recency

Magnitude correlation

strong or weak correlation

Pros of EXCEL...

supports processing of data, compatible with word and PowerPoint, relatively easy to learn, widely used

What is Shape?

symbols as categories. used to denote categories and objects. Visually shapes are readily recognized. Ex. Nathan hot dog contest

What are the types of distribution?

symmetric, left skewed, right skewed

Exploratory Data

testing a hypothesis (visual confirmation) and mining for patterns, trends, and anomalies (visual exploration)

Positive correlation points move...

up and to the right

Scatter Plot Matrix

useful to see relationships among multiple variables. Allows comparison across multiple dimensions.

What is Color Hue?

usually referred to as color. refers to the different colors like blue and green. Be mindful of color blindness. For executive presentations maybe you can use colors and shapes simultaneously

Explanatory Data

usually simple everyday visualizations —line charts, bar charts, pies, and scatter plots conveying a single message

Validation Data

valid values and ranges

Discrete Temporal Data

values from specific points or blocks in time. bar graph stacked abr graph Gantt Chart Points

Line graphs are good for...

visualizing the evolution of several quantities over time.

Mean and Median are used to refer to...

what is normal or average

4. What do you see, and does it make sense?

what we hypothesize or maybe opposite of that. You may need to go to step two once again since new questions arise

Kaugnay na mga set ng pag-aaral

Nursing Fundamentals: safety and fall preventions

View Set

Ch 28 Overcurrent Protection-Fuses and Circuit Breakers

View Set

corporate finance chapter 1 and 2 homework questions

View Set

Geography U.S. and Canada Chapter 15

View Set