MIS 0855- Exam 1

Ace your homework & exams now with Quizwiz!

testable, falsifiable, grounded in rationale

"iPhone users download more apps each month than Android users "There are NO vampires living in Louisiana" "Students who attend class more often get better grades"

Data Visualization

A visual representation of data

Variety-Big data

Many different sources of data are combined together.

Theory

Something that hasn't been proven yet but has been accepted to be true-a scientific idea that is supported by evidence

Data Types

Strings: contains text Integer: contains whole number Floating Point: contains number with decimals Boolean: only two possible options(usually true/false or 1/0) Date/Time: relates to dates and times

Knowledge

application of the data and information - students that come to class more often getting better exam scores

confirmation bias

confirmation bias is a type of cognitive bias that involves favoring information that confirms previously existing beliefs or biases. (left-handed people are more creative than right handed)

Basic elements of data visualizations

content, context(captions, legends, keys), construction(aspect ration, color, size)

Information

data that is processed to be useful (mean, median, etc.)

Infographic

data visualization with text and images that together tell a story (contains a bunch of data visualization)

Intentional Bias

fake reviews out to get someone!

Why is it good to have data dictionaries and metadata?

it improves clarity, communication, reduces error, everybody on the same page

Visualizing comparative areas (two squares)

one of the left- box for b is 1/4 the size of an even though the data says b is half the size so the size of b should be bigger to accurately represent the data. the other picture accurately represents the proportions.

How can you make graphics that stand on their own?

see the tips and tools that we discussed. Use context, patterns, picture superiority

Data Science

study of the generalizable extraction of knowledge from data - basically means that you are taking data and getting insight from it.

A hypothesis must be

testable, falsifiable, grounded in a rationale

Survivorship Bias

the logical error of concentration on the people or things that made it past some selection process and overlooking those that did not, typically because of their lack of visibility. We tend to only focus on successful individuals for reference.

direction of casuality

two variables have a relationship and have a cause for each other but you often don't know which one caused which. (Correlation does not imply causation)

Spurious Correlations

when two things that don't really make sense are associated with each other. (dads buy beer and diapers)

Picture Superiority Effect

where pictures and images are more distinguishable than words

How do you use color, size, scale, etc. to create visualization

-using color, size, scale, ratio -axes should include 0-why? to accurately display the difference in data

to tell a powerful story with data

1. My understanding of the business problem 2.How will I measure the business impact? 3.What's the available data? 4The initial solution hypothesis 5The solution 6The business impact of the solution

Avoid when telling story with data

1. Technical Terminology 2.Step-by-step description 3Complex statistics

What makes data so important today?

1. The massive availability of digital data suggests (and it has been increasingly shown) that there are huge opportunities to create valuable insights from data at every corner of the economy and society. 2. Newly developed techniques and tools for data analytics allow us to harness the potential of digital data in new ways. 3. Your competitor is going to harness data

Metadata

1. each piece of data can be described by: Variable name (a dataset column label), Variable description (in a data dictionary), Data type (in a data dictionary), Value (the datum itself). 2. Metadata is data that describe other data. They are often stored as a data dictionary attached to a dataset. 3. Without explicit or inferred metadata, a dataset becomes useless.

Biases in (Big) Data

Always remember that data and datasets" are creations of human design" Different types of biases... 1. Survivorship Bias 2. Intentional Bias 3.Confirmation Bias

Think about a song on Spotify: you have the audio file, song name, duration, plays, artist name---Which of the above is NOT metadata?

Audio file

Before data; Today data

B-data had to be collected and often produced for each analytical purpose separately. This made data very expensive. T-digital data are everywhere. Digital systems create data as the side effect of their operation. There is often a very low cost to data collection - massive datasets are just a few clicks away!

Different types of visualizations

Charts, graphs, scatterplots, heat maps, etc

Data

Data are usually understood as 'raw unorganized facts' - recorded observations about the world.

Velocity-Big data

Data can change very quickly. (However much 'Big Data' you have, the data does not speak by itself - new kinds of data sources have also new problems.)

Standard View

Data-Information-Knowledge Hierarchy

Calculated Fields in Tableau

Look at tableau and assignment (basic visualization: what happens if you drag this somewhere or do this, which visualization) Categorical Values(pie usually has percentages bar usually compares numbers) Continuous Value(line graphs shows relationship between variables or something over time, histograms)

Open data

Open data are datasets that are made available by organizations (see below) to be used freely for analysis and modification by anyone. There can be various motivations for offering open data: Government: speed up economic development, legitimacy (the data were paid by taxpayers) Companies: open innovation, PR, corporate citizenship Academics: the ideal of science, replicability of results, publicity Non-profits: publicity, cultural ideals Data are a non-rival economic good - its consumption by one party does not preclude others from using them. This makes it difficult to capture value from data by offering it to third parties. However, organizations often do not want to share data due to competition and possible liabilities; preparing open datasets can be laborious.

Volume-Big data(the so-called 3Vs of gives us an overall idea of the nature of new data resources)

The amount of data has literally exploded.

Telling stories with data

There are lots of articles on "telling stories with data" (just Google it) that pretty much tell the same story: "Find the compelling narrative. Along with giving an account of the facts and establishing the connections between them, don't be boring." "Think about your audience. What does the audience know about the topic?" "Be objective and offer balance. A visualization should be devoid of bias. Even if it is arguing to influence, it should be based upon what the data says-not what you want it to say." "Don't Censor. Don't be selective about the data you include or exclude, unless you're confident you're giving your audience the best representation of what the data "says". "Finally, Edit, Edit, Edit. Also, take care to really try to explain the data, not just decorate it."


Related study sets

HDFS 3312: CHILD DEVELOPMENT- CHAPTER 9: LANGUAGE AND COMMUNICATION

View Set

Chapter 11: Skin, hair, and nails

View Set

Chemistry - 6.1 Ion Formation Review

View Set

Polygons (Interior angles, exterior angles, number of sides and area)

View Set

levels of evidence study designs

View Set

Chapter 7-Energy Balance and Weight Control

View Set