Business Analytics- Quants
Who was incorrect in saying "my scientific studies have afforded me great gratification; and I am convinced that it will not be long before the whole world acknowledges the results of my work?"
Gregor Mendel
These have only two values, and for statistical analysis purposes it's often best to measure them as the presence or absence of something with values of 1 and 0. An example could be whether you are male or female (which could be recorded as 0 for no "femaleness," and 1 for being female), or whether you are a US citizen or not.
Binary Variables
Analysis of variance; a statistical test of whether the means of more than two groups are all equal.
ANOVA
If you want to see relationships among data points, use a
Scatterplot
There are at least six types of quantitative analytical stories. Which of the following is not one of those of types?
Analytical Story
the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact based management to drive decisions and add value
Analytics
If you want to compare a set of frequencies or values, typically for one variable, use a
Bar chart
These have several possible categories as values, such as eye color, flavors of ice cream, or which state or province you live in. Because they can't easily be converted into numbers where increases or decreases mean anything, there is a special class of statistics for categorical data.
Categorical (also called nominal) variables
The relationship between an event (the cause) and a second event (the effect), where the second event is understood as a consequence of the first. In common usage, causality is also the relationship between a set of factors (causes) and a phenomenon (the effect).
Causality
A statistical test that determines how well sample data fits a specified type of distribution.
Chi square (goodness-of-fit) test
is a main task of exploratory data mining, and a common technique for statistical data analysis used in many fields.
Clustering or Cluster analysis
The extent to which two or more variables are related to one another. The degree of relatedness is expressed as a correlation coefficient, which ranges from 1.0 to 1.0
Correlation
for example, perhaps the most widely used analytical software tool in the world (though most people think of it as a spreadsheet tool), can do some statistical analysis (and visual analytics) as well as reporting, but it's not the most robust statistical software if you have a lot of data or a complex statistical model to build.
Excel
tries to answer the questions of why something happened by conducting experiments.
Experimental design
The variable whose value is unknown that you would like to predict or explain.
Dependent variable
involve gathering, organizing, tabulating, and depicting data and then describing the characteristics about what is being studied.
Descriptive analytics
A statistical procedure that takes a large number of variables or objects and uncovers the underlying relationships among them.
Factor analysis
an effective method for reviewing previous findings.
Doing an Internet search for key terms related to your analysis
3 Stages of quantitative analysis
Framing the problem, Solving the problem, Communicating and acting on the results
The analytical thinking example, The Simon Hannes Insider Trading Case, is an example of how data analysis can be used to identify
Fraud
The most important thing in the problem recognition stage
Fully understand the problem and why it matters
If you want to understand or depict data across geography, use a
Geographical map
is particularly well suited to visual displays of information.
Here's What happened story
the early story you tell about your analysis—are simply educated guesses about what variables really matter in your model.
Hypothesis
A systematic approach to assessing a tentative belief (claim) about reality. It involves confronting the belief or claim with evidence and deciding, in light of this evidence,
Hypothesis Testing
Barbara McClintock, an American scientist, was named a 1983 Nobel Laureate for her discovery of genetic transposition. Her description of her work process on page 137 is an example of the creative stage called
Immersionn
Regarding the four stages of creative analytical thinking, the stage that focuses on internalization of the problem into the subconscious mind, with unusual connections likely to be made below the level of consciousness is called
Incubation
A variable whose value is known and used to help predict or explain a dependent variable.
Independent variable
Regarding the four stages of creative analytical thinking, the stage that focuses on the big breakthrough in understanding how the problem can be solved through quantitative analysis is called
Insight
The process where the software fits models to the data in an automated and rapid fashion to find the best fit is called
Machine Learning
is particularly well suited to organizations like retailers (that have a lot of stores) and banks (that have a lot of branches). That makes it easy to try things out in some locations and use others as controls.
Mad scientist story
measures of central tendency
Mean and Median
These variables have numbers with standard units, such as weight in pounds or kilograms, or height in inches or centimeters. The higher the number, the more of that variable is present. Numerical variables, then, are well suited to common statistical approaches like correlation and regression analysis.
Numerical (interval and ratio) variables
another prescriptive technique, attempts to identify the ideal level of a particular variable in its relationship to another
Optimization
These variables have numbers assigned to them, and the higher the number, the more of the variable is present. However, the difference between 1 and 2 may not be the same as the difference between 5 and 6. Atypical example of ordinal variables is the Likert item—named after the sociologist Rensis Likert— that typically involves survey responses such as strongly disagree, somewhat disagree, neither disagree nor agree, somewhat agree, strongly agree.
Ordinal Variables
is all about anticipating what will happen in the future
Prediction story
go beyond merely describing the characteristics of the data and the relationships among the variables (factors that can assume a range of different values); they use data from the past to predict the future.
Predictive analytics
Regarding the four stages of creative analytical thinking, the stage that focuses on doing the groundwork on the problem is called
Preparation
including methods such as experimental design and optimization, go even further. Like a prescription from a doctor, they suggest a course of action.
Prescriptive analytics
to show you how quantitative analysis works—even if you do not have a quantitative background—and how you can use it to make better decisions
Primary Goal of Keeping Up With the Quants
6 steps of quantitative analysis
Problem recognition, Data Collection Modeling, Data Analysis, Results Presentation and Action, Review of Previous Findings
are often useful tools for exploratory research—the earliest stage of analytics.
Qualitative Analytics
refers to the systematic empirical investigation of phenomena via statistical, mathematical, or computational techniques. Structured data is collected from a large number of representative cases and analyzed statistically.
Quantitative Analytics
The most popular measure of how well an estimated regression line fits the sample data on which it is based. It also indicates the amount of variability of the dependent variable accounted for by the regression line.
R^2
Any statistical method that seeks to establish an equation that allows the unknown value of one dependent variable to be estimated from the known value of one or more independent variables.
Regression
Choose the correct category: The numbers presented to you should be relevant to the question to which they are applied, and representative of the group or entity they supposedly represent. If the numbers do not give some answer to the question, they are merely meaningless.
Relevance
Which of the following is not one of the common steps involved in the problem recognition stage as it applies to stakeholders?
Selecting chart types
Among all the sample results that are possible when the null hypothesis is true, the (arbitrary) maximum proportion of these results that is considered sufficiently unusual to reject the null hypothesis is called the significance level.
Significance level or alpha
The use of quantitative analysis shown in the movie Moneyball is an example of how data analysis is used in the _______ industry.
Sports
If you want to show the rise and fall of one variable in relation to another (typically time), use a
Stack Graph
If you want to analyze text frequencies, use a
Tag Cloud
The essence of creative data analysis is
finding a pattern among the variables in the data.
Who said "Genius is ninety-nine percent perspiration and one percent inspiration."
Thomas Edison
To see the parts of a whole and how they relate to each other, use a
Tree Map
Analytics can be classified as qualitative or quantitative according to the process employed and the type of data that are collected and analyzed.
True
The analytical thinking example, The Suspicious Husband, is an example of
Type 1 error
The most successful analysts are those who can "tell a story with data."
Yep
Choose the correct category: If the numbers are relevant but not accurate, you need to discard them. The accuracy of numbers can be evaluated by questioning who and how they made them. Numbers that do not pass your credibility tests are useless.
accuracy
Choose the correct category: Even when accurate, numbers can often be misleading if there is represented. Especially people who have an ulterior agenda are apt to mislead with numbers intentionally.
correct interpretation
If you are only trying to relate a couple of things that can be measured numerically, you will probably want to use some type of __________________. This is one of the simplest statistical analyses you can perform. Basically it assesses whether two variables—take the weight and height of a series of people, for example—vary together.
correlation analysis
Which of following is not a concept that any executive needs to understand?
creating visualizations
Analytics can be classified as _________ according to their methods and purpose
descriptive, predictive, or prescriptive
The type of products that a customer has bought from us in the past year is the best guide to what e-mailed offers he or she will respond positively to in the future
good example of testable hypotheses.
Which of the following is not what quantitative analysts should expect of business decision makers?
ignore things you don't understand and move on
Regarding the four stages of creative analytical thinking, the stage that focuses on Intense engagement in solving the problem and the data at hand; a long struggle to find a solution takes place is called
immersion
is a purposefully simplified representation of the phenomenon or problem.
model
Wine quality = 12.145 (a constant) + .0238 vintage age + 0.616 average growing season temperature - 0.00386 harvest rainfall + 0.00117 winter rainfall
multiple linear aggression
According to the text, many Internet-based organizations—Google, Facebook, Amazon, eBay, and others—are using so-called big data from online transactions not only to support decisions but to create _________________.
new product offerings and features for customers
Good quantitative thinkers (and organizations that want to nourish them) should always demand __________ when someone presents ideas, hunches, theories, and casual observations
numbers
A key aspect of thinking quantitatively is understanding the laws of
probability and randomness
When performing a hypothesis test, the ______ gives the probability of data occurrence under the assumption that H0 is true.
p-value
Which of the following is not a best practice to becoming a quantitative analyst
social media
A test statistic that tests whether the means of two groups are equal, or whether the mean of one group has a specified value.
t-test or student's test
occurs when the null hypothesis is true, but it is rejected. In traditional hypothesis testing, one rejects the null hypothesis if the p-value is smaller than the significance level
type 1 error or alpha_error
According to the book, what is a quick and efficient way to find the concepts related to numbers that you don't know?
use a search engine