ISDS 361A Chapter 1_PP
Basic Statistical Concepts in Inferential Statistics: (Random) Sample
A (random) subset chosen from the population
Basic Statistical Concepts in Inferential Statistics: Parameter (Variable)
A descriptive measure of the population that is of interest e.g. the mean (Unknown -- Use Greek letter)
Basic Statistical Concepts in Inferential Statistics: Statistic
A descriptive measure that is calculated from the sample, e.g. the sample mean (Use regular letter)
What's Selection bias
One subset of experimental units in the population has either no chance, less of a chance, or more of a chance of being selected than another subset
Basic Statistical Concepts in Inferential Statistics: Population
A set of items (experimental units) under study
Which type of arithmetic operations are possible and which aren't?
All arithmetic operations are possible for interval data but not for nominal and ordinal data types
Learning Goals
Data Statistics Descriptive vs. Inferential Statistics Types of Descriptive Statistics Elements of Inferential Statistics Data Collection Methods Inference errors from nonrandom samples
Sources of Statistical Data
Data can be extracted from a public source Ex: Wall Street Journal, Orange County Business Journal A designed experiment can be performed Ex: Test cavity prevention - divide subjects into groups A survey can be taken Ex: Presidential poll (phone, mail), TV program (Nielsen) Observation studies can be made Ex: Observe output of workers on morning/evening shifts
What is Cross-Sectional data
Data collected at the same or approximately the same point of time Example: student grades, heights of 100 people
How do consumers feel about using the Internet for online shopping? To find out, a customer-experience software company commissioned a nationwide survey of 1859 U.S. adults who had conducted at least one online transaction in the past year. The findings, reported on BusinessWeek.com (2006), revealed that 1655 respondents or 89% experienced technical problems with an online transaction. Identify the data-collection method Identify the target population Are the sample data representative of the population?
Data source - SURVEY Population - all US online shoppers with at least one online transaction last year Are the sample data representative of the population? No complete information is given. So, one wonders if there may be a case of self selection bias and non-response bias.
What are the two Types of Statistics
Descriptive and Inferential
Can you identify the type of statistics? Say, from a survey of a random sample of 5000 CSUF students, it has been learned that 80% of those sampled are very excited about studying statistics.
Descriptive statistics
Using Non-random Samples intentionally leads to
Designed to skew results on purpose Unethical statistical practice
Why is statistics important?2
Do university students from different parts of the world perceive business ethics differently? How reliable are the quarterly forecasts for your firms? etc...
What is Randomized sample
Every item in the population has an equal chance of being in any particular sample. Silly example: How do you test a pot of soup for saltiness?
What is Sampling
Examine part of the whole or population impractical, prohibitive, costly
What is data? In general In reality
In general data are facts and figures. In reality, data are often very large Data is often stored in Large computer databases.
Some shoppers may have been excluded from the survey for several reasons: did not see the survey at all, did not have time to respond, etc. On the other hand, some people may be eager to respond because they had the most difficulty with the online shopping experience.
So, in the end we may have a non random sample.
What are Measurement errors
Inaccuracies in getting/recording data; ambiguous questions on questionnaires, etc.
Can you identify the type of statistics? Using a survey of a random sample of 5000 California residents, a UCLA Economist told a local TV station that over 55% of Californians have a positive view about the future of the U.S. economy.
Inferential statistics
What is Statistics
Is a way to get information from data In this course, we emphasize the use of STATISTICS for business and economic decision making.
Using Non-random Samples Unintentionally leads to
Leads to unjustified or false conclusions
Purpose of Inferential Statistics
Making inferences about a parameter of a population based on information obtained from a statistic of the sample(With a Certain Degree of Confidence)
Basic Statistical Concepts in Inferential Statistics
Population Parameter (Variable) (Random) Sample Statistic
What is Interval Data
Quantitative or numerical in nature Example: SAT scores, grades on an exam (%), income, height and weight, etc.
What is Ordinal Data
Same as nominal data but there is an ordering or ranking to them that is meaningful Example: Credit ratings of individuals can be classified as Excellent, Good, Fair, Poor.
What matters when Sampling?
Sample size matters, NOT the size of the population A random sample of 100 students represents the student body as well as 100 voters represents the entire electorate in USA.
Types of Non-Random Sampling Errors
Selection Bias Non-response bias Measurement Errors
What plays a role in making complex data?
Statistics
An airline company is interested in the opinions of their frequent flyer customers about their proposed new routes. Specifically, they want to know what proportion of them plan to use one of their new hubs in the next 6 months. They take a random sample of 10,000 from the database of all frequent flyers and send them an e-mail message with a request to fill out a survey in exchange for 1500 miles. Describe the "population" Describe the variable of interest and possibly the "parameter" of interest Describe the "sample" Describe the "statistics" Describe the "inference"
The population is all frequent flyer customers of the airline. The proportion of frequent flyer customers that plan to use one of the new hubs in the next 6 months. The 10,000 frequent flyer customers who has been randomly selected. The proportion that plan to use one of the new hubs in the next 6 months out of the 10,000 selected sample.
According to ABC Consulting (a made up company that does not exist), the average age of viewers of "American Idol" is 23 years. But the producer of the show thinks that the average age is higher than 23. To test her hypothesis, the producer of the show samples 500 Idol viewers and determines the age of each. Describe the "population" Describe the variable of interest and possibly the "parameter" of interest Describe the "sample" Describe the "statistics" Describe the "inference"
The population is all viewers of the American Idol TV show. The average age of the TV show viewers. The 500 Idol viewers, who has been randomly selected. The average age of the sampled viewers. How to infer about the average age of all viewers using the sampled data - with a certain degree of confidence.
A local TV company with customers in 15 towns is considering offering high-speed internet service on its cable lines. Before starting the new service they want to find out whether customers would pay $50 per month that they plan to charge. A graduate of a business school who works for the company has prepared several alternative plans for assessing customer demand. For each, indicate what (if any) biases might result. Put a big advertisement in the newspaper asking people to give their opinions on the company website. Randomly select one of the towns and contact every cable subscriber by phone. Send a survey to each customer and ask them to fill it out and return. Randomly select 20 customers from each town. Send them a survey, and follow up with a phone call if they do not return the survey within a week.
The problem of voluntary response. Only those who both see and feel strongly enough will respond. One town may not be typical of all - not representative. Will have selection bias. This is good and unbiased.
What is Time Series Data
Time series data are collected over several time periods Example: US average price per gallon of gasoline between 2007 and 2012 Graphs of time series data are frequently found in business and economic publications
What is the Goal of Data Collection
To obtain a "representative sample" that exhibits the characteristics of the entire population.
Why is statistics important
Using STATISTICS we draw conclusions -extract useful information!) from data. With the extracted information, statistics help managers to make valid business decisions in response to such questions as: Ex: What is the effect of advertising on sales? What is the relationship between shelf location and cereal sales? Do aggressive high-growth mutual funds really have higher returns than more conservative funds?
What's Non-response Bias
When data is unavailable or unattainable for certain experimental units in the population
What is Inferential Statistics
goes beyond the data at our disposal. More formally, it refers to drawing conclusions about a large set of data - called the population - based on a smaller set of sample data.
What are Scales of Data (measurement) what are the 3 major types of measurement scales (arithmetic operations)?
indicate the type of statistical analyses that are most appropriate 3 major types of measurement scales: Nominal, Ordinal, Interval
What's Nominal Data
qualitative or categorical and labels are used to denote the classes/categories Example: Students of a university are classified by the school in which they are enrolled using a label such as Business, Humanities, Education, and so on.
What is Descriptive Statistics
refers to the summary of important aspects of a data set. This includes collecting data, organizing the data, and then presenting the data in the forms of charts (figures), tables and numerical measures.
What is the most common approach when in the goal of Data collection
taking random samples where each experimental unit in the population theoretically has the same chance of being selected for the sample.
If the sample is biased in the sense that there are non-random errors in it, then
the conclusion is SUSPECT!!
What is Business Analytics
the scientific process of transforming data into insights for making better business decisions.