Ch. 1 Statistics

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Is the NUMERICAL value a Para/Stat? The avg annual salary for 35 of a company's 1200 accountants is $68,000.

$68,000 is a numerical description of a sample of annual salaries.

What is a population?

-A collection of ALL outcomes, responses, measurements, or counts that are of interest. -Represents ALL possible measurements or outcomes that are of interest to us in a particular study. -An ENTIRE set of individuals or objects.

Sample

-A subset, or part of a population. -It refers to a portion of the population that is representative of the population from which it was selected.

Complete randomization design

-Subjects are assigned to different Tx groups through random selection. -Experimenter may use blocks. -Commonly used design is a randomized block design.

What is a Parameter to a Population?

A PARAmeter is a numerical description of a POPulation characteristic

What is a Statistic to a Sample?

A STATistic is a numerical description of a sample

Sampling:

A census costs to damn much, so a sample is a count or measure of PART of a population. This is more commonly used in statistical studies.

Define an experimental study:

A researcher deliberately applies a -treatment- before observing responses. -A control group is used where no Tx is applied and are given a placebo. -A treatment group does receive a Tx

Define an observational study:

A researcher doesn't influence the responses. -Observes and measures charac's of interest of part of a population but doesn't change existing conditions.

Which branch of statistics, and what conclusions can be drawn from the study using inferential statistics. A study shows that senior citizens who live in Florida have better memories than senior citizens who do NOT live in Florida.

An inference is that senior citizens in Florida have better memories than senior citizens not living in Florida.

Define the descriptive branch of statistics:

Branch of statistics that involves the -->organization, summarization, and --->display<--- of data<--.

Define the inferential branch of statistics:

Branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.

Let's say a population falls into natural occurring subgroups like Zones in a county, zip code, different branches of a bank. What can be used?

Cluster Sample -Each fall into natural occurring subgroups -Divide the groups into clusters, and select all of the members in 1/more (not all) of the clusters.

ID the sampling technique: A Pg study in Cebu, Philippines, randomly selects 33 communities from the Cebu metro area, then interviews all Pg women in theses communities.

Cluster sampling is used because each community is considered a cluster and every Pg woman in a selected community is surveyed. A potential source of bias is that selected communities may not be representative of the entire area.

What is the third guideline when designing a statistical study?

Collect the data

Define qualitative data (charac of data):

Consists of attributes, labels, or nonnumerical entries.

Define quantitative data (charac of data):

Consists of numerical measurements or counts.

What are 3 key elements of a well-designed experiment?

Control Randomization Replication Control those random thoughts of replicating yourself via sexual.

ID the sampling technique: Questioning students as they leave a university library, a researcher asks 358 students about their drinking habits.

Convenience sampling is used because the students are chosen due to their convenience of location. Bias may enter into the sample because the students may not be representative of the population of students.

ID the sampling technique: 25 students are randomly selected from each grade level at a high school and surveyed about their study habits.

Convenience sampling, because all of the people sampled are in one convenient location.

Count or measure of an entire population:

Count that census and spend a census while measuring that population

That Nominal nub needs to qual only!

Data here are qual only. Data at this level is categorized by using names, labels, or qualities. That Nominal nub needs to qual only! Put him in the category that involves calling him names, labeling him a nonqual, and reminding him his quality is NO GOOD. Remember, nominal data simply represents a LABEL.

What is the fourth guideline when designing a statistical study?

Describe the data, using descriptive statistics techniques.

Which branch of statistics, and what conclusions can be drawn from the study using inferential statistics. In a sample of Wall Street analysts, the percentage who incorrectly forecasted high-tech warnings in a recent year was 44%.

Descriptive statistics - "The percentage of Wall street analysts who incorrectly forecasted high-tech earnings in a recent year was 44%." A possible inference drawn from the study is that the stock market is difficult to forecast, even for professionals.

Which branch of statistics, and what conclusions can be drawn from the study using inferential statistics. A large sample of men, 48, was studied for 18 years. For unmarried men, 70% were alive at age 65. For married men, 90% were alive at age 65.

Descriptive statistics - For unmarried men, 70% were alive at age 65," and "for married men, 90% were alive at age 65. A possible inference includes that being married is associated with a longer life for men.

What is the second guideline when designing a statistical study?

Develop a detailed plan for collecting the data. If you use a sample, make sure the sample is representative of the population.

Systematic Sample:

Each member of a population is assigned a number, ordered in a particular way, and picked at regular intervals from the starting number.

ID the sampling technique: Opinion of students on stem cell research. You assign each student a number and generate random numbers. You then question each student whose number is randomly selected.

Each sample of the same size has an equal chance of being selected and each student has an equal chance of being selected, so this is a simple random sample.

What is a confounding variable?

Experimental results can be ruined by a variety of factors. One is a confounding variable. The experimenter is so confounded because he can't tell the difference between the effects of different factors on the variable.

Think about the table with the Yankees' World series victories (yrs). What data level does this constitute?

First, the years represent quant data. You can find differences (subtraction) between specific dates like finding out how many years it's been since they won. It doesn't make any sense to say one year is a multiple of another making this data at the interval level.

What is the Hawthorne Effect?

Hawthorne was a FILZY JEW who realized subjects would change their shape/behavior simply because they knew they were participating in an experiment.

Think about the table about the 2012 American League home run totals (by team).

Here you CAN find differences AND write ratios. From the data, you can see that Bmore hit 39 more home runs as Detroit hit. This data is at the ratio level.

What is the sixth guideline when designing a statistical study?

Identify any possible errors

What is the first guideline when designing a statistical study?

Identify the variable(s) of interest (the focus) and the population of the study.

What is the fifth guideline when designing a statistical study?

Interpret the data and make decisions about the population using inferential statistics.

NOIR LOM? Items on a Physicians intake form Find the LOM. Temp, Allergies, Weight, Pain level.

Interval Nominal Ratio Ordinal

LOM NOIR - The years of birth for the runners in the Boston marathon

Interval - because meaningful differences between entries can be calculated, but a zero entry is not an inherent zero.

What's the significant difference between Ordinal and Interval LOM?

Interval data can also be ordered, and meaningful differences between data entries CAN BE calculated. Here, a zero entry simply represents a position on a scale; the entry itself is NOT an inherent/essential zero. Like having zero dollars in your bank account. It means nothing and represents NO MONEY. A temp of 0 degrees C doesn't represent something where not heat is present. It's simply a position on the Celsius scale - it is NOT an inherent zero.

Avg monthly Temps (F) for Denver, Co. - NOIR LOM? Jan 30.7 Feb 32.5 ...

Interval where it's qual and quant, put in order, and you CAN FIND DIFFERENCES BETWEEN VALUES.

NOIR LOM? Years a TV show on ABC won the Emmy for best comedy series: 1955, 1981, 2912

Interval. Data can be ordered, meaningful differences can be calculated, but it doesn't make sense to say one year is a multiple of another.

Experimental design where subjects are paired up according to a similarity be it age, geographical location, or physical charac.

Matched-pairs design. One receives Tx/ one placebo or different Tx.

Data consists of MOC information observed by Count Chocula where does it come from?

Measurements or responses Observations Counts

Another charac of data includes a form of measurement while drinking NOIR

Nominal Ordinal Interval Ratio

TV Shows - NOIR LOM? Comedy, Drama, etc

Nominal with a meaningful calculation - a show televised by the network could be put into 1/8 categories shown.

NOIR LOM? Jersey numbers for players on a soccer team: 5, 9, 88, 8, etc...

Nominal. No mathematical computations can occur and data is organize using numbers.

What is a biased sample?

One that is not representative of the population from which it's drawn. A sample consisting of only 18-22 y/o students would not represent ALL of the 18-22 population in the country.

LOM NOIR - List of badge numbers of police officers at a precinct.

Ordinal - badge numbers can be ordered and often indicate seniority of service, but no meaningful math can be performed.

LOM NOIR - The top 10 grossing films released in a year

Ordinal - because data can be arranged in order, but the differences between data entries make no sense.

Ordinary ordinal measurements are qual or quan data. Define.

Ordinal data measurements are not ordinary, they are qual or quan arranged in order, ranked, but differences between data entries are NOT MEANINGUL. Ordinary military people are normally placed in order and ranked, but differences between them and their data entries (where they're from) are NOT meaningful.

Motion picture association of America ratings - NOIR LOM? G, PG, etc.

Ordinal where it's qual and quant. A PG rating has a stronger restriction than a G rating.

NOIR LOM? Top 5 fiction books on the NYT best sellers list on 12/23/12: 1. Threat Vector 2. Gone Girl 3. The Forgotten

Ordinal. Data can be arranged in order, but differences between data entries are not meaningful.

Is the NUMERICAL value a Para/Stat? In a recent study of math majors at a university, 10 students were minoring in physics.

Parameter. The 10 students minoring in physics is a numerical description of all math majors at a university.

Is the NUMERICAL value a Para/Stat? 62 of the 97 passengers aboard the Hindenburg airship survived its explosion.

Parameter. The 62 surviving passengers out of 97 total passengers is a numerical description of all of the passengers of the Hindenburg that survived.

Is the NUMERICAL value a Para/Stat? At a college, 90% of the members of the Board of Trustees approved the contract of the new president.

Parameter. The 90% of members that approved the contract of the new president is a numerical description of all board of Trustees members.

Is the NUMERICAL value a Para/Stat? In 2012, MJB teams spent a total of over 2 millions on player's salaries.

Parameter. The value over 2 million is a numerical description of the total player salary for all of the players in MLB.

P/S? Employees of a given company

Population

P/S? The grains of sand on the beaches of the world.

Population

P/S? The number of widgets manufactured by a company that plans to be in business forever.

Population

2 Types of data sets used when studying statistics

Population Sample

Is the NUMERICAL value a Para/Stat? The freshman class at a university has an average SAT math score of 514

Population Parameter

P/S? The political party of every president.

Population because it is a collection of ALL the US presidents' political parties.

P/S? Number of airplanes owned by an airline.

Population.

P/S? Potential consumers in a target market.

Population.

P/S? Final score of each golfer in a tournament.

Population. Because it is a collection of ALL the golfers' scores in the tournament.

P/S? Revenue of each of the 30 companies in the Dow Jones Industrial Average.

Population. It's a collection of the revenues of the 30 companies in the DJIA. It represents ALL possible measurements or outcomes interested in this particular study.

Which is the population and sample? A. Ages of adults in the US who own cell phones B. Ages of adults in the U.S. who own Samsung cell phones

Population: A Sample: B

Which is the population and sample? A. Parties of registered voters in Warren County? B. Parties of Warren County voters who respond to online survey.

Population: A Sample: B

Find the population and sample: Survey of 55 US law firms found the average hourly billing rate was $425.

Population: Collection of the average hourly billing rates of all US law firms Sample: Collection of the average hourly billing rates of the 55 US law firms surveyed.

Find the population and sample: Survey of 202 pilots found that 20% admit that they have made a serious error due to sleepiness.

Population: Collection of the effect of sleepiness on all pilots Sample: Collection of the effect of sleepiness on the 202 pilots surveyed.

Find the population and sample: Survey of 12,082 US adults found that 45.5% received the flu vaccine for a recent flu season.

Population: Collection of the immunization status of all adults in the US. Sample: Collection of the immunization status of the 12,082 US adults surveyed.

ID the population and sample: A Study of dietary habits of 20,000 men was conducted to find a link between high intakes of dairy products and prostate cancer.

Population: Collection of the prostate conditions of all men Sample: Collection of the prostate conditions of the 20,000 men in study.

Find the population and sample: Survey of 2311 US adults found that 84 % have seen a health care provider at least once in the past year.

Population: Collection of the responses of all US adults Sample: Collection of the responses of the 2311 US adults that were sampled

Find the population and sample: Survey of 1503 US adults found that 78% favor government policies requiring better fuel efficiency for vehicles.

Population: Collection of the responses of all US adults Sample: Collection of the responses of the 1503 US adults that were sampled

Find the population and sample: A survey of 1015 US adults found that 32% have had to put off medical care for themselves or their family in the part year due to the cost.

Population: Collection of the responses of all adults in the US Sample: Collection of the responses of the of the 1015 US adults surveyed

Find the population and sample: To gather information about starting salaries at companies listed in the Standard & Poor's 500, a researcher contacts 65 of the 500 companies.

Population: Collection of the starting salaries at all 500 companies listed in the Standard & Poor's 500. Sample: Collection of the starting salaries of the 65 companies listed in the Standard & Poor's 500 that were contacted by the researcher.

Ordinal LOM charac:

Qual or Qual where you can: -Put data in categories -Arrange data in order

Interval LOM charac:

Qual or Qual where you can: -Put data in categories -Arrange data in order -Subtract data values

Ratio LOM charac:

Qual or Qual where you can: -Put data in categories -Arrange data in order -Subtract data values -Determine whether one data value is a multiple of another

Nominal LOM characteristic:

Qual where you can: -Put data in categories

Determine whether the following information is qualitative or quantitative: List of debit car pin numbers

Qual, because debit pin numbers are labels and it does not make sense to find differences between numbers

Determine whether the following information is qualitative or quantitative: Eye colors of models

Qual, because eye colors are attributes

Determine whether the following information is qualitative or quantitative: Responses on an opinion poll

Qual, because the poll responses are attributes

Determine between qual/quant, and LOM of the data: Top Salespeople. The regions representing the top salespeople in a corporation for the past six years. Southeast Northwest Northeast Southwest

Qual. Nominal. No math can occur and data are categorized by region.

Determine between qual/quant, and LOM of the data: Top 5 music albums for 2012. 1. Adele - 21 2. Michael Buble - Xmas 3. Drake - Take Care 4. Taylor Swift - Red

Qual. Ordinal. Data can be arranged in order but the differences between data entries make no sense.

Determine between qual/quant, and LOM of the data: Football. Top 5 teams in the final college football poll released in Jan 2013. 1. Alabama 2. Oregon 3. Ohio State

Qualitative. Ordinal. Data can be arranged in order, but the differences between data entries are NOT meaningful.

Determine whether the following information is qualitative or quantitative: The ages of a sample of 350 employees of a software company.

Quant, because ages are numerical measurements

Determine whether the following information is qualitative or quantitative: Heights of hot air balloons

Quant, because balloon heights are numerical measurements

Determine whether the following information is qualitative or quantitative: The final scores of a video game

Quant, because final scores are numerical measurements

Determine whether the following information is qualitative or quantitative: Weights of infants at a hospital

Quant, because infant weights are numerical measurements

Determine whether the following information is qualitative or quantitative: The revenues of the companies on the Fortune 500 list.

Quant, because revenues are numerical measurements

A process of randomly assigning subjects to different treatment groups:

Randomization

NOIR LOM? Gender profiles of the 112th Congress: Number - 100, 200, 300, etc Gender - Men, women

Ratio Nominal

NOIR LOM? How serious of a Problem is Global Warming? Percents - 5, 10, 15, etc Reponse- Very serious, Somewhat serious, not too serious, etc...

Ratio Ordinal

LOM NOIR - Horsepower of racing car engines

Ratio - because on data entry can be expressed as a multiple of another.

Avg monthly precipitation (in) in Orlando, Fl - NOIR LOM? Jan 2.35 Feb 2.38 ...

Ratio where it's both qual and quant, put in order, you can find differences between values, AND find RATIOS OF VALUES: 7.58/3.77 = 2 Here, there's TWICE AS MUCH rain in June as March

Replication improves the validity of the experimental results, what is it?

Repetition of an experiment under the same or similar conditions.

ID the sampling technique: Chosen at random, 580 customers at a car dealership are contacted and asked their opinions of the service they received.

SRS. Because each customer has an equal chance of being contacted, and all samples of 580 customers have an equal chance of being selected.

Is the NUMERICAL VALUE a Para/Stat? Recent survey of approximately 400,000 employers reported that the average salary for marketing majors is $53,4000

Sample Statistic

Is the NUMERICAL value a Para/Stat? In a random check of 400 retail stores, the FDA found the 34% of the stores were not storing fish at the proper temperature.

Sample Statistic

ID the sampling technique: Opinion of students on stem cell research. You select students who are in your biology class.

Sample is taken from something readily available - convenience sample. Sample may be biased because biology students can be more familiar with stem cell research than others.

Convenience sample

Sample often leading to bias and not recommended. It consists only of members easy to get.

The number of subjects in a study

Sample size, also known as n

Simple random sample:

Sample where every possible -sample- of the sample size has the same chance of being selected. A member of the sample chosen.

P/S? Cholesterol levels of 20 patients in a hospital with 100 patients.

Sample. Because the collection of the 20 patients is a SUBSET of the population of 100 patients at the hospital.

P/S? Survey of 500 spectators from a stadium with 42,000 spectators.

Sample. The collection of the 500 spectators is a subset of the population of 42,000 spectators at the stadium.

If it's acceptable to have a member of a same population is selected more than once:

Sampling process is said to be -with replacement-

Ratio LOM.

Similar to Interval and here that little zero IS a hero! Here, a ratio of two data entries can be formed so that one data entry can be meaningfully expressed as a multiple of another. Think about the twice as much concept. 2 buck is twice as much as 1 buck. 2 degrees C is NOT twice as warm as 1 degree C (This data would be at the interval level).

ID the sampling technique: Using random digit dialing, researchers call 140 people and ask what obstacles (childcare) keep them from exercising.

Simple random sampling because each telephone umber has an equal chance of being dialed, and all samples of 1400 phone numbers have an equal chance of being selected. The sample may be biased because telephone sampling only samples those individuals who have telephones, who are available and who are willing to respond.

ID the sampling technique: Using random digit dialing, researchers ask 1003 US adults their plans on working during retirement.

Simple random sampling is used because random telephone numbers were generated and called. A potential source of bias is that telephone sampling only samples individuals who have telephones, available, and willing to participate.

There are several ways of collecting data - S/S This one gets all 'Mathy' on you, reminds you just bought a LEGIT Lenovo laptop, and you don't even have to worry about life/death situations while saving time and money.

Simulation: -Use of a mathematical or physical model to reproduce the conditions of a situation or process. -Data collection often uses computers -Allows you to study situations impractical or even dangerous to create in real life, often saving time and money.

Is the NUMERICAL value a Para/Stat? In a survey of 300 computer users, 8% said their computers had malfunctions that needed to be repaired by service technicians.

Statistic. 8% is a numerical description of a sample of computer users.

Is the NUMERICAL value a Para/Stat? A Survey of 1000 US adults found that 40% think that the Internet is the best way to get news and information.

Statistic. The 40% is a numerical description of a sample of US adults.

Is the NUMERICAL value a Para/Stat? A survey of 733 small business owners found that 17% have a current job opening.

Statistic. The value 17% is a numerical description of a sample of small business owners.

Is the NUMERICAL value a Para/Stat? A survey of 1004 US adults found that 52% think that China's emergence as a world power is a major threat to the well-being of the US.

Statistic. The value 52% is a numerical description of a sample of US adults.

What science is the collecting, organizing, analyzing, and interpreting data in order to make decisions?

Statistics

ID the sampling technique: Soybeans are planted on a 48-acre field. The field is divided into one-acre subplots. A sample is taken from each subplot to estimate the harvest.

Stratified sampling is used because a sample is taken fromm each one-acre subplot.

ID the sampling technique: Opinion of students on stem cell research. You divide the student population with respect to majors and randomly select and question some students in each major.

Students are divided into a strata (majors) and a sample is selected from each major - Stratified sample

There are several ways of collecting data - S/S People investigating by interviewing, checking Facebook, calling their BFF, or mailing a payment.

Survey: -Investigation of 1/more charac's of a population -Carried out by people asking questions -MC types are done by interview, internet, phone, or mail -When designing DON'T use leading questions, results in bias.

ID the sampling technique: A journalist goes to a campground to ask people how they feel about air pollution

Systematic sampling, because every tenth machine part is sampled.

What is another factor that can affect experimental results?

The Placebo Effect: When subjects react favorably when no Tx has been prescribed.

What can be used to help control the placebo effect?

The blinding technique - where subject don't know if they are receiving a Tx or placebo. In a double-blind experiment, both the experimenter nor the subject know if they are receiving Tx or placebo.

Describe the data and LOM: HR (bpm) of an athlete during an exercise session.

The data is the collection of HRs, at the Ratio level because the data can be ordered, meaningful differences can be calculated, the data can be written as a ratio, and the data set contains an inherent zero.

Describe the data and LOM: BODY TEMPS! (F) of an athlete during an exercise session.

The data is the collection of body temps and is at the interval LOM because the data can be ordered and meaningful difference can be calculated, but it doesn't make sense to write a ratio using the temps.

What dictates the best way to collect data?

The focus of the study.

For a deeper understanding of a population, consider a market researcher for a soft drink company who might want to determine the sweetness preferences of Americans between the ages of 15 and 25. What is the population?

The population in this example includes ALL Americans in this age group.

What must a researcher be sure of when using sampling techniques?

To collect unbiased data, that heshe needs to make sure that sample is a representation of that population. This is to make appropriate inferential observation of the study.

What's a good idea in terms of size when dealing with these experimental units?

Use the same number of subjects

Random sample:

When -every member of A POPULATION (ALL - has an -equal chance- of being selected. Population: -A collection of ALL outcomes, responses, measurements, or counts that are of interest. -Represents ALL possible measurements or outcomes that are of interest to us in a particular study. -An ENTIRE set of individuals or objects, which may by finite or infinite.

What should you use a stratified sample?

When its important for the sample to have members from each SEGMENT of the POPULATION. Here, members are divided into two or more subsets called -strata- that share a SIMILAR CHARAC like age, gender, ethnicity, or even political preference. Each segment of the population is represented. Like using socioeconomic status as divider.

If it's NOT acceptable to have a same member selected more than once from a population:

Without replacement


Ensembles d'études connexes

Développement des Logiciels Interactifs (DLI)

View Set

RN Concept-Based Assessment Level 2 Online Practice B

View Set

Simulate your exam missed questions

View Set

CySA+ 002 Chapter 9 - Software and Hardware Development Security

View Set

FIN 3123 Financial Management Stocks and Bonds

View Set

Econ 139 Chapter 8 - Foreign Currency Translation

View Set

Database Systems - Chapter 3, Database Systems: 11e Chapter 5, Database Systems: 11e Chapter 7, Database Systems: 11e Chapter 8, Database Systems Chapter 6, Database Systems - Jukic - Chapter 2, Database Systems Chapter 4

View Set

Ch. 7 Premature and Small-for-Dates Infants

View Set