Data Chapter 1
Bivariate
2 variables Ex: scatterplot, correlations, coefficents
Variable
A characteristic about a subject
Population
All members of a specified group (not necessarily people) Consists of all itmes of interest in a statistical problem.
Nominal Scale
Also called Categorial Represents the least sophisticated level of measurement. Used for qualitative variables All we can do with this data is categorize or group it. Ex: 30 companies trade on the DJIA and we group these companies based on what stock they trade on.
Qualitative Variable
Also called Nominal value Use labels or names to identify the distinguishing characteristic of each observation. Ex: The census asked people to identify their gender others include race, profession, type of business, the manufacturer of a car etc...
Quantitative Variable
Also called Numerical value A variable that assumes meaningful numerical values Can be either discrete or continuous
Discrete Variable
Assumes a countable number of values Ex: There are 90 points scored in a basketball game or you have 3 children. You cant have 90.2 points or 3.4 children You can have money though $20.37 is fine.
Data Set
Calculation of data values as a whole
Observation
Each data value Ex: weight would be a variable but the actual number of the weight is the observation
Sample
Is a subset of a particular population *We usually rely on sample data to make conclusions We analyze sample data and calculate a sample statistic to make inferences about the unknown population parameter.
Continuous Variable
Is characterized by uncountable values that are within a certain interval Ex: Weight, height, time Weight is 100.7 lbs
Multivariate
More than 2 variables Ex: Multiple Regression
4 Data Measurements
Nominal Scale Ordinal Scale Interval Scale Ratio Scale
Univariate
One variable (very rare) Ex: weight would be a variable
Cross-sectional data
Refers to data collected by recording a characteristic of many subjects at the SAME POINT IN TIME, or without regard to differences in time. Ex: The recorded scores of students in a class The current price of gasoline in different states in the US.
Inferential Statistics
Refers to drawing conclusions about a large set of data.. They draw this from a "population" which is based on a smaller set of "sample" data.
Descriptive Statistics
Refers to the summary of importnant aspect of a data set. Includes collecting data, organizing the data, and then presenting the data in the forms of charts and tables.
Ordinal Scale
Stronger than nominal scale. Used for qualitative variables Differences between categories are meaningless We are able to both categorize and rank the data with respect to some characteristic or trait. The only weakness is that we can't interpret the difference between the ranked values because the actual numbers used are arbitrary. Ex: Classifying service at a hotel as excellent, good, fair or poor.
Ratio Scale
Strongest level of measurement. Has all the characteristics of interval data but has a true zero point. Used for measuring many types of data in business analysis. Quanitative data
Interval Scale
We can rank and categorize data but also we are assured that the differences between scale values are meaningful. Only drawback is that the value of zero is arbitrary chosen. Quanitative data Ex: Temperature.... 50-60 degrees is the same as 80-90 degrees.
Subject (Individual)
an item for studying
Time series data
refers to data collected by recording a characteristic of a subject over several time periods. SPACED POINT IN TIME Ex: monthly sales of cars at a dealership in 2010.
