Statistics introduction and Variables
Rule
1. if you can add it - EVER VALUE, it's quantitative. For example, a G.P.A. of 3.3 and a G.P.A. of 4.0 can be added together (3.3 + 4.0 = 7.3), so that means it's quantitative. 2. if you can't add something -EVERY VALUE, then it's categorical. For example, you can't add cat + dog, or Republican + Democrat.
descriptive statistics
Organizing describing , and summarizing data do not involve generalizing beyond the data at hand
parameter
a number that is a property of the population
statistic
a number that represents a property of the sample a number that summarizes the data collected from a sample
Sample
a set of data collected and/or selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations
independent variable
a variable whose variation does not depend on that of another that causes or influences another associated factor or phenomenon called a dependent variable. manipulated by the experimenter
categorical variables
another name for Qualitative variables no natural sense of ordering. They are therefore measured on a nominal scale. hair color (Black, Brown, Gray, Red, Yellow) is a qualitative variable, as is name (Adam, Becky, Christina, Dave . . .). Qualitative variables can be coded to appear numeric but their numbers are meaningless, as in male=1, female=2
Quantitative data
are always numbers are the result of counting or measuring attributes of a population.
quantitative discrete data
All data that are the result of counting
quantitative continuous data
All data that are the result of measuring
What separates continuous random variables from countably infinite discrete ones is that
continuous variables are uncountably infinite; they have too many possible values to list out or to count and/or they can be measured to a high level of precision (such as the level of smog in the air in Los Angeles on a given day, measured in parts per million).
Qualitative variables/categorical
express a qualitative attribute such as hair color, eye color, religion, favorite movie, gender values of a qualitative variable do not imply a numerical ordering. Values of the variable "religion" differ qualitatively; no ordering of religions is implied.
representative sample
sample containing the characteristics of the population
sampling
select a subset of the population by a defined procedure to study to gain information about the population
constants
such as π that do not vary
probability
the extent to which an event is likely to occur, measured by the ratio of the favorable cases to the whole number of cases possible.
statistics
the science dealing with the collection, analysis, interpretation, and presentation of data
population
the total "set" of observations that can be made All elements, individuals, or units that meet the selection criteria for a group to be studied, and from which a representative sample is taken for detailed examination.
Continuous Variables
1. Variables that can take on any value in a certain range. Time and distance are continuous; . 2. No measured variable is truly continuous; however, discrete variables measured with enough precision can often be considered continuous for practical purposes. Sometimes, a variable that takes on enough discrete values can be considered to be continuous for practical purposes. One example is time to the nearest millisecond. it can take 1.001, 1.003, 2.009, 45.008 3. commonly used continuous random variables are the normal distribution and the t-distribution.
data
1. information that has been collected from an experiment, a survey, a historical record 2. Data are the actual values of the variable. They may be numbers or they may be words
Quantitative variables/Numerical variables
1. measured in terms of numbers. 2. have values that describe a measurable quantity as a number, like 'how many' or 'how much' height, weight, and shoe size 3. take on values with equal units such as weight in pounds 4. Variables that are not qualitative
Categorical data are summarized by
1. percentage percentage of Republicans, Democrats 2. Two-way tables summarize the information from two categorical variables at once, such as gender and political party, easily calculate the percentage of individuals in each combination of categories 3. Pie Chart 4. Bar Graphs 5. Frequency Tables w/ relative frequency
Discrete Variable
1. possible scores are discrete points on the scale, a household could have three children or six children, but not 4.53 children 2. Variables that can only take on a finite number of values 3. All qualitative variables are discrete. 4. Some quantitative variables are discrete, such as performance rated as 1,2,3,4, or 5, or temperature rounded to the nearest degree. 4. finite if its list of possible values has a fixed (finite) number of elements in it (for example, the number of smokers in a random sample of 100 voters has to be between 0 and 100). One very common finite random variable is the binomial 5. countably infinite if its possible values can be specifically listed out but they have no specific end. For example, the number of accidents occurring at a certain intersection over a 10-year period can take on possible values: 0, 1, 2, . . . (you know they end somewhere but you can't say where, so you list them all).
summarize a numerical data set
1. what the center of a data set average, the mean 2. median 3. Histograms : Comparing means and medians 4. standard deviation 5. Variance 6. range 7. Relative Standing, Percentiles 8. five-number summary 9. interquartile range 10. Boxplots - good at depicting differences between distributions 11. time chart / line graph 12. stem and leaf plots - display shapes of distributions, small to moderate amounts of data 13. frequency polygons 14. scatter plots -Scatter plots are used to show the relationship between two variables15. Bar charts 16. dot plots
dependent variable
A factor or phenomenon that is changed by the effect of an associated factor or phenomenon called the independent variable.
inferential statistics
The formal methods used for drawing conclusions from "good" data. are called inferential statistics. Statistical inference uses probability to determine how confident we can be that our conclusions are correct.
universe
The total of all populations
nominal scale
is one of four Levels of Measurement. No ordering is implied, and addition/subtraction and multiplication/division would be inappropriate for a variable on a nominal scale. {Female, Male}Occasionally, numeric values are nominal: for instance, if a variable were coded as Female = 1, Male = 2, the set {1,2} is still nominal
variable {capital letters such as X and Y}
properties or characteristics of some event, object, or person that can take on different values or amounts can be measured or counted