Elementary Statistics Chapter 1,2,3
Validity
"Accuracy" of measurement ~ is the instrument measuring what it is supposed to measure?
Reliability
"Consistency" of measurement ~ is the instrument measuring the same each time? (bathroom scale)
Frequency Distribution
A distributional count of scores, items, or whatever is being measured n= Sample amount x= Groups f= Frequency P= Proportion
Standard Error of the Estimate
A measure of the prediction error. A form of (sample standard deviation) appropriate to the bivariate world
The Scientific Method
Identification of problem, Formation of Hypothesis, Statements of relationship/differences, solutions to the problem,Collection,Organization & Analyses of data & Formation of conclusions
Pearson r
Pearson Product Moment Correlation Coefficient (Pearson r) *Scale ranges from negative 1 to positive 1 *Positive correlations stregthen as they approach +1.00 *Negative correlations strehgthen as they approach -1.00
mu
Population mean
P=Rho
Population measure of bivariate relationship
sigma
Population standard deviation
sigma (squared)
Population variance. It is a measure of dispersion
Median
The middle score or position/50th percentile / 50%le
Variable
A characteristic under study or investigation that assumes different values for different elements
Constant
A characteristic under study that has a fixed value
Spurious Correlation
A correlation that is artificially high or low as the result of some confounding variable (words of caution)
Coefficient of Non-Determination
% of non-overlap (remember circles percent not shared)
Measures of Dispersion
(Measure of Spread) Measures the degree to which data points (Scores) are spread out to each other
Spearmans Rho
(Statistical test) Used at the oridinal level
True Experimental Research
*Must use control group *Must manipulate the Independant variable *Must have randomization
Discrete Variable
A (quantitive) variable whose values are countable
Scatterplot
A plot of paired oberservations. Tracks positive and negative correlations
Biased Sample
A sample in which some members of the population are less likely to be included than others No ability to generalize i.e a sample of volunteers
Statistic
A sample measure calculated for sample data.
Simple Linear Regression
A statistical technic used to predict Y from X as correlations get strong enough/are high enough
Population
All or a complete set of individuals (elements) under investigation
Confounding Variable
Any variable(s) highly related to the independant (IV) (treatment)/dependant variable that may effect the accuracy of our results...(IQ)
Mean
Arithematic average
Inferential Statistics
Collection of methods that help make decisions about a population based on sample results
Cross section Data
Data collected on different elements at the same point in time or for the same period of time
Raw Data
Data recorded in the sequence in which they are collected and before they are processed.
Operational Definitions
Define conceptual definitons in terms of how they are measured (scale)
Parameters
Describes populaton characteristics A statisical value that describes the population such as mean, mode,median,range,variance or standard deviation calculated for a population data set
Interval Scale
Equal distance between points on the scale (like thermometer) No true Zero (zero is not the absence of anything (very cold)
Areas Under the Curve
Equal segments (also called just areas) when added up will equal 100%
Correlation Matrix
Found in literature. Lists variables across the top and side.
Statistics
Group of methods used to collect, analyze, present, and interpret data and to make decisions
Positive Correlations
Has simultaneouus increase or decrease in the x and y. The strongest positive correlation we could have would be +1.00 Exampe (an increase in study time (x) it creates an increase in grades(y) OR a decrease in sleep (x) creates a decrease in energy(y)
Normal Distribution
In a normal distribution the mean, median and mode will be identical, will have shape of triangle. (Gaussion Dist, Symmetrical Dist. & Bell Curve) house prices ( 50k, 75k, 100k, 10,000,000) "extreme"
Variance
Indicate the degree to which scores are dispersed around the mean. The standard deviation is obtained by taking the positive square root of the variance. The variance calculated for population data is denoted by (read as sigma squared).
Correlation Research
Is always tentative
Range
Is the simplest measure of dispersion to calculate. Its obtained by taking the difference between the largest and smallest values in a data set.
Quasi-Experimental Research
Lack of randomization *Manipulates an independent variable *Uses a control group
Regression Line
Line of the best fit or best prediction
Measure of Central Tendency
Measures that describe the center of a distribution. The Mean, Median, Mode are three of the measures of central tendency
Mode
Most frequently occuring score, i.e. 1,2,3,3,3,4,4,5
Z Score
Most widely used, the distance from the mean to any score in the distribution in standard deviation units, ranging from -3 to +3 with a mean of 0 and a standard deviation of 1 RULES: If Z is positive we add 50% to our proportion between z and the mean If Z is negative we simply use the proportion that is greater than z
Ratio Scale
Most widely used.Equal distance between points on scale. Has TRUE ZERO. (Pulse, blood pressure weight etc..)
Language of Science
Natural & Physical behavior "Approach to gather Knowledge" Functions: Develope of theory & Testing of Hypotheses "We have Method"
Curvilinearity
Non-Linear, Pearson r is meaningless (words of caution)
Nominal Scale
Numbers represents categorical or classifications, used to identify i.e. social security numbers,home address, cell numbers No arithmetic operations can be performed w with them
Operationalization
Operational definitions define Conceptional definitions in terms of how they are measured We "Operationalize" headache pain and define it via a scale
Three measures of dispersion
Range, Variance, Standard Deviaton Both Variance & Standard Deviation measure the degree to which our data points are dispersed around the mean
Ordinal Scale
Rankings/Standings unequal distance between points on the scale (Percentile rank %le) (using names or #'s where only the comparison is relevant)
Randomization
Reduces (spread of risk) of the effects of any confounding variable(s) equally across all groups
Covariance
Relationship or the measure of the relationship that is non-standardized (sister measurement to correlation)
Replication
Repeating the essence of a research study,usually with different participants in different situations,to see whether the basic finding extend to other participants and circumstances
x-bar
Sample mean
r
Sample measure of bivariate relationship
s
Sample standard deviation
S (squared)
Sample variance
Descriptive Statistics
Set of procedures in which we organize, summarize, present data. This is done is 5 ways 1.Frequency distributions 2.Measures of central tendency 3.Measures of dispersion 4.Measures of position 5.Measures of relationship
Standardization
Set of procedures where we transform raw scores (grades,weight, height) to another score with a known mean and standard deviation with a purpose of determining "Percentile Rank" (%le)
Data
Sets of numbers we collected that represents some measurement/Information about individuals in a population
Y(1)=bx +a
Simple linear regression Y(1) is said to be Prime. This formula will give you the regression line. b=slope, a = intercept, y(1) Predicted variable (dependent variable) x= Predictor variable (independent variable)
Continuous Variables
Some variables can not be counted, and they can assume any numerical value between two numbers
Summation Notation
Sometimes a mathematical notation helps express a mathematical relationship concisely. Summation Notation is used to denote the sum of values The act of adding paticular data points and the meaning of E which we refer to as "sum". it means "add em up" greek symbol
Negative Skew
Tail points in negative direction
Positive Distribution (skew)
Tail points in the positive direction
Random Assignment
The act of taking those names from a bin & assigning them to respective groups in the case of experimental research, where we indeed manipulate the (IV) and measure for its effect on the (DV)
Standard Normal Distribution
The characteristics are identical. It has a bell shape curve and symmetrical distribution.It has a mid point that is the Mode and the Median and the Mean
Correlation
The extent to which two variables change together in a systematic way or "co-vary" (the relationship or association between two variables). This relationship is Linear. Does not infer causation!!
Samples
The selection of a few elements from the population (subsets of population)
Measurement
The value of a variable for an element, The act of assigning numbers to observations using a set of rules which we define as measurement scales (Nominial, Ordinal, Interval,Ratio)
Independant Variable
The variable we manipulate to measure for its effect on the dependent variable (treatment)
Dependent Variable
The variable we measure "Data" the observed outcome of interests, data, the effect in terms of cause and effect
Phi Coefficient
Used at nominal level
Point biserial Correlation
Used at nominal level
Empirical research (Scientific Research)
Uses observation, measurement and data analysis to help answer specific questions
Outliers or Extreme values
Values that are very small or very large relative to the majority of the values of the values in a data set. (words of caution) Can effect the magnitude of r..
Qualitative or Categorical Variables
Variables that cannot be measured numerically but can be divided into different categories (Nominal scale)
Coefficient of Determination
Variation over-lap (remember circles % which is shared)
Measure of position
Where does the score rank in relation to others or where does your score rank/fall in relation to all the others
Science
an approach to gathering knowledge
Negative Correlations
has opposing increase or decrease in x and y. The strongest negative correlation we could have is -1.00. Example ( an increase in alcohol consumption (x) creates a decrease in dexterity
Research Design/Method
is a plan for collecting and analyzing data that allows us to answer the specific questions
Standard Deviation
mostly used measure of dispersion. tells how closely the values of a data set are clustered around the mean
Sum of Squares
take each difference from the mean and square it. Then, add up these squared deviations. When you do this you have the sum of the squared deviations (which is then reduced to "Sums of Squares", or SS).
Statistics "The Language of Science"
the sciences, i.e. natural & physical behavioral Science " approach to gathering knowledge" Development of Theory testing of Hypotheses
Random Selection/Sample/Probabitlity Sample
whereby each memeber of the population we're investigating has an equal likelihood of being selected
Independent (predictor) Variable
x