SWU 321 Ch 1-3 Arizona State University
What are the characteristics of a frequency distribution for nominal level data?
FD are constructed directly from raw data with no particular order
Ratio level data
Fixed, absolute, ranked data with absolute zero,
What is meant by a sample?
Gathering data from smaller subset of a population
Variability
Gives an indicator of the amount of variation among the attributes of a variable in a data set. Also called dispersion
Reversing the z score
Gives you the percentiles
Constant
Is the same across all avenues; can vary among studies
Why use percentages instead of frequencies?
It allows groups of various sized to be equally compared (Out of 100)
Covariance
It is not ethical or possible to design studies that can be used to demonstrate that one variable influences the other; we can see it in two or more variables, but cannot say causal relationship, so COVARY is the word. Can help to predict the attribute of one variable by knowing the attribute of the other
What are the 3 types of means?
1) Arithmetic: Add all numbers and divide by the number of Attributes; outliers can distort 2) Trimmed mean: Bottom and top five percent is not calculated. 3) Weighted: Used when computing the average of unequally weighted attributes. You even out unequal Att's for an even comparison
What are the two other used of variability and central tendency?
1) Describe the overall distribution of a variable within a sample or population 2) They make other types of statistical analysis possible
What are the types of statistical analysis that variability and central tendency make possible?
1) Estimate the D of a V within the population 2) Compare D from one sample w/another 3) Compare D of sample to the D of the population it came from 4) Compare V's of 2 data sets 5) Compare simultaneously several V's within research sample or population
Operationalization
1) How will the variables be measured? 2) How will we determine type of intervention? 3) support with professional evidence
Which measure of CT is used with nominal? Ordinal? Interval/ratio?
1) Nominal: mode 2) Ordinal: Mode best choice but ordinal measurements produced by group freq dist use the median 3) Interval/ratio: Median or mean, but it is best to use ethical judgement to ensure accurate representation of the data
What are the two most common ways statistical analysis may be categorized?
1) Number of variables 2) Primary purpose
Conceptualization
1) Selecting the most important variables to study 2) Stating exactly what is meant by each variable 3) why they are important 4) Explain concepts and variables
What are the steps for determining variance?
1) Subtract the mean from each attribute (mean deviation) 2) Square each 1 3) Divide the sum of squared differences by the total number of attributes (Subtract 1 for sample data)
Why use grouped freq dist?
1) Too may or unequal spreads of attributes 2) easier to visualize and comprehend when grouped (10-15/16-20) 3) Too many attributes to list each one (Continuous V and there are a lot of cases)
What is meant by number of variables?
1) Univariate 2) bivariate 3) multivariate
How do research hypotheses evolve?
1) findings on someone else research 2) Own experience 3) literature review
What determines what kind of graph to use to display data?
1) level of measurement 2) Need for clarity
What two things need to be present to determine the z score?
1) mean 2) standard deviation
What is meaningful grouping?
1) type of grouping that reduces the number of attributes to smaller number that can be easily displayed and understood without compromising measurement precision 2) Makes attributes grouping similar to one another and meaningful, they should share similar characteristics 3) Each group must fall into one group and groupings should be mutually exhaustive and exclusive.
What needs to be present for Att's to be normally distributed?
1) when graphed, it will be bell curved and symmetrical 2) M, M, M similar 3) Most attributes fall between -1 and +1 standard deviation from the mean. (A few may fall above 3
Each z score correspond to:
1) z score and 2) percentile rank
What are the steps to figure out SD?
1)List the A for the V in column A 2)Compute the mean of the A in columnA 3) List the mean in columnB 4)Subtract the mean from each A in columnA and list in colunmC 5)Square each A in C and list in D 6)Compute the sum column D 7)Divide the sum by total number of Att (minus 1 for sample data). 8)Compute the square root
What is meant by bivariate and what are they used for?
2 variables are measured; use of statistical tests to determine if there is statistical support for research hypothesis about the relationship between TWO variables
What is correlation?
A relationship among variables in which certain attributes of one variable reflect a pattern of relationships with another
Research hypothesis
A statement describing what the researcher suspects the relationship among variables will be
Binary
A type of dichotomous variable where researcher assigns attrubutes of 0 or 1 for the presence or absence of a variable. (Own car? 1-yes 2-no)
What is a z score?
A way of converting raw scores into standard deviation units, making it possible to compare A's from 2 samples or populations
What is descriptive analysis?
A way to summarize and communicate the most important, salient data. Summarize the measurement of variables into graphs, charts, and descriptive summaries
Positive skew
Bulk of data is at the low end of the scale
Negative skew
Bulk of data is concentrated at the high end of the scale
What is ordinal plus
Able to use final score as numerical data and allows for more mathematical statistical analysis. Distances between numbers are equal
Dummy variable
Allows for transformation into ratio level data
Secondary data
Analyzing primary data for other data
What are the two types of non causal relationships and when are they used?
Association and correlation; used when we are unable to introduce or manipulate the attributes of one variable to examine the effects of the other
What are the characteristics of a frequency distribution for ordinal level data and higher?
Attributes have a logical sequence based on the quantity of the variable
Mesokurtic? Characteristics of?
Bell shaped, normally distributed curve. Measurements closest to the mean are most common. Tapers consistently and gradually. Mode, median, and mean all in about the same place
What is a one-tailed or directional hypothesis?
Can predict variables are related and predict the direction the relationship will take
Continuous variable
Can take on all numerical values, including fractions and decimals
Data
Collected measurements from a research study; before analysis
Cumulative percentage fd?
Combines features of the cumulative freq and percentage distribution tables - displays column for cumulative % for each attribute
What are the features of z scores?
Compared to other scores, converted into percentiles, compared to their respective distributions
Frequency distribution?
Consolidate data sets constructed from arrays
What is an absolute frequency distribution (Simple FD)?
Count number of times an attribute occurs and list them (any level of measurement)
Practical uses of z score?
Curving on tests and standardized measurements - IQ test, SAT's, normal distributions
What is meant by primary purpose of analysis?
Data grouped into two categories: descriptive and inferential statistics
Kurtosis
Degree to which a D is peaked and how much A cluster around the center
What is a parameter?
Descriptive summaries of the attributes of the variables that were measured. (A 20% sample of student data, when summarized: statistics) Form the basis of more complicated analysis. Parameters summarize data from populations and statistics summarize data from samples
Attributes
Different categories a variable can assume (words or numbers)
Data set
Different sets of data within the same study
Variable
Differs among and in measurements - primary focus in statistics
Percentage frequency D
Displays an absolute percentage column on right side; percentage of total number of distribution
Median
Divides an array of Att's into two equal halves. Interval and ratio
Leptokurtic
High degree of A cluster around the center (Mean)
What is percentile rank?
Indicate the % of cases within a certain group whose attributes fall below or above a certain score, this enables us to put attributes in perspective relative to other attributes within the same group
Mean
Interval and ratio
Polygon
Interval/ratio;
How is information different than data
It is the interpretation of that data from a research study
Percentage are useful with _______ amounts of date, but not with _________.
Large; small
Reliability
Level of consistency the test produces
What is nominal data?
Lowest level, categorical data, Must have two or more attributes or would be a constant, clearly exhaustive and mutually exclusive, numbers do not hold any value
Confounding variable?
May contribute to variations in the dependent variable
What MOCT is most commonly used in social work and what makes it vary?
Mean: normally distributed Median: skewed
What is meant by univariate and what is it used for?
Means one variable; Helps to visualize certain trends in the measurements of a single variable that might otherwise be hard to discern or comprehend
What is multiple indicator?
Measure a single variable multiple ways to decrease human error, bias, ect.
What LOM is best when there are outliers?
Median
Platykurtic
More flat polygon, A heavy at tales
Interval
No absolute zero, rank order and equally spaced; criterion/continuum, attributes indicate how far apart values are from each other. Difference between intervals matter and can be measured. Ranked data with equal distances
Can SD be used with skewed data? Why?
No, the distance between standard deviations are not even
Pie charts? Level of measurement
Nominal
Explain ordinal data
Nominal plus quantitative quality, social class, education, order a race finishes, intervals not equal, values ranked in hierarchy, No "true" numerical values, rank in order, No mathematical quality
What are the levels of measurement?
Nominal, ordinal, interval and ratio
List possible questions for nominal, ordinal, and ratio levels of measurement if you want to know the hours the individual worked the prior week
Nominal: Did you work last week? Ordinal: 10-15 16-20 21-25 26-30 Ratio: How many hours did you work last week? _____
How is standard Normal Distribution relevant?
Normally distributed V's do not have to fit standard normal D's perfectly, but do need to come close to be normally distributed
Frequency
Number of times an attribute occurs within a data set
________ on polygons are also the __________ of Att's that fall within respective distances from the ________ of a normally distributed __________ or __________ level of measurement.
Numbers / percentages / mean / interval / ratio
Causal
One variables attributes cause or contribute to the other variable
What are the 3 forms of research hypothesis?
One-tailed/directional, two-tailed/nondirectional, and the Null
Dichotomous variable
Only two attributes: (Yes No) (Male Female)
Array?
Ordering every attribute of a variable that occurred within the raw data from low to high: automatically generated with software
Cumulative FD and type of data
Ordinal or higher; counts a running total of how many
What is association?
Predicts certain attributes of one variable will be found with another
Validity
Present only when the measurement is considered both reliable and consistent; it measures what it is supposed to
Why use a 5 number summary and what are they?
Provides a more complete image of the list measurements of a V. It reports the minimum A, 25th percentile, 50th percentile (median), 75th percentile, and the maximum Att
How to calculate z score
Raw score minus mean divided by the standard deviation
What is meant by data reduction?
Reduce large amounts of date into simple and more understandable info without distorting or losing overal meaning
Datum
Singular form of data
Explain multivariate:
Sort the interrelatedness of three or more variables; sort complex relationships among variables that are not simple cause and effect ; examine relationships between and among variables similtaneously
Variance
Sum of squared deviations from the mean and divided by number of cases
asymptotic
Tales of D get closer and closer to zero, but never touch the axis
Standard Deviation
The amount of variance within a data set
Mode
The attribute that occurs most frequently; most unrestricted and can be used with all LOM, but does not describe interval or ratio very well
Interquartile Range
The data from the 25th percentile to the 75th or the middle 50%
Skewness
The degree to which a D of V is not symmetrical (polygon)
Range
The distance between the maximum and minimum numbers plus one. Easily distorted by outliers
Standard deviation and 5 characteristics
The square root of the variance. 1)Frequent in quantitative research. 2)useful for describing the variability of research. 3)Use with interval/ratio data. 4)Uses all case attributes. 5)Tells the extent the attributes cluster around the mean. determine where a given attribute falls relative to other Att's (percentile)
Null hypothesis
The variables are unrelated
Central Tendency?
The way data is bunched around the center of the D
Standard Normal Distribution? Features?
Theoretical D based on the normal curve. Perfectly symmetrical, rarely happens, divided into 6 equal units. Asymptotic, bell shaped, M, M, M all fall approx same place
Why use a box plot?
To display the central tendency and dispersion of the A's of a V. Also displays outliers.
Bimodal
Two attributes occur most frequently
How to combat human error?
Use multiple data sets to measure one variable (multiple indicators)
Inferential statistics
Used when data are collected from a sample rather than a population. Sample statistics are estimates of population parameters
Bar graphs and line diagrams? Level of measurement and features?
Useful for displaying freq list for nominal level / Bars of equal width that do not touch; one for each attribute. Height reflects freq and order is abrituary
Mean deviation
Uses all attributes; the average amount the attributes differ/deviate from the mean. Add all absolute sums of deviations and divide by the total number of attributes
If we were measuring self-esteem, what would be the variable, indicator, attribute, and data source?
Variable: self esteem; Indicator: Self-esteem questionnaire (ISE); Attribute: Scores on questionnaire; Data source: Clients complete the ISE on posttest-pretest
What is a two-tailed or nondirectional hypothesis
Variables are related but do not predict the direction of their relationships
Non causal
Variables do not cause each other but have an identifiable pattern
What is the relevance of a null hypothesis?
We try to prove that the null hypothesis is false because you cannot prove definitively that the other two are absolutely true.
Discrete variable
Whole numbers; Only finite number of attributes (SAT scores)
What is the horizontal line of a graph? Called the _________. Displays _________
X axis / abscissa / attributes
What is the vertical line of a graph? Called the ________. Displays ___________
Y axis / ordinate / frequencies
Any z score is the _________ of ________ _________ that an attribute falls from the mean. They can be _______ or _______ the mean
number / standard deviations / above / below
Histogram
ordinal or higher / bars touch / ordinal - equal width / interval or ratio - width varies