AM2- stats
What p-value is considered significant?
0.05
What percentage of values fall within ±1 standard deviation the mean in a normal distribution?
67%
What percentage of values fall within ±2 standard deviation the mean in a normal distribution?
95%
What does precise mean?
A measure is repeatable
What is the standard error of the mean?
A measure of how close the mean of a sample is to the legit mean of the population
What is a confidence interval?
A range of values in which a specified probability of the means of repeated samples would be expected to fall
What is a biologically relevant effect?
An effect considered by expert judgement as important and meaningful for human, animal, plant or environmental health
Why are samples used in statistics?
Because it's rarely possible to measure the whole population
How can you increase precision?
By increasing the sample size
What are the two kinds of quantitative data?
Continuous Discrete
What can scatterplots be used to show?
Correlation between two factors
What is ordinal data?
Data that can be placed in a sequence
What is nominal data?
Data that can be placed in categories with no order
What is discrete data?
Data that can only take particular values
What is continuous data?
Data that can take any value
What is stratified sampling?
Dividing the population into subgroups then selecting a sample from each of these groups
Where the independent variable is categorical (group, gender etc) and the dependant variable is continuous:
For two categories, use a t-test or a Wilcoxon/Mann Whitney For more than two categories, use a general linear model
When would you use Student's T-test?
If the data is approximately normally distributed
When would you use Wilcoxon or Mann-Whitney Rank tests?
If the data is not approximately normally distributed
When can the range of a set be misleading?
If there's outliers/unusual data points
When would you use dot plots?
If you want to compare two (or more) sets of data, so you can plot them side by side.
What does standard deviation tell you?
It's a measure of variation which gives an indication of the average spread of the data about the mean
When is it OK/not OK to jitter data?
It's ok for categorical data but be careful with jittering continuous variables
Where both the dependant and independent variable are continuous, use:
Linear regression (or general linear model) for parametric data Spearman rank test for non-parametric data.
What does the P value mean?
P is the probability (0-1) that the differences between two populations are due to chance.
Where observations are paired, use
Paired t-test for two groups General linear model for more than two
What are the two main kinds of data?
Quantitative Categorical
How can data be displayed in graphs?
Simple scatterplots Grouped frequency table Histogram Box and whisker plot Bar chart Dot plot
What is jittering?
Slight sideways movements, to make all the data points clear
What is the mean?
The average value of the data set
What is the effect size?
The difference between the means of two groups
What is the range of a set of data?
The difference between the smallest and largest data point
What is the interquartile range?
The difference between the upper quartile and the lower quartile.
What are reference ranges based on in medicine?
The fact that 95% of the population will lie within two standard deviations of the mean
What are the dependent and independent variables determined by?
The hypothesos
What is evidence based veterinary medicine?
The idea that veterinary medicine should be based in solid scientific evidence
What is the median?
The middle value of an ordered set of data
What is the mode of a data set?
The most common value
What is stats?
The process of establishing the probability of samples coming from the same populations
What is the t-statistic measuring?
The ratio between the size of the effect and the amount of variation between individuals
What does the term population mean in statistics?
The set of individuals/objects that we are interested in knowing about.
What's the difference between standard deviation and standard error of the mean?
The standard deviation is a measure of the variation within the population The standard error of the mean is a measure of how close the mean of your sample is to the true mean of the population
What is most common reason for incorrectly accepting the null hypothesis and rejecting the study hypothesis?
The study size is too small
What is the independent variable?
The variable that is changed
What is a dependent variable?
The variable that is measured; it changes depending on the independent variable
Define the 'inter-individual variation'
The variation around the specific group's mean
What does a p-value of less than 0.05 mean?
There is less than 1 in 20 chance that the samples are from the same population.
What's the point of graphs?
They can be used to: Identify the general shape of the data Identify unusual or outlying points Compare the shape of two independent data sets Identify relationships between two variables in a dataset
Why does the population need to be as clearly defined as possible?
To be able to take a valid sample To be able to make inferences from the sample which are valid for the target population.
When is a bar chart used?
To display discrete or categorical data
False positives are what kind of error?
Type I
False negatives are what kind of error?
Type II error
What is categorical data?
Types of data which may be divided into groups.
What is a type I error?
When you decide that the two samples are from different populations, when they're not
What are type II errors?
When you decide that the two samples are from the same population, when they're not
When would you transform data?
When you've got continuous data that isn't normally distributed
What are degrees of freedom?
a correction necessary for small sample sizes, where estimates of the true mean of the population are likely to be inaccurate.
What is the normal distribution?
a function that represents the distribution of many random variables as a symmetrical bell-shaped graph.
What is a sample?
a subset of the population
What is a linear model?
an equation of a straight line through the data has the form of y = mx + b -can summarize general patterns
What are the types of categorical data?
nominal ordinal
What is quantitative data?
numerical data
If the 95% confidence intervals don't overlap, the two samples are likely to be drawn from ...
populations with different means
What is a null hypothesis?
the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
Define the term unbiased
the parameters (mean, standard deviation) estimated from a sample will be close to the true estimates and not systematically shifted in any way