Module 2: Fundamentals of Probability, Random Variables, Distributions, and Collecting Data
True or false: The Belmont Report was designed for social science applications.
False
A research experiment regarding which disorder prompted increased federal protection for human subjects?
Syphillis
The probability mass function 𝑓𝑋 of a discrete random variable 𝑋 has the properties:
0≤𝑓𝑋(𝑥)≤1 ∑𝑥𝑓𝑋(𝑥)=1
Consider the sample space of 𝑛 tosses of a coin. What is the size of this space?
2^n
Assume that you have a continuous random variable which is uniformly distributed in the range: [3,8]. What is the probability that the random variable takes on a value less than or equal to 7?
4/5
What is "BeautifulSoup"?
A package in Python which parses the HTML in websites to help users find information
What is an API?
A programming interface, typically constructed by the developers of an application, that among other things helps users obtain certain structured data from the application more easily
What is Amazon MTurk (Mechanical Turk)?
An online marketplace where users can pay other users to complete simple tasks
what is the principle by which we can simply add the number of combinations on the left hand side to obtain the number of combinations on the right hand side?
Each summand on the left hand side enumerates events which are disjoint
True or False: Google Maps' API gives users direct access to historical data.
False
True or False: The main goal of the IRB is to ensure that researchers set aside appropriate funding for their projects.
False
True or false: Organizations - such as Amazon - are allowed to collect data and publish without being subject to human subject protection rules.
False
True or false: The binomial distribution describes the number of successes in 𝑛 Bernoulli (binary outcome) trials, with the additional constraint that in each trial the probability of success has to be equal to the probability of failure.
False
True or false: The conditions for a probability density function to be valid include that the density at each point is less than 1 and that the function must integrate to 1.
False
probability of the space
P(S) = 1
What are the core principles of the Belmont principles?
Respect for persons, beneficence, justice
What issues may come up when looking for data?
The data is free, the data is partially confidential (access is restricted), you need to get an agreement to access the data, you need to comply with requirements for data security
Joint probability density functions (joint PDF) for continuous random variables exhibit which of the following properties?
The value of the joint PDF at any particular point is non-negative The joint PDF integrates to 1 over the entire domain
True or False: API users do not necessarily need to be completely familiar with Python.
True
discrete random variable
a random variable that can take one of a finite number of distinct outcomes
continuous random variable
a random variable that may assume any numerical value in an interval or collection of intervals
In a simple sample space...
all outcomes are equally likely and unbiased
What is the Wayback Machine?
an archive of web pages
what is web scraping?
crawling a web page for information
In order to get the joint probability that her headache returns within three hours, the process is to take the () of the () over the regions where 𝑥 is between () and 𝑦 is between ()
double integral joint probability density function 0 and 3 0 and 3
mutual exclusivity
empty intersection - no overlap
exhaustive
includes all possible outcomes
sampling 𝑘 items from 𝑛 items without replacement
n!/(n-k)! OR 𝑛(𝑛−1)...(𝑛−𝑘+1)
sampling 𝑘 items from 𝑛 items with replacement
n^k
Given the CDF of a continuous random variable, which of the following processes allows you to get the PDF of that random variable?
take the derivative of the CDF
This distribution describes the number of "successes" commonly denoted as () in a sample of size () that is drawn from a population of size N whose initial probability of success is k/N () replacement
x, n, without
What is the condition for disjoint probabilities?
𝑃(𝐴∪𝐵)=𝑃(𝐴)+𝑃(𝐵)