Chapter 3 Quiz
Classification begins with
Decision Boundaries Training Data: existing data that have been manually evaluated and assigned to a class Test Data: existing data used to evaluate the model Decision Trees: used to divide data into smaller groups (pruning removes branches to avoid overfitting the model) Decision Boundaries: mark the split between one class and another
Prescriptive Analytics Examples
Decision Support Systems: rule-based systems that gather data and recommend actions based on the input (use rules to guide the accountant, rules derived from past behavior) Machine Learning and AI: adapt to new external data to recommend a course of action
Diagnostic Analytics Examples
Profiling: identifies the "typical" behavior of an individual, group, or population by compiling summary statistics about the data (done primarily using structured data) Clustering: helps identify groups of individuals that share common underlying characteristics Similarity Matching: used to identify similar individuals based on what is known about them Co-occurrence grouping: discovers associations between individuals based on common events Benford's Law: compares actual to expected values
Predictive Analytics Examples
Regression: estimates or predicts the numerical value of a dependent variable based on the slope and intersect of a line and the value of an independent variable (helps predict expected outcomes) Classification: predicts a class or category for a new observation based on the manual identification of classes from previous observations Link Prediction: predicts a relationship between two data items
Descriptive Analytics Examples
Summary Statistics: describe a set of data in terms of their location, shape, range, and size Data Reduction/Filtering: reduce the amount of observations to focus on relevant items Fuzzy Matching: locates approximate matches (useful for identifying relationships in imperfect data)
Diagnostic Analytics
procedures that explore the current data to determine why something has happened the way it has, typically comparing the data to a benchmark; how individual data relates to the general population
Prescriptive Analytics
procedures that model data to enable recommendations for what should be done in the future
Descriptive Analytics
procedures that summarize existing data to determine what has happened in the past
Predictive Analytics
procedures used to generate a model that can be used to determine what is likely to happen in the future