ISDS Test 4
How did Visa improve customer service while also improving retention of fraud?
By creating *more accurate* fraud identification systems, Visa was able to *decrease* the number of false positives and *reducing* customer concerns and complaints that went along with them.
Data Mining Process: CRISP-DM
CRISP-DM process is *most comprehensive, common, and standardized data mining process*
Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes?
Classification
Decision Trees Created by Logic Rules
Slides 51 and 52
Data Mining
Starts with a loosely defined discovery statement; *uses all existing data to discover novel patterns and relationships*; *looks for data sets as big as possible*; several million to a few billion are large for data mining studies
Definition of Data Mining
The *nontrivial* process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases.
How can data mining be used to fight terrorism?
The application case discusses use of data mining to detect money laundering and other forms of terrorist financing. Using data mining on *data about imports and exports* or finding an *observed price deviation* can help to detect *tax avoidance/evasion, money laundering, or terrorist financing*. Other applications could be to track the behavior and movement of potential terrorists, as well as text mining emails, blogs, and social media threads.
Four major types of DM patterns
Association Prediction Cluster (segmentation) Sequential (or time series) relationships
Association
*Associations* find the commonly co-occurring groupings of things, such as a) *tells you what products your customers are most likely to purchase at the same time.* *Market basket analysis examples*: Beer and diapers Comprehensive automobile insurance and health insurance Online books, online music and podcasts
Clusters (segmentation)
*Clusters (segmentation)* identify natural groupings of things based on their known characteristics, such as assigning customers in different segments based on their demographics & past purchase behaviors. Examples: *Market segmentation of customers* *Establishing new tax brackets* Harry Potter - Hogwarts Sorting Hat Seating guests at a wedding
Data Mining Goes to Hollywood: Predicting Financial Success of Movies
*Goal*: Predicting financial success of Hollywood movies before the start of their production process *How*: Use of advanced predictive analytics methods *Results*: promising
Association Rule Mining and Market Basket Analysis
*Input*: the simple point-of-sale transaction data *Output*: Most frequent affinities among items *Example*: according to the transaction data... "Customer who bought a laptop computer and a antivirus software, also bought extended service plan 70 percent of the time." *How do you use such a pattern/knowledge?* -Put the items next to each other -Promote the items as a package -Place items far apart from each other!
Predictions
*Predictions* tell the nature of future occurrences of certain events based on what has happened in the past, such as... a) *predicting the winner of the Super Bowl (classification) or* b) *forecasting the absolute temperature of a given day* (regression). c) *Prediction problems where the variables have numeric values are most accurately defined as regressions*.
Sequential (or time series) relationships
*Sequential (or time series) relationships*, discover time-ordered events, such as predicting that an existing banking customer who already has a checking account will open a savings account followed by an investment account within a year.
Statistics
*Starts with a well defined proposition and hypothesis*; collects sample data to test hypothesis; looks for right size of data or takes a data sample; *A few hundred or a thousand data points are large enough to a statistician*
Association Rule Mining and Market Basket Analysis
Because of its successful application to retail business problems, association rule mining is commonly called *market basket analysis*. *Often used as an example to describe DM to ordinary people, such as the famous "relationship between diapers and beers!"* *Definition*: Identify strong relationships among different products (or services) that are usually purchased together (show up in the same basket together, either a physical basket at a store or a virtual basket at an e-commerce Web site).
What are the top challenges for law enforcement agencies and departments like Miami-Dade Police Department? Can you think of other challenges (not mentioned in this case) that can benefit from data mining?
*The primary problems were rising crime, impatient city leaders, and budget pressures*. Challenges for many agencies revolve around being able to provide the best possible service within a limited budget. This means that agencies must be able to be efficient in their use of time and resources, as well as ensuring that their results are positive. These issues are believed to be consistent across many departments and jurisdictions. In addition, other areas may struggle with specific questions about the use of funds, and the cost-benefit of different types of enforcement or possible prevention programs.
Cluster Analysis for Data Mining
*Used for automatic identification of natural groupings of things* *Learns the clusters of things from past data, then assigns new instances* In marketing, it is also known as *segmentation*
Association Rule Mining
-A very popular DM method in business -Finds interesting relationships (affinities) between variables (items or events) -Also known as market basket analysis -helps understand the purchase behavior of a buyer in the retail business -Often used as an example to describe DM to ordinary people, such as the famous "relationship between diapers and beers!" -Finds an affinity of two products to be commonly together in a shopping cart
Banking & Other Financial Data Mining Applications
-Automate the loan application process -Optimizing cash reserves with forecasting
Insurance Data Mining Applications
-Identify and prevent fraudulent claim activities -Determine optimal rate plans
CRM Applications Include
-Maximize return on marketing campaigns -Improve customer retention (churn analysis) -Maximize customer value (cross-, up-selling) -Identify and treat most valued customers
Retailing and Logistics Data Mining Applications
-Optimize inventory levels at different locations -Improve the store layout and sales promotions
Brokerage and Securities Trading Data Mining Applications
-Predict changes on certain bond prices -Forecast the direction of stock fluctuations
Manufacturing and Maintenance Data Mining Applications
-Predict/prevent machinery failures -Discover novel patterns to improve product quality
Most common standard processes, in the order of effectiveness:
1. CRISP-DM (Cross-Industry Standard Process for Data Mining) 2. SEMMA (Sample, Explore, Modify, Model, and Assess) 3. KDD (Knowledge Discovery in Databases)
Data Mining Truths
1. If using a mining analogy, "knowledge mining" would be a more appropriate term than "data mining." 2. The cost of data storage has plummeted recently, making data mining feasible for more firms. 3. Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from analyzing the vast data amounts routinely collected. 4. Parallel processing is sometimes used for data mining because of the massive data amounts and search efforts involved 5. The number of users of free/open source data mining software now exceeds that of users of commercial software versions.
Why is Data Mining gaining attention?
1. More intense *competition* at the *global scale*. 2. *Recognition of the value in data sources*. 3. *Availability of quality data* on customers, vendors, transactions, Web, etc. a. *A large portion of "understanding the customer" can come from analyzing the vast amount of data that a company routinely collects*. b. *This has helped Amazon and many other successful businesses*. 5. The exponential increase in data processing and storage capabilities; and *decrease in cost*. a) *The cost of data storage has plummeted recently, making data mining feasible for more firms*.
Data Mining Myths
1. provides instant solutions and crystal-ball predictions 2. is not yet viable for business applications 3. requires a separate, dedicated database 4. can only be done by those with advanced degrees 5. is only for large firms that have lots of customer data 6. is another name for the good-old statistics
Association Rule Mining
A *very popular* DM method in business *Finds interesting relationships (affinities) between variables (items or events)*
What do you think are the promises and major challenges for data miners in contributing to medical and biological research endeavors?
According to the American Cancer Society, half of all men and one-third of all women in the United States will develop cancer during their lifetimes; approximately 1.68 million new cancer cases will be diagnosed in 2017. Cancer is the second most common cause of death in the United States and in the world, exceeded only by cardiovascular disease. *Data mining shows tremendous promise for helping to understand cancer, leading to better treatment and saved lives. Data mining is not meant to replace medical professionals and researchers, but to complement their invaluable efforts to provide data-driven new research directions and to ultimately save more lives. Without the cooperation and feedback from the medical experts, data mining results are not of much use. The patterns and relationships found via data mining methods should be evaluated by medical professionals who have years of experience in the problem domain to decide whether they are logical, actionable, and novel to warrant new research directions*.
Decision Trees
Analysis procedure which classifies observations into distinct groups based upon the values of predictor/input variables
Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features?
Clustering (example: market segmentation). Clustering partitions a collection of things into segments whose members share *similar characteristics*.
CRISP-DM phases:
Composed of six consecutive phases Step 1: Business Understanding Step 2: Data Understanding Step 3: Data Preparation (data pre-processing) (Steps 1-3 Account for 85% of project time) Step 4: Model Building Step 5: Testing and Evaluation Step 6: Deployment (use for prediction)
Cluster Analysis for Data Mining
Creates groups that have *maximum* similarity among members within each group and *minimum* similarity among members across the groups. *Market Segmentation* is an analysis that aids in dividing customers into groups based upon demographics so that you can target those groups with different advertising campaigns. -*Example*: Marketing to the senior citizens is done differently than to the millennials.
k-Fold Cross Validation (rotation estimation)
Data is split into k mutual subsets and k number training/testing experiments are conducted
Data Mining Characteristics and Objectives
Data is the most critical ingredient for DM which may include soft/unstructured data.
Data Mining and Privacy
Data that is collected, stored, and analyzed in data mining is often *private and personal*. One way to accomplish privacy and protection of individuals' rights when data mining is by *de-identification* of the customer records prior to applying data mining applications, so that the records cannot be traced to an individual. Third party providers of publicly available datasets protect the anonymity of the individuals in the data set primarily *by removing identifiers such as names and social security numbers*.
What was the challenge Dell was facing that led to their analytics journey?
Dell noticed that customers were spending a significant amount of time evaluating products before they contacted sales. Dell wanted to ensure that this evaluation of products was positive for the company, and wanted to make sure that they were providing accurate information in a format that was expedient for customers. The problem was that the company had a huge variety of information available, and figuring out how to understand that information required additional research. All of this is true.
Decision Trees
Employs a divide-and-conquer method Recursively divides a training set until each division consists of examples from one class
How can data mining be used for ultimately curing illnesses like cancer?
Even though cancer research has traditionally been clinical and biological in nature, in recent years data-driven analytic studies have become a common complement. In medical domains where data- and analytics-driven research have been applied successfully, novel research directions have been identified to further advance the clinical and biological studies. Data mining algorithms that predict cancer survivability with high predictive power are valuable but *cannot replace the medical professionals*. Using data mining techniques, medical researchers are able to identify novel patterns, paving the road toward a cancer-free society. *Data mining methods are capable of extracting patterns and relationships hidden deep in large and complex medical databases*.
Chapter Overview
Generally speaking, data mining is a way to develop intelligence (i.e., *actionable information* or knowledge) from data that an organization collects, organizes, and stores. A wide range of data mining techniques are being used by organizations to gain a *better understanding of their customers* and their own operations and to *solve complex organizational problems*.
Predictive Accuracy
Hit rate
Why do law enforcement agencies and departments like Miami-Dade Police Department embrace advanced analytics and data mining?
Law enforcement agencies have embraced advanced analytics and data mining because it allows them to address many of the needs that they have in their departments. Specifically, it allows them to be more efficient in their use of money and resources. They are able to do this by being more selective in the types of activities that they engage in. *Additionally, data may be used to help them look at outstanding crimes (cold cases), and to find new avenues that may be explored to solve these previously unsolved crimes*.
Opening Vignette
Miami-Dade Police Department Is Using Predictive Analytics to Foresee and Fight Crime
Are Statistics and Data Mining the Same?
No
Data Mining extracts patterns from data
Pattern? A mathematical (numeric and/or symbolic) relationship among data items Hidden patterns are identified.
Single/Simple Split
Simple split (or holdout or test sample estimation) Split the data into 2 mutually exclusive sets: training (~70%) and testing (30%) For Neural Networks, the data is split into three sub-sets (training [~60%], validation [~20%], testing [~20%])
Application Case 4.6
Slide 79
What did Influence Health do?
The company implemented an analytics system that looked at health records to help identify when potential customers would need health services. By better understanding the probability of need, the company was better able to market its services to customers. *The goal of the system was increasing customers' use of its services. Customers are more often comparing services from a variety of healthcare service providers before selecting one*.
Classification
The data is *qualitative*. The output variable is *categorical* (nominal or ordinal) in nature. *Nominal data* has finite non-ordered values. *Examples*: yes/no (Y/N); gender (M/F); ethnic groups (a choice from a list of groups) *Ordinal data* has finite ordered values. *Examples*: Store location is good, fair, bad. Customer credit rating is 0-Bad, 1-Fair, 2-Excellent
How can data mining help companies in the healthcare industry (in ways other than the ones mentioned in this case)?
There are many possibilities. One may be a way to evaluate and predict services that may be needed by a population. This enhanced understanding may help drive preventive care in the future. *For example, Influence Health was able to evaluate over 195 million patient records in only two days and extract complex insights into patient disease encounter with greater speed and accuracy than ever before*.
What solution did Dell develop and implement? What were the results?
To solve this problem, Dell created a single data mart that contained information from a wide variety of sources. *In the Dell case study, engineers working closely with marketing, used lean software development strategies and numerous technologies to create a highly scalable, singular data mart. This data mart became the singular repository for information that was used for making decisions in the company. This decision had many positive results, including saving significant amounts in operational costs as well as driving increased revenues*.
Cluster Analysis for Data Mining
Used for automatic identification of *natural groupings* of things Also known as *segmentation* Cluster analysis has been used extensively for *fraud detection* (both credit card and e-commerce fraud) and *market segmentation of customers* in CRM. *Place customers into groups having very similar characteristics*.
What challenges were Visa and the rest of the credit card industry facing?
Visa was facing twin challenges. *The focus of the predictive analytics system in the Visa credit card case was two-fold: on more accurately detecting and handling fraudulent claims and reducing customer concerns and complaints that went along with these claims*. The first challenge was the growing rates of credit card fraud, while the second challenge was inaccurate fraud identification systems that created customer issues. For example, customers were on a dream vacation or a critical business trip and tried to use their Visa credit card for a large and unexpected purchase of goods or services. This flagged the transaction as possible fraud in the Visa fraud risk tools. Visa would then deny the transaction and freeze the credit card. This is a false positive result. Customers were very unhappy when this happened to them.
Decision Tree Algorithm
When a problem has many attributes that influence the classification of different patterns, decision trees may be a useful approach.
Did Target go too far? Did it do anything illegal? What do you think Target should have done? What do you think Target should do next (quit these types of practices)?
a) Lawful to store and analyze transaction & customer data. The company did not use an private or personal data. Legally speaking, there was no violation of any laws. b) Disturbing to identify teen pregnancy, terminal disease, divorce, or bankruptcy.
Why is it important for many Hollywood professionals to predict the financial success of movies?
a. *Hard to predict box-office receipts for a movie* b. *Predictive models in early stages of movie production is effective to minimize investments in flops*. The movie industry is the "land of hunches and wild guesses" due to the difficulty associated with forecasting product demand, making the movie business in Hollywood a risky endeavor. If Hollywood could better predict financial success, this would mitigate some of the financial risk.
Did Target go too far? Did it do anything illegal? What do you think Target should have done? What do you think Target should do next (quit these types of practices)?
a. *Lawful to store and analyze transaction & customer data. The company did not use an private or personal data. Legally speaking, there was no violation of any laws*. b. *Disturbing to identify teen pregnancy, terminal disease, divorce, or bankruptcy*. Target might have made a tactical mistake, but they certainly didn't do anything illegal. They did not use any information that violates customer privacy; rather, they used transactional data that most every other retail chain is collecting and storing (and perhaps analyzing) about their customers. Indeed, even the father apologized when realizing his daughter was actually pregnant. *The fact is, we live in a world of massive data, and we are all as consumers leaving traces of our buying behavior for anyone to see*.
What do you think about data mining and its implication for privacy? What is the threshold between discovery of knowledge and infringement of privacy?
a. *Target sent a teen maternity ads because Target's analytic model suggested she was pregnant based on her buying habits*. b. *Tradeoff between knowledge discovery and privacy rights*. c. *Risk offending customers and hurt the bottom line*. There is a tradeoff between knowledge discovery and privacy rights. Retailers should be sensitive about this when targeting their advertising based on data mining results, especially regarding topics that could be embarrassing to their customers. Otherwise, they risk offending these customers, which could hurt their bottom line. This news story ran in Forbes and New York Times, among other notable publications.
How can data mining be used for predicting financial success of movies before the start of their production process?
a. *To determine how much to invest in the movie production* b. *To evaluate tradeoffs to maximize success of movie production*. c. *Classification problem for prediction >Dependent variable - Class no. 1-9 (flop to blockbuster) >Select combination of independent variables, e.g., MPAA rating, competition, actors (star value), genre, special effects, sequel, etc.* The way the textbook authors, Sharda and Delen, did it was they applied individual and ensemble prediction models, and were able to identify significant variables influencing financial success. They also showed that by using sensitivity analysis, decision makers can predict with fairly high accuracy how much value a specific actor (or a specific release date, or the addition of more technical effects, etc.) brings to the financial success of a film, making the underlying system an invaluable decision aid.
Why is it important for many Hollywood professionals to predict the financial success movies?
a. Hard to predict box-office receipts for a movie b. Predictive models in early stages of movie production is effective to minimize investments in flops
What do you think about data mining and its implication for privacy? What is the threshold between discovery of knowledge and infringement of privacy?
a. Target sent a teen maternity ads because Target's analytic model suggested she was pregnant based on her buying habits. b. Tradeoff between knowledge discovery and privacy rights. c. Risk offending customers and hurt the bottom line.
How can data mining be used for predicting financial success of movies before the start of their production process?
a. To determine how much to invest in the movie production b. To evaluate tradeoffs to maximize success of movie production. c. Classification problem for prediction >Dependent variable - Class no. 1-9 (flop to blockbuster) >Select combination of independent variables, e.g., MPAA rating, competition, actors (star value), genre, special effects, sequel, etc.
A representative application of association rule mining includes popular uses in _ and in _.
business; medicine
In classification problems, the primary source for accuracy estimation is the __
confusion matrix
Scalability
construct a prediction model efficiently given a large amount of data
Striking it rich requires __
creative thinking
The data miner is often an __
end user
Only a few years ago, Data Mining became__?
exciting technology
Customer Relationship Management (CRM)
extends traditional marketing by creating one-on-one relationships with customers.
The CRISP-DM process is ____
highly repetitive and experimental
DM has become ____ for a vast majority of organizations
imperative and common practice
Another name for data mining
knowledge mining
Apriori Algorithm
most commonly used algorithm to discover association rules most commonly used for association rule mining. -Given a set of itemsets, the algorithm attempts to find subsets that are common to at least a minimum number of the itemsets. -Uses a bottom-up approach -Widely used for data mining
Classification
most frequently used DM method for real-world problems; Learn from past data, classify new data
Robustness
overcome noisy data to make somewhat accurate predictions
Because of the large amounts of data and massive search efforts, it is sometimes necessary to use __ for data mining.
parallel processing
DM extract _ from data
patterns
Novel
previously unknown patterns are discovered
Potentially Useful
results should lead to some business benefit.
Many of the techniques used in data mining have their roots in traditional __ and __.
statistical analysis; artificial intelligence (AI)
DM is a __ for companies to compete with the giants of Amazon, Capital One, and Marriott?
strategic weapon
Valid
the discovered patterns should hold true on new data.
Although the term data mining is relatively new...
the ideas behind it are not new.
Data Mining Tools - Know the facts:
use mathematical techniques for extracting hidden patterns for predictive purposes. use patterns in data to develop mathematical rules for predicting outcomes for future observations. are commonly used to identify customer buying patterns to increase sales and for fraud detection, among other things. In data mining, classification models help in prediction. A data mining study is specific to addressing a well-defined business task, and different business tasks require different sets of data.