Ch. 4 & 5 test: Data Mining
Market-basket Analysis
...
Data Mining Patterns
1. Associations 2. Predictions 3. Clusters 4. Sequential relationships
Data Mining Process
1. Business Understanding 2. Data Understanding 3. Data Preparation 4. Model Building 5. Test and evaluation (model) 6. Deployment
Sensitivity Analysis
The use of previously-trained prediction models in order to accurately understand the effect of specific parameters on the end results.
Applications of Data Mining
a. identify successful therapies for illnesses and to discover new drugs b. reduce fraudulent behavior (insurance claims and credit card usage) c. identify customer buying patterns d. reclaim profitable customers e. aid in market-basket analysis f. better target customers/clients
Artificial Intelligence
the science of designing and programming computer systems to do intelligent things and to simulate human thought processes suchs as reasoning and understanding language
Characteristics of Data Mining
1. Data is previously cleansed and consolidated into Data Warehouses. 2. Data Mining takes place through a client/server architecture, or through Web-Based information systems architecture. 3. Sophisticated tools are constantly being developed to help remove the information ore from corporate files or archived public records. 4. The miner is often the end-user, who are able to answer ad hoc queries with little or no programming skill. 5. Intelligent interpretation of end results is necessary to reach an effectual resolution. 6. Data Mining tools are readily combined with spreadsheets and other software development tools. 7. Parallel Processing may be required to mine large amounts of data.
Data Mining
1. The process through which previously unknown knowledge is discovered. 2. Discovering knowledge from large amounts of data.
Machine Learning
Artificial intelligence techniques that make it possible for machine performance to improve based on feedback from past performance. (Used in games like chess/checkers; based on prior actions)
CRISP-DM
Cross-Industry Standard Process for data mining is the non-proprietary standard methodology, where a need is recognized and then a solution is created to resolve that need.
Aspects of data mining
Data Mining is a non-trivial process of determining valid, novel, potentially usable, and understandable patterns in data.
SEMMA
Sample Explore Modify Model Assess, as created by SAS institute, ideal for exploratory analysis of a pertinent sample of data.