DATA MINING
OLAP (On-line Analytical Processing)
-Analysis technique with functionalities such as summarization, consolidation, and aggregation as well as the ability to view information from different angles.
Data Mining Tasks
-Classification -Regression -Clustering -Summarization -Link Analysis
Difference between data warehouse and data mining
-Data mining does not require the use of a warehouse, but it may be the best foundation for mining. -If multiple analyses are run in sequence, the data need to be held constant (as in a DW). In an operational database, data change often. -the data in the DW is integrated and stable
Components of Data Mining
-Database, Data Warehouse, World Wide Web, or other information repository -Database and Data warehouse Server -Knowledge Base -Data mining engine -Pattern evaluation module -User Interface
Data Mining Technologies
-Statistics -Neural networks, genetic algorithms, fuzzy logic -Decision trees
Roots of Data Mining
-data mining was called statistical analysis, and the pioneers were statistical software companies such as SAS and SPSS. -traditional techniques had been augmented by new methods such as fuzzy logic, heuristics and neural networks
Data Mining ("Knowledge Mining From Data" to "Knowledge Mining")
-extracting or mining knowledge from large amounts of data. -process of discovering interesting knowledge from large amounts of data stored in databases, data warehouses, or other information repositories. -process of using raw data to infer important business relationships.
Data Warehouse
-repository architecture that has emerged -repository of multiple heterogeneous data sources organized under a unified schema at a single site in order to facilitate management decision making. -Includes data cleaning, data integration, and OLAP.
Knowledge Discovery from Data (KDD) steps:
Data Cleaning Data Integration Data Selection Data Transformation Data Mining Pattern Evaluation Knowledge Presentation
Data Mining vs. KDD(Knowledge Data Discovery)
Data Mining: Use of algorithms to extract the information and patterns derived by the KDD process. Knowledge Discovery in Databases (KDD): process of finding useful information and patterns in data.