Data Mining Test # 1

Ace your homework & exams now with Quizwiz!

Data transformation techniques

min normalization - max normalization - z score normalization - decimal scaling normalization -

How to use the binning method to handle noisy data?

Equal-width binning - Equal-depth binning -

the major steps of KDD (knowledge discovery from databases) or data mining process

Input data- Collection of data objects and their attributes Data preprocessing - Data extraction, cleaning, and transformation comprises the majority of ,the work of building a data warehouse Data Mining - Postprocessing - Information -

Differences between Data mining and traditional statistical methods

Statistics is the traditional field that deals with the quantification, collection, analysis, interpretation, and drawing conclusions from data. Data mining is an interdisciplinary field that draws on computer sci- ences (data base, artificial intelligence, machine learning, graphical and visualization models), statistics and engineering (pattern recognition, neural networks).

The major tasks of data preprocessing

Data Cleaning - fill in missing values, identify outliers and smooth out noisy data, correct inconsistent data, resolve redundancy caused by data integration. Missing data - Ignore the tuple: usually done when class label is missing (assuming the tasks in classification not effective when the percentage of missing values per attribute varies considerably, Fill in it automatically with (global constant , attribute mean). Noisy data - random error or variance in a measured variable, Incorrect attribute values may due to, Other data problems which requires data cleaning.

Data Mining

Non-trivial extraction of implicit, previously unknown and potentially useful information from data. Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns to support future decisions

data mining tasks

Predictive Methods - Use some variables to predict unknown or future values of other variables. Descriptive Methods - Find human interpretable patterns/rules that describe the data.

Differences between Data mining and Database query processing

Query Tools - are tools that help analyze the data in a database. They provide query building, query editing, searching, finding, reporting and summarizing functionalities. Data mining - extraction of previously unknown and interesting information from raw data, utilize statistical models to look for hidden patterns in data. Data miners are interested in finding useful relationships between different data elements.

data preprocessing

to transform the raw input data into an appropriate format for subsequent analysis.


Related study sets

NUR 150 Unit 3 Pharmacology-PrepU

View Set

OB EXAM Female Partner Abuse/ Intimate Partner Violence

View Set

BSC2086 Chapter 23 Homework Assignment on Respiratory System

View Set

MO Health and Life Insurance Exam

View Set

Integumentary Management Davis Ch.29

View Set

Midterm Check Your Understanding

View Set

MGMT 309 - Wesson Exam 2 (Ch 6, 7, 8, 20)

View Set