Data mining test study guide

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

discrimination

a comparison of the general features of the target class data object against the general features of object from one or multiple contrasting classes

characterization

a summarization of the general characteristic or features of a target class of data

data qualities

accuracy completeness consistency believably timeliness interpretability

two reasons why data mining is popular ow and wasn't as popular 20 years ago

advancement in technology in technology (computerization of the society) powerful data collection and storage tools

data mining

an essential process where intelligent methods are applied to extract data patterns

what are the data mining functionalities

characterization discrimination classification regression clustering analysis association correlation

database

collection of interrelated data that are measured by specialized system known as database management system (DBMS)

five summary of a distribution

consisting of the median Q2, the quartiles (Q1 and Q3), and the smallest and largest individual observation

What is the steps of a process of knowledge discovery (KDD)?

data cleaning data integration data selection data transformation data mining data evaluation knowledge presentation

association

deciding which value are related to each other

what are some procedure of handling missing values

getting rid of valuable data from the data set creating an incomplete data set

interpretability

if the data easy to understand or not

accuracy

if the value are right or wrong, it is accurate or not

consistency

if the value is consistent with other values or inconsistent

completeness

if the values are recorded or not, is it available or not

believably

if the values are trustworthy or not, can we trust the data and the data source

timeliness

if the values will be able to process on time or not, updatability

confidence in an association rule

it gives the level of certainty or chance that is probable of the association

support in an association rule

it gives the percentage of the actual association found in the target data set

correlation

predicting future values based on current trends

data warehouse

repositories of information from multiple sources and stored under a unified schema of a site

attributes

represent characteristics of those objects

data object

represent entities

what is minimum confidence threshold

same definition as minimum support threshold

outlier analysis

studying value s that are separated from a class label in order to explain why it occurred

target data set

the class of data under study

what is minimum support threshold

the minimum value that is required for a support to achei

What is data mining

the process of discovering interacting patterns and knowledge from large amount of data

classification

the process of finding a model that describes and distinguishes data classes or concepts

data evaluation

to identify the truly interesting patterns representing knowledge based on interesting measures

why do we pre-process data

to reduce redundancy of using same data on the data set to save time during the data analysis phase of data mining to handle incomplete data set to find possible data to replace the missing data clean noisy data from data set

data cleaning

to remove noise and inconsistent data

clustering

used to generate class labels for a group of data

regression

used to prediction missing or unavailable numerical data values rather than class labels

knowledge presentation

were visualization and knowledge representation techniques are used to present mined knowledge to users

data transformation

where data are transformed and consolidated into forms appropriate from mining by performing summary or aggregation operation

data selection

where data relevant to the analysis task are retrieved from the database

data integration

where multiple data source may be combined


Kaugnay na mga set ng pag-aaral

Learning: Chapter 18: International Trade and Public Policy

View Set

Honors Chemistry Quarter 2 Exam: Units 4-6

View Set

Electrical Energy Fundamentals Part 2

View Set

Lesson 3 - Written Communication

View Set

Ap Euro Chapter 16: Toward a New Worldview

View Set

[Diabetes and Other Endocrine] Medical-Surgical Nursing Review Questions

View Set