Data Mining Midterm

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A major feature of a data warehouse is that ____. A. old data is removed periodically to improve performance B. typical users include clerks and database professionals C. it focuses on the day-to-day operations of an organization D. is time-variant

D

Intuitively, the drill-down OLAP operation corresponds to concept ____ in a concept hierarchy. A. cooperation B. ascension C. forecasting D. specialization

D

The major dimensions of a multidimensional view are: A. data, knowledge, utilization, applications B. data, knowledge, technologies, and applications C. data, knowledge, applications D. data, utilization, concurrency, modernization

b

Characterization and discrimination; the mining of frequent patterns, associations, and correlations; classification and regression; clustering analysis, and outlier analysis are all examples of data mining ____________

functionalities

A pattern is ________ if it is valid on test data with some degree of certainty, novel, potentially useful, and easily understood by humans

interesting

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the first quartile? A. 19 B. 20 C. 21 D. 22

b

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the median? A. 24 B. 25 C. 22 D. 27

b

Attribute-oriented induction is an alternative to the _____ approach for data generalization. A. three tier architecture B. concept hierarchies C. back-end tools D. data cube

b

________ is the process of discovering interesting patterns from massive amounts of data

data mining

Interesting patterns represent knowledge t/f

t

Smoothing, attribute construction, aggregation, normalization, and descritization are examples of data ____________ strategies

transformation

An _____________ association rule is a rule that is deemed as strong association rule by the association analysis but it is of no value to the problem at hand or that it is misleading

uninteresting

Measures of pattern interestingness are either objective or

subjective

Consider two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8). What is the Manhattan distance?

11

How many cuboids are there in a 5-dimensional data cube if there were no hierarchies associated to any dimensions?

32

How many cuboids are there in an 9-dimensional data cube if there were no hierarchies associated to any dimension?

512

Consider two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8). What is the Supremum distance? (round to 2 decimal places)

6

Consider two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8). What is the Minkowski distance? (round to 2 decimal places)

6.15

Consider two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8). What is the Euclidian distance? (round to 2 decimal places)

6.71

How many steps are there involved in data mining when viewed as KDD?

7

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the mean? A. 25 B. 29.96 C. 32.56 D. 30.40

B

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the mode? A. 25 B. 35 C. 25 and 35 D. Does not exist

C

In attribute-oriented induction, data relevant to the task at hand is collected and then generalization is performed by either attribute generalization or ___. A. full materialization B. concept description C. attribute removal D. partial materialization

C

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the 3rd quartile? A. 35 B. 36 C. 40 D. 45

a

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the minimum? A. 13 B. 15 C. 16 D. 17

a

A ____ is a repository for long-term storage of data from multiple sources organized so as to facilitate management decision making. A. Data warehouse B. data mart C. transactional database D. object oriented programming language

a

Consider a data cube measure obtained by applying the average() function. The measure is ___. A. algebraic B. analytic C. holistic D. atomic

a

Data mining functionalities are used to specify kinds of patterns or _______ to be found in data mining tasks. A. knowledge B. transactions C. relations D. history

a

In the ___ method, the process to design and construct a data warehouse is sequential, moving onto each phase only if the previous phase is complete. A. waterfall B. top-down C. spiral D. bottom-up

a

Location Resource Brazil-------8,233 USA--------3,069 Canada----2,902 China------2,840 Colombia--2,132 2-D data cube above represents info on freshwater resources per country (in kms cubed). The cube contains the dimensions location and resource. The concept hierarchy for location is defined as the total order "country<continent". Which operation materializes the view provided below? Location--------Resource Canada----2,902 China------2,840 A. dice B. drill-up C. drill-through D. rotate

a

Multidimensional data mining is also called _______ multidimensional data mining A. Exploratory B. Meaningful C. Modern D. Useful

a

The ____ OLAP operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by dimension reduction. A. roll-up B. rotate C. drill-down D. slice

a

A data warehouse is a ____ collection of data in support of management's decision making process. A. day-today operations oriented, integrated, time-variant, and nonvolatile B. subject-oriented, integrated, time-variant, and nonvolatile C. subject-oriented, integrated, time-variant, and volatile D. subject-oriented, integrated, time-invariant, and nonvolatile

b

The bottom layer in a three-tier data warehouse architecture typically consists of _____. A. analysis and API tools B. a relational database system C. a server implemented using a ROLAP or MOLAP model D. a client layer

b

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the midrange? A. 30 B. 41 C. 41.5 D. 42.5

c

Among the data warehouse applications, ______ applications supports OLAP operatiions such as roll-up, drill-down, and slice. A. star schema B. data mining C. analytical processing D. information processing

c

Data warehouse systems provide multidimensional data analysis capabilities, collectively referred to as A. TCP B. UDP C. OLAP D. relational database

c

The ___ OLAP operation is realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions. A. dice B. rotate C. drill-down D. drill-up

c

Measures of _________ tendency indicate where most of the values in our data set fall

central

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70 What is the maximum? A. 45 B. 46 C. 52 D. 70

d

A form of dimensional modeling used in online analytical processing systems is the _____. A. relational diagram B. entity-relationship model C. object-oriented data model D. star schema

d

An advantage of the spiral method to design and construct a data warehouse is that A. it requires fewer resources B. the process moves onto each phase only if the previous phase is complete C. risks are managed late in the process D. modifications can be done quickly

d

Location Resource Brazil-------8,233 USA--------3,069 Canada----2,902 China------2,840 Colombia--2,132 2-D data cube above represents info on freshwater resources per country (in kms cubed). The cube contains the dimensions location and resource. The concept hierarchy for location is defined as the total order "country<continent". Which operation materializes the view provided below? Location--------Resource South America---10,365 North America---5,971 Asia---------------2,840 A. pivot B. drill-down C. slice D. roll-up

d

Some measures of ________ are variance, standard deviation, and interquartile range.

dispersion

CIST MPK Useful acronym form 7 steps of data mining when viewed as KDD. Data cleansing Data integration Data selection Data transformation Data mining Pattern evaluation ________________ What is the last step? (starts with K)

knowledge presentation

The normal measures are mean, median, and ______

mode


Kaugnay na mga set ng pag-aaral

Increase Conversions with Performance Planner

View Set

nutrition: chapter 1 - food for health, chapter 2 - nutrition guidelines, chapter 1 nutrition practice questions, Chapter 2 practice questions, Chapter 4 practice questions, Chapter 5 study questions, Chapter 6 study questions, Chapter 3 study questi...

View Set

Acct 202: Managerial Accounting Final Exam (Chapter 1&2)

View Set

Chapter 8. Business Organizations

View Set

PCC 1- Exam 1: Diversity, Family Dynamics, IPV, and Sexuality

View Set

Ch 29 Life Cycles of Flowering Plants

View Set

Chapter 6:Tennessee Laws and Department Rules Pertinent to Life Only

View Set

Lesson 17: Performing Incident Response

View Set

CH. 20 (II. "Normalcy" and Isolationism )

View Set