Data Analytics Chapters 1 and 8

¡Supera tus tareas y exámenes ahora con Quizwiz!

What values will the following statements display? for i in range(1, 11) print(i)

1 to 10

Dendrogram

A chart for displaying cluster groupings

Which of the following attributes are commonly used to determine data quality?

Accuracy, Completeness, Conformity, and Consistency

To determine if an email is spam, data programmers would use:

Classification

One of the earliest data-mining and analytic tools was:

Excel

Data mining only applies to numeric data

False

Hot blistering is the process of representing categorical data with numeric values

False

Like other strongly typed programming languages, such as Java and C#, in Python you must declare a variable before you use it

False

There are three forms of machine learning: supervised, unsupervised, and hybrid

False

To perform data-mining operations, Python scripts make extensive use of datatable objects. You can think of a datatable as a two-dimension table that holds values

False

To use a Python module, within your script, you use the module statement

False

A great source for sample data sets is:

Kaggle

Sklearn

Library that defines Python data structures and functions that support machine-learning and data-mining operations such as clustering, classification, and regression

Composition Charts

Represent how one or more values relate to a whole

Distribution Charts

Represent the frequency of values within a data set

Conformity

The degree to which the data values align with the company's business rules, such as "The company will measure and store sensor values on 1-second intervals."

Classification

The process of assigning data to matching groups (categories), such as a tumor being benign or malignant, email being valid or spam, or a transaction being legitimate or fraudulent

Association

The process of identifying key relationships between variables. One of the best-known data-association problems is market-basket analysis, which examines items in a customer's shopping cart to determine if the presence of one item in the cart (called the antecedent) influences the addition of a second item (called the consequent)

Data Mining

The process of identifying patterns in data

Clustering

The processing of grouping related data-set items

Machine Learning

The use of data pattern-recognition algorithms, which allow a program to solve problems, such as clustering, categorization, predictive analysis, and data association without the need for explicit step-by-step programming instructions to tell the algorithm how to perform tasks

Business Intelligence

The use of tools (data mining, machine learning, and visualization) to convert data into actionable insights and recommendations.

One of the first uses of data collection and analytics was the 1890 census

True

Programmers make extensive use of R and Python to create machine-learning applications

True

Python is a case-dependent programming language that considers upper and lowercase characters as different

True

Python is an interpreted language, as opposed to a compiled language, for which the Python interpreter executes one statement at a time

True

Python is one of the world's most popular programming languages and is used to create solutions that range from websites, data mining, machine learning, visualization, and more

True

Unlike other programming languages that use braces { } to group related statements, Python instead relies on statement indentation to group statements

True

Predictive Analytics

Try to forecast what will happen in the future

Orange and Rapid Miner are examples of:

Visual Programming Environment


Conjuntos de estudio relacionados

CHM102 MasteringChemistry Ch. 15

View Set

Nursing 230: Chapter 41 Spirituality

View Set

Leonardo da Vinci, Before and After Mona Lisa

View Set

U.S. Presidents (Face to Name & Number)

View Set