Quiz 1: Data Analytics
What values will the following statements display? for i in range(1, 11) print(i)
1 to 10
Which of the following attributes are commonly used to determine data quality? (select all that apply)
1. Accuracy 2. Completeness 3. Consistency 4. Conformity
____________ is the use of tools (data mining, machine learning, and visualization) to convert data into actionable insights and recommendations.
Business intelligence
To determine if an email is spam, data programmers would use:
Classification
______________ is the process of assigning data to matching groups (categories), such as a tumor being benign or malignant, email being valid or spam, or a transaction being legitimate or fraudulent.
Classification
_______________ is the processing of grouping related data-set items.
Clustering
_____________ charts represent how one or more values relate to a whole.
Composition Charts
______ the degree to which the data values align with the company's business rules, such as "The company will measure and store sensor values on 1-second intervals."
Conformity
True or False? Data mining only applies to numeric data.
False
True or False? Hot blistering is the process of representing categorical data with numeric values.
False
True or False? Like other strongly typed programming languages, such as Java and C#, in Python you must declare a variable before you use it.
False
True or False? There are three forms of machine learning: supervised, unsupervised, and hybrid.
False
True or False? To perform data-mining operations, Python scripts make extensive use of datatable objects. You can think of a datatable as a two-dimension table that holds values.
False
True or False? To use a Python module, within your script, you use the module statement
False
___________ analytics try to forecast what will happen in the future.
Predictive
True or False? One of the first uses of data collection and analytics was the 1890 census.
True
True or False? Programmers make extensive use of R and Python to create machine-learning applications.
True
True or False? Python is a case-dependent programming language that considers upper and lowercase characters as different
True
True or False? Python is an interpreted language, as opposed to a compiled language, for which the Python interpreter executes one statement at a time.
True
True or False? Unlike other programming languages that use braces { } to group related statements, Python instead relies on statement indentation to group statements.
True
_____ is the process of identifying key relationships between variables. One of the best-known data-association problems is market-basket analysis, which examines items in a customer's shopping cart to determine if the presence of one item in the cart (called the antecedent) influences the addition of a second item (called the consequent).
Assosiation
________ charts represent the frequency of values within a data set.
Distribution Charts
A great source for sample data sets is:
Kaggle
________________ is defined as the use of data pattern-recognition algorithms, which allow a program to solve problems, such as clustering, categorization, predictive analysis, and data association without the need for explicit step-by-step programming instructions to tell the algorithm how to perform tasks.
Machine Learning
______ is the process of identifying patterns in data.
Pattern Recognition
True or False? Python is one of the world's most popular programming languages and is used to create solutions that range from websites, data mining, machine learning, visualization, and more.
True...
The _______________ library defines Python data structures and functions that support machine-learning and data-mining operations such as clustering, classification, and regression.
sklearn
Orange and Rapid Miner are examples of:
visual programming environments