Concept 4: Ch 22: Data Mining as a Research Tool
false
( t of f ) use and processing of data collected by wearable technologies are always subject to HIPPA laws
true
( t or f ) Exploration begins with exploring and preparing data for the data mining process
false
( t or f ) Meta-learning is a subset of artificial intelligence that permits computers to learn either inductively to deductively. Inductive machine learning is the process of reasoning and making generalizations or extracting patterns and rules from huge datasets.
true
( t or f ) algorithms are typically computer-based recipes or methods with which data mining models are developed
true
( t or f ) the acronym SEMMA- sample, explore, modify, model, assess- refers to the core process of conducting data mining
true
( t or f) Brushing is a technique in which the user manually chooses specific data points or observations or subsets of data on an interactive data display.
false
( t or f) Decision trees are a statistical technique based on using numerous algorithms to predict an independent variable
true
(t or f) to develop a successful data mining process, organizations must first have the data needed to create meaningful information
big data
a collection of data that is huge in size and yet growing exponentially with time
dataset
a collection of interrelated data
boosting
a means of increasing the power of the models generated by weighing the combinations of predictions from those models into a. predicted classification
data mining
a nurse is beginning the analytical and logical process to forecast a pattern from data that have been concealed. What is this process called?
different methods and models
a nurse researcher is averaging predictive data mining to synthesize the predictions. The predictions arise from which of the following?
identifying variables of interest
a nurse researcher is starting a drill-down analysis. She would begin with which of the following?
exploratory data analysis (EDA)
approach that uses mainly graphical techniques to gain insight into a dataset
multidimensional databases
combines data from numerous data sources and is optimized for online applications
meta-learning
combines predictions from several models
Algorithm
computer-based methods in which data mining models are developed
1. problem identification 2. exploration 3. pattern discovery 4. knowledge deployment
data mining process (4)
unidentified
data mining should use which type of patient information?
pattern discovery
different models are applied to the same data to choose the best model for the dataset being analyzed. ends with a highly predictive pattern-identifying model
through a retrospective temporal event analysis
how can data mining in health care help improve patient outcomes?
private health information of patients
in health care, which of the following is data mining dependent on?
Six Sigma
seeks to improve process outputs by removing causes of defects and minimizing variability in the process
decision tree
sets of decisions represented in a tree-shaped pattern
exploration
exploring and preparing the data for the data mining
stacking
how does meta-learning synthesize predicted classifications to generate a final, best-predicted classification?
data mining
method that helps visualize relationships in the data and mechanizes the process of discovering predictive information in massive databases
machine learning
permits computers to learn either inductively or deductively
inductive machine learning
process of REASONING and making GENERALIZATIONS from huge datasets?
modeling
refers to creating a representation or model, such as making 3D models
digital phenotyping
refers to the use of individual data collected by wearable technologies to identify health issues and, more commonly, mental health status
neural networks
represent nonlinear predictive models. they go through a learning process on existing data so they can predict, recognize patterns, or classify data
neural network
which data mining technique is a way to bridge the gap between computers and humans?
drill down
A means of viewing data warehouse information by going down to lower levels of the database to focus on information that is pertinent to the user's needs at the moment.
Exploration
A nurse is preparing data for the data mining process and is identifying important variables. What phase of data mining is this?
Patient glucose laboratory results
A nurse practitioner is using a registry in the electronic health record to identify patients at risk for diabetes. Which data can the registry identify to assist with this data mining for diabetes?
Data reduction
A nurse researcher clusters data so that the large datasets are broken up into more manageable, smaller datasets. What is this process known as?
By identifying patient data
A nurse researcher is using aggregate patient data for data mining. How is patient confidentiality maintained?
deep learning
A subset of machine learning that is able to explore many layers of a neural network simultaneously rather than linear processing
stacking
Synthesizing predicted classifications to generate a final best-predicted classification is a process referred to as __________.
nonlinear, predictive
Neural network are _______(2) models that learn through training
scoring
applying a model to new data
unstructured data
data that are not contained in a database; data residing in text files, which can represent more than 75% of an organization's data; data that are not organized or lack structure
bagging
use of voting and averaging in predictive data mining to synthesize the predictions from many models or methods or the use of the same type of model on different data.
analytical software
what helps facilitate data mining?
in a data mining model
where is the information obtained from a statistical analysis of data stored?
text files
where is unstructured big data stored?
Six Sigma
which data mining model includes data-driven methods used to assess quality control issues and eliminate waste?
Online analytic processing (OLAP)
which data mining technique is a fast analysis of shared data?
the detection of medical insurance fraud
which of the following can knowledge discovery and data mining directly affect in the healthcare setting?
online analytic processing (OLAP)
A fast analysis of shared data stored in a multidimensional database that allows the user to easily and selectively extract and view data from different points of view.
complex
In the healthcare setting, knowledge discovery and data mining are often:
1. Findable 2. Accessible 3. Interoperable
The Big Data to Knowledge (BD2K) initiative of the National Institutes of Health (NIH) supports the creation of data sets with which the following characteristics? (3)
stacking
synthesize predicted classifications to generate a final, best-predicted classification
knowledge deployment
takes the pattern and model identified in the pattern discovery phase and applies them to new data to test whether they can achieve the desired outcome
brushing
technique in which the user manually chooses specific data points or subsets of data on an interactive data display
Problem identification
the problem must be defined and everyone must understand the objectives and requirements of the data mining process they are initiating
data mining
the process of using software to sort through data to discover patterns and ascertain relationships
digital phenotyping
the processing of individual data collected by wearable technologies to suggest that an individual has health issues especially issues related to individual mental health status
classification
the technique of dividing a dataset into mutually exclusive groups
BOOSTING
the term __________ refers to increasing the power of the models generated by weighting the combinations of predictions from those models into a predicted classification
classification and regression tree (CART) data mining method
this set of rules is designed to predict which records will have a specified outcome