RMI 4226 Final Exam

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

If a predictive model makes 60 percent positive predictions in a situation in which without the model, only 40 percent of positive predictions would be made by chance, which one of the following is the model's leverage? Select one: A. 0.20 B. 0.67 C. 1.50 D. 2.00

A. 0.20

Insurers can benefit from Select one: A. A framework they can use to approach problems through data analysis. B. Analyzing data very closely to locate the most applicable findings. C. Discarding traditional data analysis and using only new data science techniques. D. Applying the theories of physics to complex insurance data.

A. A framework they can use to approach problems through data analysis

Linear regression predicts Select one: A. A numerical value for a target variable. B. One class for a target variable. C. A support vector machine for a target variable. D. Multiple classes for a target variable.

A. A numerical value for a target variable

Which one of the following is particularly useful for analyzing small social networks? Select one: A. A sociogram B. A matrix C. Sentiment analysis D. A network variable

A. A sociogram

Which one of the following statements is correct? Select one: A. An algorithm is a set of steps used to solve a problem or complete a process. B. Strictly speaking, an algorithm is no different from a model, and they can be used interchangeably. C. An algorithm is an attempt to represent the state of something. D. Before selecting the business objective of its model, an insurer should select an algorithm.

A. An algorithm is a set of steps used to solve a problem or complete a process

An online retailer uses data on products that a customer has purchased to recommend additional products to the customer. Which one of the following data analysis techniques is the retailer using? Select one: A. Association rule learning B. Correlation matrix C. Regression analysis D. Supervised learning

A. Association rule learning

Which one of the following traditional data analysis techniques would be used when an insurer wants to determine which characteristics lead to an increase in the severity of workers compensation claims but does not know which variables it must analyze to do so? Select one: A. Cluster analysis B. Linear regression C. Correlation matrix D. Classification tree

A. Cluster analysis

Stevens Insurance Agency is using social network analysis to predict the success of a target marketing effort for alumni from local universities. A logistic regression that models both local variables and network variables is being used. Which one of the following would be considered a network variable? Select one: A. Connections with alumni who are policyholders B. Distance of home from the university C. Number of children D. Number of years since graduation

A. Connections with alumni who are policyholders

In text mining, a collection of documents is called a(n) Select one: A. Corpus. B. File. C. Token. D. Attribute.

A. Corpus

Which one of the following functions of a data management program would allow accounting transactions to automatically update an organization's financial statements? Select one: A. Data integration B. Data access C. Data governance D. Data preparation

A. Data integration

If an attribute has high information gain, it Select one: A. Decreases entropy. B. Is unpredictable. C. Represents a class label. D. Is segmented.

A. Decreases entropy

Under the General Data Protection Regulation (GDPR), a data controller's role is to Select one: A. Define how and for what purpose personal data should be processed. B. Represent the business aspects of data governance. C. Manage the flow of data for the rest of the organization. D. Define the metrics used to measure an organization's overall data quality.

A. Define how and for what purpose personal data should be processed.

In a social networking scenario, which one of the following counts how many people are connected to a person? Select one: A. Degree B. Link C. Closeness D. Betweeness

A. Degree

Attributes used to develop a classification tree should be analyzed for their Select one: A. Information gain. B. Leaf nodes. C. Cost. D. Rules.

A. Information gain

The purpose of data analytics for insurers is to Select one: A. Make data-driven decisions and strategy B. Eliminate the need for human analysis. C. Acquire and use all the new types of technology. D. Automate most organizational processes.

A. Make data-driven decisions and strategy

Which one of the following is the term for the most similar instances in a data model? Select one: A. Nearest neighbors B. Class labels C. Attributes D. Centrality measures

A. Nearest neighbors

Which one of the following is a legal and regulatory concern in obtaining mass information from social media? Select one: A. Privacy B. Fraud C. Veracity D. Volume

A. Privacy

Which one of the following is a simple data analysis technique that could be used to show the relationship between two attributes? Select one: A. Scatter plot B. Correlation matrix C. Generalized linear model D. Cluster analysis

A. Scatter plot

Grant Insurance is working with its data scientists who recommend cluster analysis for the project that the risk managers have described. Which one of the following is true regarding this? Select one: A. Several iterations of cluster analysis can be applied to provide more granular information. B. Grant's risk managers will provide the target variable to gain the best results from cluster analysis. C. If outliers are found in a hierarchical clustering, they should be disregarded as to relevance. D. The data scientists can produce a dendogram which represents data by circles within circles.

A. Several iterations of cluster analysis can be applied to provide more granular information

In text mining, a computer is trained to ignore Select one: A. Stopwords. B. Keywords. C. Term frequency. D. Abbreviated words.

A. Stopwords

In a data mining context, similarity is usually measured as Select one: A. The distance between two instances' data points. B. The value of the attributes that make up an instance. C. An acceptable risk. D. A scale of variables.

A. The distance between two instances' data points

The first step in the data mining process is to Select one: A. Understand what a business wants to achieve. B. Select a data mining technique. C. Prepare the data that will be used. D. Collect the data that will be used.

A. Understand what a business wants to achieve

Which one of the following is a characteristic that differentiates big data from traditional data? Select one: A. Velocity B. Structure C. Privacy D. Fraud

A. Velocity

An insurer wants to predict which of its auto insurance customers will be likely to accept a new type of telematics. The target variable for a linear discriminant for this prediction is Select one: A. Whether or not the customer is likely to accept. B. Customers' duration with the insurer. C. Instance space. D. The type of policy the customer currently has.

A. Whether or not the customer is likely to accept

A predictive model is applied to a clothing manufacturer's data of 1,000 employees, 50 of whom had workplace injuries in the past year. The table below shows how often the model correctly and incorrectly predict for each employee "yes, will have an accident" or "no, will not have an accident." Predicted No + Predicted Yes = Total (1,000 Employees) Actual No 945 5 950 Actual Yes 10 40 50 Based on the preceding number, these statements can be made: There are 40 true positives (TP) for which the model correctly predicted yes. There are 945 true negatives (TN) for which the model correctly predicted no. There are 5 false positives (FP) for which the model incorrectly predicted yes (and the actual answer is no). There are 10 false negatives (FN) for which the model incorrectly predicted no (and the actual answer is yes). What is the F-score of the workplace injury predictive model? Select one: A. .800 B. .842 C. .889 D. .985

B. .842

Which one of the following is an attribute that considers a node's interactions within a network? Select one: A. An egonet B. A network variable C. A local variable D. A bigraph

B. A network available

A privacy impact assessment (PIA) is Select one: A. Proprietary software used to detect malware. B. A tool used to identify and assess privacy risks.. C. An example of metadata that defines key data attributes. D. A collaborative tool that facilitates workflows.

B. A tool used to identify and assess privacy risks

Galliano Insurance Agency knows that it is most likely to retain the customers that it insures for multiple lines of coverage. The agency is always trying to identify new insurance products. To help identify new products, it needs a data mining technique that will explore data to find groups with common and previously unknown characteristics. Which one of the following data mining techniques should Galliano Insurance Agency use? Select one: A. Classification B. Cluster analysis C. Association rule learning D. Regression analysis

B. Cluster analysis

Nancy, the general liability claims manager, is concerned about a significant rise in claim frequency in the state of New Jersey during the past 18 months. She cannot identify the cause of the increase and has asked James, a data analyst, to help. James decides to develop a model to analyze the dataset of New Jersey claims, and see if any previously unknown grouping can be identified for further analysis. Which one of the following data analysis techniques is James using? Select one: A. Classification tree B. Cluster analysis C. Correlation matrix D. Linear regression analysis

B. Cluster analysis

Which one of the following is true regarding data quality? Select one: A. Claims representatives primarily use quality data for pricing decisions. B. Data quality is a relative, not an absolute, concept C. Quality data retains its quality regardless of how it is used. D. It is reasonable to assume that external data comes from reliable sources.

B. Data quality is a relative, not an absolute, concept

Data governance provides Select one: A. A road map that details where data is located. B. Definitions, standards and procedures for how data is used. C. The internal data entry processes needed to capture accounting transactions. D. A dynamic view of data without needing to move it between systems.

B. Definitions, standards and procedures for how data is used

Millstone Insurance knows that its most profitable customers are those that have been policyholders for more than three years. Millstone wants to know more about the attributes of these long-term customers. The data science team used a modeling technique that grouped the long-term customers according to similarities. The groups with similar attributes were represented by circles within circles. Which one of the following techniques did the data science team use? Select one: A. K nearest neighbor clustering B. Hierarchical clustering C. Bubble plotting D. Class probability estimation

B. Hierarchical clustering

A neural network consists of Select one: A. Clustering, classification, and linear layers. B. Input, hidden, and output layers. C. Input, clustering, and neuron layers. D. Node, neural, and output layers.

B. Input, hidden, and output layers

The part of a classification tree that indicates the classification of the target model is the Select one: A. Rule. B. Leaf node. C. Branch. D. Root node.

B. Leaf node

Which one of the following can be applied over time to refine a model to better predict results? Select one: A. Statistics B. Machine learning C. Regression D. Association rule learning

B. Machine learning

Which one of the following correctly describes a new evolution of big data, sometimes referred to as big data 2.0 in insurance and risk management? Select one: A. Insurers are exploring the possibility of underwriting personal lines over the Internet. B. Organizations can strategically use data from the Internet of Things and vehicle telematics. C. Organizations have just started conducting business on the Internet. D. Insurers are using the Internet to market their products to customers.

B. Organizations can strategically use data from the Internet of Things and vehicle telematics

If an insurer wants to determine the numerical value for a known target variable, it is most likely to use Select one: A. A classification tree. B. Regression. C. Cluster analysis. D. Association rule learning.

B. Regression

The process of determining the opinion or emotion behind a selection of text is Select one: A. Data mining B. Sentiment analysis C. Link analysis D. Text mining

B. Sentiment analysis

Which one of the following is a way to measure opinion in text? Select one: A. Recall B. Sentiment analysis C. Precision D. Cluster analysis

B. Sentiment analysis

Malware is defined as Select one: A. Software technology used to encrypt data. B. Software designed to cause damage C. A hardware-based security breach. D. A tool for managing data security.

B. Software designed to cause damage

Which one of the following is an element of a data security program? Select one: A. Implementing a data governance program. B. Storing data back-ups off site. C. Increasing the overall efficiency of data systems. D. Installing agile project management.

B. Storing data back-ups off site

Johanna is claim manager for Goshen Mutual. She is working with the data science team to develop a way to determine the characteristics of slip and fall on premises claims that end up going to litigation. Johanna would like to use this information to help her assign new claims to her staff. The data science team is applying the k nearest neighbor technique to the data on slip and fall claims that have gone to litigation. The model reveals that 6 of a new claim's nearest neighbors in the dataset did go to litigation and 2 of the nearest neighbors did not go to litigation. Using the voting method, which one of the following can Johanna predict about the new claim? Select one: A. The new claim has a 60% probability of going to litigation. B. The new claim is likely to go to litigation. C. The new claim has a 30% probability of going to litigation. D. The new claim will most likely not go to litigation.

B. The new claim is likely to go to litigation

When examining a model's results, insurance and risk management professionals should defer to Select one: A. The predictive model. B. Their professional experience. C. The model's F-score. D. The model's performance on the training data.

B. Their professional experience

Data science is especially useful for Select one: A. Structured data. B. Unstructured data. C. Internal data. D. Databases.

B. Unstructured data

A predictive model was developed for Shelton Manufacturing to determine the likelihood of current and future employees suffering from hearing loss. The predictive model was applied to Shelton Manufacturing data of 200 employees, 10 of whom developed hearing loss in the past year. Based on the numbers shown in the performance metric below, what is the accuracy of the hearing loss predictive model? Predicted No+Predicted Yes=Total (200 employees) Actual No 178. 12. 190 Actual Yes 2 8 10 Select one: A. 0.40 B. 0.80 C. 0.93 D. 0.95

C. 0.93

Which one of the following is a way to present the results of a text mining model so that performance metrics can be calculated? Select one: A. An algorithm B. A corpus C. A confusion matrix D. An F-score

C. A confusion matrix

Which one of the following best describes why a weighted average gives a more accurate estimate than a simple majority combining function when predicting the value of a target variable by its nearest neighbors? Select one: A. A majority combining function plots data points in a two-dimensional space, while a weighted average plots data points in a three-dimensional space. B. A majority combining function combines two functions, while a weighted average combines two averages. C. A majority combining function gives equal weight to all of the nearest neighbors, while a weighted average weights the nearest neighbors' contributions by their distance. D. A majority combining function gives the nearest neighbors' similarity weights, while a weighted average gives the nearest neighbors' contributions.

C. A majority combining function gives equal weight to all of the nearest neighbors, while a weighted average weights the nearest neighbors' contributions by their distance

During the process of training a predictive model, overfitting occurs when Select one: A. The data has known values for the target variable. B. The model applies generalization. C. A model is overly tailored to the training data. D. Data is held out from the training data.

C. A model is overly tailored to the training data

Hierarchical clustering groups data Select one: A. Into two large circles. B. By identifying outliers. C. According to similarities. D. According to differences.

C. According to similarities

The insurance professionals who have traditionally analyzed data and made predictions based on their analyses are Select one: A. Computer programmers. B. Data scientists. C. Actuaries D. Claims professionals.

C. Actuaries

To gain a competitive advantage, maintain profitability, and satisfy customers an organization must Select one: A. Have an effective risk management program. B. Adopt current accounting rules. C. Be able to trust its data. D. Pay attention to the marketplace.

C. Be able to trust its data

There are two types of associated risk for data privacy, individual and general risk. General data privacy risk Select one: A. Involves legal and regulatory requirements. B. Varies by the type of business or industry. C. Can be categorized operational or reputational. D. Is of specific concern to the European Union.

C. Can be categorized operational or reputational

The two types of descriptions for clusters are Select one: A. K-means and k-nearest neighbor. B. Supervised and unsupervised. C. Characteristic and differential D. Hierarchical and dendogram.

C. Characteristic and differential

Which one of the following is a data mining technique an insurer applies when it knows what information it wants to predict? Select one: A. Machine learning B. Cluster analysis C. Classification D. Association rule learning

C. Classification

In the context of a predictive model, a true positive results when the model Select one: A. Incorrectly predicts a positive. B. Incorrectly predicts a negative. C. Correctly predicts a positive. D. Correctly predicts a negative.

C. Correctly predicts a positive

Internal data entry processes that capture accounting transactions, customer data or other operational transactions are called Select one: A. Data quality. B. Data governance. C. Data capture. D. Data integration.

C. Data capture

Which one of the following is a basic process in any data security program? Select one: A. Establish a data governance committee (DGC). B. Perform random sampling of data for accuracy C. Develop and enforce stronger password protocols. D. Establish metrics for timeliness of data refresh in systems.

C. Develop and enforce stronger password protocols

Which one of the following is a way that insurers and risk managers can use data science to improve their results through data-driven decision making? Select one: A. Determining prior year losses at a particular location B. Providing human analysis of data C. Discovering new relationships in data D. Using industry data in addition to the organization's own data

C. Discovering new relationships in data

Tree-based probabilities are calculated by Select one: A. Subtracting the number of times the model incorrectly predicted the value from the number of correct times at each leaf node. B. Dividing the number of times the model correctly predicted the value by the number of incorrect times at each leaf node. C. Dividing the number of times the model correctly predicted the value by the total predictions at each leaf node. D. Subtracting the number of times the model correctly predicted the value from the total predictions at each leaf node.

C. Dividing the number of times the model correctly predicted the value by the total predictions at each leaf node

Technology that can particularly assist adjusters in evaluating claims after catastrophes is Select one: A. Telematics. B. Wearables. C. Drones. D. Sensors.

C. Drones

Which one of the following is an example of a data governance tool? Select one: A. Metadata B. Data integration C. External Policy D. Risk Management

C. External policy

Reynolds Insurance provides workers compensation insurance for small to medium sized companies, mostly to its local and regional manufacturers. For data science to be relevant and useful to Reynolds, which one of the following types of technology would be its best investment? A. Analyzing data to compare older and younger workers' preferences for longer shifts but shorter work weeks. B. Placing telematics in employees' personal automobiles to better identify driving behaviors. C. Harnessing the Internet of Things data to analyze data from wearables such as hardhats and steel-toed boots. D. Classifying employees by various demographics to reveal the likelihood of lawsuits against their organization.

C. Harnessing the Internet of Things data to analyze data from wearables such as hardhats and steel-toed boots

Which one of the following defines individual risk? Select one: A. Individual risk is reputational in nature. B. Individual risk is defined by the data governance committee. C. Individual risk varies according to the type of business. D. Individual risk may be categorized as operational.

C. Individual risk varies according to the type of business

Bellingham Insurance has traditionally used classification trees and linear regression models to price personal auto insurance based on individual characteristics. The insurer is considering using neural networks to price personal auto insurance in the future. Which one of the following is a major advantage of a neural network? Select one: A. Its methodology and results can be evaluated easily. B. It has little to no risk of overfitting. C. It develops rules to make predictions as it performs various mathematical functions. D. It can work with both numerical and nonnumerical values.

C. It develops rules to make predictions as it performs various mathematical functions

The claim manager for Bellingham Mutual wants to know whether, based on its characteristics, a workers compensation claim should be classified as fraudulent or not fraudulent. The manager also wants to know the probability that a claim that is classified as fraudulent will actually be fraudulent. When the target variable is binary, which one of the following types of regression analysis should be used? Select one: A. Multiple regression B. Least squares regression C. Logistic regression D. Circular regression

C. Logistic regression

The hidden layer of a neural network Select one: A. Classifies target variables according to binary categories. B. Uses only linear functions to make predictions about numerical variables. C. Performs various mathematical functions to match inputs to outputs. D. Uses nodes to provide predictive values based on nonnumerical input data.

C. Performs various mathematical functions to match inputs to outputs

Wycliffe Insurance is very concerned about data quality and has many safeguards in place to ensure the data it collects and stores is managed appropriately. New claims data is entered with the date of its arrival to the department. Then the claims representative's activities are also entered with the date and time whenever the file is updated. The organization has chosen this data formatting to reflect the required degree of accuracy that has proven many times to be beneficial when the data is used in settlement negotiations or arbitration hearings. The dimension of stored data quality used in this case by Wycliffe is Select one: A. Flexibility. B. Organizational consistency. C. Precision. D. Granularity.

C. Precision

Which one of the following is the first step in the text mining process? Select one: A. Create structured data from unstructured data (words) B. Create a model using data mining techniques C. Retrieve and prepare text with preprocessing techniques D. Evaluate the model

C. Retrieve and prepare text with preprocessing techniques

In order for neural networks to predict the success of a project, Select one: A. It is necessary to use linear and logistic regression to evaluate the data. B. Additional layers must be added to the network to analyze different project phases. C. The factors that caused the success or failure of previous projects must be understood. D. They must overfit the historical data to provide accuracy for the analysis of new projects.

C. The factors that caused the success or failure of previous projects must be understood

In the algorithm k nearest neighbor (k-NN), the "k" refers to Select one: A. The weighted average. B. The distance between two instances. C. The number of neighbors used. D. The majority.

C. The number of neighbors used

A predictive model is applied to a clothing manufacturer's data of 1,000 employees, 50 of whom had workplace injuries in the past year. The table below shows how often the model correctly and incorrectly predict for each employee "yes, will have an accident" or "no, will not have an accident." Predicted No + Predicted Yes = Total (1,000 Employees) Actual No 945 5 950 Actual Yes 10 40 50 Based on the preceding number, these statements can be made: There are 40 true positives (TP) for which the model correctly predicted yes. There are 945 true negatives (TN) for which the model correctly predicted no. There are 5 false positives (FP) for which the model incorrectly predicted yes (and the actual answer is no). There are 10 false negatives (FN) for which the model incorrectly predicted no (and the actual answer is yes). What is the accuracy of the workplace injury predictive model? Select one: A. .800 B. .842 C. .889 D. .985

D. .985

Which one of the following is a data governance committee (DGC) responsibility? Select one: A. A data governance committee plays a key role in project management for data projects. B. A data governance committee is charged with monitoring the volume of big data within an organization. C. A data governance committee both retrieves and prepares metadata for use by an organization. D. A data governance committee ensures there are few conflicts or redundancies in data standards and practices.

D. A data governance committee ensures there are few conflicts or redundancies in data standards and practices

In link prediction, a model attempts to predict Select one: A. How many people are connected. B. The extent to which a person connects others. C. The similarity between people. D. A pair of instances.

D. A pair of instances

The target value in a linear regression Select one: A. Forms an optimized linear discriminant. B. Changes disproportionately with the attribute values. C. Is classified according to the attribute values. D. Changes proportionately with the attribute values.

D. Changes proportionately with the attribute values

Greatview Insurance wants to predict which auto liability claims will most likely go to litigation, so it can assign them to experienced adjusters early in the process. There are certain known indicators of litigation that Greatview wants to use in the data mining process. Which one of the following data mining techniques would Greatview's analyst most likely use? Select one: A. Cluster analysis B. Regression analysis C. Association rule learning D. Classification

D. Classification

The lifeblood of every organizational function is Select one: A. Risk management. B. Employees. C. Regulation. D. Data.

D. Data

In terms of data governance, IT employees hold the role of Select one: A. Rule developers. B. Compliance regulators. C. Data stewards. D. Data custodians.

D. Data custodians

The important first step in a decision-making model is to Select one: A. Purchase the technology. B. Prepare the data. C. Assign a data scientist. D. Define the problem.

D. Define the problem

A predictive model is applied to a clothing manufacturer's data of 1,000 employees, 50 of whom had workplace injuries in the past year. The table below shows how often the model correctly and incorrectly predict for each employee "yes, will have an accident" or "no, will not have an accident." Predicted No + Predicted Yes = Total (1,000 Employees) Actual No 945. 5 950 Actual Yes 10 40 50 Based on the preceding number, these statements can be made: There are 40 true positives (TP) for which the model correctly predicted yes. There are 945 true negatives (TN) for which the model correctly predicted no. There are 5 false positives (FP) for which the model incorrectly predicted yes (and the actual answer is no). There are 10 false negatives (FN) for which the model incorrectly predicted no (and the actual answer is yes). Using the formula, 2 × [(Precision × recall) ÷ Precision + recall)], which in this case is 2 × [(.889 × .80) ÷ (.889 + .80)] = .842 measures the workplace injury predictive model's Select one: A. Precision. B. Recall. C. Accuracy. D. F-score.

D. F-score

Classification tree models can be continually improved through Select one: A. Tree-based class probability estimation. B. Unsupervised learning. C. Overfitting. D. Machine learning.

D. Machine learning

Martin is a commercial auto underwriter who wants to determine if a linear relationship exists among the drivers' years of experience, the number of miles driven per week, and the frequency of accidents. The target variable is the number of accidents. Which one of the following types of regression analysis is Martin using? Select one: A. Cluster analysis B. Logistic regression analysis C. Least squares regression analysis D. Multiple regression analysis

D. Multiple regression analysis

Which one of the following measures only the positive results of a model? Select one: A. F-score B. Recall C. Accuracy D. Precision

D. Precision

Conducting unsupervised learning before supervised learning may Select one: A. Determine unknown values in the past or present. B. Define the terms that actuaries and data scientists use. C. Segment the dataset based on informative attributes. D. Provide the information needed to define an appropriate target for supervised learning.

D. Provide the information needed to define an appropriate target for supervised learning

Which one of the following is the basis of link prediction? Select one: A. Centrality B. Neutrality C. Influence D. Similarity

D. Similarity

A project team at Goshen Mutual has been working on developing a new product for the personal insurance market. The team is using geodemographic data to determine which territories would be the best to introduce the product. Geodemographic data would be categorized as Select one: A. Structured internal data. B. Unstructured internal data. C. Unstructured external data. D. Structured external data.

D. Structured external data

Which one of the following statements is correct regarding the personal data and privacy positions of the European Union (EU) and the U.S.? Select one: A. The U.S. has a stronger cultural expectation of privacy than the EU. B. U.S. companies are required to comply with the EU's General Data Protection Regulation (GDPR) only if they have employees in the EU. C. Class-action lawsuits over privacy are commonplace in the EU, but rare in the U.S. D. The EU has one all-encompassing data protection framework and the U.S. has several more targeted privacy laws.

D. The EU has one all-encompassing data protection framework and the U.S. has several more targeted privacy laws

In terms of data quality principles, validity is defined as Select one: A. The extent that each dataset contains all elements necessary for business needs. B. The process of tracing data from its source to its destination. C. The true value of data relative to the business information being analyzed. D. The accuracy of data within predefined and accepted parameters or values.

D. The accuracy of data within predefined and accepted parameters or values

All of the following are fundamental concepts of data science, EXCEPT: Select one: A. After information is gleaned, the user must have a process to evaluate the accuracy of the any conclusions drawn. B. Information technology can be applied to big data to reveal characteristics of people or events of interest. C. Close analysis may result in substantive findings that may not necessarily lead to actionable conclusions. D. The data mining project need not provide actions that lead to better business results in order to be considered worthwhile.

D. The data mining project need not provide actions that lead to better business results in order to be considered worthwhile.

Which one of the following is a drawback of neural networks? Select one: A. The input layer is too limited in types of data. B. They are unable to work with other data analysis techniques. C. They cannot be used to analyze big data. D. The hidden layer is often too opaque to data scientists.

D. The hidden layer is often too opaque to data scientists

The data quality principle of reasonability refers to Select one: A. The systematic process of tracing data. B. The appropriateness of current data. C. The comprehensive nature of data. D. The materiality or relevance of data.

D. The materiality or relevance of data

In predictive modeling terminology, a target variable is Select one: A. The representation of a data point described by a set of attributes within a predictive model's dataset. B. A variable that describes a characteristic of an instance within a predictive model. C. A model used to study and find relationships within data. D. The predefined attribute whose value is being predicted in a predictive model.

D. The predefined attribute whose value is being predicted in a predictive model

Which one of the following is a fundamental concept of data science? Select one: A. Random exploration is the best approach to discover useful knowledge from data. B. Analyzing data as closely as possible is likely to produce the most generally applicable results. C. Information technology cannot be applied to big data, and experimental approaches are used. D. The selection of data mining approaches must be considered in the context in which the results will be applied.

D. The selection of data mining approaches must be considered in the context in which the results will be applied.

The descriptive approach is applied Select one: A. To process information received from the Internet of Things. B. When an insurer or risk manager is deciding what type of computer technology to purchase. C. Repeatedly to provide information for data-driven decision making. D. When an insurer or risk manager has a specific problem.

D. When an insurer or risk manager has a specific problem

Carla is the risk manager for Shelton Manufacturing. She wants to develop a classification tree to help her determine whether an employee is likely to be able to return to work after an accident. Carla knows that whether or not the employee had a previous injury is the attribute that provides the greatest information gain in determining if an employee will return to work or not. Other attributes include the age of the employee, physical demands of occupation, whether the accident involved a back injury or not, and the availability of light duty work. Which one of the following would be the root node in the classification tree? Select one: A. Whether or not light duty work is available B. Whether or not the employee returned to work C. Whether it is a back injury or not D. Whether or not the employee had a previous injury

D. Whether or not the employee had a previous injury

Stevens Insurance developed a predictive model that predicts the likelihood that personal automobile policyholders will not renew their policies. The model is based on data on 500 policyholders. The data includes the policyholder name, age, number of vehicles insured, length of time insured with Stevens, and whether the policy renewed or not. Which one of the following would be considered the target variable in the model? Select one: A. Length of time insured with Stevens B. Number of vehicles insured C. Policyholder name D. Whether the policy renewed or not

D. Whether the policy renewed or not


संबंधित स्टडी सेट्स

306 Ricci Chapter 11: Maternal Adaptation During Pregnancy

View Set

Section 3.1 Measures of Central Tendency

View Set

Lecture 11: Infrared Spectroscopy Part 1

View Set

EAQ Sexually Transmitted Infections

View Set

Psyc100- Psychology Example Questions

View Set

Provinces and Capitals of the Philippines

View Set