CPCU 550

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

identifying the range of possible consequences that could result from the risk and determining the likelihood of their occurrence What are the three basic categories of accident causes? 1) poor m_____ 2) s______ policy 3) personal or e_________ factors Which one of the following statements about root cause analysis (RCA) is true? A. Root cause analysis is aimed at reducing the harmful effects of a loss once the loss has occurred. B. Weather conditions and earthquakes can be considered root causes for some losses. C. The first step in root cause analysis is analyzing causal factors. D. A root cause must produce effective recommendations for prevention of future accidents.

"predict and prevent" philosophy poor management, safety policy, and personal or environmental factors. D. A root cause must produce effective recommendations for prevention of future accidents.

What's a challenge for insurers using telematics and usage-based insurance?

(Safe drivers, privacy, CONTEXT!!!!!!!) insureds who agree to participate are likely already safe drivers looking to obtain lower premiums. Others don't want their movements tracked. !!!!! Another challenge involves the need to evaluate the data in context. For example, a driver in a congested area will brake harder and more frequently than a driver in a rural area. This could be considered a bad habit in a rural, congestion-free area, but, in a city, it may be necessary to avoid accidents. Without context of where or why a particular practice is taking place, the data may not be as useful. This selective reluctance to participate could prevent telematics data from being predictive about the driving population as a whole.

computer processing or output that simulates human reasoning/knowledge. Like how human brain processes data, just based on predictions based on a set of rules/complex calculations feature that uses AI to engage in dialogue with human and provide simple responses digital entity that's able to provide more complex answers to questions and guide customers to proper representatives.

Artificial intelligence- computer processing or output that simulates human reasoning/knowledge. Like how human brain processes data, just based on predictions based on a set of rules/complex calculations Chatbots- feature that uses AI to engage in dialogue with human and provide simple responses Conversational AI- digital entity that's able to provide more complex answers to questions and guide customers to proper representatives.

Katie is a data analyst who is analyzing a dataset of 100,000 workers compensation claims for her employer. The data shows that on average, 10% of first aid claims convert to indemnity within the first year. Through the use of a predictive model, he learns that for insureds in specific geographic regions, 15% of first aid claims convert to indemnity. Katie calculates the leverage to be A. 0.015 B. 0.05 C. 0.25 D. 1.5 Among a wide variety of sensors used by Mega Manufacturing, Inc., are units that measure the range of motion of a given piece of machinery. Which one of the following specialized sensors would be used for this purpose? A. Transducers B. Digital twins C. Accelerometers D. Actuators

B. 0.05 C. Accelerometers

Park Slope Baking recently began manufacturing a line of gluten-free products. Caroline is the products liability underwriter for the account and is not sure how concerned she should be about the new product line. She knows that gluten-free products have become more popular recently, but she does not know if there have been any products liability problems. Which one of the following is an appropriate technique that data scientists can use to help Caroline discover emerging risks with gluten-free products? A. Predictive modeling B. Cluster analysis C. Telematics D. Classification tree American Insurance Company uses cluster analysis to determine relationships in data related to underwriting. In cluster analysis, "k-means" represents A. An instance. B. The nearest neighbor. C. The centroid. D. An algorithm.

B. Cluster analysis D. An algorithm.

Ensuring fairness in predictive model used by insures requires that the models be based on A. Data dredging. B. Data mining. C. Non-discriminatory factors. D. Correlation. Internal data entry processes that capture accounting transactions, customer data or other operational transactions are called A. Data integration. B. Data governance. C. Data quality. D. Data capture. With the growth of algorithms, artificial intelligence and machine learning, Greater American Insurance Co. must ensure that the use of new technologies results in models that are fair and ethical. To do this, they are most likely to rely on a combination of unfair trade practices acts and A. Federal insurance law. B. The NAIC's Principles of Artificial Intelligence. C. Data from the Internet of Things. D. Consumer feedback.

B. Data mining. D. Data capture. B. The NAIC's Principles of Artificial Intelligence.

In her role at the Federal Emergency Management Agency (FEMA), Sabrina monitors a variety of technologies to help provide advance warning of pending catastrophes. Which one of the following types of disaster would she be alerted to by an accelerometer? A. Wildfire B. Earthquake C. Tsunami D. Hurricane Which one of the following forms of technology is used in both risk prediction and prevention with self-driving vehicles? A. Drones B. Computer vision C. Radio frequency identification tags D. Accelerometer

B. Earthquake B. Computer vision

Francesca is developing a predictive model using behavioral biometrics for insurers to predict customer retention. She measured the model's precision, recall, F-score, and accuracy metrics to determine the model's performance. Which one of the following of Francesca's metrics measures only positive results? A. Accuracy B. Precision C. F-score D. Recall Tania is a workers compensation claims manager at Millstone Insurance. She has noticed that the use of opioid medications is increasing significantly. Which one of the following data analytics approaches could be used to analyze the characteristics of medical providers and claimants involved in excessive opioid use? A. Telematics B. Data mining C. Predictive modeling D. Wearable sensors

B. Precision B. Data mining

Which one of the following statements is correct with respect to empirical probability distributions? A. Event categories (bins) are designed so that only the most probable events are included. B. They provide a mutually exclusive, collectively exhaustive list of outcomes. C. The sum of all empirical probabilities in a distribution can be any number. D. Any given event can fall into one or more categories of event (bins). Many types of ratemaking analysis require an insurer to examine the information in two separate company databases: a policy database and a A Premium database. B Operational database. C Claims database. D Statistical database. What are the three factors of the domino theory?

B. They provide a mutually exclusive, collectively exhaustive list of outcomes. C Claims database. fault of person hazard injury

In processing big data, Insurance Company reviews and analyzes both internal and external, structured and unstructured data. Which one of the following is an example of external unstructured data? A. Customer zip code information B. Trending social media topics C. Customer demographics including age and ethnicity D. Premium cost information from other insurers As a data management consideration, the term "velocity" refers to the increasing speed at which data arrives. It also includes A. The balance between structured and unstructured data. B. The rate of change in types of data. C. The accuracy of data. D. The processing power required to manage the data.

B. Trending social media topics B. The rate of change in types of data.

Mega Manufacturing Corp. uses a number of sensor networks throughout its operations, some of which use data aggregation to reduce network traffic. Because data aggregation networks carry the risk of hackers generating false values, Mega Manufacturing should A. Observe the principle of least privilege. B. Use encryption and message authentication protocols. C. Use only secure sensors. D. Avoid the use of wireless devices on these networks. Jonathan is a project manager for Murray Builders. As part of the jobsite safety program, they use pressure sensors, current flow sensors, position sensors, and motions sensors on a regular basis. Most of the sensors used by Murray Builders fall into which one of the following categories? A. Mechanical B. Thermal C. Biochemical D. Radiant

B. Use encryption and message authentication protocols. A. Mechanical

If the data used in a predictive model has too much complexity, it won't be accurate when data beyond the training data is applied to it. This process is known as A. Generalization. B. Cross-validation. C. Overfitting. D. Matrix confusion. Goshen Mutual is a personal lines insurer. It has decided to use link prediction models to target market new customers through social media connections with current customers. The model indicates that information that current customer Kathy posts on a platform spread widely through social media connections. Which one of the following statements is true about Kathy? A. Kathy has a high degree of closeness. B. Kathy has a minimal degree of betweenness. C. Kathy has a minimal degree of closeness. D. Kathy has a high degree of betweenness.

C. Overfitting. D. Kathy has a high degree of betweenness.

Raul is an insurance adjuster handling many claims. Because of security concerns, his organization has issued strict guidelines for maintaining his policyholders' and claimants' personally identifiable information (PII). One claimant recently discovered that his PII, only disclosed because of his recent accident, had been compromised. Raul was not found to be remiss in his security efforts. Which other party is most likely responsible? A. An agency in the process of premium collection B. The data scientist compiling the policy information for analysis C. The intermediary who first took the claimant's loss information D. The producer working solely as a consultant

C. The intermediary who first took the claimant's loss information

Which one of the following describes the law of large numbers? A. It states that, in order to be able to predict the relative probability of future events, those events must be both frequent, and independent of one another. B. It states that the more times a particular event has occurred in the past, the greater the likelihood of that same event occurring in the future. C. It states that as the number of similar but independent exposure units increases, the relative accuracy of predictions about future outcomes also increases. D. It states that events that have occurred in the past under identical conditions and resulting from unchanging causal forces will increase at a predictable rate into the future.

C. It states that as the number of similar but independent exposure units increases, the relative accuracy of predictions about future outcomes also increases.

What are some of the challenges of sensor networks? To reduce network traffic, some sensor networks employ a technique called data aggregation. What does this entail? How could computer vision be hacked?

Challenges of Sensor networks, built around low-cost and generally minimized-function sensors, are different from traditional networks and require different security protocols to effectively protect them. Sensor network challenges include wide geographic distribution and defenses that contemplate how attackers attempt to exploit the unique nature of a sensor network. These networks are designed with the assumption that each sensor node (gathering encrypted data) is unsecured; however, collected data from all nodes is aggregated at a secure base station. Compilation of individual data systems and data that could result in the totality of the information being classified, or classified at a higher level, or of beneficial use to an adversary Hackers can manipulate computer vision technology by inputting modified images that cause the computer to misinterpret them. A coordinated attack could disrupt the machine learning, leading to incorrect and dangerous results. Mitigating such risks begins with building machine learning algorithms that can detect and properly classify adversarial images

steps of the claims process

Claims process: · Acknowledge claim and assign it to rep · Identify policy, confirm coverage · Contact insured or insured's rep · Investigating and documenting claim · Determining cause of loss and loss amount · Concluding claim

A supervised learning technique that uses a structure similar to a tree to segment data according to known attributes to determine the value of a categorical target variable. Designed to find a defined target variable such as "accident" or "no accident." A statistical technique that is used to estimate relationships between variables. A statistical method to predict the numerical value of a target variable based on the values of explanatory variables. Comparisons are made between differences in the actual target variable values and the model's forecasted values A model that determines previously unknown groupings of data. These are all methods typically applied to (structured/unstructured data)

Classification trees Regression analysis linear regression Cluster analysis These are all usually applied to STRUCTURED data with defined fields

Information technology (IT) can help insurers increase operational efficiency. Which one of the following would enable field claim representatives at a storm site to access policy information, transmit claims information, and settle losses or make partial payments immediately so that displaced families could return to their homes as soon as possible? A. Cloud computing and storage B. Telematics C. Internet of Things D. Low cost mobile technology Which one of the following explains why a computer recursively applies a model? A. To analyze claims data from previous years B. To determine the probability of a target variable C. To analyze different splits in the values of attributes D. To identify attributes that can be used

D. Low cost mobile technology C. To analyze different splits in the values of attributes

Golmer Insurance is weighing the decision to purchase new information technology (IT) and its management is weighing the costs of efficiency against the savings and benefits the purchase can provide. In this effort, which one of the following will Golmer most likely conclude? A. Business volume cannot be expected to increase simply from incorporating new IT upgrades. B. While enhanced IT capabilities increase operational efficiency in underwriting, they have little impact on other functional departments. C. When evaluating the benefits of new IT, intangible benefits, such as customer satisfaction with the improved technology, are not quantifiable, and so should not be considered. D. Overall savings can result from the retirement of outdated IT, especially from no longer having to maintain the systems.

D. Overall savings can result from the retirement of outdated IT, especially from no longer having to maintain the systems.

Shota is a risk manager who is investigating an industrial injury. The most apparent cause of the injury is lack of attention by the worker, but Shota suspects there are other factors that played a part in the incident. Which one of the following techniques would be most appropriate for Shota to use to identify the causes of the injury? A. Domino theory B. Energy transfer theory C. Job safety analysis D. Root cause analysis

D. Root cause analysis

Aleski is analyzing the relationship between data points in claims data to develop a predictive model. He's comparing claims frequency and severity between new and long-term insureds and comparing attributes to predict the likelihood of large claims. Aleski is looking for A Distance between data points B The class label of the target variable C The nearest neighbor to the data D Similarities between data points In analyzing past claims, Insurance Company finds that a significant number of auto liability claims have been under-reserved. After analyzing these claims to find patterns, they classify and assign new claims based on a series of target variables such as "bodily injury/no bodily injury." They will then analyze the results to see which of these attributes are better predictors of claim severity. This is... A Classification tree analysis B K-means analysis C Cluster analysis D Generalized linear model analysis

D. Similarities between data points. A Classification tree analysis

Daniel is a data scientist who recognizes the four fundamental concepts of data science: information technology can be applied to big data to reveal characteristics; systematic processes can be used to discover useful knowledge from data; analyzing data too closely can result in worthless findings; and data mining approaches and results must consider how results will be applied. Which one of the following concepts of data science would Daniel apply to analyze a dataset that seems to indicate that an insurer's workers compensation rates are too high? A. Information technology can be applied to big data to reveal characteristics. B. Analyzing data too closely can result in worthless findings. C. Data mining approaches and results must consider how results will be applied. D. Systematic processes can be used to discover useful knowledge from data.

D. Systematic processes can be used to discover useful knowledge from data.

Sebastian is a corporate risk manager who is developing a model to predict insureds who are more likely to make fraudulent claims. He has identified attributes that contribute to fraud and had been testing the model using holdout data. Historical data indicates that 5% of insureds file fraudulent claims, but the model predicts that 90% of insureds are likely to commit fraud. Sebastian concludes that A. The data had underfit the model. B. The model had underfit the data. C. The data had overfit the model. D. The model had overfit the data.

D. The model had overfit the data.

Lidar technology has numerous applications for catastrophe management, both before and after a natural disaster. Lidar can be used for all of the following, EXCEPT: A. To collect data on air pressure, temperature, wind turbulence, and location to assist in rescue and remediation efforts B. To help identify regions, neighborhoods, or individual structures that may need to be evacuated before a flood C. To help determine the optimal location for emergency communication equipment after a disaster disrupts cell or internet service D. To continuously monitor earth movement to determine when and where an earthquake will occur

D. To continuously monitor earth movement to determine when and where an earthquake will occur

In general, the easiest opportunities to target a market come from A. Canvassing all customers in a specialty segment. B. Developing relationships with specialty insurers. C. Researching special insurance policies. D. Using special approaches to market segments. Shun is a risk manager who is conducting a probability analysis of the number of auto claims per insured over a 10-year period. The number of claims range from zero to five per insured. Shun plots the results as a(n) A. Discrete probability distribution. B. Continuous probability distribution. C. Theoretical probability. D. Empirical probability.

D. Using special approaches to market segments. A. Discrete probability distribution.

analysis of large amounts of data to find new relationships and patterns that will assist in developing business solutions T/F: it would be appropriate to use a predictive model before knowing what relationships to look for skills, technologies, applications and practices used to improve decision-making insights and reinforce information integrity analyze large databases of business intelligence to supply managers with suggested courses of action (outputs) for business problems. These outputs are limited to ones that are best supported by the data. Can improve speed and quality of decisions

Data mining False, A predictive model would not be an appropriate tool to use before the cluster analysis business intelligence decision-support systems (DSS)

involves analyzing large amounts of data to find relationships and patterns that will help in developing business solutions. a deliberate search for any relationships between data—even those that are insignificant. Besides sex, marital status, race, religion, national origin, and credit reports, what are some forms of data that consumers often deem unfair for use in risk selection and pricing?

Data mining Data dredging Some controversial forms of data include zip code; biometric and genetic information; purchase histories; telematics; online activities; social media posts; activities tracked by wearables; and information on injuries, disabilities, and medical conditions.

What are some examples of UW-related unfair trade practices? Where can consumers submit complaints about insurers' unfair trade practices? What are the two penalties a DOI could apply on an insurer who violated the law?

Discriminating unfairly when selecting loss exposures, Misclassifying exposures, cancel/non-renewing, using non-filed and non-approved rates, failing to apply implemented UW factors, failing to use proper policy forms, failing to follow state-specific rules Consumers can submit complaints about an insurer's unfair trade practices to the department of insurance (DOI) of the state where the activity in question occurred. If the DOI finds a complaint valid, the insurance commissioner may issue a cease-and-desist order barring the insurer from continuing the activity. The DOI may then hold a hearing, and if the insurer is successful in its defense, the DOI will remove the cease-and-desist order. . If the DOI finds that the insurer violated the law, it may impose one or both of two types of penalty: 1) Fine per violation- amount will vary based on activities conducted flagrantly with conscious disregard of the law 2) License suspension or revocation- penalty may be imposed if insurer's management knew unfair trade practice was occurring · If an insurer disagrees with the DOI's findings, it can generally file for judicial review. If the court agrees with the findings, the court's decision is final.

approach to accident causation that views accidents as energy released that affects objects in amounts or at rates that objects can't tolerate How can accidents be prevented using the energy transfer theory? break down each activity into individual sequential steps (usually repetitive) determines potential hazards if each action is not performed. Define controls and responsibilities

Energy transfer theory maintain safe distance, ensure objects moving at high energy can reduce energy and slow down/stop job safety analysis (JSA)

T/F: Algorithm has a strict meaning and can only be descriptive Algorithm vs model

FALSE- Algorithms (predictive or descriptive) can take a number of forms, such as mathematical equations, classification trees, and clustering techniques. After an insurer selects the business objectives of its model and the data to be analyzed, it chooses an algorithm. Strictly speaking, an algorithm is different from a model, which is an attempt to represent the state of something. However, in the world of data analytics, the terms are often used interchangeably.

T/F: predictive models can't determine unknown past values why might a descriptive model be used prior to predictive model?

FALSE- Predictive models CAN determine unknown values in the past or present ) auto insurer might want to know what similarities exist among third-party bodily injury claimants with large claims. Information gained from descriptive models can be used to build predictive models. For example, if an auto insurer wants to know how many of its third-party bodily injury claims will exceed $25,000, it might use a predictive model. Predictive models can also be used to determine unknown values in the past or present

Three layers of neural networks I____________, H_____________, O_______________ assisted and automated intelligence are _________ systems as opposed to augmented and autonomous as adaptive systems how could augmented and autonomous intelligence change insurance?

INPUT, HIDDEN, OUTPUT input layer, a hidden layer with nonlinear functions, and an output layer, that is used for complex problems. FIXED An effective augmented intelligence system would be able to identify an insured's liability exposures and could provide risk-rating recommendations. It may also be able to learn based on the judgments of human underwriters. Autonomous intelligence might be able to accept insurance applications, provide quotes, and process and pay claims

What are some of the cost-benefits of business intelligence? Water sensors, temperature sensors, motion activators, and smoke detectors would likely be the wireless sensor networks used by someone who was a...

In adopting cost-benefit considerations for tech: -savings from eliminating expenses for old tech -compatibility with other IT system -reducing employees workload once learning curve has been adapted -potential disruption to customers And their long term satisfaction -expected increase in business volume property manager

uses the results of other analysis techniques to identify the predominant determinants of an accident is called things that directly result in one event causing another In flipping a coin, each of the two possible outcomes, heads or tails, has an equal probability of 50%. Because on a particular flip of a coin, only one outcome is possible, these outcomes are A. Empirical. B. Mutually exclusive. C. Collectively exhaustive. D. Skewed.

Root cause analysis Causal factors B. Mutually exclusive.

Classification, regression and cluster analysis are applied to (structured/unstructured) data a challenged of unsupervised learning a challenge of supervised learning Why can doing unsupervised learning first be helpful?

STRUCTURED data for Classification, regression and cluster analysis it can reveal meaningless correlations (for instance, that policyholders named John are more likely to have accidents.) that there must be data about the target. Returning to the previous example, if no data exists about drivers under 30, there is nothing to study Conducting unsupervised learning first may provide the information needed to define an appropriate target for supervised learning.

What does "k" stand for in K-means? What does "k" stand for in k nearest neighbors? A prediction of the connection between data items. number of possible connections one could have measure of the distance from people (friends) have to a central person the extent to which a person connects others

In the k-means clustering algorithm, the letter k stands for the number of clusters created. The k value in the k-NN algorithm defines how many neighbors will be checked to determine the classification of a specific query point. link prediction centrality closeness betweenness

A network of objects that transmit data to computers how should cyber risk be treated since telematics and computer vision don't have a long history? best way to prevent IoT Cyber risk

Internet of things (IoT) associated cyber risks are often better treated first through risk prevention rather than risk transfer. keep structured inventory plan. . Also restricting access to personal data, limiting data collection to what's needed, and requiring robust passwords/disallowing auto connections

(product targeting, marketing, market, market segmentation, target marketing) 1) an entire collection of sellers and buyers of products and/or services 2) the process of dividing a market into approachable subgroups based on their characteristics, their needs, and the marketing actions they're most likely to respond to. 3) the act of promoting products or services through planning, pricing, and messaging in an attempt to make a sale. 4) a subset of marketing. It's the act of focusing marketing efforts on a specific subgroup of prospects or customers. 5) a subset of target marketing. It's the act of finding or creating unique products, services, and/or marketing strategies to specifically address the needs of the target audience. It optimizes the results from market segmentation and target marketing

1) A market is an entire collection of sellers and buyers of products and/or services. 2) Market segmentation is the process of dividing a market into approachable subgroups based on their characteristics, their needs, and the marketing actions they're most likely to respond to. 3) Marketing is the act of promoting products or services through planning, pricing, and messaging in an attempt to make a sale. 4) Target marketing is a subset of marketing. It's the act of focusing marketing efforts on a specific subgroup of prospects or customers. 5) Product targeting is a subset of target marketing. It's the act of finding or creating unique products, services, and/or marketing strategies to specifically address the needs of the target audience. It optimizes the results from market segmentation and target marketing.

Four kinds of smart sensors (thermal, radiant, mechanical, biochemical) 1) pressure sensors, flow sensors, motion detectors 2) home diagnostic tests, wearable fitness monitor, diabetes tests 3) smoke detectors, heat sensors, computer hardware sensors 4) optical sensors, radar, RFID

1) mechanical pressure sensors, flow sensors, motion detectors 2) biomechanical home diagnostic tests, wearable fitness monitor, diabetes tests 3) thermal smoke detectors, heat sensors, computer hardware sensors 4) radiant optical sensors, radar, RFID

a tool used to identify and assess privacy risks throughout the development life cycle of a program or system. A ___ should identify whether the information being collected complies with privacy-related legal and regulatory requirements. T/F: All data collected from data brokers can be trusted requires businesses to protect the personal data and privacy of EU citizens for transactions that occur within EU member states. It also regulates the export of personal data outside the EU.

A privacy impact assessment (PIA) is a tool used to identify and assess privacy risks throughout the development life cycle of a program or system. A PIA should identify whether the information being collected complies with privacy-related legal and regulatory requirements. FALSE- Be wary of data brokers who may market data to help businesses which was collected unethically or illegally General Data protection regulation (GDPR)

The National Association of Insurance Commissioners (NAIC) issued its Principles on Artificial Intelligence to guide insurers in developing models. Which one of the following is the overarching message of these principles? A. Do no harm B. Eliminate unintentional bias C. Avoid discriminatory data D. Be as prescriptive as possible RWB Insurance Company (RWB) is based in New York. Which one of the following should be used by RWB to ensure that information collected by underwriters complies with privacy-related legal and regulatory requirements, and that associated risks are being properly addressed? A. An off-site file back-up system B. The General Data Protection Regulation Guidelines C. A privacy impact assessment D. Firewalls

A. Do no harm C. A privacy impact assessment

Carolina has been given an assignment to calculate the probability of future auto claims based on historical auto claims data in each state. Carolina's prediction is an example of A. Empirical probability. B. Discrete probability distribution. C. Continuous probability distribution. D. Theoretical probability. Risk manager Miguel uses telematics to help his employer predict and prevent losses. He is implementing a proactive risk management program to detect leaks and provide early warnings. Miguel is most likely to use A. Connected-device ecosystems. B. Auto telematics. C. Biometric devices. D. IoT water sensors.

A. Empirical probability. D. IoT water sensors.

A worker for Build-Rite Construction Company removed the pressure gauge and warning sticker from an air compressor. The pressure gauge automatically shuts off the compressor when the pressure gets too high. The sticker warned that an explosion could result if the pressure was too high. Without the pressure gauge in place, the air compressor exploded. The explosion killed one worker and severely injured another. If Build-Rite performs a root cause analysis (RCA) of this fatal accident, which one of the following might be determined to be a root cause? A. Inadequate training B. Removal of the pressure gauge C. Explosion of the air compressor D. Removal of the warning sticker

A. Inadequate training

Which one of the following is an algorithm used to group data into clusters of claims? A. K-means B. Network analysis C. Unsupervised learning D. Centroid Lainie, a data manager for Insurance Company, is tasked with assessing whether the company should adopt a new technological solution. To do this, she first analyzes both the positive and negative results that are likely to occur as a result of making the change. This is called a(n) A. Business intelligence study. B. Underwriting assessment. C. Pros and cons review. D. Cost-benefit analysis. Risk manager Carla is investigating an incident in which a worker was injured. In addition to looking at the accident and the injury that occurred, she is evaluating the worker's ancestry, his unsafe acts and his personal faults. Carla is investigating this incident using A. Energy transfer theory. B. Domino theory. C. Job safety analysis. D. Root cause analysis.

A. K-means D. Cost-benefit analysis. B. Domino theory.

Which one of the following steps undertaken by an analyst during a data review can be particularly helpful in detecting data anomalies? A. Perform exploratory analysis B. Review prior data C. Determine data or metadata definitions D. Identify questionable data values As an actuary for Greater American Insurance Co., Stuart has a variety of methods to check data for reasonableness and consistency prior to undertaking large data-related projects. Which one of the following actions involves the most detailed examination of the data? A. Assessment B. Survey C. Review D. Audit

A. Perform exploratory analysis D. Audit

Midwestern Construction Company uses wearable technology to alert managers when an employee has entered a hazardous area. Which one of the following sensor types is most likely to be used in that situation? A. Proximity sensor B. Pressure sensor C. Motion sensor D. Position sensor Jacqueline and her team manage the inventory for a large shipping organization. To date, they have used barcodes to track and manage individual assets. Which one of the following explains why radio frequency identification (RFID) tags might be a better choice? A. RFID tracking happens in real time. B. There may not be enough unique barcodes for a large inventory. C. Most products now come with RFID tags already attached. D. RFID technology is less susceptible to hacking.

A. Proximity sensor A. RFID tracking happens in real time.

Adam analyzes data for a large P&C insurer. He has been given a project to mine data to group its insureds into rankings based on loss statistics. Adam is most likely to use which one of the following types of data model creation to complete this project? A. Unsupervised learning B. Predictive modeling C. Supervised learning D. Descriptive modeling Alva is a data analyst for an insurer and uses techniques such as classification trees, linear regression, cluster analysis, and linear models. When Alva needs to determine the relationship between attributes and a target variable, she creates an algorithm using A. Cluster analysis. B. Classification trees. C. Linear models. D. Linear regression.

A. Unsupervised learning D. Linear regression.

Performance metrics A measure of how often the model predicts the CORRECT OUTCOME. The formula is ( TP + TN) ÷ ( TP + TN + FP + FN), measures ONLY POSITIVES from model. The formula is TP ÷ ( TP + FP). measure of how well the model catches ACTUAL POSITIVE results TP ÷ ( TP + FN) popular way of evaluating a predictive model because it considers both precision and recall formula is 2 × ([Precision × Recall] ÷ [Precision + Recall])

Accuracy ( TP + TN) ÷ ( TP + TN + FP + FN), precision TP ÷ ( TP + FP) recall TP ÷ ( TP + FN) f-score 2 × ([Precision × Recall] ÷ [Precision + Recall])

How are similarity weights determined?

The three nearest neighbors could also be weighted by their distance to the new claim to predict whether that claim will be fraudulent. Assume that Claim A has a distance of 12, Claim B has a distance of 13, and Claim C has a distance of 20. Their contributions to the average distance from the new claim can be weighted by their individual distance from it. To calculate the contributions, each distance amount is squared, and the reciprocal of the square is calculated, resulting in a similarity weight. The similarity weights are then weighted relative to each other so that they total 1.00, resulting in a contribution amount. To calculate a probability estimate for each claim, a score of 1.00 is assigned to Yes, and a score of 0.00 is assigned to No, with these numbers multiplied by the respective contribution amounts.

What are the two main tent poles of ethical models (F and T) What makes an insurer predictive model fair and transparent?

The two main tent poles of ethical models are fairness and transparency. A fair model allows an insurer to make risk and pricing decisions based on data that's both accurate and truly predictive of the expected future cost of coverage. A transparent model allows an insurer to demonstrate the rationale for risk-related decisions and illustrates how consumer data is collected and used. It counters the perception of insurers' algorithms making decisions based on results from so-called black boxes that generate mysterious calculations from consumer data.

There are two types of associated risk with data which are... Advantages of effective data management?

There are two types of associated risk with data: individual risks, which vary according to the type of business and industry, and general risks, which may be categorized as operational or reputational. - Increased overall efficiency - Enhanced on-demand access to data - Improved decision-making

individual policies or a subdivision of policies explanatory information about records. typically include a policy identifier, relevant dates, premium, exposure, and risk characteristics database defined according to records and fields can this database make separate records during the policy period? database where each record generally represents a transaction tied to a specific claim

records fields policy database YES- Separate records can also be created for changes to the risk during the policy period: one record reflecting before and one reflecting after a change was put into effect, such as a midterm adjustment to an existing homeowners policy. claims database

insurer takes possession of damaged property for which it has paid total loss and recovers portion of loss payment by selling damaged property insurer recovers amount paid on claim from any party other than insured who caused loss or is other wise legally liable seamless transaction, documentation and payment at time/place of sale What are some special considerations for collecting claims data?

salvage subrogation point-of-sale Bill reimbursement, medical audit, personally identifiable information

Consider the example of a business being able to move its commercial autos based on a text alert about an impending storm. What technology enables this prediction and the resulting risk management decision? What elevates smart sensors over ordinary sensors?

sensors Smart sensors may even trigger remedial actions, such as deactivating machinery that is about to overheat, to control risk and prevent losses. Sensors can also be connected into networks, gathering information from multiple locations.

the measure of how alike two data objects are. It can be extremely useful when looking for the relationships between data points (instances). What can be used to measure similarity? Measuring distance between instances determines the most similar instances in a data model, which are called... value of target variable in a model based on majority of nearest neighbors in a model

similarity distance between points- Make right angle with points and find hypotenuse. For 2+ variables, plot in more than two dimensions nearest neighbors. K-nearest neighbors is An algorithm in which "k" equals the number of nearest neighbors plotted on a graph. class label

sense environment, process data, and communicate with other products and operations, generating big data that provides an analytical basis for the identification of risks and the prediction of likely future events network consisting of individual sensors placed at various locations to exchange data. Property managers can use technology that uses radio frequency to identify objects. communicates with antenna and transceiver. Helpful in supply chain operations for moving products sensor similar to radar that uses infrared light to detect nearby objects. In catastrophe management- can help with less-than-ideal lighting, capture/produce elevation data, collect area readings for rescue/remediation, determine optimal location for emergency comms equipment

smart products Wireless sensor networks (WSNs) Radio frequency identification (RFID) Lidar

validity vs reasonability

Validity- Refers to relevance or suitability for a particular application; the same data can be valid in one analysis but not in another. In the context of our wildfire example, a dataset that included auto claims might not be considered valid, as it probably doesn't have much predictive value for homeowners claims. Reasonability- Refers to the data's materiality, taking into consideration applicable business conditions and whether, at a basic level, the data makes sense. If the injured parties in a set of auto accident claims, for instance, all appear to be over 100 years old, the data may not be reasonable. dataset vs values

data management considerations associated with big data (Five V's) (Value, velocity, variety, volume, veracity) enormous amounts of data available that is ever-expanding, so a DM plan needs to constantly evolve high volume of unstructured data AND structured data. Handle them in different ways constantly increasing speed at which data arrives or changes completeness and accuracy of data. Unstructured data is likely to have less of this than unstructured data derived from results of data analysis to help orgs make better business decisions. Big data has potential to add value, but it must be obtained/analyzed w/techniques that provide meaningful results

Volume- enormous amounts of data available that is ever-expanding, so a DM plan needs to constantly evolve Variety- high volume of unstructured data AND structured data. Handle them in different ways Velocity- constantly increasing speed at which data arrives or changes Veracity- completeness and accuracy of data. Unstructured data is likely to have less veracity than unstructured data Value- derived from results of data analysis to help orgs make better business decisions. Big data has potential to add value, but it must be obtained/analyzed w/techniques that provide meaningful results

Which questions (timeliness, completeness, accuracy, reasonability, lineage, validity) do these answer? What percentage of zip codes in the selected data are valid? What percentage of zip codes truly capture the loss exposures' location and accurately reflect the risk being insured? Do any of the zip codes apply to postal boxes? How large is the dataset? Does it include zip codes from all across the state or only a certain area? Does the collection make sense? Do all of the entries include an acceptable number of digits? Are the zip codes up to date? What protections are in place to ensure the source of data is reliable?

What percentage of zip codes in the selected data are valid? (validity) What percentage of zip codes truly capture the loss exposures' location and accurately reflect the risk being insured? Do any of the zip codes apply to postal boxes? (accuracy) How large is the dataset? Does it include zip codes from all across the state or only a certain area? (completeness) Does the collection make sense? Do all of the entries include an acceptable number of digits? (reasonability) Are the zip codes up to date? (timeliness) What protections are in place to ensure the source of data is reliable? (lineage)

integration of strategic vehicle management solutions with innovative technologies. Provides info through series of layers: (communications layer, service layer, sensing layer) 1) uses a variety of sensors, cameras, and data-collection capabilities to make (or help the driver make) necessary corrections and provide information to others 2) which provides data transmission to and from drivers and managers using wireless protocols that ensure necessary capabilities 3) employs applications using data processing, cloud computing, and storage and analysis of large amounts of the data captured by vehicle sensors and provided by drivers What are some positive results the three layers of smart transportation could provide by working together?

smart transportation 1) sensing layer 2) communications layer 3) service layer Positive results include improved remote diagnostics, prompt driver response from real-time analysis of driving habits or physical condition, fuel and vehicle repair savings because of implemented corrections and preventive maintenance alerts, and customizable products and services (such as comparisons of nearby hotels and restaurants) to make rides easier for drivers and more enjoyable for passengers.

categories in which the data is organized what are some examples of nontraditional internal data sources? what are some examples of external data sources?

fields audio analysis, AI area scans other organizations, government data

software providing basic control for device's hardware "principle of least privilege" meaning for data collection

firmware should collect only data needed for the intended purpose. permission for access to the devices or data is restricted to the bare minimum required to fulfill the task.

-Systematic processes can be used to discover useful knowledge from data. -Information technology can be applied to big data to reveal the characteristics of groups of people or events of interest. -Analyzing data too closely can result in interesting findings that are not generally useful. -Data mining approaches and results must be thoughtfully considered in the context in which the results will be applied. four fundamental concepts of d____ ___________ why are premiums not immediately due?

four fundamental concepts of data science To give producers some protection against insureds' late payments, premiums are usually not due from the producer to the insurer until 30 or 45 days after the policy's effective date. This delay also allows the producer to invest the premiums collected until they're due.

A statistical technique that increases the flexibility of a linear model by linking it with a nonlinear function. GLM is used for more simple/complex data classification GLM THREE components: (systematic, link function, random) 1) probability distribution of response variable 2) linear combination of explanatory variables 3) relates results of random and systematic components a technique for unsupervised learning. It is commonly used when an insurer knows the general problem it wants to solve but not the variables it must analyze to do so.

generalized linear model used for more complex data classification RANDOM= PROB DISTRIBUTION OF RESPONSE SYSTEMATIC= LINEAR COMBO OF EXPLANATORY LINK= RELATES RANDOM AND SYSTEMATIC Random component (probability distribution of response variable), systematic component (linear combination of explanatory variables), link function (relates results of random and systematic components) cluster analysis

Examples of smart products M________ sensors can be used for surveillance and security. P__________ sensors convert pressure or tension into a measurement of electrical resistance. C_________ sensors are used to protect electronic systems and batteries from heat buildup. P__________ sensors are used to activate components only when they are in the optimal location for a particular process to continue. P_________ sensors, slightly different from position sensors, respond when an object reaches an area within range of the sensor. For example, proximity sensors in wearables can detect when a person has entered a hazardous area and warn the person or a manager. These would be used for which kind of role?

motion pressure current position proximity Construction and engineering managers = motion, pressure, current, position, proximity

when no more than one outcome can occur at a time when at LEAST one of the outcomes will occur no more than one of them can occur at a time...

mutually exclusive collectively exhausting . When the two concepts are combined, you have what's referred to as the MECE (mutually exclusive, collectively exhaustive) principle.

For classification tree: representation of data attribute pathway in classification tree terminal value used to classify instance based on its attributes

node arrow leaf node-

when model reflects training data too closely for it to be effective on other data Data that is used to ready a predictive model and that therefore must have known values for the target variable of the model. data that wasn't used for training data development to prevent overfitting ability of model to apply itself to data outside training data the process of splitting available data into multiple folds, or subsets, and then using different folds for training and holdout testing. The result is that all the data is used in both ways

overfit training data holdout generalization cross-validation

historical data is blended with multiple variables to construct models of anticipated future outcomes obtain information through language recognition method that is helpful when a specific problem needs to be solved, then no further analysis is needed method that is reusable for providing info for data-driven decision making

predictive modeling text mining descriptive analytics predictive analytics

technique for forecasting events on assumption that they're governed by unchanging probability distribution a set of probability estimates from a particular set of circumstances which includes the probability of EACH possible outcome Probability that is based on theoretical principles rather than on actual experience. All the information you need to determine a theoretical probability is right in front of you (coin toss, dice roll) associated with historical data. Based on actual experience through historical data or from observation of facts. ONLY estimates. Samples must be large and representatives A mathematical principle stating that as the number of similar but independent exposure units increases, the relative accuracy of predictions about future outcomes (losses) also increases.

probability analysis probability distribution theoretical probability empirical probability law of large numbers

wireless sensor networks (water sensors, temp sensors, motion sensors, smoke detectors) and drones could be used by which kind of role? RFID and GPS would likely be used by which kind of role? Internet of Things (IoT), wearables and Lidar could be used for safety and autonomy by which role? WSNs, sensors, accelerometers, and thermal sensors could be used by which role? wearables and drones would be used by which role?

property managers = wireless sensor networks (WSNs) and drones supply chain managers= RFID and GPS transportation managers= Internet of Things (IoT), wearables and Lidar catastrophe managers = WSNs, sensors, accelerometers, and thermal sensors workplace safety managers = wearables and drones

What is sometimes the downside of unsupervised learning? variable that describes characteristic of instance within a model (x variables) representation of data point described by set of attributes within model's dataset (occurrences in dataset) operational sequence used to solve mathematical problems and create computer programs

unsupervised learning doesn't have a defined target variable, which means it can sometimes reveal meaningless correlations Attribute- Instance- Algorithm-

big data is greater in which three areas than traditional data? (Value, velocity, variety, volume, veracity) What might happen when internal and external data about a claimant are combined that could be unintentionally harmful? Why might voice analysis not be a totally accurate indicator?

volume, variety, velocity An insurer might access individual items of data about a claimant from both internal and external sources. While the data from each source by itself may not violate the claimant's privacy rights, combining all the data could lead to conclusions that harm the claimant in a way that wasn't anticipated. While certain vocal indicators can suggest dishonesty, they may be the result of stress or anxiety.

what are the actuary review steps? How does a credit score differ from an insurance score? Why might credit and insurance scores not be totally accountable?

1. Determine data or metadata definitions 2. Identify questionable data values 3. Review prior data 4. Perform exploratory analysis Unlike credit scores, insurance scores are not intended to measure creditworthiness, but rather to predict the probability of a loss, and numerous studies have established the correlation between credit scores and loss potential. However, these scores don't account for the circumstances that may have caused someone's score to decline. An insured's score could be harmed by circumstances such as identity theft or predatory lending, or even a simple billing error. In addition, these scores can reflect racial bias and other unfairly discriminatory practices. When that's the case, large-scale use of credit scores in data models can then perpetuate these biases. As issues of equity take greater precedence, scrutiny of insurers' practices and pressure for regulators to respond also grow. Keeping abreast of developments in areas like predictive modeling algorithms, artificial intelligence, and machine learning can be a challenge for regulators, but it is essential because laws and regulations in these areas may lose relevance as these technologies advance. However, the NAIC has issued its Principles on Artificial Intelligence, which, along with unfair trade practices acts, are meant to guide insurers in developing models that are fair and ethical; accountable; compliant; transparent; and safe, secure, and robust.

Actuarial Standard of Practice No. 23, the Actuarial Standards Board defines these two processes as: a) A formal and systematic examination of data for the purpose of testing its accuracy and completeness. B) An examination of the obvious characteristics of data to determine if such data appear reasonable and consistent for purposes of the assignment. A review is not as detailed as an audit of data Determine data definitions, identify questionable data values, review prior data, perform exploratory analysis ^^ these are the steps for an actuary's _______ review allows users to generate a dynamic view of data without moving it or needing temporary or intermediary storage of it. ^^^ is a a subset of data _____________

Audit A formal and systematic examination of data for the purpose of testing its accuracy and completeness. Review An examination of the obvious characteristics of data to determine if such data appear reasonable and consistent for purposes of the assignment. A review is not as detailed as an audit of data. Actuary steps for a data review: Determine data definitions, identify questionable data values, review prior data, perform exploratory analysis data virtualization Data virtualization is a subset of data integration

Four types of AI (augmented, assisted, autonomous, automated) 1) Executes simple/routine tasks. Uses rules-based software to complete repetitive tasks that don't require human involvement. Doesn't learn from decisions, works best in well-defined tasks. 2) supports human work. Human makes decision, but AI provides data to support data decision-making process. assists humans in tasks/decision making. 3) works collaboratively with humans to perform tasks/make decisions. Often used to help people do new things they couldn't do otherwise. Adaptive, advanced analytics used to provide insights/recommendations, but human still gets final call. 4) machines act on their own to complete tasks and make decisions without human involvement. Very few true autonomous intelligence systems are in use today, and widespread adoption appears to be years away. Learns to improve calculations over lifetime

Automated Intelligence Assisted Intelligence Augmented Intelligence Autonomous Intelligence

Maggie is an underwriter for a company that insures delivery services. She notices an uptick in the number of accidents caused by employees of a particular insured where the insured vehicle "rear-ends" the claimant's vehicle. She investigates by reviewing data. Which one of the following findings is likely to lead to a data-driven decision on how to reverse the trend she discovered? A Maggie learns that the insured had provided generous bonuses to all employees just before the uptick in accidents. B Maggie learns that an initiative to make deliveries in less time coincides with the uptick in accidents. C Maggie learns that the lead repair mechanic was replaced just before the uptick in accidents. D Maggie learns that many of the insured's drivers were taking their cars home at the end of the workday during the period of the uptick.

B Maggie learns that an initiative to make deliveries in less time coincides with the uptick in accidents. Maggie learning that an initiative to make deliveries in less time coincides with the uptick in accidents is revealing. The implication is that drivers are in a hurry and going too fast when it is time to stop.

Pipes, Inc., is a large commercial plumbing company with hundreds of service vehicles. For safety purposes, it employs a variety of smart transportation technologies. Which one of the following layers of smart transportation technology would it use to analyze the large amounts of data collected from all its vehicles? A The communications layer B The service layer C The sensing layer D The reporting layer Luke is a data analyst who is analyzing a dataset of 100,000 workers compensation claims for his employer. The data shows that on average, 10% of first aid claims convert to indemnity within the first year. Through the use of a predictive model, he learns that for insureds in specific geographic regions, 15% of first aid claims convert to indemnity. Luke calculates the lift to be A 0.015 B 1.5 C 66.7 D 0.15

B The service layer Correct. The service layer employs applications using data processing, cloud computing and storage, and analysis of large amounts of vehicle data captured by sensors and provided by drivers. B 1.5

An insurer's governance, risk, and compliance programs create rules, processes, and controls that support the insurer's operating policies and strategic goals. The greatest benefit of these programs is A Availability. B Transparency. C Efficiency. D Inexpensiveness

B Transparency. They create transparency that offers the insurer's management and stakeholders a macro view of all the organization's daily activities and helps them identify potential credit, market, or operational risk exposures so that they can react quickly and appropriately.

Mega Manufacturing Corp. employs a number of new technologies to improve its operations, safety and security. Which one of the following would be the most likely application of computer vision technology? A Collision-avoidance features in warehouse machinery such as forklifts B Protecting the data integrity in its data aggregation networks C Facial recognition software to help keep unauthorized personnel from entering its premises D Sensors that detect smoke or overheating in its machinery Data quality can be adequately managed with meaningful metrics developed to align with... A Unstructured data. B Measurable business expectations. C Projected outcomes. D Timely results. T/F: For assessing liability risks for an emerging technology, a predictive model may not be useful as a first step.

C Facial recognition software to help keep unauthorized personnel from entering its premises ->Computer vision permits automation of tasks that would normally require human eyesight and decision making, such as facial recognition. B Measurable business expectations. TRUE

Miu is developing a model to predict agents who are more likely to manipulate life insurance applications in their favor. She has identified attributes that contribute to the behavior and is testing the model using holdout data. Her historical data implies that 50% of agents file manipulated applications, but the model predicts that 2% of agents are likely to engage in the behavior. Miu concludes that A The data had overfit the model. B The model had underfit the data. C The model had overfit the data. D The data had underfit the model Which one of the following is an appropriate way for a modeler to balance the priorities of accuracy and explainability? A Based on how the model and its results will be used B Based on how a previous iteration of the model performed C Based on the ease of interpreting the model results D Based on the abilities of the modeling technique

C The model had overfit the data. Overfitting is the process of fitting a model too closely to the training data for the model to be effective on other data. A Based on how the model and its results will be used

Mega Manufacturing Corp. inadvertently violates the provisions of the General Data Protection Regulation (GDPR). What is the maximum fine that might be assessed as a flat amount (not a percentage of revenues)? A. 5 million euros B. 10 million euros C. 20 million euros D. 40 million euros There are two types of associated risk for data privacy, individual and general risk. General data privacy risk A. Can be categorized operational or reputational. B. Involves legal and regulatory requirements. C. Is of specific concern to the European Union. D. Varies by the type of business or industry.

C. 20 million euros A. Can be categorized operational or reputational.

Sofia uses insurer operational data such as policy and premium data, accounting data, claims data and notes, and billing data in her analyses. Which one of the following types of data does Sofia refer to when she needs to analyze loss adjustment expenses, allocated loss adjustment expenses, and unallocated loss adjustment expenses? A. Claim data B. Premium data C. Accounting data D. Billing data As a data scientist, Ida must master the necessary skills for her profession: mathematics and statistics, domain knowledge, computer programming, and machine learning. Which one of the following of Ida's skills comes from a combination of two other skills? A. Computer programming B. Mathematics and statistics C. Domain knowledge D. Machine learning

C. Accounting data D. Machine learning

Diana is a claims manager whose company has recently begun using AI to assist with claims predictions and decision-making. The software is able to learn, so its recommendations improve over time, but it does not make any decisions itself; it helps humans make better decisions. Which one of the following terms best describes this technology? A. Autonomous intelligence B. Assisted intelligence C. Augmented intelligence D. Automated intelligence Arty has created a predictive model for bodily injury claimants that are likely to magnify their symptoms. He has received a new claim and this instance has four nearest neighbors, three of whom have claimants who magnified their symptoms and one that was legitimate. Arty refers to this data point as A 4-NN B NN-4 C NN-k D k-NN Sound risk management decisions are predicated on A Effective decision-making. B Quality data C Regulations and compliance. D Operational efficiencies.

C. Augmented intelligence A. 4-NN. B Quality data

Which one of the following explains how classification tree analysis can be used to improve the claims handling process? A. Classification tree analysis is an unsupervised learning technique that can explore unknown attributes. B. Classification tree analysis uses the process map to recommend improvements for the overall workflow. C. Classification tree analysis can be used to assign new claims according to target variables. D. Classification tree analysis can be used to determine the probability of outcomes for new claims. Commercial lines insurance markets can be segmented demographically based on A. Occupation. B. Income levels. C. Type of business. D. Family life cycle.

C. Classification tree analysis can be used to assign new claims according to target variables. C. Type of business.

which term do these questions apply to? (accuracy, completeness, lineage, reasonability, timeliness, validity) how comprehensive the dataset is relative to what it claims to represent and the extent to which it can be used for its intended purpose? Have all stakeholders agreed to a common time frame for collecting and uploading data? Is the data correct and the form in which the data is presented is unambiguous and consistent? Is the data at a level of consistency that mirrors business conditions within an acceptable range? Has an exhaustive effort revealed verifiable reasons for inconsistent and unexpected results from the data? What controls or changes can prevent this from occurring in the future? Is this within the acceptable range to reflect business expectations?

Completeness Completeness is defined by how comprehensive the dataset is relative to what it claims to represent and the extent to which it can be used for its intended purpose. Timeliness Have all stakeholders agreed to a common time frame for collecting and uploading data? Timeliness refers to a dataset's relevance relative to its intended purpose. That is, is it sufficiently up to date for its results to be considered currently applicable? Accuracy A particular dataset's accuracy, although nominally determined by whether the data is correct, is also measured by whether the form in which the data is presented is unambiguous and consistent (for example, the format in which date of birth is represented). Reasonability Is the data at a level of consistency that mirrors business conditions within an acceptable range? A dataset is reasonable if it nominally makes sense. For example, if some of the data fields in a listing of U.S. zip codes contained letters, it would not be considered reasonable. Data lineage Has an exhaustive effort revealed verifiable reasons for inconsistent and unexpected results from the data? What controls or changes can prevent this from occurring in the future? A dataset's lineage is, to the extent made possible by internal data governance policies, a history of its origin, the dates on which any changes occurred to it, and the nature of those changes. It is essentially the dataset's life story. Validity Is this within the acceptable range to reflect business expectations? Data is considered valid if it is, in fact, measuring what it purports to, is correctly stored and formatted, and conforms with any applicable internal data governance standards.

technology that simulates human vision. This is accomplished through the development and use of algorithms that can automatically provide visual understanding. It's used in self-checkout, medical imaging, etc. insights into data use and processing gained by combining AI and machine learning. Most human-like AI, mimicking brain's ability to process info and make adjustments. Can work with incomplete data and train self to perform new tasks. Subset of machine learning

Computer vision is a technology that simulates human vision. It involves detecting, extracting, and analyzing images to give that object context and allow a machine to respond to it as a human would. This is accomplished through the development and use of algorithms that can automatically provide visual understanding. <- used in self-checkout, medical imaging, etc. Deep learning- insights into data use and processing gained by combining AI and machine learning. Most human-like AI, mimicking brain's ability to process info and make adjustments. Can work with incomplete data and train self to perform new tasks. Subset of machine learning

Amani is an underwriter for a lawyers professional liability insurer. Her employer markets its products by administering the insurance programs for various bar associations. This can best be described as an example of A Demographic segmentation. B A targeted campaign. C Product targeting. D Affinity marketing. The actuaries at Greater American Insurance Co. understand that the predictive models they create must be not only fair and accurate but readily explainable. Explainability is demonstrated when a model's predictions can be easily interpreted and the modeling system A Can be repeated to produce identical results. B Can be easily adapted to other lines of insurance. C Follows best practices used by other insurers. D Avoids data dredging.

D Affinity marketing. Affinity marketing = a type of group marketing that targets various groups based on profession, association, interests, hobbies, and attitudes. A Can be repeated to produce identical results.

Fatima is a data scientist for an insurer. While analyzing claims data, she finds that insureds with a specific make of automobile have a higher incidence of theft than other insureds. Fatima would describe the relationship between automobile make and theft losses as A. Low gain, high entropy. B. Low gain, low entropy. C. High gain, high entropy. D. High gain, low entropy. Glaston Insurance wants to profitably grow its commercial auto book of business. It has identified the traits of its most profitable commercial auto accounts and plotted them on a graph. It will then use the model to identify prospective accounts with common traits. Glaston Insurance is using data modeling based on A. Connectiveness. B. Classification and location. C. Similarity and distance. D. Betweenness.

D. High gain, low entropy. C. Similarity and distance.

T/F: Lidar can only work after a natural disaster How would lidar be used before ansd after a flood? When would it be best to use lidar?

FALSE= Lidar can work both before AND after a natural disaster It can capture and produce accurate elevation data, enhancing the mapping of flood-prone areas before a flood and helping to determine the flood levels that will be reached based on current water levels in surrounding rivers. This can help identify regions, neighborhoods, or individual structures that may need to be evacuated before a flood. After a flood, the additional data lidar collects on air pressure, temperature, wind turbulence, and location provides information that assists in prioritizing rescue and remediation efforts. This data can also be used to analyze and compare various flood events. Lidar can help determine the optimal location for emergency communications equipment after a disaster disrupts cell or internet service. It can also provide street-by-street, or even structure-by-structure, analysis of locations with the highest risk, which can help rescue and remediation resources be deployed efficiently. under less-than-ideal lighting conditions caused by excess clouds, intense sunshine, or shadows.

Depending on the extent of their services, producers may handle a wide variety of insured and insurer data. What are some of the activities producer may handle? why should sensor application be narrow instead of broad?

Producers may be the initial contact with the customer and therefore collect and assess the information on the insurance application. producers may issue policies, collect premiums, provide customer service, handle small or routine claims, and provide varying levels of consulting. LIMIT HACKING REACH A focused approach will limit possible losses if a breach occurs in the network's security. For example, if the sensor network's main purpose is to monitor air quality in a mine shaft, it should not also be used to monitor other factors that have no impact on the mine shaft's operation (such as the number of employees who walk past the sensor).

Which characteristic of data quality refers to the materiality of the data, along with the applicable business conditions and whether, at a basic level, the data makes sense to collect and analyze? (timeliness, reasonability, lineage, accuracy, consistency, validity, completeness) What is the order of legal proceedings? T/F: Actuaries are required to review AND audit data

Reasonability Cease-and-desist, hearing, ruling FALSE: While actuaries are required to review data, they are not required to audit it.

T/F: claims that involve multiple coverages or causes of loss may be represented by separate records or through indicator fields. What is a challenge of combining policy data and claims data?

TRUE Claims data is often aggregated to the claimant level in datasets supplied to business users, such as actuaries. Policy deductibles and limits are typically applied at the OCCURRENCE level, not the claimant level. Data is initially collected when an application is accepted and processed and is collected thereafter when pertinent transactions (such as endorsements or renewals) are completed. Claims data changes as more claims for a given policy or accident year are reported, claims are paid, reserves are revised, or new information is acquired on the claim. As a result, data obtained at the first notice of loss can be much more incomplete than data on more mature claims.

T/F: Precision is usually a better measure of a model's success than accuracy. Precision vs recall Shortcut to remembering f-score

TRUE, precision is usually a better measure of a model's success than accuracy. Precision measures how often a model predicts the correct class, while recall measures how well a model finds all instances of the correct class. Precision and recall often have an inverse relationship, meaning that improving one metric usually worsens the other. Precision can be seen as a measure of quality, and recall as a measure of quantity. f-score 2 × ([Precision × Recall] ÷ [Precision + Recall]) 2 then Order of Operations (MDA) 2* [(P*R)/(P+R)]

The ________-_________ Code of Conduct helps professionals working with data navigate ethical quandaries. It states that, in some instances, accuracy and explainability can be competing interests, causing the modeler to prioritize one over the other based on how the model and its results will be used. T/F: Insurers aren't allowed to discriminate against different groups of customers when someone is apprehended for a small offense but the authorities look to find something larger which will carry a heavier penalty

The Oxford-Munich Code of Conduct helps professionals working with data navigate ethical quandaries. It states that, in some instances, accuracy and explainability can be competing interests, causing the modeler to prioritize one over the other based on how the model and its results will be used. FALSE- Not only is it legal for insurers to discriminate against different groups of insureds or applicants, it's often necessary. In insurance, discrimination means distinguishing, differentiating, or categorizing insureds. Fair discrimination helps ensure that premiums charged are commensurate with the risk. For example, an auto insurer may charge a higher premium for an insured with a history of speeding tickets. Pretextual stops

specialized sensors include (actuators, digital twin, accelerometers, transducers) device converts one form of energy into another mechanical device turning energy into motion or otherwise a change in position using a signal and energy source device measuring acceleration, motion and tilt separate profile of physical object that helps identify risks with the object. Will send alerts

Transducers- device converts one form of energy into another Actuators- mechnanical device turning energy into motion or otherwise a change in position using a singal and energy source Accelerometers- device measuring acceleration, motion and tilt Digital twin- separate profile of physical object that helps identify risks with the object. Will send alerts

actuaries vs data scientists difference info related to context of information data scientists is working with Results produced through data science are useful only if they are relevant to the business context.

actuaries- use math methods to analyze various insurance data for several purposes like developing rates or setting reserves vs data scientists- explore previously underused sources of data. Use data programming languages domain knowledge

put these in order -> machine learning, deep learning, artificial intelligence What differs AI (usually) from machine learning What differs deep learning from machine learning? What's a drawback to these programs??

artificial learning, machine learning, deep learning AI to ML= machine learning adapts and makes changes ML to DL= deep learning systems have the advantage of being able to work with incomplete data as well as train themselves to perform new tasks, like recognizing speech and images. One drawback to deep learning systems is that the processes and algorithms they use are often highly complex and difficult to explain. This can reduce transparency for customers and regulators.

a finite number of possible outcomes. Data can only take on certain values an infinite number of possible outcomes that can be divided into BINS. Normally described in terms of probabilities a value falls within a certain range Modern data analytics is revolutionizing accident prediction and prevention in two significant ways. What are they? 1) Facilitates collection of large amounts of info from ______ 2) Enables use of ____________________________________to quickly/precisely estimate accident probabilities

discrete probability distributions continuous probability distributions EXAMPLE: At the animal shelter, after counting the cats, you'll weigh them. The counts are discrete values while their weights are continuous. 1) Facilitates collection of large amounts of info from SENSORS 2) predictive models and machine learning to quickly/precisely estimate accident probabilities

three attributes of market segmentation: (geographic, demographic, behavioristic) ·_- purchased history, website activity. Find attributes of loyal customers to improve retention and satisfaction · - understand needs of area · - age, gender, education, occupation, ethnicity, income, family size, etc. combined to create even further subgroups a type of group marketing based on various attributes to sell to groups who have similar interests/needs type of auto insurance in which the premium is based on policyholder's driving behavior. Depends on various behaviors

behavioristic geographic demographic affinity marketing usage-based insurance

A k-means cluster's central point is called a... increasing claims costs for which the reserves are inadequate duration of 1+ years between open and closure Diane works with the insurer's data science team to develop an approach to analyzing long-tail claims. Because they don't yet understand the factors that have caused the adverse development, the team recommends applying which data modeling technique? what would the order be for the following procedures: -generalized linear model -supervised learning -unsupervised learning for k-means

centroid adverse development Long-tail claims they would use k-means clustering As a result, they may discover an outlier cluster of claims based on claims size and the ratio of ultimate loss to the 18-month estimate Work from unsupervised learning for k-means to generalized linear model based on target variable, now in supervised learning

What technique is enabling insurers to improve the accuracy of loss estimates for long-tail claims and more effectively detect fraudulent claims? And what does cluster analysis use to make its conclusions? What happens after the initial clusters are created?

cluster analysis The cluster analysis uses k-means, which indicates the number of clusters within which to group the data, to organize data into clusters of claims closest in distance (and therefore similar) to each group's centroid. After the initial clusters have been created, each one can be further analyzed for additional common attributes. So, a cluster of high-cost claims could produce more clusters according to attributes of litigation or subrogation. This clustering process would continue until the data scientist has a good understanding of the relationships between significant variables.

steps of predictive modeling

collect historical data, build model with variables in training data, test model with holdout, evaluate model, reassess as need

claim that has 1+ attributes about it that make it cost more than the average claim computer teaches itself to make better models based on results of previous results/new data Once claims are examined against attributes, the attributes can be ranked according to information gain. Then predictive model works with machine learning to recursively build a classification tree. recursively here means... process of extracting hidden patterns from data used fore things like research and fraud detection Obtaining information through language recognition.

complex claim machine learning successfully applying a model data mining text mining

automated checkout lanes; medical imaging; automotive safety, particularly related to automated vehicles; surveillance; and traffic control are examples of... a video-streaming service that tracks the shows and movies individuals watch and combines that data with the viewing habits of other subscribers to recommend new shows or movies to the entire group is an example of...

computer vision machine learning

shows predicted and actual results of model. Reveals amount and types of errors made using the model model predicted "yes" but result was "no" model predicted "no" but result was "yes" T/F: Once a model is used, it can be trusted indefinitely to provide meaningful results

confusion matrix False positive False negative FALSE: Once model is used, it isn't done. It should be reevaluated as more data becomes available. Regardless of initial accuracy, however, no predictive model will make accurate predictions indefinitely. When significant economic, technological, or cultural changes occur, predictive models should be updated and retrained.

five factors form chain of events that successfully lead to resulting accident/injury When is domino theory best applied? what are the factors of the domino theory? T/F: If even just one of these links is removed, it can, in theory, prevent the injury Which link should definitely be removed if none else?

domino theory best applied to situations within human control. · Ancestry and social environment · Fault of person · Unsafe act or hazard · Accident itself · Resulting injury Because each of the first four links of the domino theory leads directly to the next, removing any of them should, in theory, prevent the resulting injury from occurring. Removal of the third domino, the unsafe act and/or mechanical or physical hazard, is usually the best way to break the accident sequence and prevent injury or illness.

a tactic used to reduce network traffic which includes the process of combining and organizing data from multiple sources into a single format for analysis and decision-making. what is the risk of data aggregation? how can this risk be prevented?

data aggregation FALSE VALUES and DECRYPTION A fundamental risk here is that a hacker can exploit the sensor nodes and generate false values, compromising the aggregation downstream. A secondary risk is that, to execute the aggregation, data must be decrypted, creating a particularly attractive target point for hackers. ALGORITHMS and AUTHENTICATION Mitigation of this cyber risk requires incorporation of base station algorithms designed to detect forged or false data. Sensor networks employing data aggregation should use encryption and message authentication protocols designed specifically to ensure data integrity. Constant, real-time monitoring of sensor-generated traffic on the network is necessary, with any notable anomalies being investigated for possible evidence of outside manipulation.

data management program into an organization's strategic plan: (data access, integration, governance, preparation and capture, quality) starting point, set of rules and decisions for managing data. Provides guidelines for how/why data should be used and ensures compliance internal data entry processes to capture transactions, customer data, other ops info, outside data sources knowing where data is and how it can be retrieved involves processes that ensure data is accurate and usable for its intended purpose bringing together data from multiple sources across an organization to provide a complete, accurate, and up-to-date dataset for BI, data analysis and other applications and business processes.

data management program into an organization's strategic plan: Data governance- starting point, set of rules and decisions for managing data. Provides guidelines for how/why data should be used and ensures compliance Data preparation and capture- internal data entry processes to capture transactions, customer data, other ops info, outside data sources Data access- knowing where data is and how it can be retrieved Data quality- involves processes that ensure data is accurate and usable for its intended purpose Data integration- bringing together data from multiple sources across an organization to provide a complete, accurate, and up-to-date dataset for BI, data analysis and other applications and business processes.

how well data fits predictive model needs and expectations Elements (timeliness, reasonability, lineage, accuracy, consistency, validity, completeness) relevance or suitability to particular application how well data represents true values or measurements and information being analyzed. Must make sense IN CONTEXT measured by whether dataset delivers all variables it purports to (ex: housing data for all months) data's materiality, takes into consideration applicable business conditions and if it makes sense at a basic level appropriateness of fresh data to meet particular need; being within a certain timeframe systematic process of tracing data from its source to its destination and being able to ID errors or other mishaps that unexpectedly cause data to be inconsistent measures extent to which datasets stored in multiple locations match one another. Data stays consistent moving from one application to anothe

data quality Validity- relevance or suitability to particular application Accuracy- how well data represents true values or measurements and information being analyzed. Must make sense IN CONTEXT Completeness- (aka comprehensiveness) measured by whether dataset delivers all variables it purports to. (ex: housing data for all months) Reasonability- data's materiality, takes into consideration applicable business conditions and if it makes sense at a basic level Timeliness- appropriateness of fresh data to meet particular need; being within a certain timeframe Lineage- systematic process of tracing data from its source to its destination and being able to ID errors or other mishaps that unexpectedly happen which cause data to be inconsistent or inaccurate Consistency- measures extent to which datasets stored in multiple locations match one another. Data stays consistent and stable moving from one application to another

measure of predictive power in 1+ attributes. Variable usefulness measure of disorder in the dataset. How unpredictable it is. the percentage of positive (accurate) predictions made by the model divided by the percentage of positive predictions that would be made in the absence of the model. the percentage of positive (accurate) predictions made by the model minus the percentage of positive predictions that would be made in the absence of the model.

information gain entropy Lift Leverage

Marketing and distribution, underwriting and claims form the three parts of the insurance v__________ ______________ design and use of techniques to process large amounts of data from variety of sources and to ultimately provide knowledge based on data gather and analyze relevant verifiable data then evaluate results to guide business strategies

insurance value chain data science data-driven decision making

data organized into base with defined fields- like loss records, premium history data not organized in predetermined fields- text messages, audio/video files, phone calls ALAE stands for... ULAE stands for... PII stands for...

structured data unstructured data - allocated loss adjustment expense - unallocated loss adjustment expenses personally identifiable information (PII) - not ok under HIPAA

A type of model creation, derived from the field of machine learning, in which the target variable is defined. A type of model creation, derived from the field of machine learning, that does not have a defined target variable. when an an insurer wants to know whether auto policyholders under the age of 30 are more likely to have an accident than those over 30 when there is no defined target (insurer may simply want to know whether policyholders fall into natural groupings. There is no intended answer to this query) used to study data and gain insight into it. For example, an auto insurer might want to know what similarities exist among third-party bodily injury claimants with large claims. An auto insurer wants to know how many of its third-party bodily injury claims will exceed $25,000

supervised learning unsupervised learning supervised learning unsupervised learning descriptive modeling predictive modeling

learning that is suited for classification and regression tasks, such as weather forecasting, pricing changes, sentiment analysis, and spam detection learning is more commonly used for exploratory data analysis and clustering tasks, such as anomaly detection, big data visualization, or customer segmentation. Supervised/unsupervised learning would be used to identify buyer groups that purchase related products together to provide suggestions for other items to recommend to similar customers. Supervised/unsupervised learning would be used to predict flight times based on specific parameters, such as weather conditions, airport traffic, peak flight hours, and more

supervised learning unsupervised learning unsupervised supervised


Set pelajaran terkait

Intro to Programming Fundamentals Exam 1

View Set

MRU12.2: THE BALANCE OF INDUSTRIES AND CREATIVE DESTRUCTION

View Set

Ch. 30 - Liability of Principals, Agents, and Independent Contractors.

View Set

IS-0200.c Basic Incident Command System for Initial Response, ICS 200

View Set

Chapter 9 - Multinational Corporations Test

View Set