Section VI: Big Data Analytics
Sensors
A sensor is a device that measures the physical environment, records data, and responds to it. Examples of a sensor include the following: Clothing sensors Heat sensors Machine sensors
Artificial intelligence (AI)
Artificial intelligence is a computer that is programmed to simulate human reasoning. The computer is able to analyze data to make assumptions and business decisions.
Data Mining - Association rule learning
Association rule learning is a data-mining technique that looks to discover new relationships within a data set. With association rule learning, the insurer does not know the characteristics of the data beforehand; the purpose is to explore and discover new relationships in data that can be useful in making business decisions.
Big Data vs. Traditional Data
Big data is an aggregation of information too large to be analyzed by traditional data-analysis methods. Big data differs from traditional data in the following ways: Accuracy of data (less accurate than traditional) Speed of data Types of data Value of data Volume of data
Big data
Big data is described as "extremely large data sets." These data sets are too large to be analyzed by traditional and manual methods; the large data sets are analyzed using computers to reveal patterns and trends.
Crisp DM
CRISP-DM is a six-step process: Step 1: Understand the business need for data mining. Step 2: Understand the data. Step 3: Prepare the data. Step 4: Begin modeling. Step 5: Evaluate data. Step 6: Use the model.
Data Mining - Classification
Classification is the process of creating groups within a data set, based on known characteristics of the data. This technique is used when the insurer knows what information it wants to predict.
Data Mining - Cluster analysis
Cluster analysis is a computer-based statistical method used to create groups within a data set by connecting data through various characteristics, forming relationships within the data that were previously unknown. This is particularly useful to uncover customer needs that could lead to new insurance products.
Data Mining
Data mining is the process of analyzing large amounts of data to find new connections, relationships, and patterns within the data.
Data science
Data science is the study of designing and using new techniques to process large amounts of data. This is a new field of study brought about by the growing need for businesses to manage and use big data to solve business problems.
Machine learning
Machine learning is a form of artificial intelligence where the computer continually teaches itself to learn from history and new data, helping the computer make better decisions.
Big Data 1.0
Most organizations are still in the first stage of big data, referred to as "Big Data 1.0." In this stage, organizations are just starting to utilize the internet by conducting business online and collecting publicly available data. Insurers in the Big Data 1.0 phase are using online insurance applications, collecting publicly available underwriting information, and marketing through online platforms.
Data Mining - Regression analysis
Regression analysis is a data-mining technique that calculates the probability of an outcome based on characteristics of a data set. This technique is used when the insurer knows what information it wants to predict.
Structured data
Structured data is data organized into databases with distinct fields. This data is linked to other databases. For example, an analyst reviewing the insurer's in-force book of business will pull claim history from one database and pull premium information from another database, combining the two structured data sets into one report.
Telematics
Telematics is a device used by insurers to collect driving data about an insured. This device connects to the insured's car and collects real-time data about the insured's driving such as hard braking patterns, the tendency for fast acceleration, and distance driven. This data is analyzed by insurers to determine the correlation between driving patterns and the likelihood of an accident.
Text mining
Text mining is the process of using text recognition to gather new data and information. For example, text mining can be used to analyze handwritten notes on an inspection report to identify common exposures between new insureds.
Descriptive method
The descriptive approach is used to analyze and solve one specific problem. It is not intended to be used beyond the specific problem.
Internet of things (IoT)
The internet of things is a network of objects connected to the internet. This enables everyday objects to send, receive, and transmit data to computers.
Predictive method
The predictive approach is applied continuously to data. It is intended to be used on new incoming data on a repeating basis to continually make data-driven decisions.
Data decisions-making process
The process of reaching a data-driven decision is very similar to CRISP-DM. There are six steps to reaching a data-driven decision: Step 1: Understand the problem. Step 2: Collect data. Step 3: Fix inaccurate data. Step 4: Choose a technique. Step 5: Review output. Step 6: Make a decision.
Unstructured data
Unstructured data is data that is not organized into a specific format; it often consists of text and images that cannot be easily grouped or stored in a database.
Big Data 2.0
We are currently entering the second stage of big data, referred to as "Big Data 2.0." In the Big Data 2.0 stage, data scientists are able to instantaneously extract, aggregate, and analyze information from various data sources. In this stage, companies are able to process and analyze information from numerous data sources such as homes, automobiles, and sensors.