Data mining process
What can we do with data mining/uses
Exploratory data analysis Predictive modeling which is categorized into two , classification and regression Descriptive modeling Discovering patterns and rules Deviation detection
Explain the data analytics process
Gather data Model the problem Run analysis Intrerpret results Knowledge Deployment Enhance apply broadly learn
What is unsupervised learning?
Unsupervised learning is the machine learning task of inferring a function to describe hidden structure from unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution. This distinguishes unsupervised learning from supervised learning and reinforcement learning.
What is supervised learning?
Supervised learning is the machine learning task of inferring a function from labeled training data. It is basically training a machine with training data where the target value is already known
What is data mining
the process of analyzing data to extract information not offered by the raw data alone
What are the sources of Data mining
Abundant data such as the web, e commerce , transactions, stocks From science for example sensing, biometrics, simulation, bioinformatics From the society such as YouTube, the news , digital cameras From data mining in semi automated analysis of massive data sets
What are the methods of supervised learning
Classification- this method is used to identify labels and groups. The input data is segregated into categories eg female male , car dog Prediction - it predicts numerical target variable, each row is a case and each column is a variable
What are the steps in data mining?
Define the problem Obtain data Clean and pre process data Specify the task to be used on the data Use one or more algorithms Implementation Assess the results Deploy model in production mode
What are the key features of data mining
It focuses on large data sets and databases for analysis-data mining is accomplish by building models. A model uses an algorithm to act on a set of data. The notion of the automatic discovery refers to the execution of data mining models. The prediction is based on likely outcomes-many Forms of data Mining are predictive and predictions have an associated probability. Generate automatic pattern predictions based on trend and behavior Focus on large data sets and databases