Machine Learning
Unsupervised learning techniques
- Affinity Analysis - Clustering - Clustering: k-means - Nearest-neighbor mapping - self-organizing maps - singular value decomposition
Reinforcement learning techniques
- Artificial Neural networks - Learning automata - Markov Decision Process - Q-learning
Supervised learning techniques
- Bayesian statistics - decision trees - forecasting - neural networks - Random Forests - regression analysis - support vector machines
4 types of machine learning
1. supervised 2. semi-supervised 3. Unsupervised 4. Reinforcement
Regression analysis
A method of predicting sales based on finding a relationship between past sales and one or more independent variables, such as population or income
Unsupervised learning
Category of data-mining techniques in which an algorithm explains relationships without an outcome variable to guide the process. (Market basket analysis, anomaly/intrusion detection, identifying like things)
Supervised learning
Category of data-mining techniques in which an algorithm learns how to predict or classify an outcome variable of interest. (Risk assessment, personalizing interaction, fraud detection, customer segmentation, image-speech-text recognition)
Artificial Neural networks
Computer systems that are intended to mimic the human brain
Markov Decision Process
Consists of a set of states, set of actions, a probability function for the next state given the current state and action, and an immediate reward function given the current state, current action, and next state.
Decision trees
Diagrams where answers to yes or no questions lead decision makers to address additional questions until they reach the end of the tree.
Clustering: K-Means
For a given K, finds K clusters by iteratively moving cluster centers to the cluster centers of gravity and adjusting the cluster set assignments.
Support Vector Machine
Supervised learning classification tool that seeks a dividing hyperplane for any number of dimensions can be used for regression or classification
Random Forests
Very good performance (speed, accuracy) when abundant data is available Use bootstrapping/bagging to initialise each tree with different data Use only a subset of variables at each node Use a random optimisation criterion at each node Project features on a random different manifold at each node
Nearest-Neighbor
a resampling method that uses the nearest pixel value to estimate a new pixel value
reinforcement learning
a type of Machine Learning, and thereby also a branch of Artificial Intelligence. It allows machines and software agents to automatically determine the ideal behavior within a specific context, in order to maximize its performance. (gaming, robotics, navigation)
Q-learning
algorithm to learn quality of actions telling an agent what action to take under what circumstances. It does not require a model of the environment, and it can handle problems with stochastic transitions and rewards, without requiring adaptations.
Affinity Analysis
data analysis and data mining technique that discovers co-occurrence relationships among activities performed by (or recorded about) specific individuals or groups.
Neural networks
interconnected neural cells. With experience, networks can learn, as feedback strengthens or inhibits connections that produce certain results. Computer simulations of neural networks show analogous learning.
Learning automata
lets you get regression testing out of the way so you can focus on the fun stuff, while simultaneously allowing you to test more software in each sprint.
Deep learning
machine learning that uses neural networks to identify relationships in data by modeling processes of the human brain
forecasting
method for predicting how variables will change the future
Clustering
organizing items into related groups during recall from long-term memory
Natural language processing
processing that allows the computer to understand and react to statements and commands made in a "natural" language, such as English
Self-organizing maps
produces a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples
Bayesian statistics
statistics that involve a formula for calculating the likelihood of a hypothesis being true and meaningful, taking into account relevant prior knowledge
Machine Learning
the extraction of knowledge from data based on algorithms created from training data
Cognitive Computing
the use of artificial intelligence techniques and access to vast amounts of data to simulate human problem solving in complex situations with ambiguity, changing data, and even conflicting information
Semi-supervised learning
training data includes a few desired outputs (Image recognition/classification, speech recognition, webpage classification)
Singular Value Decomposition
used to reduce the number of terms in a matrix