Introduction to Machine Learning
What is database mining?
Large datasets from growth of automation and web. For example: web click data, medical records, biology, engineering
What feedback can we get from unsupervised learning?
With unsupervised learning there is no feedback based on the prediction results. There is no real data we can predict on.
What are examples of Machine Learning?
Facebook photos, tagging, recommendations, marking junk emails as spam, etc. Machine learning is conducted through learning the algorithms.
Classification problems
Another way of supervised learning is through classification problems. Predicting whether something is classified this way based on the characteristics and size.
Regression problems
A way of supervised learning is also considered as a regression problem, which is predicting continuous valued output (price).
What is an example of classification?
An example can be a various real data of tumor sizes and which size is more susceptible to being malignant or not. Rationale: This is called classification, which has discrete valued output (0 or 1), or more 0,1,2,3 depending different types.
"A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E." Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting? Classifying emails as spam or not spam. Watching you label emails as spam or not spam. The number (or fraction) of emails correctly classified as spam/not spam. None of the above—this is not a machine learning problem.
Classifying emails as spam or not spam is task T. Rationale: Watching you label emails as spam or not spam is task E. The number (or fraction) of emails correctly classified as spam/not spam is task P.
What is an example of clustering unsupervised learning?
Clustering - Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
What method can be used to predict the pricing on houses on a dataset?
Dataset prediction can be decided through a straight line of data or quadratic line through more data.
What's another example of a classification problem?
Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
Regression problem example
Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.
Supervised Learning
In this learning, we're already given a data set and already know what the correct output should look like. Having the idea that there is a relationship between the input and the output.
What is machine learning based off of?
Machine learning is based off of AI (Artificial Intelligence). It is grown off the field of AI.
What is Arthur Samuel's definition of machine learning? (1959)
Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.
What is an example of non-clustering unsupervised learning?
Non-clustering - The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).
What's an example of supervised learning?
Predicting price on house. Predicting the price of the house based on the square footage.
People cannot always write scripts and code for the machine to learn. What's a way for machines to process properly?
The only way to fix and find scripts/code/dataset is to let the machine learn on this own.
Classification problem example:
We could turn this example into a classification problem by instead of making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.
What is Tom Mitchell's definition of machine learning? (1998)
Well-posed learning problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
What's another example of a regression problem? Besides predicting prices and house sizes?
Given a picture of a person, we have to predict their age on the basis of a given picture.
What occurs in classification problems?
We are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
What occurs in regression problems?
In regression problems, we are trying to predict results within a continuous output, meaning that we are trying to map input variables into some continuous function.
Supervised learning is categorized into what problems?
Supervised learning is categorized into regression and classification problems.
How would you explain supervised learning?
Supervised learning is when you give the algorithm the right dataset. where the right answers were given. Every example determines the right price. Producing more of these right answers. These are predicted from actual right answers and right data.
Supervised learning problem You're running a company, and you want to develop learning algorithms to address each of two problems. Problem 1: You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months. Problem 2: You'd like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised. Should you treat these as classification or as regression problems? - Treat both as classification problems. - Treat problem 1 as a classification problem, problem 2 as a - regression problem. - Treat problem 1 as a regression problem, problem 2 as a classification problem. - Treat both as regression problems.
Treat problem 1 as a regression problem, problem 2 as a classification problem. (Correct )
What is Unsupervised Learning?
Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive this structure by clustering the data based on relationships among the variables in the data.