L:11 AI & ML
Key Points about Supervised Learning
- Each data tagged with the correct label. - Predict future outcomes based on past data - Requires an input and output variables - Training and testing the model.
Weak AI: Artificial Narrow Intelligence (ANI)
- requires training to perform a specific task. - drives most of the AI that surrounds us today. - E.g. Apple's Siri, Uber's self-driving, Facebook's facial recognition ,and Google Translate
For training the model, data set is split into 2 segments:
- training data - testing data.
Test set
A set of unseen data used only to assess the performance of a fully-specified classifier.
Evaluate "Tune" the model
A confusion matrix is used to evaluate the trained model, i.e., tell how well the trained model is.
Choose the model
Select the model that will be best for the type of your data. The goal is to train the best performing model possible using the pre-processed data.
Artificial Intelligence (AI)
Systems or machines that mimic human intelligence to perform tasks and can iteratively improve themselves based on the information they gather
Features
The input variables that we give to our machine learning models
True positives (TP)
These are cases in which we predicted TRUE, and our predicted output is correct.
True negatives (TN)
We predicted FALSE, and our predicted output is correct.
False negatives (FN)
We predicted FALSE, but the actual predicted output is TRUE.
False positives (FP)
We predicted TRUE, but the actual predicted output is FALSE.
Training set
a material through which the computer learns how to process information and tune the parameter of classifier.
Clustering
a problem in which a set of inputs is to be divided into groups. Example: identifying sub-groups of Patients with chronic cough (No ICD codes for chronic cough)
Classification
a problem in which the target variable is categorical ▪ Examples: classify an email as "spam" or "not spam", classify an x-ray of patient as having "pneumonia" or "no pneumonia". ▪ Example of classification algorithms: K-Nearest Neighbor, Support Vector Machine
Regression
a problem in which the target variable is continuous ▪ Example: Predicting prices of a house given the features of house like size, price. ▪ Example of regression algorithms: Linear Regression, Decision Tress/Random Forest
Machine Learning (ML)
a sub-field of AI that focuses on building system, which has the ability to learn without being explicitly programmed. It focuses on using data and algorithms to simulate the way humans learn and gradually improve its accuracy based on data they consume.
Deep Learning
a subset of machine learning that learn the representation of data using a hierarchy of multiple layers that mimic the neural network of human brain.
AI In Saudi Arabia
was established in 2019
The Challenges of Adopting Artificial Intelligence
▪ Building Trust ▪ Lack of skilled professional ▪ Data Security ▪ Software Malfunction ▪ Data Scarcity ▪ Algorithm Bias
Deep Learning Application
▪ Cancer tumor detection ▪ Image coloring ▪ Object detection
he supervised learning is categorized into 2 other categories
▪ Classification ▪ Regression
pre-processing techniques include:
▪ Conversion of data ▪ Ignoring the missing values ▪ Filling the missing value ▪ Outliers' detection
Why AI Now?
▪ More Computing Power ▪ Availability of More data ▪ Better Algorithms
AI In Saudi Arabia develop several AI-based program:
▪ Tawakkalna ▪ Tabaud ▪ Boroog ▪ Ehsan
Types of artificial intelligence
▪ Weak AI ▪ Strong AI
Outliers' detection
some error data is presented in data set that deviates drastically from other observations. For example: mistyping
Feature Extraction
The process of choosing relevant features for your machine learning model based on the type of problem you are trying to solve Example: the model decide which student needs extra tutorial sessions
Gathering data
The process of gathering data depends on the type of project. The data set could be collected from database, files, or sensors. The collected data can't be used directly to perform analysis
AI drive the Kingdom's advancement in artificial intelligence innovations:
• orchestrating AI research • developing AI solutions • enhancing AI education
Key Points about Unsupervised Learning
- AI system is presented with unlabeled, un-categorized data and the system's algorithms act on the data without prior training to discover patterns. - he system identifies hidden features from the input data provided. - Example of unsupervised learning is "Clustering".
A confusion matrix has 4 parameters
- True positives - True Negatives - False Positives - False Negative
Strong AI: Artificial General Intelligence (AGI)
- a theoretical form of AI where a machine would have an intelligence equaled to humans. - has a self-aware consciousness that has the ability to solve problems, learn, and plan for the future. - urpass the intelligence and ability of the human brain. - Not exist yet
Machine Learning Processes
1- Gathering data 2- Data Pre-processing 3- Choose the model 4- Evaluate the model 5- Deploy the mode
ML Types
1- Supervised Learning 2- Unsupervised Learning
Accuracy
= (TP +TN) / (TP+TN+FP+FN)
Deploy the mode
After the evaluation model phase is done, the resulted " trained" model can be utilized.
Unsupervised Learning
Discovering patterns in unlabeled data Example: cluster similar documents based on the text content
Supervised Learning
Learning with a labeled training set Example: email spam detector with training set of already labeled emails
Conversion of data
ML models handles only numeric features. Converting categorical and ordinal data into numeric
Data Pre-processing
Most important step that help building machine learning models more accurately. A process of cleaning the raw data and converts it into a clean data set.
Ignoring the missing values
Removing the row or column of data that has missing value.
Artificial Intelligence (AI)
a set of computer science techniques that allows computer software to learn from experience, adapt to new inputs and complete tasks that resemble human intelligence
Filling the missing value
fill the missing data manually, e.g., with mean, median or highest value.