3. Feature engineering
What is the goal of feature engineering?
- Convert unstructured data into input to learning algorithm - Expose the structure of the concept to the learning algorithm - work well with the structure of the model - balance the number of features, complexity of concept, complexity of model, and the amount of data
Four types of supervised feature selection
- Filter - Wrapper - Embedded - Hybrid
benefits of feature engineering
- Improving the accuracy of ML model - Solving overfitting problem - speed up computation - understandability for ML procss
What are the 2 ways of feature extraction?
- Manual feature extraction - Automated feature extraction
Compare between filter and wrapper method.
- Wrapper methods are computationally more expensive than filter methods due to the repeated learning steps and cross-validation - However, wrapper methods are more accurate than the filter method
The objective of variable selection is three fold
- improving the prediction performance of predictors - providing faster and more cost effective predictors - providing a better understaning of the underlaying process that generated the data
To keep 'relevant features only', we remove the features that are?
- non informative - non discriminative - redundant
Supervised feature selection before training model
- statistical method: removing features with low variance - Filter method: Univariant feature selection
Before applying the wrapper feature selection, we must specify?
- what model type and which learning algorithms will be used - how to evaluate model accuracy
Supervised feature selection while training model
- wrapper method: recursive feature elimination - embedded method: L1 based feature selection
Filter supervised feature selection methodology
1. create groups of the features per different criteria 2. create benchmark for each group 3. test correlation inside the group against benchmark 4. keep only features less correlated to each other than to group benchmark
Backward elimination technique begins the process by considering (BLANK) of the features and removes the least significant feature
All
Feature
An individual measurable property or characteristic of a phenomenon being observed
Exhaustive feature selection of the best feature selection methods, which evaluates each feature set as (BLANK)
Brute-force
How can Information gain be used in feature selection?
Calculating the information gain of each variable with respect to the target variable
Chi-square test is a techniquw to determine the relationship between (BLANK) variables
Categorical
The performance of the wrapper method depends on the (BLANK)
Classifier
Features should be (BLANK) with the target but (BLANK) among themselves
Correlated, uncorrelated
If the correlation coefficient crosses a certain threshold value, we can?
Drop one of the features
Forward selection is an iterative process, which begins with a/n (BLANK) set of features
Empty
In wrapper supervised feature selection, a predictive model is used to (BLANK) and (BLANK)
Evaluate a combination of features, assign model performance scores
T or F: Wrapper method is less expensive than filter
False. Wrapper is computationally expensive
(BLANK) is the science and art of extracting information from raw data
Feature engineering
(BLANK) yields better results than applying ML directly to the raw data
Feature extraction
(BLANK) returns the rank of the variable on the (BLANK) criteria in descending order. Then we can select the variables with the largest score
Fisher's score, fisher's criteria
Recursive feature elimination is a recursive (BLANK) approach, where features are selected by recursively taking smaller and smaller subset of features.
Greedy optimization
Wrapper method is not recommended on (BLANK) number of features
High
(BLANK) determines the reduction in entropy while transforming the dataset
Information gain
The aim of feature selection is (BLANK)
Maximize relevance and maximize redundancy
(BLANK) is one example of wrapper and embedded feature selection
Random forest
What is a drawback of the low variance filter?
Relationships between feature or feature and target variables are not taken into account
Feature engineering is a (BLANK) problem
Representation
The wrapper methodology considers the (BLANK) sets as a search problem, where different combinations are prepared, evaluated, and compared to other combinations
Selection of feature
High correlation between two features means?
They have similar trends and are likely to carry similar information
In missing value ratio, a predefined (BLANK) may be defined. In case of low missing values, the (BLANK) technique may need to be applied
Threshold, imputation
Manual features extraction can be impractical for huge datasets, and may need a good understanding of the background or domain. is the sentence true or false?
True
T or F: filter supervised feature selection does not depend on the learning algorithm
True
T or F: the best subset of features is selected based on the results of the classifier.
True
T or F: too many variables lead to slow computation which in turn requires more memory and hardware.
True
T or F: wrapper method perform better than filter.
True
Feature selection is also called (BLANK) or (BLANK) or (BLANK)
Variable selection, attribute selection, dimensionality reduction
embedded feature selction performs better than wrapper and filter. Why?
because it has a collective decision
The classifier's performance usually will (BLANK) for a large number of features
degrade
the required number of samples to achieve the same accuract grows (BLANK) with the number of variables
exponancialy
In the embedded method, there are ensemble learning and (BLANK) learning methods for feature selection
hybrid
Manual feature extraction requires (BLANK) and (BLANK) the features that are relevant for a given problem and implementing a way to extract those features
identifiying and describing
removing redundant dara variables helps to (BLANK)
improve accuracy
As a dimentionality reduction technique, feature selection aims to choose a small subset of the relevant from the original features by removing (BLANK)
irrelevant, redundant, or noisy features
Feature selection can lead to better ?
learning performance, higher learning accuracy, lower computational cost, and better model interpretability
embedded feature selection is computationally (BLANK) than wrapper methods. However, this method has a drawbacj specific to a learning model
less intensive
Missing value ratio removes the features which have high ratio of (BLANK)
missing values
What is the advantage of the filter method?
needs low computational time and does not overfit the data
Too many variables might result to (BLANK) which means model is not able to generalize the pattern
overfitting
Inclusion of a relevant variable has a (BLANK) affect on model accuracy
positive
the main priority in hybrid feature selection
select the methods, then follow their processes
Automated feature extraction uses (BLANK) or (BLANK) to extract features automatically from signals or images without the need for human intervention
specialized algorithms or deep networks
In filter method, features are selected using (BLANK)
statistics measures
Feature extraction
the process of transforming raw data into numerical features that can be processed whil perserving the information in the original dataset
The process of creating hybrid feature selection methods depends on (BLANK)
what you choose to combine