Summaries for machine learning course (3)
- Get link
- X
- Other Apps
Feature scaling:
When we have different features that take on very different
ranges of values, it can cause gradient descent to run slowly. Through rescaling
the different features, they all take on a comparable range of values and can take gradient descent run much faster
How ?
Common scaling method:
For feature scaling, if there are multiple features, should I use the same method to do the feature scaling?
When you have multiple features in your dataset, it is generally recommended to use the same scaling method for all features to maintain consistency and comparability between them. Here's why and how to approach it:
Why Use the Same Scaling Method?
- Uniformity: Different scaling methods can transform features into ranges or distributions that may not align well, potentially confusing machine learning models.
- Model Sensitivity: Many models (e.g., distance-based models like k-Nearest Neighbors (k-NN), Support Vector Machines (SVM), or Principal Component Analysis (PCA)) assume that features are on the same scale. Using different scaling methods might bias the model toward features with a wider range.
- Interpretability: Having a consistent scaling method ensures that all transformed features are comparable, which aids in model interpretation and debugging.
Exceptions
- Heterogeneous Features: If your features are of different types (e.g., one is a count and another is a percentage), consider whether scaling them differently makes sense based on domain knowledge. For example:
- A feature measured in dollars might use logarithmic scaling to reduce skewness.
- A binary feature might not need scaling at all.
- Sparse Data: If some features are sparse (e.g., encoded categorical variables), scaling might affect sparsity and could require a tailored approach like preserving the 0 values while scaling non-zero values.
- Get link
- X
- Other Apps
Comments
Post a Comment