Machine learning algorithms:
Supervised learning: used most in real-world applications
Unsupervised learning
recommender systems
Reinforcement learning
(Tool vs how to apply the tool)
1. Supervised learning:
X to Y
input to output/lable
learns from being given"right answers"
Algorithms:
1.1regression: predict house price vs. size
1.1.1 Terminology:
x==input variable/feature
y==output variable/target
m=number of training examples
(x,y)=single training example
Add 1/2 to look neater
find w and b to minimize J
1.1.2 Train the model with gradient descent
Gradient Descent is an optimization algorithm used to minimize a function by iteratively moving towards the function's steepest descent, as defined by the negative of the gradient. It's widely used in machine learning and deep learning to minimize the cost or loss function and optimize the parameters of a model.
J will not always follow a bell shape, so it may have >1 minimum
learning rate: alpha (between 0 and 1)
Derivative term of J
w is assigned and b is assigned as the following:
Repeat until the algorithm converges. Converges refer to reaching a local minimum where the parameters w and b no longer change much with each additional step you take.
Remember alpha, the learning rate is always positive.
The derivative term is equal to the slop in this example.
1.1.3 How do you choose a learning rate?
If the learning rate is too small, gradient descent may be slow.
If the learning rate is too large, gradient descent may :
--overshoot, and never reach the minimum.
--Fail to converge, diverge
It can reach a local minimum with a fixed learning rate since when it is near a local minimum,
--- Derivative (slope in the example) becomes smaller
--- Update steps become smaller
1.1.4 Example : Gradient descent for linear regression

If using the squared error cost function, it has a single global minimum. It is due to its bowl shape and it also called convex function.
some other functions may have more than one local minimum.
1.1.5 Example :Batch gradient descent for linear regression model : Other gradient descents use subsets.
1.2 classification : predict categories e.g. breast cancer detection : benign va malignant
2. unsupervised learning: --Find sth interesting in unlabeled data--Data only comes with inputs x, but not output labels y. The algorithm has to find structure in the data
vs supervised learning--learn from data labeled with the right answers
Algorithm:
2.1Clustering: Group similar data points together e.g. Google News, DNA microarray
2.2Anomaly detection: find unusual data points. e.g. fraud detection
2.3Dimensionality reduction: compress data using fewer numbers
Reference :
Code/class notes for reference : https://github.com/greyhatguy007/Machine-Learning-Specialization-Coursera/tree/main/C1%20-%20Supervised%20Machine%20Learning%20-%20Regression%20and%20Classification
Comments
Post a Comment