machine learning - Understanding code wrt Logistic Regression using gradient descent -


i following siraj raval's videos on logistic regression using gradient descent :

1) link longer video : https://www.youtube.com/watch?v=xdm6er7ztlk&t=2686s

2) link shorter video : https://www.youtube.com/watch?v=xrjcoz3afyy&list=pl2-dafemk2a7mu0bskscgmjemeddu_h4d

in videos talks using gradient descent reduce error set number of iterations function converges(slope becomes zero). illustrates process via code. following 2 main functions code :

def step_gradient(b_current, m_current, points, learningrate):     b_gradient = 0     m_gradient = 0     n = float(len(points))     in range(0, len(points)):         x = points[i, 0]         y = points[i, 1]         b_gradient += -(2/n) * (y - ((m_current * x) + b_current))         m_gradient += -(2/n) * x * (y - ((m_current * x) + b_current))     new_b = b_current - (learningrate * b_gradient)     new_m = m_current - (learningrate * m_gradient)     return [new_b, new_m]  def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):     b = starting_b     m = starting_m     in range(num_iterations):         b, m = step_gradient(b, m, array(points), learning_rate)     return [b, m]  #the above functions called below:     learning_rate = 0.0001     initial_b = 0 # initial y-intercept guess     initial_m = 0 # initial slope guess     num_iterations = 1000     [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations) # code taken siraj raval's github page  

why value of b & m continue update iterations? after number of iterations, function converge, when find values of b & m give slope = 0.

so why continue iteration after point , continue updating b & m ? way, aren't losing 'correct' b & m values? how learning rate helping convergence process if continue update values after converging? thus, why there no check convergence, , how working?

in practice, not reach slope 0 exactly. thinking of loss function bowl. if learning rate high, possible overshoot on lowest point of bowl. on contrary, if learning rate low, learning become slow , won't reach lowest point of bowl before iterations done.

that's why in machine learning, learning rate important hyperparameter tune.


Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -