3

Batch Gradient Descent

Unsolved

Supervised

Difficulty: 5 | Problem written by ankita

Problem reported in interviews at

Amazon
Apple
Facebook
Google
Netflix

There are various optimization algorithms that are used to find a local minimum for a differentiable function by minimizing the cost function of the problem. Gradient descent is one such algorithm that acts as an optimization algorithm which minimizes the cost function by updating the weights of the model.

The gradient descent algorithm multiplies the gradient by a learning rate to determine the next point in the process of reaching a local minimum.

In batch gradient descent, the error is calculated for each example in the training dataset and the parameters are updated only after all training examples have been evaluated once. 

In this problem, we expect you to implement the batch gradient descent algorithm manually. The cost/loss function will be the mean squared loss for linear regression:

\(Loss = \tfrac{n}{2}*\sum_{i=1}^{i=n}(y_{i}-f(x_{i}))^{2}\)

You have to initialize all the weights to zero to meet the desired output.

Input:
x: an array of training examples
y: an array of output corresponding to each training example
lr: the learning rate for the algorithm
iter: number of iterations the algorithm will perform

Output:
A list of updated weights after every iteration. Do not include the first W that is an array of zeros.

The first element of W should be wo i.e., if Y = wX + wo then W = [wo, elements of w]
To get the above W, stack a column of ones to X at the beginning of X

Algorithm:

1. Initialize W to zeros

2. Calculate predicted y_p as X*W

3. Now, use this y_p and actual y to calculate the derivative of the loss function with respect to W:

   Derivative of loss with respect to W =  XT*. (y_p - y)

   Here, *. refers to the dot product.

4. Update the weights using the formmula: W = W - learning_rate*derivative of the loss function with respect to W.

5. Continue the above step for the number of iterations specified in the function.

Sample Input:
x: [[0.99225237, 0.5395178, 0.79842092, 0.02775931, 0.39621768], [0.98536108, 0.66982212, 0.64544386, 0.571907, 0.65500807], [0.10700636, 0.72182486, 0.90849513, 0.04115586, 0.12004296], [0.88837122, 0.55088876, 0.07233497, 0.18170872, 0.75790715], [0.82160936, 0.29830165, 0.04743069, 0.56591258, 0.57792857], [0.7741544, 0.79494405, 0.43776452, 0.84449786, 0.13524823], [0.49676629, 0.68343065, 0.93262425, 0.48795023, 0.5276216], [0.72568726, 0.37546075, 0.84436292, 0.48876059, 0.41054587], [0.65780426, 0.32164874, 0.87530504, 0.58380788, 0.65859991], [0.30150944, 0.68509148, 0.3827384, 0.83898851, 0.7266716]]
y: [[0.81657103], [0.36262651], [0.08230336], [0.67738928], [0.03994914], [0.01392763], [0.59334186], [0.23453339], [0.30637318], [0.13115408]]
lr: 0.001
iter: 5

Expected Output:
[array([[0.00325817], [0.00252778], [0.00182098], [0.00208755], [0.00108395], [0.00181574]]), array([[0.00642997], [0.00499589], [0.00359319], [0.00412289], [0.00212727], [0.00358801]]), array([[0.00951767], [0.00740588], [0.00531792], [0.00610739], [0.00313103], [0.00531795]]), array([[0.01252345], [0.00975928], [0.0069964 ], [0.00804236], [0.00409626], [0.00700665]]), array([[0.01544946], [0.01205756], [0.00862985], [0.00992911], [0.00502399], [0.0086552 ]])]

Input Test Case

Please enter only one test case at a time
numpy has been already imported as np (import numpy as np)

Output