### Batch Gradient Descent

Unsolved###### Supervised

##### Problem reported in interviews at

There are various optimization algorithms that are used to find a local minimum for a differentiable function by minimizing the cost function of the problem. Gradient descent is one such algorithm that acts as an optimization algorithm which minimizes the cost function by updating the weights of the model.

The gradient descent algorithm multiplies the gradient by a learning rate to determine the next point in the process of reaching a local minimum.

In **batch** gradient descent, the error is calculated for each example in the training dataset and the parameters are updated only after all training examples have been evaluated once.

In this problem, we expect you to implement the batch gradient descent algorithm manually. The cost/loss function will be the mean squared loss for linear regression:

\(Loss = \tfrac{n}{2}*\sum_{i=1}^{i=n}(y_{i}-f(x_{i}))^{2}\)

**You have to initialize all the weights to zero to meet the desired output.**

**Input:**

x: an array of training examples

y: an array of output corresponding to each training example

lr: the learning rate for the algorithm

iter: number of iterations the algorithm will perform

**Output:**

A list of updated weights after every iteration. Do not include the first W that is an array of zeros.

The first element of W should be w_{o} i.e., if Y = wX + w_{o} then W = [w_{o}, elements of w]

To get the above W, stack a column of ones to X at the beginning of X

**Algorithm:**

1. Initialize W to zeros

2. Calculate predicted y_p as X*W

3. Now, use this y_p and actual y to calculate the derivative of the loss function with respect to W:

**Derivative of loss with respect to W = X ^{T}*. (y_p - y) **

Here, *. refers to the dot product.

4. Update the weights using the formmula: W = W - learning_rate*derivative of the loss function with respect to W.

5. Continue the above step for the number of iterations specified in the function.

##### Sample Input:

`x: [[0.99225237, 0.5395178, 0.79842092, 0.02775931, 0.39621768], [0.98536108, 0.66982212, 0.64544386, 0.571907, 0.65500807], [0.10700636, 0.72182486, 0.90849513, 0.04115586, 0.12004296], [0.88837122, 0.55088876, 0.07233497, 0.18170872, 0.75790715], [0.82160936, 0.29830165, 0.04743069, 0.56591258, 0.57792857], [0.7741544, 0.79494405, 0.43776452, 0.84449786, 0.13524823], [0.49676629, 0.68343065, 0.93262425, 0.48795023, 0.5276216], [0.72568726, 0.37546075, 0.84436292, 0.48876059, 0.41054587], [0.65780426, 0.32164874, 0.87530504, 0.58380788, 0.65859991], [0.30150944, 0.68509148, 0.3827384, 0.83898851, 0.7266716]]`

`y: [[0.81657103], [0.36262651], [0.08230336], [0.67738928], [0.03994914], [0.01392763], [0.59334186], [0.23453339], [0.30637318], [0.13115408]]`

`lr: 0.001`

`iter: 5`

##### Expected Output:

```
[array([[0.00325817],
[0.00252778],
[0.00182098],
[0.00208755],
[0.00108395],
[0.00181574]]), array([[0.00642997],
[0.00499589],
[0.00359319],
[0.00412289],
[0.00212727],
[0.00358801]]), array([[0.00951767],
[0.00740588],
[0.00531792],
[0.00610739],
[0.00313103],
[0.00531795]]), array([[0.01252345],
[0.00975928],
[0.0069964 ],
[0.00804236],
[0.00409626],
[0.00700665]]), array([[0.01544946],
[0.01205756],
[0.00862985],
[0.00992911],
[0.00502399],
[0.0086552 ]])]
```

Input Test Case

Please enter only one test case at a timenumpy has been already imported as np (import numpy as np)

Output