Why the m=X.shape[1] ??

it should me m=X.shape[0] as index will tell us the number of data_points.

m=X.shape[1]

self.w-=learning_rate*dw/m
self.b-=learning_rate*db/m

Why the m=X.shape[1] ??

it should me m=X.shape[0] as index will tell us the number of data_points.

m=X.shape[1]

self.w-=learning_rate*dw/m
self.b-=learning_rate*db/m

There seems to be another bug in 0324_ScalarBAckPropagation notebook. The FirstFFNetwork class has m= X.shape[0[ instead of m = X.shape[1]. Since X.shape[0] is 750 in this case, instead of X.shape[1] which is 2, the weight gradients are becoming very small when divided by m.

As a result, in the lesson, a high learning rate of 5 had to be selected for the model to converge. Using X.shape[1], a small learning rate of 0.01 would suffice.

Surprisingly, this same class in 0318_FeedforwardNetwork notebook does not have this bug for the same FirstFFNetwork class. It correctly has m = X.shape[1]

Hi @parsar0,

The errata is in `0318_FeedforwardNetwork`

Notebook. It should have been `X.shape[0]`

, it signifies *m*, which is the no. of training examples.

On the other hand, `X.shape[1]`

is the no. of input features.