REG : Binarisation

`X_binarised_train = X_train.apply(pd.cut, bins=2, labels=[ ])`

Sir, while binarising the data with the above code, how do we decide the order of labels i.e. how do we decide whether we need to mention [1,0] or [0,1]?

What’re the criteria for deciding such a thing.

Can you please explain in detail?

Thank you.

It depends on the user, what label is appropriate for his/her usecase.

By default the data to be cut will be sorted in ascending order. Andfor two bins case, first half will be assigned the first label and 2nd half the 2nd label.

In example below, I wanted to split my data into 2 half, with smaller numbers getting label ‘a’ and larger numbers getting label ‘b’.
Screenshot 2020-12-18 at 10.38.26 PM

Thank you for the explanation Sir.