Doubt in Pandas: apply() function

create a function

def gender_to_numeric(x):
if x == ‘M’:
return 1
if x == ‘F’:
return 0

apply the function to the gender column and create a new column

users[‘gender_n’] = users[‘gender’].apply(gender_to_numeric)

a = users.groupby(‘occupation’).gender_n.sum() / users.occupation.value_counts() * 100

sort to the most male

a.sort_values(ascending = False)

In this code, iam not able to understand this piece of code:

users[‘gender_n’] = users[‘gender’].apply(gender_to_numeric)

What is the significance of this variable and what it actually does. In the last when this variable is implemented, only “gender_n” is written. Iam not able to understand this.

With apply function, you can apply a function to all the values of the pandas series.

users[‘gender_n’] = users[‘gender’].apply(gender_to_numeric)

Here, we are taking the gender column and applying the function gender_to_numeric on all of its values and save the results in a new column called gender_n.

For more details on apply function, please refer to the documentation.

I just want to understand " users[‘gender_n’] "
As you said this means we created another column named ‘gender_n’. Will this column be added permanently to our original data.?

Yes. A new column called gender_n will be created on the dataframe users.

Ok sir, Thankyou very much.