Doubt in Pandas DataFrame - Week 10 Notebook

Hello TA’s
Please refer to the screenshot for my ques.
What is the difference between the commented line and non commented line and why sir is writing the code in the way as the commented line when both line can give same output. DataFrame is new for me so, although sir explained very clearly the commented line but due to my incapability to understand clearly, i request you to please elaborate and explain again.

First of all note that:

  • there are different values for method, for e.g. : ‘Radial Velocity’, ‘Transit’, ‘Astrometry’ etc
  • There are total 1035 rows. Some of these rows have method ‘Radial Velociy’ others have ‘Transit’ and so on.

Now coming to the code

d[m] = df[‘method’].count()

here what we are doing is counting the total number of entries in df column ‘method’ which is 1035 (equal to the number of rows).

Note that we are addressing the whole column with name ‘method’ and are no specifying any specific type of method like Radial velocity etc.

On the other hand, consider this code

d[m] = df[df.method == m][‘method’].count()

Here first: df.method == m comparison makes sure only those rows are selected which has a specific method m (note number of rows associated with specific methods will be fewer than 1035). So this reduces the number of rows in the entire dataframe and only those rows with specific method m are left.

Next [‘method’].count(), simply count the number of rows(which are only specific to method m.
Compare the results shown on the two screenshots.
Remember:
df[‘method’] implies choose any method
df[‘method’ == m] implies choose only method m.

Thank you so much for such detailed explaination. It is really helpful.