Doubt with np.argmax - Week 8

image

How does this work? How does it give where Rahul is higher and not Sachin is higher? How to get Sachin is higher value?

I don’t know what is the structure of snr, so I made some assumption, based on your question:

  • Its a 2 dimension structure with 225 rows and 2 columns.
  • Column 0, represents Sachin’s score and Column 1 represents Dravid’s score.
    So basically every row has two entries: entry 0 for Sachin and entry 1 for Rahul

If above assumptions are incorrect, you can stop here and ignore the rest of this reply(It’s garbage).
If assumption is correct, read on.

np.argmax(snr, axis=1) will return either 0(index for Sachin’s score) or 1(index for Rahul’s score) for every row.
Its a clever use of getting a return value of 1 (by placing Rahuls score in index 1) for every innings when Rahul scored higher. ‘is_rahul_higher’ will have 0s and 1s resulting from each comparison.
So basically, summing ‘is_rahul_higher’ gives total number of inining when he scored higher and dividing by 225 gives the fraction.

Note how we cannot use the approach directly for getting result for Tendulkar as his score is indexed at 0. We will need to subtract Rahul’s sum from 225 first (or simply subtract fraction obtained earlier for Rahul from 1).

So we have to derive this based on the current structure and it is not a straightforward formula or something. As per your explanation, if we have three columns, then it will work only for the second column. For third column, we will have to sum and divide by 2? Is it?

Yes,

Yes

Nice, I didn’t thought of this. You are right.

Or, for several player columns case, count the frequency of each unique element/column_index. It will then represent the number of innings, player at the index scored higher than others.

is_higher = np.argmax(snr, axis=1)
np.unique(is_higher, return_counts = True) # This will return element and their frequency in is_higher

Thanks for your detailed reply!

1 Like