Could you please explain the principle behind the solution to problem 8 in Week-8 (i.e. 1. How many more runs does Sachin score on average after having scored x runs). The assumption made is not very convincing.
x_arr = np.arange(0, 101, 5) #[0, 5, ....100] indices = sachin >= x_arr #build 21x225 table
As I understand, what is being done is an
indices table is built. Further for x (le’ts say x=5), indices’ 1st row will have an array of 225 elements where element will be True if the score is >= 5, False otherwise.
(similarly x=0 will correspond to indices’ 0th row, and x=100 will correspond to indices’ 20th row.)
Once the table is there, 1) you can pick a indices row (depending on
x value) and 2) use it as a index in
sachin to pick up relevant scores (more runs than 0, 5, 10… etc), 3)calculate
more runs than x by subtracting x and 4) finally take the mean.
The assumption that “once he crosses that barrier of scoring few runs when x=5, his number of runs that will likely add to his score would be greater than his average” - is what I am not understanding.
Is this a proposition? Or rather how do we or to say to what confidence level we can say this inside the quotes to be true?
Ok, I think i understand your question better, but I don’t have a good answer. Here is my understanding:
- Question: How many more runs Sachin score on average after having scored x runs?
As the solution approach shown in videos, this question is not about making a hypothesis (that Sachin scores so many runs if …) whose acceptance/rejection will need some likelihood/confidence level calculation. Its simple calculating average score Sachin scores/scored after passing x runs from the given innings/data (which is deterministic, no probability or likelihood involved).
Having said that, I kind of agree you can debate about how question is phrased.
Lastly, if this question had to be seen as having a sample data given and making a prediction about population (innings spanning his all career) parameters with some confidence level, what would be your Null Hypothesis and sample statistics (this material is covered in week 20)?
Ok @sanjayk. I could understand your reformulation statement of the question now (“calculating average score Sachin scores/scored after passing x runs from the given innings/data”). It becomes more clear with the meaning now.
Though I could not do this part myself, and I had to take the video help of Prof Pratyush, I will again attempt with my understanding now as it is clearer.
Thanks a lot