Week 18 Exercise 2, not getting the desired graph

The below given is the desired graph :

And the below given are my graphs: (you might have zoom in to see properly :slight_smile: )

As you can see, my Var(x) v/s Mean graph is not at all same. I am getting the correct output for Var(X) v/s sd. I tried a lot but am not able to find the reason as to why my graph is different. Kindly help me.

Thanks in advance :slight_smile:

1 Like

Hi @priyam145,
Glad to see you trying out these exercises :slight_smile:
Just to check if what i could figure out is correct, can you please rethink why have you written loc=i in np.random.normal() while taking sample_variable_mean?
I think it should also be loc=0 because centre of distribution should not change for every iteration.

Incase I might have missed the reason for this, please let me know.

Hi @Ishvinder,

The task is to plot a graph between Variance of sample means and population mean & give a graphical solution for the question whether the Variance of sample means depends on the population mean or not?


We can see in the 1st graph in the question, the range of population mean to be used is (-5, 5).

Hence I wrote:

samples_variable_mean = np.random.normal(loc=i, scale=1, size=(1000, 10))

According to the formula:

                          Var(X) = sigma^2/n

We can see that, Var(X) depends only on sigma, which I have kept as constant (scale=1). Hence, the graph of Var(X) v/s population mean should be a line parallel to x-axis:

      y=c ,   where c=sigma^2/n
      sigma=1, n=10 (n = size of each sample)
      Hence,  y = 0.1 (the desired graph given in the question post, which I am not getting)

Can you share your notebook, it might help me understand better :slight_smile:

Yeah sure.
Link

1 Like

I am also getting the same plot for graph 1.

Hey @kavyajeetbora,

I finally found the solution to this. Actually, the graph we have is similar to straight line. Set y-range (0, 0.15). and you will see a fairly constant plot, similar to a straight line. If you let matplotlib/seaborn set select the range automatically then it will zoom in way too much where you will not be able to see any relation.

Now, you might have a question, why we don’t get an exact straight line coz that’s what maths is showing? This is because np.var() uses population variance formula by default and we should always use sample variance formula to calculate variance of any sample. Set argument ddof=1, so your function should look like np.var(… , ddof=1), I know you won’t understand the difference between pop & sample variance right now. So, just take this explanation. Further in the course you will get to know about this. Just know, use of population variance formula is creating errors.

1 Like

Thanks @priyam145

Is this related to bessel’s correction (diving the variance by n-1 instead of n) where n is the sample size ?

@kavyajeetbora

Yes, it is the bessel’s correction👍

1 Like