In this video, it is explained that if we have a population of 10000 and from this population, we take a sample of 100, then the total number of ways of doing that is 6.5 * 10^241, which is quite a big number.

This must have been evaluated using combination as

(10000 * 9999 * 9998 …upto 9901)/(100!)

This is not a large population size but still my computer, and even on google search, could not evaluate this small value(when compared to millions of people in actual scenarios) of P(10000,100).

What does it mean? Does it require a more powerful computer like a supercomputer? Or is there something else that I need to be aware of?

One of the primary reasons is that general purpose computers are 64 or 32 bit architecture, which specifies a certain range of values a data type can hold. No data type is suitable enough to contain such large numbers and it is difficult for the machine to directly compute such values of the order of 10^241. Hope this helps!!

So does that mean, there exists some other techniques to determine the chance of observing a trend throughout the population of 10000 people, if this trend is observed in our collected sample of 100 people? Or else it actually requires heavy machines to compute, because in real world, these numbers tend to be even bigger, how do people do it then?

Yes, more complex computations require the use of high computing devices such as GPUs and TPUs. Libraries such as numpy are used to help solve complex computations in an efficient manner. There are other libraries such as PyTorch and Tensorflow are there too. However, I am not aware of alternate techniques for this specific task of observing the trend in population. You may give a check with these libraries. Hope this helps!!

Suppose, there is an upcoming election in a city of 10,000 people and we are interested in knowing the preference of its peoples between the left-wing and right-wing party.

So given the above objective, you got to reach to each individual of city before reaching a conclusion.

I believe 10,000 is a very big number as reaching 10,000 persons is time-consuming and cost-intensive process, or probably we can’t even ask each individual about there preference for various obvious reasons.

So, what would we do?

We would take a sample of 100 citizens such that it represents its population ( Knowledge of Sampling Technique is required ).

Please notice, we could select these 100 citizens in (10000 C 100) ways. Now ponder…

What if we would have chosen 100 citizens in the sample who all favour left-wing over right-wing. Such combination is possible in (10000 C 100) ways which may lead us to a wrong conclusion.

I would like to enumerate two problems that we are facing:

We can’t work with the population because of cost, time and other constraints.

We can’t rely on just one sample because that could be misleading.

Therefore we need some scientific technique to solve these two problems and come up with some conclusion, and I think here we would be needing probability theory and inferential statistics.

Just to add my understanding of ‘‘why do we need probability theory’’ .
Whenever we take a sample out of a population(and that we do because we do not have enough resources like cost, time etc to work on entire population) we make inference about the population. But we can never be 100% sure as whatever inferences we are making out of that sample, will fully(100%) comply with the population. Therefore, after analysing sample we always talk about the ‘chance’ of any event related to that population.

This ‘chance’ term comes because there are certain factors which are out of our control, and these factors are called randomness in the population. To factor this randomness and define how much chance do we have that our inferences are correct, Probablity theory is required!!!