Probability is the foundation of everything we do in Statistics. There are two types of probability: discrete and continuous. Discrete probability is when there are distinct probabilities for different values (represented by bars), while continuous probability is where the probabilities change smoothly with the value (represented by a smooth curve). This is better described by images. These are two types of probability models and are therefor treated different.
Discrete Probabilities of Dice Rolls. |
Continuous Probability Distribution of Pounds of Waste Produced. |
We have what is called Probability Models. Probability models have a sample space (denoted by an S). Probabilities are assigned to sample points in sample space. For the continuous Standard Normal Distribution, the x axis is the sample point and the y axis (denoted P(E)) is the probability axis. In the graphs above, the values we want to get a probability for is on the horizontal axis (which is typically denoted as the x axis and is defined as a single statistical event) and the probability of that particular value occurring is the area of the distribution between the curve and the x axis.
There are basic rules which all statistical analysis probability models must follow:
1) The probability of a single event must be between 0 and 1, $0 \leq E \leq 1$. This means that the probability of an event occurring can't exceed 100% and can't be less than 0%. It can be exactly those percents, or in between those percents, just not outside those amounts.
2) The probability of all events must equal exactly 1, $\sum_{i=1}^{n} P(E_i)=1$. This means that if you take every single possible event, add all of their probabilities together, the answer you get should be exactly and precisely 100%, no more, no less.
3) The probability of an event occuring is given by the sum of all sample point probabilities in the event, $\sum_{i=1}^{n} P(p_i)=P(E_i)$. This means that if you want the probability of working less than 40 hours next week, you should add up the probability of working every possible increment of 0.25 hr up to 40 hrs.
4) The probability of an event E not occurring is given by the probability of it occurring taken from 1, $P(E^{c})=1-P(E)$. Here, the "exponent" of c denotes that it is the compliment of E, or the part of the data set which not covered in the event E. This means that if you want the probability of you working any amount of hours next week other than 40 hours, you would find the probability of working exactly 40 hours, and subtract it from 100%.
If these rules hold, than it is a statistical model. If these rules do NOT hold, however, than it is definitely NOT statistical. It may or may not be a model, but if it is, the model is NOT a statistical model.
These are the basics of probability models and discrete and continuous probability distributions. That's it for this time. Next time, I'll cover examples of discrete and continuous probabilities. Until then, stay curious.
K. "Alan" Eister has his bacholers of Science in Chemistry. He is also a tutor for Varsity Tutors. If you feel you need any further help on any of the topics covered, you can sign up for tutoring sessions here.
Comments
Post a Comment