Skip to main content

Types of Statistical Error: Basic Statistics Lecture Series Lecture #12

As promised last time, I will cover types of statistical error this time.  Knowing the magnitude and the type of error is important to convey with any hypothesis test.  This also happens to be why, in science, it is said that nothing can ever truly be proven; only disproven.



First, it is important to understand that error typing is an integral part of hypothesis and no other part of statistics, similar to the human brain and the person it's in.  The human brain cannot fit into any other species, and it is necessary for humans to live with it.  The same concept applies with these types of errors and hypothesis; it cannot fit anywhere else, and is necessary for the success of hypothesis testing.

So what specifically is hypothesis testing?  It is the chances that the conclusion is incorrect, namely the chances of the null hypothesis is rejected when it's true (Type I Error, false positive) and the chances of failing to reject the null hypothesis when it's false (Type II Error, false negative).  In statistical hypothesis testing, the alternative hypothesis will always be a positive result.  In the case of statistical analysis, a positive result doesn't mean the same as it does in common language; it means that the result which is different than the proposed mean is the true value.

That's no baby, sir.
How would we find how big the errors would be?  For type I errors, that would be easy.  The chances are simply the level of α we've chosen to compare the p-value to.  Type II errors are are a little bit more complicated to calculate, but we've already calculated it before.  What you do is obtain the p-value, which is to calculate the difference between the sample mean and the proposed population mean, and divide it by the standard error, and obtain the p-value from the resultant z-value.  Remember from the hypothesis testing posts that the value for z is given by $z_{calc}=\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}$.  Here's a table which contains the types of errors and their calculations:

True Statement
Null Hypothesis True Alternative Hypothesis True
Statistical Conclusion Null Hypothesis Rejected Type I Error, α Correct Decision
Null Hypothesis Not Rejected Correct Hypothesis Type II Error, p from $$z_{calc}= β = \frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}$$

So that's the types of error and how to find them.  If you have any questions, please leave a comment.  Next time, I'll start on regression analysis.  Until then, stay curious.

K. "Alan" Eister has his bacholers of Science in Chemistry. He is also a tutor for Varsity Tutors.  If you feel you need any further help on any of the topics covered, you can sign up for tutoring sessions. here

Comments

Popular posts from this blog

Basic Statistics Lecture #3: Normal, Binomial, and Poisson Distributions

As I have mentioned last time , the uniform continuous distribution is not the only form of continuous distribution in statistics.  As promised, here are the three most common continuous distribution types.  As a side note, all sampling distributions are relative to the algebraic mean. Normal Distribution: I think most people are familiar with the concept of a normal distribution.  If you've ever seen a bell curve, you've seen the normal distribution.  If you've begun from the first lecture of this lecture series, you've also seen the normal distribution. This type of distribution is where the data points follow a continuous curve, is non-uniform, has a mean (algebraic average) equal to the median (the exact middle value), falls from highest probability at the mean to (for all practical purposes) zero as the x-values approach $\pm \infty$, and therefor has equal number of data points to the left and to the right of the mean, and has the domain of $(\pm \i

Confidence Interval: Basic Statistics Lecture Series Lecture #11

You'll remember last time , I covered hypothesis testing of proportions and the time before that , hypothesis testing of a sample with a mean and standard deviation.  This time, I'll cover the concept of confidence intervals. Confidence intervals are of the form μ 1-α ∈ (a, b) 1-α , where a and b are two numbers such that a<b, α is the significance level as covered in hypothesis testing, and μ is the actual population mean (not the sample mean). This is a the statement of there being a [(1-α)*100]% probability that the true population mean will be somewhere between a and b.  The obvious question is "How do we find a and b?".  Here, I will describe the process. Step 1. Find the Fundamental Statistics The first thing we need to find the fundamental statistics , the mean, standard deviation, and the sample size.  The sample mean is typically referred to as the point estimate by most statistics text books.  This is because the point estimate of the populati

Basic Statistics Lecture #5: Baye's Theorem

As promised last time , I am going to cover Baye's Theorem. If Tree diagram is the common name for Bayes Theorem.  Recall that conditional probability is given by $P(A \mid B) = \frac{P(A \wedge B)}{P(B)}$.   For tree diagrams, let's say that we have events A, B 1 , B 2 , B 3 , … (the reason we have multiple B's is because they all are within the same family of events) such that the events in the family of B are mutually exclusive and the sum of the probabilities of the events in the family of B are equal to 1. Then we have $$P(B_i \mid A)= \frac{P(B_i)*P(A \mid B_i)}{\sum_{m=1}^{n}[P(B_m)*P(A \mid B_m)]}$$  What this means is reliant on the tree diagram. If we are only looking at the sub-items of A, this is what the tree diagram would look like. If J has a probability of 100%, and P(C) and P(D) are not 0, then when we are trying to find the probability of any of the B's being true given that A is true, we have to set the probability of A to be the entir