The chi-square test is a statistical method used to determine if there is a significant association between two categorical variables. It's commonly used in hypothesis testing to evaluate how likely it is that an observed distribution is due to chance.
The chi-square statistic is calculated using the following formula:
\[\chi^2 = \sum_{i=1}^{n} \frac{(O_i - E_i)^2}{E_i}\]
Where:
This formula measures the discrepancy between the observed data and what we would expect if there were no association between the variables.
Let's consider a simple example where we're testing if a die is fair. We roll the die 60 times and get the following results:
Face Value | Observed (O) | Expected (E) |
---|---|---|
1 | 5 | 10 |
2 | 8 | 10 |
3 | 9 | 10 |
4 | 12 | 10 |
5 | 15 | 10 |
6 | 11 | 10 |
Calculating the chi-square statistic:
\[\chi^2 = \frac{(5-10)^2}{10} + \frac{(8-10)^2}{10} + \frac{(9-10)^2}{10} + \frac{(12-10)^2}{10} + \frac{(15-10)^2}{10} + \frac{(11-10)^2}{10} = 6.5\]
With 5 degrees of freedom (6 categories - 1), we can look up the p-value in a chi-square distribution table or use a calculator. In this case, the p-value is approximately 0.26.
This bar chart visualizes the observed frequencies (blue bars) compared to the expected frequency (red dashed line) for each face value of the die.
Since the p-value (0.26) is greater than the common significance level of 0.05, we fail to reject the null hypothesis. This means we don't have strong evidence to conclude that the die is unfair based on these results.
We can create a free, personalized calculator just for you!
Contact us and let's bring your idea to life.