# Significance / Significant

## Definition(s)

• The confirmation, with a given Confidence Level, of a prior hypothesis, using a Statistical Estimate. The result is said to be Statistically Significant if all values within the Confidence Interval for the desired Confidence Level (typically 95%) are consistent with the hypothesis being true, and inconsistent with it being false. For example, if the hypothesis is that fewer than 300,000 Documents are Relevant, and a Statistical Estimate shows that, 290,000 Documents are Relevant, plus or minus 5,000 Documents, we say that the result is Significant. On the other hand, if the Statistical Estimate shows that 290,000 Documents are Relevant, plus or minus 15,000 Documents, we say that the result is not Significant, because the Confidence Interval includes values (i.e., the values between 300,000 and 305,000) that contradict the hypothesis. 1
• Statistically significant means that the observed results are unlikely to have occurred by chance. Used in statistical decisions to decide whether a difference, for example, is large enough that it is unlikely to happened by chance from the sampling distribution. In statistics, in general, significance, refers to whether the outcome is so unlikely under the null hypothesis (no real difference) that we reject the null hypothesis and accept the alternative. For example, we select a random sample of students from each of two schools and we measure their reading comprehension. The null hypothesis is that there is no difference between schools on reading comprehension. The motivated hypothesis is that there is a difference. If the difference between mean (average) reading comprehension of these two samples is sufficiently large that it is unlikely, then we say that the difference is significant, and that the two schools differ in their reading comprehension. It is a misnomer to speak about a significant random sample. Significance refers to this kind of hypothesis test, not the size of the sample. 2 3

## Notes

1. Maura R. Grossman and Gordon V. Cormack, EDRM page & The Grossman-Cormack Glossary of Technology-Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate Judge2013 Fed. Cts. L. Rev. 7 (January 2013).
2. Herb Roitblat, Search 2020: The Glossary.
3. Herb Roitblat, Predictive Coding Glossary.