Data dredging, also known as data snooping or p-hacking, is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. This is done by performing many statistical tests on the data and only reporting those that come back with significant results. Thus data dredging is also often a misused or misapplied form of data mining.
The process of data dredging involves testing multiple hypotheses using a single data set by exhaustively searchingโperhaps for combinations of variables that might show a correlation, and perhaps for groups of cases or observations that show differences in their mean or in their breakdown by some other variable.