• An Unsupervised Learning method in which Documents are segregated into categories or groups so that the Documents in any group are more similar to one another than to those in other groups. Clustering involves no human intervention, and the resulting categories may or may not reflect distinctions that are valuable for the purpose of a search or review effort. 1
  • Grouping documents or other objects by similarity. The similarity between two documents in a cluster is greater than the similarity of documents in two different clusters. 2 3


  1. Maura R. Grossman and Gordon V. Cormack, EDRM page & The Grossman-Cormack Glossary of Technology-Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate Judge2013 Fed. Cts. L. Rev. 7 (January 2013).
  2. Herb Roitblat, Search 2020: The Glossary.
  3. Herb Roitblat, Predictive Coding Glossary.