Machine Learning


  • The use of a computer Algorithm to organize or Classify Documents by analyzing their Features. In the context of Technology-Assisted Review, Supervised Learning Algorithms (e.g., Support Vector Machines, Logistic Regression, Nearest Neighbor, and Bayesian Classifiers) are used to infer Relevance or Non-Relevance of Documents based on the Coding of Documents in a Training Set. In Electronic Discovery generally, Unsupervised Learning Algorithms are used for Clustering, Near-Duplicate Detection, and Concept Search. 1
  • A process for using computer algorithms and methods to implement a decision, prediction, or categorization process. Machine learning processes typically apply information derived from examples to predict, categorize, or decide about previously unseen objects. Machine learning methods have largely been derived from the science of pattern recognition, brain simulation, learning theory, and decision theory. Machine learning is closely related to statistical modeling. 2
  • A branch of computer science that deals with designing computer programs to extract information from examples. For example, properties that distinguish between responsive and nonresponsive documents may be extracted from example documents in each category. The goal is to predict the correct category for future untagged examples based on the knowledge extracted from the previously classified examples. Example approaches include neural networks, support vector machines, Bayesian classifiers and others. 3


  1. Maura R. Grossman and Gordon V. Cormack, EDRM page & The Grossman-Cormack Glossary of Technology-Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate Judge2013 Fed. Cts. L. Rev. 7 (January 2013).
  2. Herb Roitblat, Search 2020: The Glossary.
  3. Herb Roitblat, Predictive Coding Glossary.