EDRM Search Guide Glossary

The EDRM Search Guide Glossary is part of the EDRM Search Guide.  The EDRM Search Guide focuses on the search, retrieval and production of ESI within the larger e-discovery process described in the EDRM Model.

Boolean Search

  • A search technique that utilizes Boolean Logic to connect individual keywords or phrases within a single query such as AND, OR, and NOT, within (w/5) , and NOT withinN (not w/5). 1 2
  • A Keyword Search in which the Keywords are combined using operators such as “AND,” “OR,” and “[BUT] NOT.” The result of a Boolean Search is precisely determined by the words contained in the Documents. (See also Bag of Words method.) 3
  • The term “Boolean” refers to a system of logic developed by an early computer pioneer, George Boole. In Boolean searching, an “and” operator between two words results in a search for documents containing both of the words. An “or” operator between two words creates a search for documents containing either of the target words. A “not” operator between two words creates a search result containing the first word but excluding the second. 4 5 6
  • A search type using Boolean logic operators between search terms that indicate a relationship between them. An “AND” operator between two words or other values (for example, “pear AND apple”) means one is searching for documents containing both of the words or values, not just one of them. An “OR” operator between two words or other values (for example, “pear OR apple”) means one is searching for documents containing either of the words. 7
  • Mathematical query language developed by English mathematician George Boole in the 19th century. Boolean searching of text is based on the underlying logic functions of various true/false statements. Common Boolean operators are “and,” “but not,” and “within.” 8
  • A search for information using “AND,” “OR” and “NOT” commands, such as “Tom but not Jones” or “bankruptcy and trustee.” 9
  • The use of the terms “AND,” “OR” and “NOT” in conducting searches. Used to widen or narrow the scope of a search. 10

Concept Search

  • A search technique that provides words which are similar in concept to a query word. A concept search will return documents that relate to the same concept as the query word, regardless of whether the query word exists in the search results documents. Concept searches can be implemented as a simple thesaurus match, or by using sophisticated statistical analysis methods. Effectiveness of concept search in an e-discovery project depends greatly on the type of algorithm used and its implementation.
  • An industry-specific term generally used to describe Keyword Expansion techniques, which allow search methods to return Documents beyond those that would be returned by a simple Keyword or Boolean Search. Methods range from simple techniques such as Stemming, Thesaurus Expansion, and Ontology search, through statistical Algorithms such as Latent Semantic Indexing.
  • Maps relationships between each word and every other word in large sets of documents and then associates words based on the context in which they are used. Two techniques can be used to perform concept searches: the use of a manually constructed thesaurus which relates certain words to others or semantic indexing, a fully automated method to show associations among words based, in part, on statistical analysis of the occurrence of proximity of certain words to others. 1
  • Also called “thesaurus” or “related” searching; sometimes called “synonym searching.” Searches that provide other words similar or close in meaning to the primary word. 2

ESI / Electronically Stored Information

  • Electronically Stored Information or ESI is information that is stored electronically on enumerable types of media regardless of the original format in which it was created.
  • Electronically Stored Information: this is an all inclusive term referring to conventional electronic documents (e.g. spreadsheets and word processing documents) and in addition the contents of databases, mobile phone messages, digital recordings (e.g. of voicemail) and transcripts of instant messages. All of this material needs to be considered for disclosure. 1
  • Used in Federal Rule of Civil Procedure 34(a)(1)(A) to refer to discoverable information “stored in any medium from which the information can be obtained either directly or, if necessary, after translation by the responding party into a reasonably usable form.” Although Rule 34(a)(1)(A) references “Documents or Electronically Stored Information,” individual units of review and production are commonly referred to as Documents, regardless of the medium. 2

Fuzzy Search

  • Fuzzy search allows searching for word variations such as in the case of misspellings. Typically, such searching includes some form of distance and score computations between the specified word and the words in the corpus.
  • A search technique that identifies ESI based on terms close to another term, with closeness defined as a typographical difference and/or change. For example, snitch, switch, and swanky can all match swatch, depending on how many incorrect letters are allowed within the search threshold.
  • Search that locates words closely match the spelling of the primary word. 1
  • A full-text search procedure that looks for exact matches as well as similarities to the search criteria, in order to compensate for spelling or OCR errors. 2

Inverted Index

An index that maps a keyword to the list of documents that contain the keyword.

Keyword Index / Indexing

  • Indexing is a process that inventories the total content of a file and builds a searchable electronic index. This index typically maps from a keyword to all the documents that contain the keyword. Search indexes serve to function as tools designed to facilitate and expedite the retrieval of information. Search engines will use both common and proprietary technology to build indexes and service search queries.
  • A technique that examines the ESI and builds a searchable electronic index. This index typically maps from a keyword to all the documents that contain the keyword.

Keyword Search

  • A common search technique that uses query words (“keywords”) and looks for them in ESI, using an index. A keyword search is a basic search technique that involves searching for one or more words within a collection of documents and returns only those documents that contain the search terms entered. The documents returned by the search engine are called the search results. Keywords often form a basic building block for constructing other more complex compound searches. Such compound searches use other search elements such as Boolean logic.
  • A very common search technique that uses query words (“keywords”) and looks for them in ESI, using an index.
  • A search in which all Documents that contain one or more specific Keywords are returned. 1
  • A method of searching for documents that possess keywords specified by a user. 2
  • A search using a full text search filter. A client search term list is applied to a full text index to find responsive files. 3
  • A search for documents containing one or more words that are specified by a user. 4

Phrase Search

  • A search consisting of multiple keywords separated by spaces to form a single phrase. For a document to match this search, the entire phrase as entered must be contained within the document.
  • The search phrase “Massachusetts Mutual” would locate text where the words are side by side. 1

Privileged Documents

A set of documents that a Producing Party is not required to provide, since they fall into Privilege such as Attorney-Client Privilege. The existence of such documents should be recorded in the Privilege Log.

Producing Party

A party that owns the complete collection of ESI, and is responsible for producing a portion of the ESI that is deemed to be relevant for a legal case or legal enquiry.