Processing - Culling, Prioritizing and Triage

From EDRM

Jump to: navigation, search
Processing
Processing - Overview
Scoping Electronic Discovery Projects
Processing Stages
Tape Restoration
File/Document Extraction
Metadata
Chain of Custody at a File Level
Deduplication
Data Culling
Data Conversion
Quality Control
Reporting
Requesting and Negotiating
Evaluating What Has Been Received
Audit and Chain of Custody
Practices to Consider
Copying, Fingerprinting and Analyzing Original Data
Culling, Prioritizing and Triage
Understanding Background to Set Priorities
Planning
File Analysis
Review Team Factors
Using Key Words for Prioritization
Searching
Search Techniques
Culling and Searching Considerations
Cost Drivers
Metrics
Means of Measuring
Industry Benchmark Survey
Paper-to-Electronic Estimate Conversion Table
Additional Materials
Participants

Culling and searching occur throughout the discovery process and are tightly related components of any solid and defensible processing strategy. Simply defined, culling is the process of programmatically removing content that is irrelevant while searching is the process of identifying content that is most likely relevant and will require review. Together, when implemented effectively, culling and searching will reduce and focus the reviewable content universe—saving clients time and money for higher value downstream activities.

More than 90% of collected electronic content can be non-responsive. When developing a culling and searching strategy, the objective should always be to identify the most relevant content first and move it downstream to the review team. To accomplish this, the discovery team needs to develop a general understanding of the collection. This understanding includes answers to questions such as:

  • How much data will we be receiving? How many custodians? What is the average amount per custodian?
  • What types of files (e.g. emails, word processing documents, spreadsheets, etc.) will we be receiving?
  • How will the data be delivered (e.g. on hard drives, DVDs, etc.)?

There are myriad approaches and technologies for culling and searching. The key is to find the tools that best work for your unique requirements in a particular matter. Prior to selecting a particular culling and/or search technique or tool, it is important to understand what objectives the review team is trying to accomplish.

(back to top)

Contents

Understanding Background to Set Priorities

The amount of data which is created is almost immeasurable and increasing all the time and this is why it is extremely necessary to have a triage and prioritization strategy so you can move ahead in a timely and efficient manner.

As it has already been previously discussed in the Collection section, the proper analysis of the key issues and collections can help save time and money exponentially down the road in the discovery process. The collection of information was originally based on such issues as:

  • Key custodians or issues
  • Preparation for key legal dates (depositions, hearings, filings)
  • Ease of collection drove the collection of the data

In the triage and prioritization section we can employ those basic factors and then become more granular to address more technical issues which result from more detailed management of electronic data.

(back to top)

Planning

The volumes and types of data even after collection and culling can be quite overwhelming. Like any other large project a methodology is important to follow to break down the work ahead. The process should be followed to create priority and full use of through put:

  1. Understand key legal issues to identify and process the most critical data as the top priority.
  2. Plan on processing data in line with key legal deadlines. Identify potential issues and address potential changes with court or other side
  3. Prioritize by key positions, departments and then by specific department members.
  4. Analysis of data types and accessibility (i.e. Backup or legacy types).
  5. Review team factors such as availability of experts, legal subject expertise, financial analysis, and foreign language.

Even with excellent culling techniques the data needs to be further evaluated to try to further understand the data to possibly further eliminate the data, identify future challenges, and process certain data. A critical analysis of the types of files in the collection can help understand even the largest of collections. Understanding the file types can give insight into what type of applications the users used as well as speed and challenges in the review process.

(back to top)

File Analysis

File analysis is a process where an application is used to give statistics on what types of files are in the collection. This application can be used as an in-house tool, by a forensics expert, or by an electronic discovery vendor who may be processing the data. Some tools use the file extension of the file but more sophisticated tools will analyze the file’s header information to determine the file type regardless of the file’s extension. It is possible to rename file types, such as document.old, or to rename files to try to hide critical information. Tools that identify file type by file header can be useful if renamed file types are suspected.

Types of files can help determine the number of files which would be processed for review and those which would not. There are file types which can be user generated in the normal course of business and those which are non-custodian created files such as files which come with a computer’s operating system. In most cases processing the custodian created files are the files which will move on to be processed and reviewed.

It is important to keep as much of a custodian’s data processed together going forward so that it can be managed as a whole going forward in the processing and review. Data at this point will become data that can be processed, those that will not, and those with special handling needs.

Special handling needs include files which need special non-standard applications to process or view. Companies develop their own internal application or specialized applications such as accounting or computer automated design (CAD) programs, both of which can be critical in a matter. Certain files or certain custodians might further forensics analysis depending on the nature of the matter which would entail a forensics professional working with the collected data or going back to the original media to avoid spoliation.

(back to top)

Review Team Factors

Knowing that certain file types will prove to be difficult can help delaying other information in the process. Files which contain complex data such as relational databases can be organized out of the standard process so that analysis can be done for true responsive data and proper formatting can be created so that beneficial information formats can be created for proper production.

Prioritization should also take into consideration the issues relating to the people who will be reviewing the information. Review team issues need to be considered in the prioritization process such as legal expertise, domain knowledge (i.e. Scientists or accountants), or foreign languages.

(back to top)

Using Key Words for Prioritization

The collection of data in some cases will be used as a central repository and this will be used again and again to retrieve key documents. If the data collection is being used in this way then keyword searching can be done to help prioritize the data. It is important that with this approach there are flexible options on when to retrieve the same documents or eliminate and duplicates depending on the organization and needs of the matter.

(back to top)

Personal tools
2006-2007 projects