An Algorithm Wants Your Job | Practical Law

An Algorithm Wants Your Job | Practical Law

Predictive coding programs are poised to become standard in e-discovery practice in the near future. As more courts weigh in on predictive coding, it is increasingly clear that soon there no longer will be a question of whether predictive coding can be used. Instead, counsel should focus on how and when this technology should be applied.

An Algorithm Wants Your Job

Practical Law Legal Update 2-567-8645 (Approx. 4 pages)

An Algorithm Wants Your Job

by Practical Law Litigation
Law stated as of 13 May 2014USA (National/Federal)
Predictive coding programs are poised to become standard in e-discovery practice in the near future. As more courts weigh in on predictive coding, it is increasingly clear that soon there no longer will be a question of whether predictive coding can be used. Instead, counsel should focus on how and when this technology should be applied.
The discovery phase of litigation inevitably includes a degree of searching and reviewing clients' and opposing parties' electronically stored information (ESI), including e-mails, e-mail attachments, excel spreadsheets and documents generated through a word processing system such as Microsoft Word. For an overview of e-discovery generally, see Practice Note, E-Discovery in the US: Overview.
Traditionally, the gold standard for identifying potentially responsive ESI has been to electronically apply keyword search terms to the universe of ESI using Boolean logic (for example, "stock /2 option"). After the search terms are applied, attorneys review all the documents that contain the search terms to determine whether they are relevant or privileged. Because the legal industry is not an early adopter of technological advances, this traditional method of human review has lingered even in the face of staggering volumes of ESI. As many junior attorneys can attest, reviewing high volumes of potentially relevant ESI for information that actually is relevant can resemble searching for a needle in a haystack.
Relatively recently, though, a number of companies have developed advanced algorithms to electronically identify and cull potentially relevant ESI. These more advanced computer-assisted methods of review include predictive coding, which is the use of a software program to identify documents that are relevant to a particular case or issue and then rank those documents based on the level of potential relevance. Predictive coding generally entails several parts, including:
  • A machine learning process in which humans train the program by identifying a "seed set" of relevant documents. Attorneys code each document in the seed set for relevance, privilege and any other specific issues. The program then analyzes the seed set to understand the types of documents that are relevant to the case. By applying its algorithms to the seed set documents' content and coding, the program learns to identify relevant documents and offers preliminary coding decisions. This is an iterative process that may repeat several times until the program's predictive coding is sufficiently accurate when compared to the attorneys' coding.
  • A combination of different algorithmic tools. The actual range of potential methodologies and algorithms to perform the training process is sweeping. While every predictive coding tool has its unique algorithms and features, they tend to use similar techniques and processes. Some of the more common methodologies include:
  • Organization of data. The documents may be organized by one or more of the following methods:
    • relevance ranking;
    • clustering; and
    • sorting documents by issue.
For an explanation of the key terms applicable to e-discovery generally and predictive coding specifically, see Article, E-Discovery Glossary.
Application of predictive coding technology initially caught on somewhat slowly. However, that hesitance has started to dissolve as courts recently have begun to bless the use of predictive coding. In addition, the potential advantages of using predictive coding in all aspects of case preparation, from early case assessment through trial, have become more publicized.
Computer analytics such as predictive coding programs are poised to become standard practice in e-discovery in the near future. It is possible that some courts eventually may consider them mandatory for large cases. The real uncertainty is not whether predictive coding will be used, but how and when this technology should be applied. Consequently, counsel would be well-served to educate themselves on predictive coding now.
Practical Law's Practice Note, Long Live Predictive Coding examines the basic technology behind predictive coding, common predictive coding tools, how to best use predictive coding in all aspects of litigation, case law addressing predictive coding and the advantages and disadvantages of using this technology.
In addition, Practical Law has many resources to guide counsel through e-discovery generally. In particular, the E-Discovery Toolkit provides continuously maintained resources designed to help counsel and litigants meet their e-discovery obligations under the Federal Rules of Civil Procedure and developing case law. The resources in this Toolkit also provide practical tips for maintaining, providing and producing ESI in a cost-effective and timely manner.