Predictive Coding: Will E-Discovery Swallow The Judicial System?

In an earlier article, we discussed the significance of Magistrate Judge Andrew J. Peck’s (SDNY) opinion in Da Silva Moore v. Publicis Groupe (2/24/12), a highly publicized decision that approved of the use of computer-assisted review in place of “eyes on” document review.

Eric Seggebruch, the Regional Manager for eDiscovery at Recommind, Inc., testified before Judge Peck as an expert witness during a  Da Silva Moore discovery hearing. Seggebruch has authored a helpful article titled “Electronic Discovery Utilizing Predictive Coding,” that provides both technical and practical insights concerning predictive coding and its likely future in the legal marketplace.  

At its heart, the ESI debate revolves around the discussion of the concept of proportionality. By way of example, Da Silva Moore is an employment discrimination case with a universe of some three million records subject to review for document production purposes. Proportionality asks the question whether the costs involved in identifying potentially relevant documents are justified by what is at issue in the underlying litigation.

Nearly one year after Judge Peck’s decision in Da Silva Moore, the attorneys in that case reportedly continue to submit extensive (and presumably costly) briefs on ESI discovery issues. It is for this reason that the title of this article asks whether e-discovery will swallow the judiciary. Leaving aside the staggering costs to parties in litigation, the judicial resources necessary to address these issues may not be up to the task considering the time and intensity with which these battles are fought.

In evaluating the efficacy of predictive coding, Seggebruch tells us that there are two critical terms of art – “recall” and “precision.” “Precision” asks how many documents one has to look at to find a relevant document. By way of example, if you review one hundred documents and find fifty relevant documents, you have achieved 50% precision. “Recall” may be the more important element of the two. If a search of one hundred documents brings back twenty-five relevant documents but twenty-five relevant documents are missed, then “recall” is only 50%. The rate of “recall” in any document production, whether predictive coding or “key word” searches are used, is critical to the integrity of the process.

With increased acceptance of predictive coding over time, it is likely that the “key word” paradigm, with which most lawyers and judges are familiar, will most likely change. According to Seggebruch, scientific studies have shown that “key word” document analyses are less efficacious than predictive coding. However, adversary counsel cannot complain about the level of “recall” obtained from the “key word” analysis performed if they had significant involvement in selecting the “key words” used in the search.

As an indication of how quickly the technology in this field is moving, in some cases, lawyers are now demanding ESI discovery “do overs.” These lawyers argue that when their adversary performed their initial ESI production early in the case, they were admittedly adhering to the then prevailing best technology. However, since that initial production, new ESI techniques, such as predictive coding, have become available to provide potentially  better results. To date, courts that have considered the “do over” petitions have either rejected them out of hand or required the requesting party to assume the costs.

Computers Replacing Lawyers In Reviewing Documents?

For those of us who work on document-intensive litigations, take note of Magistrate Judge Andrew J. Peck’s (SDNY.) opinion released on February 24, 2012 in Monique Da Silva Moore, et al. v. Publicis Groupe and MSL Group, Case 11 Civ. 1279 (ALC)(AJP). Judge Peck’s decision may be the first federal court opinion approving the use of computer-assisted review in place of  “eyes on” document review. Citing recent studies, Judge Peck states “while some lawyers still consider manual review to be the ‘gold standard,’  that is a myth, as statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review….While this Court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection.”

In a thoughtful guest blog on the site, (from which post the photo is reproduced here)Matthew Nelson discusses the significance  (or not) of both Judge Peck’s case and a second case in the Northen District of Illinois, the Hon. Nan R. Nolan presiding.  In that case, Kleen Products LLC v. Packaging Corporation of America et al, the plaintiffs are seeking a court order requiring defendants, among other things to use predictive coding technology in responding to their discovery requests. 

Computer assisted review, or, as it is sometimes called, predictive coding, employs the use of a sample set or “seed set” which is reviewed for responsiveness. The “seed set” can then be made available to opposing counsel to approve the responsive/non-responsive determinations made. Interestingly, at least in this case, the court noted that  “All of this review to create the seed set was done by senior attorneys (not paralegals, staff attorneys or junior associates).” The seed set is then fed into a program that creates a logic (based on the seed set determinations) and extrapolates to the universe (the negotiated set of data). Predictive coding, in essence, attempts to take the place of burdensome, expensive and time consuming document review.

As the opinion suggests, predictive coding will not work in all cases. According to Judge Peck, “What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review.”  While the court discussed possible objections under the FRCP, FRE 702 and Daubert, the court did not sufficiently address what happens when one party wants to use predictive coding and the other party objects.  In the case,  to protect privileged documents that would conceivably be swept in by the computer logic, the parties entered into a clawback agreement which was entered as a court ordert. Unfortunately, in government investigations, parties do not always have the opportunity to have a court enter such an order. So, predictive coding should be used cautiously – perhaps still requiring some “eyes on” document review in handling governmental investigations. 

Predictive coding could provide substantial benefits to clients. On the other hand, law firms whose business models depend on leveraging large teams of associates and staff attorneys to conduct document review will increasingly have to explain to their clients why such costly efforts are necessary. Technology may allow medium sized firms to more effectively compete with large firms in cases with substantial discovery. In short, predictive coding makes good sense for the courts, the clients and the Bar.