What We Learned: Legal Tech NY 2013

Last week, I made my annual pilgrimage to the Big Apple to wade among thousands of legal industry professionals, the majority of whom are involved in some phase of the discovery process. And as is normally the case, the three-day event became a blur.

There are just too many people to see, too many technology platforms to demo, too many sessions to attend. However, Legal Tech does provide that one time of year to focus on the wide array of technology and ancillary service offerings that are integral to our profession. Moreover, it provides a great opportunity to keep abreast of national and global trends in technology and its application to the legal practice.

This year was particularly fruitful, so I’ve come back with 10 observations that are related to the issue du jour: Technology-Assisted Review (TAR). I will delve into more detail in multiple blog posts over the next few weeks, but for now, here are my thoughts and summary on a panel that took on some of the bigger issues.

Part 1

TAR really took center stage at this year’s Legal Tech.  Unlike last year’s treatment of predictive coding technologies (now generically referred to as TAR, correctly or incorrectly), where the discussions largely focused on the uncertainties of computer review and “black box” technology, it’s clear that 2012 was a year in which TAR in all its different varieties was embraced by legal practitioners. This made the panels much less theoretical and much more practical.

Indeed, the panel entitled “Case Studies and Lessons Learned from the Practical Use of Technology-Assisted Review" offered a guided tour of how each of the four panelists uses TAR in his or her areas of specialty.

10 observations from the panel discussions

  1. There is no one-size fits all technology or methodology. Sorry, there is no ‘easy’ button. While all panelists had regularly used TAR, none of the panelists used the same technology or even the same approach to using the technologies. However, the panelists were consistent in that they each had clearly defined processes that they followed for each matter. We wholeheartedly agree that there are many tools with different features and benefits and the key to successfully and defensibly utilizing technology is to have customized processes.

  2. You can’t separate the law from technology. As technology continues to advance at warp speed, there’s still no substitute for “good lawyering.” To effectively use and defend the use of TAR, the attorney should follow the same principles that are part of any successful legal strategy. The first is to talk to your client. It is imperative that the attorney spend time with the client and ask questions that are designed to lead to identification of relevant information and examples of documents that can be used in creating seed sets. Ask for acronyms, where they store data and who they communicated with on other side to develop your collection and review strategy.

  3. There is no case that is too small. The potential benefits of using TAR for large data populations are generally well accepted. However, for smaller data populations the panelists agreed that TAR is still helpful in cases with as few as 15,000 documents. In fact, the obstacle in using the technology on cases under 15,000 documents is not that the technology can’t assist in the review; rather, that if doing so required that you seek out a new technology vendor with TAR technology, it may not be worth the additional time and effort. Conversely, if you have a relationship with a technology vendor and have processes built around the applicable TAR technology, the use of TAR is especially helpful to resolve smaller disputes as the costs can be reduced dramatically and the risks of using the technology are much lower.

  4. You do not have to be a technology expert to use TAR. The panelists were asked how many times they had to explain the mathematics behind the algorithms used to train a predictive coding tool. All but one – who happened to have developed her own proprietary TAR module and explained for other purposes – had never been asked to explain the underlying technology that was used in the review.

  5. Process, process, process. Create a process. Document your process. You should be able to clearly present the steps taken to identify responsive documents and that process should establish good faith and reasonableness. The processes described to “train” the computer in TAR methods differed from panelist to panelist, but each described their methodical process in which they followed to (1) create the “seed set” and (2) to validate the results.

  6. Utilize the entire tool box to create “seed set” to train the computer. This gets into the weeds a little, and I plan to post separately on this vital aspect of predictive coding, but the crux of the matter is that key terms and concept clustering are still used in many TAR platforms.

    Key terms as a method to create seed sets. One panelist “uses key terms for inclusion but never for exclusion.” So while she will populate a seed set with key term hits, she will not exclude those documents from the opportunity to be brought into the seed set using different methodologies (i.e. random sampling, concept clusters, etc., might be examples).

    Clustering as a method to create seed sets. One advantage of the clustering approach is that you are not limiting the scope of universe by using key terms, which are typically inadequate if the only methodology for identifying responsive docs employed.

    Note: In our experience, we have found that all concept/content clustering technology is not alike. In fact, some are virtually useless based upon the methodology used to create the cluster. Many programs “cluster” documents that seemingly have no substantive relation to one other and certainly not enough reliability to create “seed sets” for TAR. On the other hand, concept clusters that narrowly define the size of the cluster to only contain documents that are highly similar can be very helpful in creating useful seed sets and eliminating documents that have no value to the case or training. With the right concept clustering technology, sampling the documents in a “cluster” is similar to the old practice of going through a warehouse of banker’s boxes full of documents, which would entail looking at the outside of box for a label (or any available indices of boxes), opening them up and sampling the documents. Very quickly by viewing the folder names and glancing through the documents, the reviewer could make a reasonable determination as to the contents of that box and reasonably determine if box should be “in” or “out.” There would not be a need to look at every document to make this determination.

  7. There is no magic number with respect to how many documents should be reviewed to “train” the computer. The key is not the number but the richness/representative nature of the seed set. The goal of creating any seed set is to find as many representative documents in that population to allow the computer to apply analytics. This is often not all done at the outset, but rather it’s an iterative process in which you continue to “train” the computer as you find more and more representative documents (e.g. “active learning”).

  8. Human reviewers are critical to the TAR process. This is the case for two main reasons:

    • Training a predictive coding tool requires attorneys with significant experience (preferably litigation) and knowledge of the client, case and substance, as the decisions that are made to train the tool have much larger impact than an individual reviewer on an individual document.

    • Reviewing the documents predicted as “responsive.” The only unanimous point of agreement of the panel was that once the predictive coding technology identified the likely responsive documents, a 100% review, document by document, is recommended of documents that would be produced. Two primary reasons for the need to review the predicted relevant documents (1) privilege and (2) knowledge of your production. The panelists agreed that, to date, TAR technologies have not been as successful in identifying responsive PRIVILEGED documents; therefore, it is an important function for a human reviewer to carry out. All agreed that when you are producing documents, the attorney should be aware of documents being turned over. The first time they see a document should not be during depositions of their clients.

      That being said, there were a few situations noted that might warrant less than a 100% review of the predicted responsive set and instead utilize sampling of proposed results: second-request situations and third-party subpoenas.
  9. Effective utilization of TAR saves significant time and money, and is defensible. One of panelists explained he had a case in which he had performed in linear fashion originally, using 20 to 30 attorneys over a six-month period. By circumstance, several years later the court ordered a re-review of the data for different objectives. By using TAR, it took one attorney one-and-a-half weeks to complete the work of five associates. Depending on the tool selected and the methodology deployed, TAR has tremendous opportunity to cull through a lot of non-relevant materials and to eliminate much of the attorney review time otherwise spent on sorting through the mountains of non-responsive documents typically found in any given case (usually only 10% or less of documents are responsive in a document review). By utilizing TAR, it is possible to increase the responsive rate of any review to 50% or above, which permits the attorney reviewers to perform more in-depth and substantive analysis without wasting time and money reviewing spam or other clearly non-relevant material.

  10. Validate your results. Do your own validation/null set sampling. Be prepared to show a reasonable process was undertaken to identify documents not reviewed on a document by document basis. This is no different than any other data reduction methodology (i.e. like key term development, sampling, testing and refinement), but always a crucial step in tying up the loose ends of your process.

I’ll have follow up blogs of my LTNY series posted here in the upcoming weeks.

Legal Tech NY 2013 Panel
"Case Studies and Lessons Learned from the Practical Use of Technology-Assisted Review"

Thomas Lidbury, partner, Drinker Biddle & Reath
Alan Winchester, partner, Harris Beach
Maura Grossman, counsel, Wachtell, Lipton, Rosen & Katz
Jennifer Keadle Mason, managing partner, Mintzer, Sarowitz, Zeris, Ledva & Meyers

Trackbacks (0) Links to blogs that reference this article Trackback URL: http://www.lawdable.com/admin/trackback/294567
Comments (0) Read through and enter the discussion with the form at the end