In July, I posted on “computer assisted review” (CAR). After that, I had the pleasure of a meeting with Ben Legatt of Shepherd Data Services. Shepherd Data offers a “CAR product” and Ben offered to discuss how it works. The following is an edited version of my talk with Ben about some of the mechanics of how CAR works.
[Full Disclosure: Shepherd Data is a sponsor of Minnesota Litigator. This post, however, is NOT PAID ADVERTISING. Shepherd Data neither paid for this post nor requested that Minnesota Litigator write this post.]
ML: Help me understand how “CAR,” or computer aided review works.
Ben Legatt: Essentially, CAR software takes the entire universe of documents, or the documents that you actually want to analyze, and it treats all the documents as “a bag of words,” you might say. It doesn’t care what language documents are in. It doesn’t actually understand the words at all, but what it does is it looks at the relationship of words to other words in your data set and the proximity of different words to each other.
It creates essentially a matrix of concepts. What it can then do is reorganize, in a way, all the case documents in your database by how conceptually similar they are. The concepts are derived from that matrix.
Documents where the concepts are similar are grouped together in the matrix, in a sort of three-dimensional matrix. The documents that are less conceptually similar are grouped farther apart from each other in the matrix in a different area. Essentially what it does is it gives you different ways to search for your documents by concept as opposed to just what happens to be on the face of the document.