Legal Analytics – Researcher Beware?

If you have been following legal information news lately, you have probably heard quite a bit about legal analytics – the so-called “MoneyBall” for lawyers.  Litigation outcomes are located, summarized, categorized, and sometimes visualized.

Although analytic services are certainly not new (LegalMetric, for instance, has been delivering detailed analytics for 15 years), they have recently gained wider appeal.   As analytics are becoming directly integrated into major legal platforms, it is perhaps a good time to start taking a closer and more skeptical look.

Not all analytics are created equal and some analytics, quite frankly, are not ready to be put to actionable use.

1Docket Navigator MtD

(Docket Navigator. Judge Sam Sparks motions to dismiss by year, patent NOS)

For example, compare the above Docket Navigator analytics output for Judge Sam Sparks’ motions to dismiss in patent lawsuits with the below Bloomberg Law output.  Both services are analyzing information on the same topic — but Bloomberg Law is clearly missing the bulk of the data.  Docket Navigator also can further categorize by motion to dismiss type (FRCP 12(b)2, 12(b)1, 41(a) etc.).

(Bloomberg Law. Judge Sam Sparks motions to dismiss all years, patent NOS.)

As information professionals become regular users and gatekeepers of analytics tools, what information transparency is necessary for reliance?

Underlying dataset:

  • Is the dataset complete? If 100% of the underlying dataset is not ingested into the platform, what is missing?  Does incomplete data negate the validity of the end results?
  • Is the dataset normalized/cleaned/linked? For example, are variations of party names and judge names normalized?

Review process of documents/datasets:

  • Do humans review the data? Algorithms? Both? What is the level of confidence in the accuracy and precision of the algorithms?

End categorization/visualization:

  • What decisions are made when creating end categories?
  • Are the categories meaningful to the specific legal domain/use case? For example, a ‘motion to stay’ category will need very specific subcategories to be actionable.  Does the end categorization create useful meaning to the specific legal question/domain?

Information professionals must be acutely aware of how analytics are created and the actual value they add.  Researcher/lawyer/librarian beware.  Don’t base a decision on bad or irrelevant information.