Last week’s blog, on the importance of Data Governance, looked at how and why law firms use data. We stressed how essential it is to have a Data Governance plan, not only for safeguarding the data you collect, but also to use it to improve the business of law. For many law firms, this will entail combining and analysing multiple streams of data within their firm to streamline matters, improve client relationships and increase profitability. To this end, this week we will be looking at one of the most ubiquitous data points in a law firm. The phase or task code. Is it as useful a data point as firms think? Or is there a better way to approach the collection of this data?

A historical perspective on the task code

First, let us say that the phase and task codes are not (or at least were not designed to be) so inefficient. Historically, they have provided a number of useful functions for law firms. Back at the dawn of their invention, the ABA task codes and the Uniform Task Based Management System (UTMBS) were designed to enable electronic invoice processes. They were primarily focused around litigation, and its related costs. 

From this initial use case, firms realised that more sophisticated, or adapted code sets could serve a secondary function. With the rise of the fixed fee, code sets within matters could be used to compare and build more accurate quotes. Nominally by comparing like for like work. However, both of these uses have become increasingly unwieldy and inaccurate as time has progressed.

Is there such a thing as too much choice?

One of the key problems facing phase and task codes is that they offer too much choice, and at the same time, insufficient detail. 

“Take the example of discovery in the ABA Litigation Code Set. The phase code for “Discovery” is L300, which represents the discovery phase and the task codes include “Written Discovery,” “Document Production,” “Depositions,” “Expert Discovery,” “Discovery Motions,” and “Other Discovery.” However, these labels all represent sub-phases not tasks.”

Three Geeks and a Law Blog

These descriptions are all fairly broad, and can often come with a range of overlap between specific tasks. It can become difficult to choose the appropriate task code to assign to the task you are completing. Add to this the number of available options for some phases. Phase code E100, or Expenses has over 20 task code options to choose from. Inaccurate, or inconsistent application of phase and task codes, leads to inaccurate analysis and fee quotes later on.

This is made more complex by the possibility of wilfully misused codes. How many times has the code A111 (Activities – Other) been used, rather than scrolling through to find the appropriate code? After all, who among us hasn’t had a rushed deadline and cut one final corner?

There is no universal system

The potential for poorly applied codes becomes even more complex when you take into account the number of different sets of phase and task codes that exist. Many firms have changed or modified codes to suit their own matter management process. Additionally, different countries use different code sets. However, these may not be consistently applied across international matters or for international clients. This prevents meaningful analysis by a client, and holds vital data in silos between firms. Firms which must collaborate on matters are unable to share like-for-like data with each other. This is obviously frustrating for a client, but also prevents collaboration and progress within legal services as a whole.

The case for natural language processing

This was the exact conundrum facing our founders, Pieter and Bram, when developing Clocktimizer. If phase and task codes are unreliable, what else can be used to perform meaningful matter analysis? Thankfully, since the invention of phase and task codes, technology has risen to bridge the gap. 

While phase and task codes are prone to inaccuracy, time card narratives are considerably less so. After all, it is much easier to use your own words to describe an activity than to select the right code from hundreds of options. Natural language processing (NLP) is then able to read and categorise the tasks within the narrative itself. Through a process of training, and firm led refinement, it can ensure that all tasks are correctly identified and categorised. It can also assign tasks to phases, which accurately reflect the way a firm works. It doing to it creates a cleaner, more accurate data stream, which can then be used to build accurate fee quotes, based on real historical data.

The next generation of data analysis

As we have previously explored, the quality of the insights you can extract from your matters is entirely dependent on the data you feed it. Phase and task codes are applied with frequent inaccuracies, and generalisations. Accordingly, the insights gained from them will be equally general and inaccurate. This can be anything from low-ball fee quotes, to matters with reduced profitability. By combining increasingly sophisticated NLP, with Machine Learning, Clocktimizer users can ensure clean accurate data to inform their decisions. After all, wouldn’t we all rather spend less time filling in codes?

Illustration by Tara Lingard