Legal Informatics

Lead Article From The Practice November/December 2021
Taking the tediousness out of law

When I interview attorneys who work in legal innovation—at large firms, legal tech startups, and in-house departments—my goal is to find out whether they think legal innovation is worth the effort, and what types of resistance exist and why. A common reason arises for both why and why not: lawyers are really busy.

As comedian Mike Myers once parodied Lorne Michaels of Saturday Night Live: “Oh sure, it got a laugh. But did it get the right laugh?!” So sure, lawyers are busy, but are they the right kind of busy? It seems that they are often too busy with work they hate doing (or don’t need to be doing)—and their clients hate paying for—to have the time or energy to transition to a new model. This is especially true if the new method is perceived as untested or fleeting, or too burdensome to learn or use. And while a classic law firm pyramid structure might allow lawyers toward the top to focus most of their time on doing work they enjoy, it is often built on a system that requires many people to do work they do not enjoy.

A tax attorney friend of mine, a partner at a large firm, explained to me that he was inundated with extracting specific information from certain documents, over and over. Though straightforward and simple to him, he found that associates kept making errors doing it, so he was stuck doing it. He has clients waiting for requests, and he has no desire to spend his time on such tasks. His firm bought a reputable document analysis system, but, he said, “I can’t use it; I don’t have time to train it.”

Legal Informatics (Cambridge University Press, 2021)

This groundbreaking work offers a first-of-its-kind overview of legal informatics, the academic discipline underlying the technological transformation and economics of the legal industry. Edited by Daniel Martin Katz, Ron Dolin, and Michael J. Bommarito, and featuring contributions from more than two dozen academic and industry experts, chapters cover the history and principles of legal informatics and background technical concepts–including natural language processing and distributed ledger technology. The volume also presents real-world case studies that offer important insights into document review, due diligence, compliance, case prediction, billing, negotiation and settlement, contracting, patent management, legal research, and online dispute resolution. Written for both technical and non-technical readers, Legal Informatics is the ideal resource for anyone interested in identifying, understanding, and executing opportunities in this exciting field. Learn more on the Cambridge University Press website.

I’ve been asked many times if lawyers should innovate. But the decision to innovate is not a normative one; it is simply one of natural consequences. If the consequences of innovation are perceived as being net positive, no one has to make a moral argument. And if they are not, no moral argument is likely to change behavior.

Linked to questions of innovation in the legal profession is the exponential growth in the amount of information that is collected and stored—if not always used—by organizations. As Michael J. Bommarito II writes in a book I coedited (more on that below), law and legal professionals have always been concerned with information-come-data. As he writes, “While scholars often disagree over which system of writing came first, they generally agree that early systems of information storage emerged in response to the need to organize and manage legal and economic systems.” He goes on to note that as legal professionals became the key players in how laws are implemented and interpreted, “knowledge and data” was critical, “now or 3,000 years ago.” Of course, over time, the medium of knowledge changed from written words to 1s and 0s and grew from a few treaties to hundreds of millions and billions of data points stored electronically—all of which are now subject to new methods of analysis. Which brings us back to the beginning: information is—and always has been—the core domain of law and lawyers. However, the amount, structure, and possibilities of information have grown exponentially greater than at any time before. The critical question for lawyers, as well as other professionals, is how to understand, process, interpret, and use this knowledge in innovative and impactful ways.

As the legal profession grew, so too did the number of documents. Indeed, the paperwork increased beyond anything that could be managed solely by manual review.

From this context, questions of legal innovation form a critical part of a larger story and emerging field: legal informatics. This article serves to both introduce readers to the concept of legal informatics (including a new book on the topic) as well as situate informatics in the wider context of legal innovation. Some of the questions I will tackle include:

  1. What is legal informatics—and why should you care?
  2. Can we help lawyers spend more time on what they enjoy doing while ensuring that lawyer profitability is maintained, client costs are contained, and work product quality meets or exceeds requirements?
  3. How do we implement legal innovation in a way that maximizes the benefits and minimizes the costs?
  4. How do we measure or evaluate legal processes?

An illustrative starting point: e-discovery

As an introduction to legal informatics and an illustration of what happens when the legal industry embraces innovation, I want to start with an example: e-discovery. In a nutshell, as the legal profession grew, so too did the number of documents. Indeed, the paperwork increased beyond anything that could be managed solely by manual review. Moreover, the costs became increasingly prohibitive. A model had developed that relied on associate burnout: just put a lot of associates in a room and make them review stacks of documents by hand (much to their torment). As one CIO told me, it got to the point that one of their clients said that they could pay for the litigation or the discovery, but not both. The cash cow of document review was simply not sustainable, and the pretext of associate “training” was exploitative.

Since the early 2000s, technology tools such as search, duplicate detection, cloud computing, and so forth were injected into the process (for example, see a series of rulings from Zubulake v. UBS Warburg, starting with 216 F.R.D. 280 [S.D.N.Y. 2003]). As objections were raised about the potential poor quality of technology-based tools, standardized metrics (such as precision and recall) were developed to measure the accuracy of the outcomes. Problems were discovered in the technology solution. However, these metrics were then used to compare manual with automated processes, and it turned out that the manual work had higher error rates than the technology-assisted process. This is a common result with the application of technology to legal work, where gains in efficiency are often accompanied by gains in quality, and metrics are key to dispelling erroneous myths about the supposed higher quality of manual-only work.

Courts, drawing from the work of the Sedona Conference, eventually moved from cautiously tolerating e-discovery tools to mandating their use. The combination of lower costs and higher quality were viewed as a requirement of the Federal Rules of Civil Procedure Rule 26, which deals with proportionality and assignment of costs. At this point, there is robust competition in the marketplace, with new cloud-based e-discovery companies valued in the billions of dollars (e.g. DISCO, post-IPO, and Everlaw, pre-IPO).

The integration of technology in law is necessary to develop a modern legal system that can address increased complexity and increased client demands.

In this example, we see common threads of the injection of innovative technology into legal practice. First, increasing inefficiencies drive change. These inefficiencies create annoyance and worse for both client (cost) and associate (tediousness). Second, tools from fields such as computer, information, and library sciences are brought to bear on legal problems, often requiring a new set of tools specifically to solve legal problems. Third, the use of standardized evaluative metrics provide clear evidence that a new system, in addition to being less expensive, maintains or improves quality. Beyond core technologies such as search, AI, networking, and UIX, other subjects that are integral to the overall development process include design methodology, evaluation, and business models.

All these fields now uniquely applied to the realm of legal work have received inadequate academic and formal treatment. By comparison, biomedical informatics is a mature field of study, with robust academic programs, hundreds of Ph.D.’s annually, and commensurate career paths specifically for those cross-trained in medicine and technology (for more on the related discipline of health care informatics, see “Tracking the Profession”). A similar integration of technology in law is necessary to develop a modern legal system that can address problems created by increased complexity, inadequate access, growing sophistication of available tools, alternate legal service providers, and increased client demands. Crucially, this is already happening—the legal tech industry is burgeoning, whether or not the impetus is coming from the tech industry, firms, or companies. Legal education and schools of information must meet a reality where young lawyers expect to interact with and use technology and sophisticated data science to accomplish their jobs. One critical component necessary to facilitate this change will be textbooks that can be used to inject such cross-training into legal education, or in the application of general informatics programs to law.

From informatics to legal informatics

The implementation of e-discovery is part of the (reasonably) recent development of the field of legal informatics. Certain aspects of this field go back decades, and different regions of the world define it differently. Legal informatics consists of the formal computational underpinnings of legal technology and, depending on whom you ask, includes contextual issues related to the impact of technology on the nature of law, computer–human interaction including design and gamification, and the economics of innovation.

In 1999 the University of Washington School of Information Studies was the first school to offer an undergraduate major in informatics (though related disciplines such as computer science and library sciences were offered prior). “The term ‘informatics’ broadly describes the study, design, and development of information technology for the good of people, organizations, and society,” the school writes on their website. Crucially, they state, informatics majors at the University of Washington are trained to “turn information into actionable knowledge.”

Informatics has been defined in several ways, often invoking some or all of the fields of computer science, library science, information science, data science, and more. For the purposes of this article, we define informatics as the application of these disciplines to a particular domain. An informatics perspective looks at the data structures and algorithms, the flow of information, and the meaning of information within a given field. Biomedical informatics is an instructive example, given its rich history and maturity as both an academic and professional field. An article in the Journal of Biomedical Informatics defined it as follows:

“Leveraging insights from the philosophy of information, we define informatics as the science of information, where information is data plus meaning. Biomedical informatics is the science of information as applied to or studied in the context of biomedicine. Defining the object of study of informatics as data plus meaning clearly distinguishes the field from related fields, such as computer science, statistics and biomedicine, which have different objects of study. The emphasis on data plus meaning also suggests that biomedical informatics problems tend to be difficult when they deal with concepts that are hard to capture using formal, computational definitions. In other words, problems where meaning must be considered are more difficult than problems where manipulating data without regard for meaning is sufficient. Furthermore, the definition implies that informatics research, teaching, and service should focus on biomedical information as data plus meaning rather than only computer applications in biomedicine. [emphasis added]”

In much the same way, legal informatics can be defined as the science of information as applied to or studied in the context of law. Legal informatics contends with the development of technology in order to organize and make use of the vast amount of information that exists in the legal profession. As Bommarito writes, law is, after all, largely a knowledge business, which, until recently, relied on minds and paper. Legal informatics offers a lens through which that knowledge business might be transformed.

Legal informatics 101

To help gain purchase on this, I and two colleagues, Daniel Martin Katz and Bommarito, edited and contributed chapters to a new textbook, Legal Informatics (Cambridge University Press, 2021), which outlines the core concepts of informatics as applied to the legal profession. For the first time, we attempt to lay the groundwork for legal informatics as an academic field of study. As we write in the introduction outlining the rationale for the book:

“From document review in litigation, to compliance, case prediction, billing, negotiation and settlement, contracting, patent management due diligence, legal research, and beyond, technology is transforming the production of legal work and in turn the economics of the legal industry. Legal informatics is the academic discipline that underlies many of these transformational technologies, and despite all of these technical advances, no modern comprehensive treatment of the field has been offered to date. With contributions from more than two dozen academic and industry experts, this book offers readers a first-of-its-kind introductory overview of the exciting field of legal informatics.”

The book covers core technical concepts that are central to legal system implementation, including:

  • How legal information is represented
  • Legal search
  • Document processing and automation
  • The use of standards
  • Issues related to AI and machine learning

Other related yet central fields are covered in chapters on:

  • Process improvement
  • Design methodology
  • Gamification
  • Process re-engineering
  • The evaluation methodology of legal work

There are also over a dozen case studies, including contract and patent analytics, blockchain distributed ledgers, case outcome prediction, e-discovery, case-law search, online dispute resolution, access to justice, and knowledge management. Perspectives range from legal tech startups, teaching, law firms, and in-house legal departments. Contextual issues are discussed, including the impact of copyrights and fees for public information and an analysis of various legal service providers from the perspective of Clayton M. Christensen’s The Innovator’s Dilemma.

In looking at these issues, it is important for lawyers to understand that injecting automation for efficiency’s sake impacts law at its core. In addition to creating new trade-offs brought about by orders of magnitude changes (for example, the ability to monitor mass behavior online rather than one-on-one surveillance), technology also increasingly allows for the prevention of many types of behavior (for example, blocking digital copying), sometimes blocking even legal actions. The balance between prevention and punishment to regulate behavior is an exemplary problem in the study of the technological implementation of legal systems. It is one of many examples of the types of academic and philosophical problems that arise in legal technology, whose resolution has real-world impact.

Potential benefits and risks of innovation


Legal informatics offers an entryway to legal innovation more generally. Why think more holistically and rigorously about innovation in the legal profession? In other words, why should we care?

There are several advantages to innovating in legal practice, all of which are worth engaging with intellectually. This article focuses on revamping existing legal markets (“sustaining innovation”) rather than creating new markets.

One reason to innovate is simply that many lawyers are overloaded, and any way to accomplish client goals that takes less lawyer time is potentially a win. For those lawyers with too much work and no lack of clients, lowering their workload per client enables more work to be done. In some cases, partners may find themselves doing associate-level work, and clients are resistant to paying partner-level fees for it. This misalignment of work and associated discounted billing hurts profitability. When innovation can assist associates in off-loading some of this, everyone benefits.

A second reason to innovate relates to another type of overwork—associate retention. As recent reports have indicated, the market for talented associates is competitive (for more on law firm salary hikes and the legal talent pool, see “Upping the Ante (Again)”). Associates who have grown up with the internet and smartphones are not only comfortable with technological approaches to problem solving but also resent having to do things by hand they believe can be done with automation (for more on young lawyers’ frustration with traditional legal practice, see “Contracting Out”). This is particularly true for rote, repetitive work. As a modern classical example, roomfuls of associates doing document review, whether for M&A due diligence or e-discovery, have largely been replaced by a combination of smart technology and vastly reduced staff. The old notion of working as the cash cow for a partner at the top of the law firm pyramid is increasingly unreasonable for a modern associate; they want training, mentorship, flexibility, and efficiency. The cost of associate recruitment and training is all the higher if retention is poor. Even if associates stay, they may switch practice groups, and the impact as viewed by the original group and their clients is still detrimental.

A third reason to innovate is that it lowers client costs. Lawyers sometimes get a bad rap for being too expensive or overcharging. But perhaps it’s more accurate to say that lawyers err on the side of caution, meaning that they may overproduce in an attempt to minimize client risk. A lawyer who has rare expertise, say, in dealing with a federal agency is going to be able to charge a lot. One head of legal ops said to me, referring to an expert who came in as our interview was ending, “We love him; we just hate paying for him.” Where innovation and the appropriate application of technology can lower client costs, ultimately that helps strengthen the attorney/client relationship. When a client notices that costs are going down, that shows good faith on the part of the attorney and minimizes the frustration of high costs. There are, of course, trade-offs and concerns about quality, which we’ll discuss in the next section. But when innovation can maintain or improve quality while increasing efficiency, lowering client costs is a benefit.

A fourth reason to innovate is that it enables better leveraging of lawyer expertise. To the degree that knowledge is extracted from an expert’s head and made available through some type of database, or knowledge or document management system, there is a corresponding amplification of that expertise. Not only does it reach more clients, but it also increases consistency across all clients. Updating document templates in a workflow that uses them helps reduce errors including typos, wrong client information, and outdated law, for both transactions and litigation. Regulatory compliance is easier via a robust knowledge management system (to learn more about how McKinsey’s legal department handles knowledge management, see “Knowing Is Half the Battle”).

A related benefit to amplifying the impact of expertise via knowledge capture is the recording of institutional knowledge. The ability to find in-house domain experts, to review how or why decisions were made, or to understand client organization structure, risk tolerance, and staff concerns benefits both attorney and client.

Innovative technologies such as document automation or analysis systems not only decrease the time spent to produce the same result, they also increase the quality of it (for example, fewer typos, increased accuracy in document and clause identification). Transaction management systems that track workflow such as document versions and signatures help assure that the correct documents are signed by the right people at the right time.

And the fifth reason to innovate is the ability to aggregate disparate information to determine trends. While trend analysis by law firms is nothing new, the ability to integrate different sources in real time or provide an interface that allows clients to pose more sophisticated queries requires more advanced technology. In the end, there are competitive pressures to innovate, including alternative service providers, moves toward in-house departments, and client demands for greater efficiency, accountability, and sophistication.


The reasons not to innovate can be as reasonable and compelling as the reasons to innovate, especially on a case-by-case basis. The list is basically short and sweet— it can be characterized as “why bother?” or “if it ain’t broke, don’t fix it.”

As mentioned before, a busy attorney may simply not have the time on any given day to try to adopt new practices. Deadlines are often such that urgent matters, due in days, may reasonably take priority over the work required for potential long-term projects. There is an up-front mental cost of even contemplating changes in workflow against the backdrop of a busy schedule that seems to be working if not ideally, then at least sufficiently well to meet current demands.

Innovation requires many components to have a successful outcome, and it can be challenging to find the time, money, and energy to attempt it. In addition, it is tough to find people who know how to execute such endeavors, managers/partners who are willing to invest the resources, and users who will work with developers to iterate through the requisite trial-and-error nature of innovation. Other costs include staff training and maintenance.

Another pitfall of innovation is that lawyers risk making mistakes by handing over their work to other people, new processes, or new systems. Like the proverbial “changing the tires on a moving car,” it’s not as if work can stop to roll out a new system. The potential of a missed deadline, let alone a substantive error in a contract or brief, creates an understandable incentive to leave well enough alone. This is especially true for the siloed nature of a law firm, where the responsibility of maximizing profitability or employee retention falls outside the scope of the practitioner. The risk/reward analysis of an innovative project as viewed for a law firm overall may not equate to a similar ROI for an individual attorney who may feel more of the burden than the benefit, at least initially.

(For more on legal innovation, see “Adaptive Innovation” by myself and Thomas Buley, which is also based on a chapter in Legal Informatics.)

ROI of legal innovation

If the core component of legal informatics is the how, then the complementary component is the why. In reviewing the potential costs and benefits of legal innovation discussed earlier, can we identify contextual factors that help lead to a net positive return on investment (ROI)? If we can, what conditions might need to be present to facilitate initiating a project or funding a legal tech startup?

Organizations are made up of individuals, so positive ROI for the organization overall, or for some stakeholders, might not ever translate to positive ROI for everyone. As Christensen pointed out, however, the baseline return of not innovating is often erroneously thought to be flat. In fact, in an environment in which competitors innovate, the baseline of doing nothing is actually increasing losses over time. In addition, “more often than not, failure in innovation is rooted in not having asked an important question rather than in having arrived at an incorrect answer.” As has been said many times, a requisite mindset in innovation is learning through failures, which is often antithetical to the risk-aversion perspective of many lawyers.

The challenging task of assessing ROI involves, in part, estimating costs of building something new, including many aspects discussed in the book such as the design process, analysis of building vs. buying, gamification of adoption, and so forth. Benefits can be equally difficult to assess. As one legal innovator told me, innovation can be “somewhere between difficult and impossible,” and it is as much an art as a science. There can be a tendency to build a hammer looking for a nail. One CIO told me that he wanted a big data system. “To do what?” I said. “I don’t know.” The design process involves finding nails first and building hammers as needed.

Innovation helps do away with those activities that “lawyers hate doing and clients hate paying for.”

Innovation helps do away with those activities that “lawyers hate doing and clients hate paying for,” one innovator told me: due diligence, transaction management (for example, document signature and version control), and time card narratives. Many of these are not easy problems to solve. For example, the technology of automatically discovering how lawyer time is spent is inadequate. From the technology perspective, it requires a potential revamping of billing codes and workflow, incentivizing human input, developing various vocabularies, and using increasingly sophisticated AI. A lawyer involved in a startup trying to automate time card analysis told me that the current level was pretty good, whereas a head of innovation at a large law firm did not have such a high opinion.

As with e-discovery, standardized quality metrics would help resolve these different opinions. In fact, courts rely on standard quality metrics to determine the proper application of e-discovery technology; rather than dictate which technology should be used, the emphasis is on the quality of the results. For instance, in a case centered on the use of technology-assisted review (Rio Tinto PLC v. Vale S.A., cited above), the opinion stated:

“[…] requesting parties can insure [sic] that training and review was done appropriately by other means, such as statistical estimation of recall at the conclusion of the review as well as by whether there are gaps in the production, and quality control review of samples from the documents categorized as non-responsive. [emphasis added]”

The use of “recall”—defined as the percentage of relevant documents available that are produced for discovery—is particularly notable as it is a standard metric from the world of information retrieval. The move from process control to outcome control via quality metrics frees up not just e-discovery but many areas of legal practice. The chapter “Measuring Legal Quality” in Legal Informatics explores how this can be generalized to many areas of law. The implications are far-reaching—focusing on product rather than process allows efficiency gains without sacrificing quality. As I’ve said many times, if all we cared about was efficiency, we could resolve disputes with a coin toss. Efficiency without quality is useless. Appropriate metrics unlock innovation. Their use is important for all areas of law, keeping the focus where it belongs—on outcomes, not process.

Should innovation be developed internally or purchased externally? The ROI model is different, since external technology is available to be sold to the general market, while internal development can be specifically tailored to unique circumstances, albeit usually at a greater cost. There are several problems with funding legal innovation, even when a given problem is feasible from a technology perspective. If a problem is straightforward to solve, and correspondingly the sales cycle is fast, then there is usually too small a market to warrant VC funding. And, not surprisingly, if the market is large enough, the sales cycle, especially if selling to law firms, is generally too slow for VC funding.

A comprehensive study of legal informatics includes not only core technology but also the academic, philosophical, and economical context in which legal innovation operates.

The sweet spot of having a large-enough market with a relatively fast sales cycle generally implies “product-led growth,” such as with LegalZoom, Cleo, MyCase, and so forth: either platforms that enable subscriptions with add-ons, or consumer markets with viral marketing. As a result, regardless of the technological feasibility of a problem’s solution, legal innovation can be hamstrung by the combination of who pays vs. who benefits—a classic startup conundrum as seen, for example, in education innovation.

A comprehensive study of legal informatics includes not only core technology but also the academic, philosophical, and economical context in which legal innovation operates. Although technology inevitably works its way into every field, that path is not identical. The more we train students and practitioners, whether in engineering or law, about the unique way that technology and law interact, the more we will be able to set sail and pick a direction rather than ride the current and end up in a suboptimal destination.

Teaching, training, and career paths

Legal Informatics came from the combined experience of dozens of practitioners working in legal technology at law firms, startups, in-house departments, academic institutions, and more. Their backgrounds include law, engineering, finance, design, and more. This labor of love came from a desire to modernize legal systems across the globe. Some have been making this effort for decades, generally against the headwinds. For many of the reasons discussed above, one does not go to a well-paid lawyer or law firm and tell them that they should change their business model. Rather, problems arise and solutions are sought. As problems increase due to issues around scalability or an access-to-justice crisis, and as clients become more savvy and share information about legal processes, costs, and alternative legal service providers, pressure is increasing to incorporate modern engineering practice to the implementation of legal practice.

Creating solid top-level career paths in legal tech is mandatory to attract the best talent in what is increasingly becoming a competitive imperative.

Medical informatics has come to maturity, as evidenced by the prevalence of robust training, academic programs, standards, research, and viable career paths. When a legal technologist is forced to abandon the technology hat to pursue a partner path, or decides to report either through legal or engineering, there is a resulting diminution of the value of the interdisciplinary work required to solve complex legal problems requiring the most advanced technologies such as AI, NLP, and cybersecurity.

Few people who drive need to know how to build a car. But a reasonable understanding of how they work helps in the selection, maintenance, and use of them. Basic training in legal technology is useful for the same reasons, especially as more sophisticated tools appear across the entire realm of legal practice areas. Creating solid top-level career paths is mandatory to attract the best talent in what is increasingly becoming a competitive imperative. Regardless of any potential downside associated with legal innovation, the injection of technology in the legal system is only increasing.


Technology is increasingly used to implement and reconstruct the legal system. Its characteristics of efficiency and scale promise to apply regulations to more people more often. Its tendency to work in a mechanical or statistical manner threatens to minimize the role of exceptions, exigency, and empathy that are core components of law. The need to balance efficiency, quality, fairness, and access is nothing new. But as is common with technology, the harm caused by getting it wrong increases with each order of magnitude change in implementation. The necessity of the legal technologist to understand both technology and law cannot be overstated. Doing innovation well—applying legal informatics correctly—will also allow more lawyers to spend more time being the right kind of busy: working on tasks they enjoy, that use their skills, and that yield the most benefit to their clients and society.

Ron Dolin is a Senior Research Fellow at Harvard Law School’s Center on the Legal Profession. He holds a PhD in computer science from UC Santa Barbara. Ron was one of the first 100 employees at Google, and left after several years to get a law degree. He is a licensed attorney in CA. He has taught classes on legal technology and informatics at Harvard Law School, Notre Dame Law School, and Stanford Law School. His areas of research include developing and analyzing legal quality metrics: definition, implementation, and assessment of the metrics; impact of standardized quality benchmarks and testing methodology on the legal system, including in-house selection and management of outside counsel, increasing access to justice, legal technology startups and general competition, latent middle-class market, and UPL regulations. In addition to research and teaching, he is also an angel investor, focusing on legal technology startups.