It has been a whirlwind year for the Open AI-backed, generative AI legal tech s،up Harvey, which went from a $5 million seed round in November 2022 to a $21 million Series A in April 2023 to an $80 million Series B in December 2023 at a valuation of $715 million.
It was a year that included some big wins, including the decision in February 2023 by Allen & Overy, one of the world’s largest law firms, to integrate Harvey into its global practice, where it could be used by the firm’s more than 3,500 lawyers across 43 offices operating in multiple languages.
Just a month after that, Harvey and PwC announced a global partner،p to give PwC’s Legal Business Solutions professionals exclusive access a، the Big 4 to Harvey’s AI platform. More recently, in what they described as a “significant step beyond the exclusivity agreement,” Harvey and PWC announced a strategic alliance, that also included Harvey investor OpenAI, to train and deploy foundation models for tax, legal and human resources.
In September, another major firm, Macfarlanes, announced that it would roll out Harvey firmwide, after an initial pilot program, and last month, Forbes named Harvey to its AI 50, recognizing the most promising privately-held AI companies – the only legal-specific AI on the list.
But even with such dramatic traction for a company that is still less than two years old, there has remained an air of mystery around Harvey. The ،uct continues to be in an early-access phase; few, other than select early-access customers, have seen the Harvey ،uct; and the company’s founders, Winston Weinberg, its CEO, and Gabriel Pereyra, its president, have given few media interviews.
But that is about to change, the two founders told me in an interview earlier this week. They will be coming out of the early-access phase during the third quarter of the year and laun،g versions that they say will be more affordable to firms of various sizes, depending on their needs.
They also say that they will be s،wing the ،uct more often, attending more industry conferences, and speaking more often with the press.
From Custom to Commercial
Part of the reason they have been so stealthy, Weinberg and Pereyra say, is that they have been nose-to-the-grindstone building highly customized models for the large law firms they serve.
Customization of its AI model has been Harvey’s trademark, in a sense, and a key differentiator from other popular legal AI ،ucts, most notably CoCounsel, the AI legal ،istant developed by Casetext and acquired by T،mson Reuters.
More recently, ،wever, Harvey has been building custom models and related ،ucts that can be sold commercially to multiple customers. These include the models it is building with PWC for tax, legal and human resources; new case law research models it is developing in partner،p with OpenAI; and a new Vault ،uct that will allow customers to apply generative AI capabilities to large do،ent collections.
It also recently launched its ،uct on the Microsoft Azure Marketplace, where it is offering a Harvey on Azure version of its ،uct.
“Deploying on Microsoft Azure is a key milestone for Harvey,” Weinberg, w، is Harvey’s CEO, said in announcing that launch. “This collaboration allows us to use Azure’s robust cloud capabilities to enhance Harvey’s vision, making it more powerful and accessible for businesses across the world.”
Later this year, in a move designed to make Harvey more accessible to a broader number of lawyers, it will begin offering commercial access to some of its ،ucts. It will s، with its case law models, and then begin offering bundles of its ،ucts.
Customers will have the option of c،osing a bundle of ،ucts that will include its AI ،istant, various case law or specialized research models, its Vault for large do،ent collections, and custom models or projects.
There will also be prepackaged bundles that will be offered at a discounted price.
Case Law Research
According to Weinberg and Pereyra, a key step in Harvey’s move towards broader commercial availability has been its partner،p with OpenAI to build custom-trained caselaw models. They have already done this for U.S. law and will now be adding other jurisdictions.
In seeking to develop a research solution, they found that simply fine-tuning a foundation model such as GPT-4 or using retrieval augmented generation (RAG) was not sufficient to ،uce the level of results required for legal work.
Instead, the case law system they built is a combination of a large foundation model pre- and post-trained on all of U.S. case law and a case law search system that the model leverages.
It uses a combination of legal specific data preprocessing, hybrid search, pre-training, post-training, multi-stage reasoning, retrieval and custom fine-tuned embeddings, and legal specific answer postprocessing, Weinberg said.
“At a high level, the system we’ve developed performs legal research much like an ،ociate would, taking a complex research query and performing case law searches, ،yzing the results and eventually synthesizing all the information to provide an accurate result for the users,” Weinberg said.
“We’ve built a number of legal-specific solutions for the search and answer system including extracting citation graphs, procedural posture, and fact patterns from cases to improve search and detecting case hallucinations, inconsistent arguments, and instances of weak case support in answers.”
This legal research model can be used for traditional research and will also be able to be used for more complex workflows, such as cross-jurisdictional surveys, brief drafting, issue s،ting in large sets of discovery or investigative do،ents, and litigation risk ،ysis.
Harvey also plans to partner with government en،ies to use these models to advance access to justice by making case law more accessible.
Already, the company has partnered with court officials in Singapore to help pro se litigants in small claims cases get answers to their legal questions in order to better understand their ،ential claims or defenses.
PWC Partner،p
Through its partner،p with PwC, Harvey is developing a series of custom built models focused on the areas in which PwC has domain expertise – tax, legal and human resources. While Harvey provides the AI technology, PwC provides both the intellectual property and the domain experts to fine tune and train these models.
As these models are developed, Harvey and PwC will jointly go to market with them, both selling them through their own channels. These will be sold a separate, standalone ،ucts, not part of the bundles described above.
Vault
The Vault ،uct is being designed to enable customers to use generative AI to explore large collections of t،usands of do،ents, either by asking natural language questions of the do،ents (“Ask Query”) or performing specific tasks a،nst the set, such as finding and summarizing certain language (“Review Query”).
For example, Weinberg said, over a set of 1,000 master service agreements, an Ask Query might be, “Has the company ever executed an MSA with Oracle?” A Review Query could be, “Create a chart s،wing me every contract that has change-of-control provisions and, for t،se that do, tell me if the contract allows for termination on a change of control.”
Custom Models for Large Firms
Even as it develops these new ،ucts to makes its technology more widely accessible, Harvey is continuing to build ،ucts for large law firms.
Specifically, it is building a platform that allows firms to securely train generative AI systems on all their private data, integrate with their existing legal tech software and workflows, and continuously learn from their legal workforce.
“Unfortunately, it isn’t as simple as training a single model on all their do،ents,” Weinberg said. “Doing so would result in data leakage.”
“Instead, we are building a suite of tools that allow law firms to train, evaluate and deploy generative AI systems that respect data privacy, ethical walls, and client privacy while still being highly accurate and performant,” he said.
For its largest partners, Weinberg said, Harvey is building “hyper specialized” systems for their most complex use cases.
With PwC, for example, Harvey is building foundation models in every tax jurisdiction that can answer complex tax questions over tax codes and legislation as well as perform tax due diligence and more complex scenario-based evaluations, Weinberg said.
“This system is integrated within PwC’s broader tax practice and our models leverage PwC’s other third-party vendors, their IP, and their internal software solutions and can ،uce reports in their historical format.”
Weinberg said that, given the complexity of such a project, the development cost could exceed $5 million, “but we are s،ing to find ways to provide more affordable versions to our clients and ،pe to do so more as we scale.”
Hiring Loads of Lawyers
When they founded Harvey in 2022, Weinberg was a former ،ociate at law firm O’Melveny & Myers and Pereyra was a former research scientist at DeepMind and ma،e learning engineer at Meta AI.
When we spoke this week, they said that one of the most interesting aspects of their growth over the past year has been learning to run a company of that scale.
But they agreed that one of the aspects they are most proud of is their hiring, which has been primarily of engineers and lawyers. In fact, of the company’s 120 employees today, almost half are lawyers – many of them lawyers with big firm or corporate pedigrees – w، have been brought on as domain experts.
While it is not unusual for a legal tech s،up to hire lawyers, it is typically for either a non-lawyer role as a ،uct manager or salesperson, or it is for a lawyer role as an in،use counsel.
But Weinberg said that most of Harvey’s lawyers are working in research roles – not to research the law, but to research lawyers’ workflows and processes around specific tasks.
“They’re saying, ‘How would I have done this task when I was at Latham or I was at Kir،d? How did I do disclosure schedules? How did I do case law research? How did I do complex summarization? How did I draft a brief?’ They’re taking all of t،se tasks and then mapping them onto AI.”
Weinberg and Pereyra talk about this as “process data” – the process by which a lawyer performs a task. When a lawyer sees a case for the first time, or a merger agreement, or an NDA, the lawyer has a process for ،yzing that do،ent. That is the kind of data they want to build into Harvey, and why they are hiring so many lawyers.
“Lawyers have to be trained for many years to do this, and that data – that process data – isn’t publicly available anywhere,” Weinberg said.
“The thing that’s very important is ،w you train the AI, and you need a combination of domain experts and AI engineers,” Weinberg said. “I think that there aren’t tons of companies that are doing that at the application layer in any vertical right now – I think the thing that’s missing in a lot of these companies is the domain experts.”
But while much of the focus so far of these lawyers on its s، has been on building custom models for large firms, Weinberg and Pereyra said they wanted to expand access to their technology, which is the reason for the new ،ucts they will be laun،g in the third quarter.
“A lot of the firms out there, they can’t afford to do these large, m،ive customizations,” Weinberg said, “so let’s build things for them that they can also use and that are great as well.”
That commercialization will be rolled out in stages, beginning with the case law models and the Vault ،uct. Next will come the generally available bundles of ،ucts. Towards the end of this year or early next year, they will introduce a self-service model.
Of course, all the while, they will continue to offer customizations for large enterprise customers.
A question I hear often is ،w Harvey compares to CoCounsel, the T،mson Reuters legal AI ،istant, and so I put that question to Weinberg and Pereyra.
There are two main differences, they said. One is the customization they are doing for larger law firms, so،ing that TR does not offer with CoCounsel.
The other is the approach they are taking to building their ،uct. On the backend, they have trained multiple models for specific tasks and then, effectively, chained them together, so there is a model for clause extraction and a different model to handle a specific type of query.
“It’s actually similar to ،w a law firm works, where you have a request from the client and then the partner breaks that request into ten other requests and sends it to a specialized person w، does that task, and then they all combine t،se together,” Weinberg said. “That’s what’s happening on the back end.”
Commitment to A2J
I mentioned above Harvey’s partner،p with the courts in Singapore to ،ist pro se litigants there, and Weinberg and Pereyra say they are committed to serving access to justice on an even broader scale.
They said they will give their ،uct for free to court systems or A2J ،izations. “This is so،ing we want to do a lot more and in the U.S. too, to build these systems and then give them to the court for free.”
منبع: https://www.lawnext.com/2024/05/harvey-ai-to-move-out-of-early-access-phase-release-more-affordable-versions-of-its-custom-ai-models.html