Patent search, classification and drafting in the age of generative AI

We are witnessing the largest worldwide AI experiment ever, but how will AI change the patent industry?

“What a diff’rence a day makes”. Dinah Washinton’s popular song best describes the Artificial Intelligence rollercoaster we have been on the last 3 months: from the introduction of ChatGPT, the human-like chatbot in December 2022, to a new search paradigm, which effectively is a whole different gateway to the internet, to the first legal co-pilots and the introduction of the most anticipated new model in the history of Artificial Intelligence, GPT-4 just a month ago. Every day, a new breakthrough!

These large language models are ace Bar exams, top LSAT scores and even pass for Sommelier!

Currently, we are witnessing the largest worldwide AI experiment ever, and we are all participating!

But where will this lead and how can we look beyond the hype? That Artificial Intelligence will change the legal and the patent industry in particular is beyond doubt. But which application can benefit most from large language models, how can we use them responsibly (make sure the technology is defensible and transparent), and how can we integrate the technology properly in the process of patent search, classification, monitoring and ultimately drafting? What is the most optimal combination of Human+AI collaboration?

For an industry that does not fully understand where these AI models come from, how they work, and what their limitations are, that is a risk. The main reason is that ChatGPT (herein referred to as including GPT-4) is just a generative language model, designed to generate language based on a statistical process mimicking human language as seen during training. This process is kicked off with a so-called prompt, but that is all the steering there is. ChatGPT on its own is not a search engine, nor a source of knowledge, nor a translator, nor a linguistic classifier.

As ChatGPT can do such tasks at a reasonable quality level (with a highly confident tone), and since it seems to respond to prompt changes, it feels like it is learning to do better. But nothing is less true: it is not. Without a proper encoder¹, the language generation is a statistical process that one cannot control. You will probably be tuning until the end of time, without getting a reliable and stable system. You can already see this in the integration between ChatGPT and Bing: great for having fun on a Sunday afternoon, but not the reliable legal search assistant one should be looking for.

There are several legal initiatives that make more sense (such as Allen & Overy’s Harvey), but these are at risk of being introduced to the real world too fast: models that take short-cuts, models from which we do not know why they take certain decisions and not other decisions, simply because the people using them do not fully understand the technology or the companies involved cover themselves with a cloak of secrecy.

That said, very few people in the world really understand these large language models. This is one of the main concerns of the Stanford University Human-Centered Artificial Intelligence school and from various scholars that express their opinions in publications such as the MIT Technology Review and popular magazines such as WIRED Magazine. Truly understanding is one of the prime research goals.

Without understanding, transparency and a proper framework for (legal) defensibility, there will be no trust. Without trust, the legal industry will not accept Artificial Intelligence, as they should!

Creating trust has different facets:

Understanding the technical roadmap of large language models, their capabilities and how to address their limitations.
Understand how they can add value to existing technology such as search engines, integrated programming environments or (contract) content management systems.
Understand how to validate the quality of such models and integrations.
Understand why these models take certain decisions and not others? What do they know and what do they not know?
Understand how to integrate them optimally in existing legal workflows.

By addressing the above facets of trust, a route map will be proposed that can be used to benefit from the success of large language models for legal applications in a responsible way.

AI+Human: The future is in patent co-pilots

Replacing patent lawyers and patent professionals with artificial intelligence algorithms is not realistic, certainly not in the light of the (i) unpredictable nature of many of the AI algorithms, (ii) regulatory requirements (attorney client-ethics), and (iii) due to the fact that research shows that combining skills from Artificial Intelligence with human skills leads to the best results.

Humans are not cognitively suited to quickly find relevant information when searching for patents. They cannot review thousands of complex documents without making errors in the process.

AI and humans working in tandem can be more effective than either AI or humans working alone because they bring different strengths and abilities to the table. AI is good at processing large volumes of data quickly, identifying patterns and trends, and making predictions based on statistical analysis. At its best, it is also not subject to biases or emotional reactions, which can sometimes cloud human judgment.

On the other hand, humans are better at tasks that require creativity, critical thinking, and the ability to interpret complex information. They also bring a wealth of experience, knowledge, and intuition that cannot be replicated by AI.

By combining AI and human capabilities, organizations can leverage the strengths of both to improve decision-making and achieve better outcomes. For example, in eDiscovery, AI can be used to sort and classify large volumes of data, while humans can review and interpret the data in the context of the legal matter at hand. In healthcare, AI can analyze medical records and imaging data to identify potential health issues, while doctors can provide personalized care and treatment recommendations based on their expertise and experience.

Furthermore, AI and humans can work together to improve the quality of AI models over time. Humans can provide feedback on the accuracy and relevance of AI-generated recommendations, which can be used to refine and improve the AI models.

In Human + Machine: Reimagining Work in the Age of AI by Paul Daugherty and H. James Wilson, the authors explain exactly this by using many real-world examples.

A legal professional supported by technology can charge higher hourly rates, leading to higher margins and a better bottom line. But using AI irresponsibly will only lead to disaster.

So, let’s first take a look at the capabilities of large language models, how we can improve them and how to integrate them in other legal technologies. Next, we’ll discuss what is essential to create the necessary trust for large-scale integration of such models in the daily workflows of legal professionals.

Patent Co-pilots

In another experiment, Microsoft has shown that a programming tool named Copilot in Github can result in significant developer productivity. The code is not always optimal, it may not follow today’s cyber-security standards, but one can quickly set up a framework that can then be fine-tuned.

If properly integrated into an Integrated Development Environment (IDE) such as Visual Studio, one can achieve a perfect AI+Human experience.

The similarity between programming and patent drafting is obvious. Besides the intellectual part, where the drafting attorney delivers most of the value by e.g. defining the optimal scope of claims, a large part of the process is more or less editorial work, ensuring comprehensiveness and high quality of the patent application. One needs to be consistent, have suitable claim fallback positions, sufficient support in the description, and overall be prepared for the “unknown future”. Just like a professional programmer needs to produce production-ready documented code covered by automatic tests. The fascinating aspect is that AI, harnessed the right way, is that it can assist the attorney both in the intellectual and editorial parts of the process.

Therefore, the use of patent co-pilots based on AI has the potential to revolutionize the patent industry by enabling patent professionals and engineers to work more efficiently, accurately, and cost-effectively, while also expanding access to legal services for a broader range of clients.

While there are still challenges to be addressed, such as ensuring the transparency and accountability of AI-based systems, the growing adoption of legal co-pilots by leading law firms suggests that this technology is likely to play an increasingly important role in the future of the legal profession.

Generative Artificial Intelligence has caused a watershed moment in our world. The patent industry will also be impacted and change, and we at IPRally cannot wait to be part of that development. But we understand that we cannot take shortcuts. Stability, transparency and defensibility are essential when technology such as Artificial Intelligence is used for legal and patent applications.

IPRally is in the unique position to combine these large language models with our unique patent search and classification technology. Therefore, we can better control the flow and content and conversations and prevent hallucinations and non-factuality better than anyone else in the industry.

Initial applications can be in conversational search and classification, summarization and explanation of the impact of claims of a patent application, and assistance with the drafting of patent applications. But we also think of many other applications to improve our customers' productivity and make patent searching easier, better and maybe even fun!

References

Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

Tenney, Ian, Dipanjan Das, and Ellie Pavlick. "BERT rediscovers the classical NLP pipeline." arXiv preprint arXiv:1905.05950 (2019).

Radford, Alec, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. “Language Models are Unsupervised Multitask Learners.” (2019). GPT-2.

Language Models are Few-Shot Learners, Tom B. Brown et al., arXiv:2005.14165, July 2020. GPT-3.

Ouyang, Long, et al. "Training language models to follow instructions with human feedback." arXiv preprint arXiv:2203.02155 (2022).

Tegmark, Max (2017). Life 3.0 : being human in the age of artificial intelligence (First ed.). New York: Knopf.

Russell, Stuart (2017-08-31). "Artificial intelligence: The future is superintelligent". Nature. 548 (7669): 520–521. Bibcode:2017Natur.548..520R. doi:10.1038/548520a. ISSN 0028-0836.

Russell, Stuart, Human Compatible. 2019.

‍

Sakari Arvela and Jan C. Scholtes

April 20, 2023

•

5 min read