Which OpenAI Model for Legal Work? Update for GPT-5

by Avi GesserDiane Bernabei, and William J. Sadd

photos of authors

From left to right: Avi Gesser, Diane Bernabei, and William J. Sadd (photos courtesy of Debevoise & Plimpton LLP)

We recently provided a quick guide on the comparative capabilities of the various models currently available through GPT Enterprise based on our experience as lawyers using these models and OpenAI’s own recommendations. Below is an update to that guide based on our use of the new GPT-5 models since they became available to us on August 7, 2025. Again, this blog post is not a review or endorsement of any particular GenAI model.

Model Overview Average Response Time Inputs Examples of Good Uses for Lawyers Enterprise Usage Limits Context Window Limits (Not API) and Approximate Number of Single-Spaced Pages
GPT-5 (Flagship/Router) (use in similar ways to GPT-4o) Default in ChatGPT. Serves as the base model (like 4o) but also serves as a router to other models if it thinks they are more appropriate to answer the query. So, for example, if you include “think hard” in the prompt, it will likely move the query over to the GPT-5 Thinking model. A few seconds for simple prompts; longer when routing to other models. Text, files and images (multimodal); can analyze documents, images, and audio, and can generate images via ChatGPT tools. Proofreading, drafting emails, and summarizing documents that are not particularly complex; summarizing case law or contracts; first drafts of simple legal documents; answering general legal questions; generating illustrative diagrams or images for presentations. Unlimited. 128,000 tokens (~190 single-spaced pages).
GPT-5 Thinking (use in similar ways to o3) Good for complex, multi-step analysis. Excels at logical reasoning, code, math, and visual tasks. Typically, 30–90 seconds; (expect longer for image generation). Text, files, and images; full tool use (web browsing, file analysis with Python, etc.) for multimodal reasoning. Complex legal analyses and strategy. In-depth case reasoning, multi-step legal argument development, or analyzing large evidence datasets with code. Good for planning or tasks requiring rigorous step-by-step logic. 200/week but temporarily increased to 3,000/week. 196,000 tokens (~294 single-spaced pages).
GPT-5 Pro (use in similar ways to o3-pro) Same core capabilities as GPT-5 Thinking but with extended reasoning time for more accurate and research grade responses. 3–10 minutes. Text, files, and images; full tool use (web browsing, file analysis with Python, etc.) for multimodal reasoning. Very complex issues with need for accuracy (e.g., double-checking critical legal arguments or calculations). 15 requests/month. 128,000 tokens (~190 single-spaced pages).

Context Windows

In addition to differences across the GPT-5 models, it’s important to keep in mind a practical limit that applies to all models: the size of their context windows. Each GPT-5 model has a fixed context window, which is the ceiling on the number of tokens (basic units of text like words, parts of words, or punctuation marks) it can process at once, including uploaded documents, questions, and prior answers. If you hit that ceiling (e.g., 128K tokens, or roughly 190 pages, when using the GPT-5 Flagship model), the system may reject the request or drop the earliest content. Even when the document fits, very large inputs may reduce precision. A common workaround is to split documents into smaller chunks (30–50 pages) and query them separately. As a rule of thumb, avoid filling the window completely. Keep uploads to 80–90% of capacity (100K–110K tokens, roughly 150–165 pages) to leave room for questions, answers, and follow-ups. This preserves recall, avoids hitting the hard token ceiling, and improves precision.

Legacy ChatGPT Model

The legacy models discussed in our previous post remain available for enterprise users (with the exception of GPT-4.1 and o4-mini-high). If they do not appear in the model selector drop-down, administrators within an organization should have the ability to re-enable the legacy models in the enterprise workspace. Some users have found that, because they are more familiar with the older models, they seem to get better results than with the GPT-5 models, or they prefer the tone or style of the previous models. That said, while a date has not yet been set by OpenAI, users should plan for eventual deprecation and begin testing on the GPT-5 series of models.

New Feature – Study Mode

All the features discussed in our recent blog post, including Deep Research, Canvas, Persistent Memory, Custom GPTs, and Projects are still available with the GPT-5 models. One new feature that was recently added is the “Study and Learn Mode” feature, which is available by just clicking on the + in the prompt window. Turning on the feature shifts chats into a guided “learn together” flow, with Socratic questions, step-by-step scaffolding, and checks for understanding. It can work off files you upload, including cases, law review articles, regulations, or technical documents (e.g., over the next 30 minutes, teach me about the issues in dispute in the various AI copyright cases). For lawyers, this could be a useful way to deepen subject-matter expertise or train junior associates on new topics.

Avi Gesser is a partner, Diane Bernabei is an associate, and William J. Sadd is Head of Practice and AI Systems at Debevoise & Plimpton LLP. This post was originally published on the Debevoise Data Blog.

The views, opinions and positions expressed within all posts are those of the authors alone and do not represent those of the Program on Summer 2025 Compliance & Enforcement Blog Corporate Compliance and Enforcement (PCCE) or of the New York University School of Law. PCCE makes no representations as to the accuracy, completeness and validity or any statements made on this site and will not be liable any errors, omissions or representations. The copyright of this content belongs to the authors and any liability with regards to infringement of intellectual property rights remains with the authors.