, and it possesses powerful and helpful features. The model has a variety of parameters and options you can choose from, which you have to correctly select to optimize GPT-5’s performance for your application area.
In this article, I’ll deep-dive into the different options you have when using GPT-5, and help you choose the optimal settings to make it work well for your use case. I’ll discuss the different input modalities you can use, the available features GPT-5 has, such as tools and file upload, and I’ll discuss the parameters you can set for the model.
This article is not sponsored by OpenAI, and is simply a summary of my experiences from using GPT-5, discussing how you can use the model effectively.
 function:
def get_weather(city: str):
return "Sunny"
You can then make your custom tools available to your model, along with a description and the parameters for your function:
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get today's weather.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city you want the weather for",
},
},
"required": ["city"],
},
},
]
It’s important to ensure detailed and descriptive information in your function definitions, including a description of the function and the parameters to utilize the function.
You can define a lot of tools to make available to your model, but it’s important to remember the core principles for AI tool definitions:
- Tools are well described
- Tools do not overlap
- Make it obvious to the model when to use the function. Ambiguity makes tool usage ineffective
Parameters
There are three main parameters you should care about when using GPT-5:
- Reasoning effort
- Verbosity
- Structured output
I’ll now describe the different parameters and how to approach selecting them.
Reasoning effort
Reasoning effort is a parameter where you select from:
- minimal
- low
- medium
- high
Minimal reasoning essentially makes GPT-5 a non-reasoning model and should be used for simpler tasks, where you need quick responses. You can, for example, use minimal reasoning effort in a chat application where the questions are simple to answer, and the users expect rapid responses.
The more difficult your task is, the more reasoning you should use, though you should keep in mind the cost and latency of using more reasoning. Reasoning counts as output tokens, and at the time of writing this article, 10 USD / million tokens for GPT-5.
I usually experiment with the model, starting from the lowest reasoning effort. If I notice the model struggles to give high-quality responses, I move up on the reasoning level, first from minimal -> low. I then continue to test the model and see how well it performs. You should strive to use the lowest reasoning effort with acceptable quality.
You can set the reasoning effort with:
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"reasoning": {"effort": "medium"}, # can be: minimal, low, medium, high
}
client.responses.create(**request_params)
Verbosity
Verbosity is another important configurable parameter, and you can choose from:
- low
- medium
- high
Verbosity sets how many output tokens (excluding thinking tokens here) the model should output. The default is medium verbosity, which OpenAI has also stated is essentially the setting used for their previous models.
Suppose you want the model to generate longer and more detailed responses, you should set verbosity to high. However, I mostly find myself choosing between low and medium verbosity.
- For chat applications, medium verbosity is good because a very concise model may make the users feel the model is less helpful (a lot of users prefer some more details in responses).
- For extraction purposes, however, where you only want to output specific information, such as the date from a document, I set the verbosity to low. This helps ensure the model only responds with the output I want (the date), without providing additional reasoning and context.
You can set the verbosity level with:
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"text" = {"verbosity": "medium"}, # can be: low, medium, high
}
client.responses.create(**request_params)
Structured output
Structured output is a powerful setting you can use to ensure GPT-5 responds in JSON format. This is again useful if you want to extract specific datapoints, and no other text, such as the date from a document. This guarantees that the model responds with a valid JSON object, which you can then parse. All metadata extraction I do uses this structured output, as it is extremely useful for ensuring consistency. You can use structured output by adding the “text” key in the request params to GPT-5, such as below.
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"text" = {"format": {"type": "json_object"}},
}
client.responses.create(**request_params)
Make sure to mention “JSON” in your prompt; if not, you’ll get an error if you’re using structured output.
File upload
File upload is another powerful feature available through GPT-5. I discussed earlier the multimodal abilities of the model. However, in some scenarios, it’s useful to upload a document directly and have OpenAI parse the document. For example, if you haven’t performed OCR or extracted images from a document yet, you can instead upload the document directly to OpenAI and ask it questions. From experience, uploading files is also fast, and you’ll usually get rapid responses, mostly depending on the effort you ask for.
If you need quick responses from documents and don’t have time to use OCR first, file upload is a powerful feature you can use.
Downsides of GPT-5
GPT-5 also has some downsides. The main downside I’ve noticed during use is that OpenAI does not share the thinking tokens when you use the model. You can only access a summary of the thinking.
This is very restrictive in live applications, because if you want to use higher reasoning efforts (medium or high), you cannot stream any information from GPT-5 to the user, while the model is thinking, making for a poor user experience. The option is then to use lower reasoning efforts, which leads to lower quality outputs. Other frontier model providers, such as Anthropic and Gemini, both have available thinking tokens.
There’s also been a lot of discussion about how GPT-5 is less creative than its predecessors, though this is usually not a big problem with the applications I’m working on, since creativity usually isn’t a requirement for API usage of GPT-5.
Conclusion
In this article, I’ve provided an overview of GPT-5 with the different parameters and options, and how to most effectively utilize the model. If used right, GPT-5 is a very powerful model, though it naturally also comes with some downsides, the main one from my perspective being that OpenAI doesn’t share the reasoning tokens. Whenever working on LLM applications, I always recommend having backup models available from other frontier model providers. This could, for example, be having GPT-5 as the main model, but if it fails, you can fall back to using Gemini 2.5 Pro from Google.
👉 Find me on socials:
🧑💻 Get in touch
✍️ Medium
You can also read my other articles: