I've run 50+ tests today, and I have not gotten GPT-5 to do a single parallel tool call with a CFG schema.
This is not listed anywhere in the docs. I assume the reason why it does this is because it's constraining the model output, which means you have to have it just do one tool call at a time.
Explanations are below; labeled screenshots are provided at the bottom.
GPT-5-MINI MINIMAL REASONING EFFORT
| Mode | API calls to OpenAI | Breakdown |
|---|---|---|
| normal functions | 3 | 1 to get current date/time 1 to add 3 todos 1 for final summary |
| cfg functions | 5 | 1 to get current date/time 1 to add 1 todo 1 to add 1 todo 1 to add 1 todo 1 for final summary |
GPT-5 HIGH REASONING EFFORT
| Mode | API calls to OpenAI | Breakdown |
|---|---|---|
| normal functions | 2 | 1 for 4 tool calls (price and shipping info) 1 for final summary |
| cfg functions | 5 | 1 to get price info tool call 1 to get shipping info tool call 1 to get price info tool call 1 to get shipping info tool call 1 for final summary |
GPT-5 MINIMAL REASONING EFFORT
| Mode | API calls to OpenAI | Breakdown |
|---|---|---|
| normal functions | 2 | 1 with 2 tool calls (list unread threads and get calendar availability) 1 for final summary |
| cfg functions | 3 | 1 with 1 tool call (list unread threads) 1 with 1 tool call (get calendar availability) 1 for final summary |
This project uses uv for package management and running.
- get an openai api key and add it to a
.envfile in the root of the project
OPENAI_API_KEY=sk-...
- install dependencies
uv sync
- run the tests
uv run todos-test/cfg_functions.pyuv run todos-test/normal_functions.pyuv run price-test/cfg_price_compare.pyuv run price-test/price_compare.pyuv run email-triage-test/cfg_email_triage.pyuv run email-triage-test/email_triage.py
- check the output in the
outputdirectory (and see what tools are being called per request in the terminal output)
- normal functions
- cfg functions
- normal functions
- cfg functions
- normal functions
- cfg functions