ChatGPT JSON Mode Not Returning Valid JSON? Here's the Fix
You enabled JSON mode in the OpenAI API, and the response is still not valid JSON. Maybe it is wrapped in markdown code fences. Maybe it includes a conversational preamble before the actual object. Maybe it just returns a string that looks like JSON but fails JSON.parse().
This is one of the most common frustrations with the OpenAI API, and the fixes are straightforward once you understand what is actually happening.
How JSON Mode Works (and Its Limitations)
When you set response_format: { type: "json_object" } in an OpenAI API call, you are telling the model to constrain its output to valid JSON. But there are important caveats:
- You must mention "JSON" in the system or user prompt. This is a hard requirement. If your prompt does not include the word "JSON," the API will return an error or ignore the format constraint.
- JSON mode guarantees valid JSON syntax, not valid structure. The model will return parseable JSON, but it might not match the schema you expect. It could return
{"result": "I don't know"}when you expected{"users": [...]}.
- It does not work with all models. JSON mode is available on
gpt-4o,gpt-4o-mini,gpt-4-turbo, andgpt-3.5-turbo-1106+. Older models ignore it.
- Streaming can produce partial invalid JSON. If you are streaming the response and parsing chunks, you will get invalid JSON until the stream completes.
Common Problems and Fixes
Problem 1: Markdown Code Fences in Output
You get back:
json
{"name": "Alice", "age": 30}
Cause: You are using the Chat Completions API without response_format set, or you are using a model that does not support it. The model defaults to markdown formatting.
Fix: Ensure you are passing the response format correctly:
from openai import OpenAIclient = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "You are a helpful assistant. Respond in JSON."},
{"role": "user", "content": "List 3 programming languages with their year of creation."}
]
)
data = response.choices[0].message.content
parsed = json.loads(data) # This will work
If you still get markdown fences (perhaps because you are using a wrapper or different provider), strip them:
import re
import jsondef parse_llm_json(text):
# Strip markdown code fences
text = re.sub(r'^
(?:json)?\s*\n?', '', text.strip())
text = re.sub(r'\n?``\s*$', '', text.strip())
return json.loads(text)
Problem 2: Output Does Not Match Expected Schema
You ask for
{"users": [{"name": "...", "email": "..."}]} and get {"data": [{"user_name": "...", "user_email": "..."}]}.Fix: Use Structured Outputs instead of JSON mode. This is the newer, more reliable approach:
python
from pydantic import BaseModelclass User(BaseModel):
name: str
email: str
class UserList(BaseModel):
users: list[User]
response = client.beta.chat.completions.parse(
model="gpt-4o",
response_format=UserList,
messages=[
{"role": "system", "content": "Extract user information."},
{"role": "user", "content": "Alice (alice@test.com) and Bob (bob@test.com)"}
]
)
user_list = response.choices[0].message.parsed
print(user_list.users[0].name) # "Alice"
Structured Outputs guarantee the response matches your Pydantic model exactly. No more schema guessing.Problem 3: Truncated JSON (max_tokens Hit)
The response cuts off mid-object:
json
{"users": [{"name": "Alice", "age": 30}, {"name": "Bob", "ag
Cause: The max_tokens limit was reached before the model finished generating.Fix: Check
response.choices[0].finish_reason. If it is "length" instead of "stop", the output was truncated.
python
choice = response.choices[0]
if choice.finish_reason == "length":
print("Warning: response was truncated. Increase max_tokens.")
Increase max_tokens or ask for less data per request. For large datasets, paginate your requests.Problem 4: Invalid JSON Despite JSON Mode
In rare cases, the model produces syntactically invalid JSON even with JSON mode enabled. This typically happens with very long outputs or complex nested structures.
Fix: Add a repair step. You can fix common JSON errors programmatically:
javascript
// Using jsonrepair (npm install jsonrepair)
import { jsonrepair } from 'jsonrepair';const broken = '{"name": "Alice", "age": 30,}'; // trailing comma
const fixed = jsonrepair(broken);
const parsed = JSON.parse(fixed);
The jsonrepair npm package handles trailing commas, missing quotes, single quotes instead of double quotes, and other common issues. For an online option, jsonshield.com has a JSON fixer that repairs broken LLM output in the browser.In Python:
python
pip install json-repair
from json_repair import repair_jsonbroken = '{"name": "Alice", "hobbies": ["reading", "hiking",]}'
fixed = repair_json(broken)
parsed = json.loads(fixed)
Problem 5: The API Returns an Error About JSON Mode
Error: 'messages' must contain the word 'json' in some form
Fix: Add the word "JSON" somewhere in your messages. The simplest approach is in the system prompt:
python
messages=[
{"role": "system", "content": "Always respond in valid JSON format."},
{"role": "user", "content": "What is the capital of France?"}
]
JSON Mode vs Structured Outputs: Which Should You Use?
Feature JSON Mode Structured Outputs
Guarantees valid JSON Yes Yes
Guarantees schema match No Yes
Supports complex schemas N/A Yes (Pydantic/Zod)
Model support GPT-3.5+, GPT-4+ GPT-4o+
Streaming support Partial Yes (with parsing)
Use JSON mode when you need flexible JSON output and can handle varying structures. Use Structured Outputs when you need a specific schema -- which is almost always the better choice for production code.
Defensive Parsing Pattern
For production systems, wrap your JSON extraction in a robust parser:
python
import json
import re
from json_repair import repair_jsondef extract_json(llm_output: str) -> dict:
"""Extract and parse JSON from LLM output, handling common issues."""
text = llm_output.strip()
# Strip markdown fences
text = re.sub(r'^
`(?:json)?\s*\n?', '', text)
text = re.sub(r'\n?`\s*$', '', text)
text = text.strip() # Try direct parse first
try:
return json.loads(text)
except json.JSONDecodeError:
pass
# Try repair
try:
repaired = repair_json(text)
return json.loads(repaired)
except Exception:
pass
# Last resort: find the first { or [ and extract
match = re.search(r'[\{\[]', text)
if match:
candidate = text[match.start():]
try:
return json.loads(repair_json(candidate))
except Exception:
pass
raise ValueError(f"Could not extract JSON from: {text[:200]}")
``This pattern handles markdown fences, broken syntax, and conversational preambles. It covers roughly 99% of malformed LLM JSON output in practice.