Skip to content

OpenAI API Unrecognized Messages Argument

Problem

When using OpenAI's gpt-3.5-turbo or similar chat models, developers often encounter the error:
InvalidRequestError: Unrecognized request argument supplied: messages

This occurs when:

  1. Using the legacy openai.Completion.create() method with chat models
  2. Attempting to pass messages parameter to the completions endpoint
  3. Using an incompatible SDK version for chat-based models

The error happens because chat models (GPT-3.5-turbo, GPT-4) require the Chat Completions API endpoint (/v1/chat/completions), not the legacy Completions endpoint (/v1/completions).

Solution

Always match your API method to the model type:

For Chat Models (gpt-3.5-turbo, gpt-4)

Use the Chat Completions API methods:

Update to the latest SDK and use the new client structure:

bash
pip install --upgrade openai
python
from openai import OpenAI
client = OpenAI(api_key="your-api-key")

responses = []
for prompt in prompts:
    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=20,
        temperature=0
    )
    responses.append(completion.choices[0].message.content)

Option 2: OpenAI SDK < v1.0.0

Use the ChatCompletion class:

python
import openai
openai.api_key = "your-api-key"

responses = []
for prompt in prompts:
    completion = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=20,
        temperature=0
    )
    responses.append(completion.choices[0].message.content)

Critical Differences

  • Use openai.ChatCompletion.create() instead of openai.Completion.create()
  • Remove legacy parameters like restart_sequence, top_p, frequency_penalty, etc.

Key Differences Between Endpoints

Feature/v1/completions/v1/chat/completions
Compatible Modelsgpt-3.5-turbo-instruct, davinci-002gpt-3.5-turbo, gpt-4
Required Parameterprompt (string or list)messages (list of dicts)
Message FormatRaw textRole-content pairs
Best ForBasic text completionConversation workflows
Response Locationchoices[0].textchoices[0].message.content

Common Mistakes to Avoid

  1. Mixing endpoint types

    python
    # WRONG
    openai.Completion.create(model="gpt-3.5-turbo", messages=[...])
    
    # CORRECT
    openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=[...])
  2. Incorrect response parsing

    python
    # WRONG (completions format)
    response['choices'][0]['text']
    
    # CORRECT (chat format)
    response.choices[0].message.content
  3. Outdated SDK version
    Always verify your installed version with:

    bash
    pip show openai
  4. Incorrect message format
    The messages parameter requires role-content dictionaries:

    python
    # VALID
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Hello!"}
    ]

Pro Tip

When migrating from legacy models:

  1. Replace prompt with messages dictionary
  2. Switch to ChatCompletion methods
  3. Handle responses at .choices[0].message.content
  4. Remove incompatible parameters (like best_of, logprobs)

Troubleshooting 404 Errors

If you see:

bash
openai.NotFoundError: 404 - This is a chat model and not supported in the v1/completions endpoint
  1. Verify your model name matches exactly
  2. Confirm SDK version supports chat models
  3. Ensure you're using openai.ChatCompletion.create() for older SDKs or client.chat.completions.create() for v1.0.0+

Best Practices

  1. Always use the latest SDK:

    bash
    pip install --upgrade openai
  2. Set explicit roles in messages:

    python
    messages=[
        {"role": "system", "content": "Set assistant behavior"},
        {"role": "user", "content": "Your question here"}
    ]
  3. Handle rate limits:

    python
    from tenacity import retry, stop_after_attempt, wait_exponential
    
    @retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
    def get_chat_response(prompt):
        return client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            max_tokens=100
        )
  4. Stream responses for long interactions:

    python
    stream = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Tell me about quantum computing"}],
        stream=True
    )
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="")