The Blog

OpenAI offers a broad range of models via the API for text generation, reasoning, coding, tool use, structured outputs, and document-centric workflows.

According to the official model overview, the current models support text and image input, text output, multilingual capabilities, and vision; they are available through the Responses API and client SDKs. For complex tasks, OpenAI recommends gpt-5.4 by default; for lower latency and cost, OpenAI points to gpt-5.4-mini and gpt-5.4-nano
Open AI

LLM “Access our frontier models and APIs.”

(0)

Your review

Click the stars to start your review.

8.0/10 KIFOX Score – Very good

Location: USA OpenAI OpCo, LLC, 1455 3rd Street, San Francisco, CA 94158, USA

Image Generation Embeddings Function Calling AI Agents LLM API Multimodal AI Programming Reasoning Model Text-to-Speech Language Model Text Generation Transcription Video Generation
Free There is a free usage tier in the API rate limit system for users in permitted geographies Other Token-based API usage Billing by model, input/output tokens, cached input, audio/image/tool usage, and other usage-dependent factors.

Batch / Flex / Priority / Scale Tier Options for controlling costs and latency for larger or plannable workloads.

Fine-Tuning / Evals / Tools / Agents Additional API features for customization, evaluation, agents, web search, file search, code interpreter, realtime, and structured outputs.

Data Residency / ZDR / EKM Enterprise-grade data controls with regional storage/processing, Zero Data Retention or Modified Abuse Monitoring, and external key management.

Calculate tokens and costs with the KIFOX Tokenizer

For new professional text/coding/agent applications: gpt-5.5 or gpt-5.4.

For maximum quality on difficult tasks: gpt-5.5-pro.

For affordable, fast production workloads: gpt-5.4-mini.

For very affordable classification, routing, and extraction: gpt-5.4-nano.

For image generation and image editing: gpt-image-2.

For live voice agents: gpt-realtime-2.

For live translation: gpt-realtime-translate.

For live transcription: gpt-realtime-whisper.

For coding agents: gpt-5.3-codex.

For RAG/knowledge databases: gpt-5.4-mini or gpt-5.4-nano plus text-embedding-3-large/text-embedding-3-small.

Target audienceAssessment
Developers / software teamsVery suitable – for chatbots, agents, structured outputs, code, tool calling, RAG, automation, multimodality, audio, image, and production AI applications.
SaaS providers / product teamsVery suitable – if AI is to be embedded directly into their own products, platforms, or workflows.
SMEs with IT resourcesSuitable – for support automation, internal search, document analysis, content processes, and data extraction.
Large enterprisesVery suitable – because of the broad model portfolio, Data Residency options, ZDR/Modified Abuse Monitoring, Enterprise Key Management, and governance capabilities.
Private individuals without a technical backgroundRather unsuitable – ChatGPT is more appropriate for them; the API requires technical integration.

GPT/reasoning/text models via API

gpt-5.5-pro – For very demanding professional tasks, complex analyses, difficult programming tasks, multi-step reasoning, strategy, architecture, legally/technically demanding drafts, and maximum response quality. According to OpenAI, GPT-5.5 pro is the more precise, more compute-intensive variant of GPT-5.5 and is intended for difficult problems.

gpt-5.5 – Best general frontier model for complex professional work, coding, analysis, technical concepts, agents, RAG workflows, structured outputs, and high-quality text generation. OpenAI describes it as the latest frontier model for complex professional work.

gpt-5.4-pro – For very difficult professional tasks when higher precision is more important than speed. Particularly suitable for complex problem-solving, long analyses, deep reasoning, and difficult code/architecture questions.

gpt-5.4 – For professional standard and enterprise workflows with high quality, but more affordable than GPT-5.5. Suitable for coding, analysis, documentation, agents, knowledge systems, advisory texts, and structured business applications.

gpt-5.4-mini – For fast, cost-efficient applications with good quality: chatbots, assistants, sub-agents, coding help, classification, data extraction, support automation, and high request volumes. OpenAI calls it a strong mini model for coding, Computer Use, and subagents.

gpt-5.4-nano – For very affordable and fast mass processing: classification, ranking, simple data extraction, routing, pre-filtering, tagging, short summaries, and sub-agents. OpenAI describes it as the most affordable GPT-5.4-class model for simple high-volume tasks.

gpt-5.3-chat-latest – ChatGPT-like model that, according to OpenAI, points to the GPT-5.3-Instant snapshot. Suitable when ChatGPT-like response behavior is desired via API; not the first choice for new technical systems if GPT-5.5 or GPT-5.4 are available.

gpt-5.2-pro – Previous pro model for professional work. Suitable for very complex tasks if there are deliberate reasons not to switch to GPT-5.5 pro for compatibility, cost, or stability reasons.

gpt-5.2 – Previous frontier model for professional work with configurable reasoning. Suitable for existing applications tuned to GPT-5.2, as well as for complex analyses, coding, and agent workflows.

gpt-5.1 – Older GPT-5 model for coding and agentic tasks. Suitable for existing applications that are still optimized for GPT-5.1.

gpt-5-pro – Pro variant of GPT-5. Suitable for more complex tasks than GPT-5, especially when more compute and response precision are desired.

gpt-5 – Previous intelligent reasoning model for coding, agents, analysis, and general demanding text tasks. Today more relevant for compatibility and existing implementations.

gpt-5-mini – Faster and more affordable GPT-5 variant for clearly defined tasks, precise prompts, high request volumes, chatbots, simple agents, and standard automation.

gpt-5-nano – Fastest and most affordable GPT-5 variant. Suitable for classification, summarization, simple extraction, routing, tagging, and very high volumes.

gpt-4.1 – Strong non-reasoning model. Suitable for fast, high-quality text generation, coding, instruction following, long contexts, and general API applications when deep reasoning is not required.

gpt-4.1-mini – Smaller and faster GPT-4.1 variant. Suitable for production chatbots, support, content creation, classification, and cost optimization.

gpt-4.1-nano – Very fast, affordable GPT-4.1 variant, now deprecated according to OpenAI. Suitable only for existing workflows, simple classification, and mass processing.

gpt-4o – Multimodal GPT model for text and image understanding, fast chatbots, assistance systems, analysis of images/screenshots, and general applications. Still relevant for existing projects.

gpt-4o-mini – Affordable smaller GPT-4o variant for focused tasks, simple chatbots, classification, short texts, and high volumes. Still relevant for older implementations.

gpt-4-turbo – Older GPT-4 model, now more of a legacy option. Suitable only for existing applications that were deliberately not migrated.

gpt-4 – Older high-intelligence model. Today primarily legacy/compatibility.

gpt-3.5-turbo – Legacy model for affordable chat and text tasks. Today only really useful for old systems.

o3-pro – More compute-intensive variant of o3 for better answers. Suitable for very difficult reasoning tasks when an older o-series model is needed.

o3 – Reasoning model for complex tasks; according to OpenAI, it has now been superseded by GPT-5. Suitable for existing applications optimized for o3.

o4-mini – Fast, cost-efficient reasoning model; according to OpenAI, now deprecated and superseded by GPT-5 mini. Suitable only for existing applications, fast reasoning tasks, coding, and visual tasks.

o3-mini – Older small reasoning model. Today only legacy/compatibility.

o1-pro – Older pro variant of o1 for difficult reasoning tasks. Deprecated/legacy.

o1 – Earlier o-series reasoning generation. Deprecated/legacy.

o1-mini – Earlier small o-series variant. Deprecated/legacy.

o1-preview – Early preview version of the o-series. Deprecated/legacy.

Coding models

OpenAI offers its own Codex/coding models for software development and agentic coding tasks.

gpt-5.3-codex – Currently an important coding model for agentic software development, Codex-like workflows, refactoring, bug fixing, complex codebases, pull request work, and longer coding tasks. OpenAI describes it as a particularly powerful agentic coding model.

gpt-5.2-codex – Predecessor/existing model for long agentic coding tasks, complex code changes, and software development.

gpt-5-codex – Older GPT-5 Codex model for agentic coding. Today more of a legacy option.

gpt-5.1-codex – Older Codex model for coding agents and existing workflows. Deprecated/legacy.

gpt-5.1-codex-max – Variant for longer-running coding tasks. Deprecated/legacy.

gpt-5.1-codex-mini – Smaller, more affordable Codex variant. Deprecated/legacy.

codex-mini-latest – Fast older reasoning model for Codex CLI. Deprecated/legacy.

Image models

OpenAI lists GPT Image 2, GPT Image 1.5, chatgpt-image-latest, GPT Image 1, gpt-image-1-mini, as well as DALL·E 3 and DALL·E 2 in the API model overview.

gpt-image-2 – Current state-of-the-art image model for high-quality image generation and image editing. The official API name is GPT Image 2, not “GPT Image 2.0.” Suitable for realistic images, product images, illustrations, marketing graphics, image variants, inpainting/editing, and professional visual content.

gpt-image-1.5 – Previous image generation model. Suitable for existing workflows tuned to GPT Image 1.5.

chatgpt-image-latest – Previous image model from the ChatGPT context. Suitable when ChatGPT-like image output is desired; for new API projects, GPT Image 2 is generally preferred.

gpt-image-1 – Previous image generation model, now deprecated. Only still relevant for legacy applications.

gpt-image-1-mini – Cost-efficient image model variant. Suitable for more affordable image generation, simple variants, drafts, preview images, and scalable image workflows.

dall-e-3 – Older image generation model, now deprecated. Only still relevant for existing projects.

dall-e-2 – First older DALL·E generation, now deprecated. Legacy only.

Realtime, audio, and voice models

OpenAI lists realtime and audio models for live voice interaction, speech-to-speech, transcription, text-to-speech, and audio workflows.

gpt-realtime-2 – Currently the most important realtime voice model for live voice agents, real-time dialogues, call bots, support assistants, voice-controlled agents, tool calling during conversations, and more complex live interactions. OpenAI describes it as a reasoning model for realtime voice interactions.

gpt-realtime-translate – Specialized model for live speech-to-speech translation. Suitable for multilingual real-time communication, customer support, education, meetings, international teams, and interpreting workflows. OpenAI describes it as a streaming speech-to-speech translation model; the current announcement mentions real-time translation from more than 70 input languages into 13 output languages.

gpt-realtime-whisper – Streaming speech-to-text model for live transcription. Suitable for meeting captions, live subtitles, call transcription, minutes, voice notes, and documentation workflows.

gpt-realtime-1.5 – Very good voice model for audio-in/audio-out. Suitable for live voice assistants, call center prototypes, voice UX, and interactive voice dialogues.

gpt-realtime – Realtime model for text and audio input as well as audio output. Suitable for older realtime implementations and existing projects.

gpt-realtime-mini – Cost-efficient realtime variant. Suitable for more affordable voice agents, simple voice dialogues, high request volumes, and prototypes.

gpt-audio-1.5 – Audio-in/audio-out model for Chat Completions-based audio workflows. Suitable for applications that do not necessarily require WebRTC/realtime sessions.

gpt-audio – Audio model for audio inputs and audio outputs via Chat Completions. Suitable for older audio workflows, voice assistants, and multimodal audio apps.

gpt-audio-mini – Cost-efficient audio variant. Suitable for simpler audio tasks and scalable audio workloads.

gpt-4o-audio – Older/deprecated GPT-4o audio model. Suitable only for existing implementations.

gpt-4o-mini-audio – Older/deprecated smaller GPT-4o audio model. Suitable only for existing projects.

gpt-4o-realtime – Older realtime model for text and audio input as well as audio output. Suitable for existing realtime apps.

gpt-4o-mini-realtime – Smaller older realtime variant. Deprecated/legacy.

gpt-4o-transcribe – Speech-to-text model based on GPT-4o. Suitable for high-quality transcription, audio analysis, subtitles, meetings, interviews, and call analysis.

gpt-4o-mini-transcribe – Smaller, more affordable transcription variant. Suitable for high volumes, simple transcription, and cost-sensitive speech-to-text workflows.

gpt-4o-transcribe-diarize – Transcription model with speaker recognition. Suitable for interviews, meetings, conversation logs, and call center analysis when it is important to distinguish who spoke when.

gpt-4o-mini-tts – Text-to-speech model based on GPT-4o-mini. Suitable for natural-sounding speech output, chatbot read-aloud, voice UX, learning content, and simple audio outputs.

tts-1 – Older text-to-speech model, optimized for speed. Suitable for fast TTS output.

tts-1-hd – Older text-to-speech model, optimized for quality. Suitable for higher-quality speech output in legacy workflows.

whisper – General speech recognition model. Suitable for classic transcription, audio-to-text, translation/transcription of older workflows, and legacy applications.

Deep research models

o3-deep-research – Earlier deep research model for intensive research tasks, source work, and multi-step information analysis. According to the model overview, deprecated.

o4-mini-deep-research – Faster/more affordable deep research variant. According to the model overview, deprecated.

Open-weight models

gpt-oss-120b – Open-weight model under the Apache-2.0 license. Suitable for self-hosting, own infrastructure, research, customization, and scenarios where model weights are relevant. OpenAI describes it as the strongest open-weight model that fits into an H100 GPU.

gpt-oss-20b – Smaller open-weight model. Suitable for lower latency, local/self-hosted applications, experiments, and more resource-efficient deployments.

Other API models that are often forgotten

computer-use-preview – Specialized model for computer-use/browser/GUI automation. According to OpenAI, deprecated; only relevant for existing workflows.

gpt-4o-search-preview – Older GPT model for web search in Chat Completions. Deprecated; today more likely replaced by Web Search tools/Responses workflows.

gpt-4o-mini-search-preview – Small older search preview variant. Deprecated/legacy.

omni-moderation – Moderation model for detecting potentially harmful content in text and images. Suitable for safety checks, user-generated content review, and compliance filters.

text-moderation – Older text moderation model. Deprecated/legacy.

text-moderation-stable – Older stable text moderation variant. Deprecated/legacy.

text-embedding-3-large – Powerful embedding model. Suitable for semantic search, RAG, vector indexes, similarity search, knowledge databases, and clustering.

text-embedding-3-small – More affordable embedding model. Suitable for scalable RAG systems, search functions, classification, and semantic similarity at lower cost.

text-embedding-ada-002 – Older embedding model. Today mainly relevant for legacy vector indexes.

sora-2 – Video generation model with synchronized audio; according to the model overview, deprecated.

sora-2-pro – Advanced Sora-2-pro variant; according to the model overview, deprecated.

Hosting & Data

✅ = well covered ⚠️ = partial / indirect ❓ = not available / unclear
?

1) On-prem / local hosting
Meaning: The company operates the solution on its own hardware or within its own infrastructure. In the strictest sense, not only the application runs locally, but ideally the model as well.

2) Private cloud / data center
Meaning: The solution runs in a dedicated or more clearly separated cloud environment, often with a hosting provider or hyperscaler, but in a German data center or in a particularly controlled environment.

3) EU SaaS / managed
Meaning: The provider operates the solution itself as a service. The company uses the tool as a ready-made cloud service, ideally with EU data residency.

4) Hybrid
Meaning: One part of the processing remains internal / local / in a private cloud, while another part runs in an external cloud or EU SaaS.

5) AVV / DPA
Meaning: This is the data processing agreement or Data Processing Addendum. It governs that the provider processes personal data on behalf of the customer and is bound by the customer's instructions.

6) No training
Meaning: The provider does not use your prompts, uploads, attachments, chat histories, or outputs for training or improving the general model — ideally excluded by contract.

7) Open-source / transparency path
Meaning: There is a path toward greater technical transparency and sovereignty, for example through:
- open models
- documented components
- self-hostable parts
- traceable architecture
- export / switching options

✅ = well covered ⚠️ = partial / indirect ❓ = not available / unclear
On-prem / local hosting ⚠️
Private cloud / data center
EU SaaS / Managed
Hybrid ⚠️
DPA / AVV
No training on customer data
Open source / transparency path ⚠️

Overall assessment of hosting & data:
The OpenAI API is a managed cloud API service for text, reasoning, code, image, audio, voice, embeddings, moderation, tools, agents, and multimodal applications. Traditional on-premises hosting of the closed OpenAI models is not publicly documented as a standard option. Positive aspects include the broad API coverage, Responses API, tool calling, structured outputs, batch, Flex/Priority Processing, Data Residency, Zero Data Retention, Enterprise Key Management, and the clear standard statement “no training on API data without opt-in.” Critical issues remain third-country transfers, feature limitations depending on the region, possible persistence in certain API functions, and the need to correctly configure ZDR/Data Residency contractually or on a project-specific basis.

Conclusion:
OpenAI is very strong for productive, scaling AI applications and enterprise use cases; for strictly regulated data, usage should be safeguarded with a DPA, ZDR/Modified Abuse Monitoring, Data Residency, EKM, and internal data classifications.

Data controls in the OpenAI platform

On-prem / local hosting ⚠️
Private cloud / data center
EU SaaS / Managed
Hybrid ⚠️
DPA / AVV
No training on customer data
Open source / transparency path ⚠️

Overall assessment of hosting & data:
The OpenAI API is a managed cloud API service for text, reasoning, code, image, audio, voice, embeddings, moderation, tools, agents, and multimodal applications. Traditional on-premises hosting of the closed OpenAI models is not publicly documented as a standard option. Positive aspects include the broad API coverage, Responses API, tool calling, structured outputs, batch, Flex/Priority Processing, Data Residency, Zero Data Retention, Enterprise Key Management, and the clear standard statement “no training on API data without opt-in.” Critical issues remain third-country transfers, feature limitations depending on the region, possible persistence in certain API functions, and the need to correctly configure ZDR/Data Residency contractually or on a project-specific basis.

Conclusion:
OpenAI is very strong for productive, scaling AI applications and enterprise use cases; for strictly regulated data, usage should be safeguarded with a DPA, ZDR/Modified Abuse Monitoring, Data Residency, EKM, and internal data classifications.

Data controls in the OpenAI platform

Strengths & weaknesses at a glance

Strengths Weaknesses
- Very broad model coverage from affordable to frontier. - The portfolio is complex; model selection, pricing tiers, context limits, and tool costs require explanation.
- Strong suitability for coding, agents, tool calling, structured outputs, and long contexts. - The strongest models are significantly more expensive than Mini/Nano variants.
- According to OpenAI, API/business data is not used for training on inputs/outputs by default. - Privacy and data residency options are not uniformly identical for every case, but are in some instances tied to organization type, endpoint, or enablement.
- Data residency, Zero Data Retention, and DPA are - Older model families that remain available increase operational complexity in selection and lifecycle management.

Data last updated: 16. April 2026

Reviews

0 reviews in total

(0)
5★ 0.0%
4★ 0.0%
3★ 0.0%
2★ 0.0%
1★ 0.0%

There are no confirmed reviews for this tool yet.