Website | GitHub | HuggingFace | Guides
Why should you use gpt‑oss‑120b:
Excellent performance. gpt‑oss‑120b matches or surpasses o4-mini on core benchmarks like AIME, MMLU, TauBench, and HealthBench (even outperforms proprietary models like OpenAI o1 and GPT‑4o).
Efficient and flexible deployment. Despite its size, gpt‑oss‑120b can run on a single 80GB GPU (e.g., NVIDIA H100 or AMD MI300X). It's optimized for local, on-device, or cloud inference via partners like vLLM, llama.cpp and Ollama.
Adjustable reasoning levels. It supports low, medium, and high reasoning modes to balance speed and depth.
Permissive license. gpt‑oss‑120b is released under the Apache 2.0 license, which means you can freely use it for commercial applications. This makes it a good choice for teams building custom LLM inference pipelines.
gpt-oss-120bOur most advanced powerful open model designed for complex tasks, deeper context understanding, and enhanced reasoning capabilities.
gpt-oss-20bA versatile, efficient open-source language model ideal for a wide range of applications, from conversational AI to creative content generation.
Last modified 22 March 2026