Nebius is launching an AI platform called Token Factory. It is designed to help companies bring open-source and customized language models into production.
Token Factory combines various components of the AI production process, such as inferencing, fine-tuning, and access management within a single environment. It supports dozens of open models, including DeepSeek, Llama, GPT-OSS from OpenAI, NVIDIA Nemotron, and Qwen. Companies can also run their own models on it. The service runs on Nebius’ existing AI Cloud infrastructure.
The introduction comes at a time when many organizations are moving from experimental AI projects to practical applications. This is increasi…
Nebius is launching an AI platform called Token Factory. It is designed to help companies bring open-source and customized language models into production.
Token Factory combines various components of the AI production process, such as inferencing, fine-tuning, and access management within a single environment. It supports dozens of open models, including DeepSeek, Llama, GPT-OSS from OpenAI, NVIDIA Nemotron, and Qwen. Companies can also run their own models on it. The service runs on Nebius’ existing AI Cloud infrastructure.
The introduction comes at a time when many organizations are moving from experimental AI projects to practical applications. This is increasing the need for open models that offer more freedom than commercial alternatives. However, the use of such models presents challenges, for example in terms of security, scalability, and cost control. Token Factory aims to partially address these issues by automating management and monitoring.
According to co-founder and business director Roman Chernin, the platform was developed to relieve teams who want to scale up their AI solutions without constant manual intervention. He argues that companies primarily need speed, reliability, and predictable costs, and that Token Factory responds to this by largely automating infrastructure management.
The underlying infrastructure, Nebius AI Cloud 3.0 (Aether), offers monitoring, security, and performance that has been tested against industry standards such as MLPerf Inference.
Transparent costs per token
Token Factory focuses on optimizing models after the training phase. Users can convert open model weights into production-ready systems with transparent costs per token. Fine-tuning and distillation are built in, allowing models to be adapted to business data while response time and costs can be reduced by tens of percent, according to Nebius.
The service offers options for direct implementation of models, without separate infrastructure setup. In addition, Token Factory supports team management, project separation, single sign-on, and role management features, enabling organizations to better comply with internal and regulatory requirements.
Security complies with standards such as SOC 2 Type II, HIPAA, and ISO 27001. Data centers are located in the European Union and the United States, with support for data location requirements and zero-retention policies.
Token Factory replaces the previous Nebius AI Studio and is available immediately. Existing users will be automatically transferred. According to Nebius, the new environment now supports more than sixty open models, including applications for text generation, code, and image processing.
Amsterdam-based Nebius is active in cloud infrastructure for AI and has research centers in Europe, North America, and Israel. The company is listed on the Nasdaq and develops both software and hardware for computing power, storage, and model management.