UnieAI

August 3, 2026 · Model Zoo

GLM-5.2

Coming to UnieAI — running in our own datacenter, no third-party APIs.

Share the launchX in f

→Join Discord→

Show the screenshot to an admin, get US$10 credit

Inside the enterprise

Back inside the company: data is scattered, AI can't read it

Structured data sits in databases, unstructured files in drives. An LLM can't consume either.

Data Hub + Parser

Both kinds of data, one semantic layer

Batch + CDC sync, Parser semantics and vector indexing — entities, relations and labels drafted by AI, confirmed by people. Every answer cites its source.

SQLERP · PostgreSQL

SQLCRM records

PDFcontract_final_v3.pdf

XLSQ-report.xlsx

DOCmeeting-notes-0712.docx

DRIVECloud drive /shared

LOGFab sensor logs

MAILSupport email archive

PPTSlide archive

APIHTTP API · Orders

ANALYTICS

BI / Reports / Q&A

KPI

12.8M

Tokens / day

↑ 12.6%

WORKFLOWS

Agents / Automation / Decisions

Triggers

Schedule

Event

Alert

Agent Orchestration

Agent Context

Actions

Validate

Approve

Execute

INTEGRATIONS

APIs / Apps / MCP

CRM

ERP

Finance App

MCP Server

External API

Customer

Order

Product

Contract

Agent Context

Named MetricsConfirmed SemanticsData MapLineageBusiness ObjectsColumn-level JoinsKnowledge Binding

ONTOLOGY / SEMANTIC LAYER

Data Hub × Ontology × Named Metrics

SQL Databases

APIs

S3 / Files

Documents

Images

Logs

UNIEAI DATA HUB

UNIEAI MODEL ZOO

UnieInfra · GPU

Faster, cheaper inference

UnieInfra is tuned for agent workloads: 4× throughput at low load, 2× lower latency under high concurrency, and autotuning that squeezes maximum throughput out of your existing hardware.

Explore UnieInfra

Agent Core · Harness

Open models get smarter

A purpose-built harness makes models stronger and more reliable. Agent Core 2 lifts MiniMax-M2 from 78.3% to 97.2% on AIME, and runs hundreds of agents in a single process.

Explore Agent Core

Data Hub · Parser

Data becomes trusted answers

Data Hub and UnieAI Parser bring structured and unstructured data onto the platform; AI and humans co-build the ontology. Every answer ships with citations: SQL table names and full file paths.

Explore Data Hub

Model Zoo · Studio

SOTA models, one-click deploy, one API

Model Zoo keeps the latest SOTA open and proprietary models on UnieInfra, OpenAI-compatible and billed per token, with API governance via Studio: keys, logs, billing and guardrails.

Browse the Model Zoo

Partners · Monetize compute

Don't just rent out compute. Turn it into global token revenue

Mount your GPU machines and model APIs on UnieAI. Sell tokens and lease GPUs to the world through our channels.

Partner with us

Total throughput

higher is better

Qwen3.5-122B-A10B (FP16) · Nvidia H200 × 2 · test by InferenceMAX.

UnieInfravLLM 0.21.0

2×

4.5×

4.7×

4.6×

3.25×

vLLM crash

1163264128256

concurrency

NVIDIAAMDQualcommIntel

One inference engine across four accelerator ecosystems

AIME 2025 — accuracy

higher is better

Baseline scores from the public leaderboard; the UnieAI bar is our internal result.

99.0%

97.2%

96.7%

93.4%

89.3%

89.0%

83.7%

78.3%

GPT-5.2 (xhigh)

MiniMax-M2 × UnieAI Agent Core 2

GPT-5.2 (medium)

gpt-oss-120b (high)

gpt-oss-20B (high)

Nova 2.0 Pro

Claude 4.5 Haiku

MiniMax-M2 (baseline)

UnieAI BI Agent

Data Hub connected

How much did Q2 2026 revenue grow quarter over quarter?

querying erp.finance…

+18.4%

(NT$231M → NT$274M)

SQLerp.finance_revenue_quarterly

DOC/reports/2026-Q2/financial-summary.pdf

Model API Gateway

SOTA models · always current

MiniMax-M2

DeepSeek-V3

Qwen3-235B

Llama 4

GLM-4.5

Kimi K2

POST https://api.unieai.com/v1/chat/completions

{ "model": "MiniMax-M2" }

The endpoint never changes — switching models is one string. OpenAI-compatible, token-metered.

GPU in·tokens out

UnieInfra · GPU

Faster, cheaper inference

UnieInfra is tuned for agent workloads: 4× throughput at low load, 2× lower latency under high concurrency, and autotuning that squeezes maximum throughput out of your existing hardware.

Total throughput

higher is better

Qwen3.5-122B-A10B (FP16) · Nvidia H200 × 2 · test by InferenceMAX.

UnieInfravLLM 0.21.0

2×

4.5×

4.7×

4.6×

3.25×

vLLM crash

1163264128256

concurrency

NVIDIAAMDQualcommIntel

One inference engine across four accelerator ecosystems

Agent Core · Harness

Open models get smarter

A purpose-built harness makes models stronger and more reliable. Agent Core 2 lifts MiniMax-M2 from 78.3% to 97.2% on AIME, and runs hundreds of agents in a single process.

AIME 2025 — accuracy

higher is better

Baseline scores from the public leaderboard; the UnieAI bar is our internal result.

99.0%

97.2%

96.7%

93.4%

89.3%

89.0%

83.7%

78.3%

GPT-5.2 (xhigh)

MiniMax-M2 × UnieAI Agent Core 2

GPT-5.2 (medium)

gpt-oss-120b (high)

gpt-oss-20B (high)

Nova 2.0 Pro

Claude 4.5 Haiku

MiniMax-M2 (baseline)

Data Hub · Parser

Data becomes trusted answers

Data Hub and UnieAI Parser bring structured and unstructured data onto the platform; AI and humans co-build the ontology. Every answer ships with citations: SQL table names and full file paths.

UnieAI BI Agent

Data Hub connected

How much did Q2 2026 revenue grow quarter over quarter?

querying erp.finance…

+18.4%

(NT$231M → NT$274M)

SQLerp.finance_revenue_quarterly

DOC/reports/2026-Q2/financial-summary.pdf

Model Zoo · Studio

SOTA models, one-click deploy, one API

Model Zoo keeps the latest SOTA open and proprietary models on UnieInfra, OpenAI-compatible and billed per token, with API governance via Studio: keys, logs, billing and guardrails.

Model API Gateway

SOTA models · always current

MiniMax-M2

DeepSeek-V3

Qwen3-235B

Llama 4

GLM-4.5

Kimi K2

POST https://api.unieai.com/v1/chat/completions

{ "model": "MiniMax-M2" }

The endpoint never changes — switching models is one string. OpenAI-compatible, token-metered.

Partners · Monetize compute

Don't just rent out compute. Turn it into global token revenue

Mount your GPU machines and model APIs on UnieAI. Sell tokens and lease GPUs to the world through our channels.

GPU in·tokens out

Introducing UnieAI Agent Core

Tools · MCP

Session

Harness

Sandbox

Orchestration

One harness, orchestrated.

Agent Core decides when the model reasons, which tools it calls via MCP, what it remembers, and runs it safely in a sandbox.

AIME 2025 — accuracy

higher is better

Baseline scores from the public leaderboard; the UnieAI bar is our internal result.

99.0%

97.2%

96.7%

93.4%

89.3%

89.0%

83.7%

78.3%

GPT-5.2 (xhigh)

MiniMax-M2 × UnieAI Agent Core 2

GPT-5.2 (medium)

gpt-oss-120b (high)

gpt-oss-20B (high)

Nova 2.0 Pro

Claude 4.5 Haiku

MiniMax-M2 (baseline)

Stronger models, stable agents.

A purpose-built harness makes models stronger and more reliable. Agent Core 2 lifts MiniMax-M2 on AIME from 78.3% to 97.2%.

Agent Core optimizes the CPU, hundreds of agents per process. Read more UnieInfra optimizes the GPU, 2× throughput at low load. Read more

UnieAI Chat

Your AI coworker

Agent mode runs decks, financial analysis, reports and scheduled tasks — for developers and everyday users alike.

Meet UnieAI Chat

UnieAI Code

The coding agent for open models

A coding agent for open and private models, bringing the harness into your development workflow.

Meet UnieAI Code

The latest SOTA open and proprietary models, continuously updated. One API.

Browse the Model Zoo

Llama 3.3Qwen3DeepSeek-V3MiniMax-M2MistralGemma 3GLM-4.6Kimi K2Phi-4gpt-ossCommand R+YiLlama 3.3Qwen3DeepSeek-V3MiniMax-M2MistralGemma 3GLM-4.6Kimi K2Phi-4gpt-ossCommand R+Yi

Runs across every major accelerator

Mount your compute. Sell to the world.

We work with distributors, FDE teams and application companies, and we own the hard part: Token, harness and hardware. Our partners stay focused on building a moat for their customers, while inference, the agent runtime and deployment are handled by us.

UnieAI Model Zoo UnieAI Agent Core Deployment

Apps built on Agent Core, in your hands today.

Our agent harness already ships as products. UnieAI Code is a coding agent for open and private models. UnieAI Chat brings an agent mode for slides, financial analysis, report generation and scheduled tasks, for developers and everyday users alike.

UnieAI Code UnieAI Chat