Current API Models, Clearly Tiered
Free API keys can call auto and hrLLM at 40 requests per hour. PRO unlocks direct access to the rest of the new lineup, while legacy Kiwi models remain visible as deprecated compatibility entries.
EOL: 09.03.2026. They stay visible for lineage and transition planning, but the current public lineup centers on auto, hrLLM, and the new direct PRO models.Recommended Free Croatian Model
hrLLM is our Croatian-first model. It writes and answers only in grammatically correct Croatian and is being actively tuned because Croatian is still poorly covered by most general-purpose models.
Built specifically for Croatian instead of treating it as a low-priority multilingual edge case.
Keeps tone, inflection, and sentence structure cleaner than general-purpose models on Croatian prompts.
Recommended free model for Croatian-first API and dashboard workflows.
Public API access
Direct model ID: hrllm
Free API keys: 40 requests/hour
Recommended for Croatian-first products, assistants, and writing workflows.
Open hrLLM pageCurrent API Lineup
These are the current public-facing models in the lineup. hrLLM is the recommended free Croatian model, while the other direct models are available as PRO.
api.llm.kiwi
Croatian-first model for writing and answering in grammatically correct Croatian.
Best used for: Croatian customer support, formal business writing, public-sector communication, and education content.
Based on: hrllm
api.llm.kiwi
Large-scale open-source Pro model for comprehensive analysis and extensive reasoning tasks.
Best used for: Large-scale analysis, comprehensive reasoning, extensive technical documentation, and complex multi-domain tasks.
Based on: GPT-OSS-120B
api.llm.kiwi
Advanced Pro model for complex reasoning, technical analysis, and sophisticated problem solving.
Best used for: Complex technical tasks, advanced reasoning, multi-step analysis, and sophisticated problem solving.
Based on: kimik2
api.llm.kiwi
Compact Pro model for quick reasoning, drafting, and lightweight production tasks.
Best used for: Fast general chat, structured drafting, lightweight copilots, and low-latency automations.
Based on: Qwen3-1.4B
Deprecated Models
Deprecated models remain listed for continuity, migration, and provider lineage. They are intentionally greyed out and clearly marked with their EOL date.
Access and Usage Limits
These are the model-access highlights users need most often. The complete reference stays in the docs.
Free: 40 requests/hour for auto and hrLLM
PRO unlocks direct access to the new advanced models with higher sustained throughput.
192 requests/minute per IP
Cache-friendly endpoint for model discovery and compatibility metadata.
36 requests/minute per signed-in user + IP
hrLLM additionally uses a tighter free-tier hourly model limit.
24 requests/minute per signed-in user
hrLLM additionally uses a tighter free-tier hourly model limit.