Documentation
Registration & Activation
After registration, top up your balance to the activation threshold. Once reached, your account activates automatically. The required amount may increase over time — activate early.
Minimum activation balance: $10 (will grow, don't be late)
LLM
Video tutorial
Watch the full walkthrough on YouTube: youtu.be/-aCTYDOvlj0
Pricing
You pay only for GPU usage at market rate — same or cheaper than renting a GPU directly. When multiple users share a GPU simultaneously, the cost is split among them. Invite friends to lower your costs further.
Models
The model list is curated intentionally. Three reasons:
- Redundant models are excluded. 4B and 9B cost nearly the same, but 9B is significantly better — no reason to offer both.
- Some models are impractical at scale. Kimi-K2.6 requires 8 top-tier GPUs simultaneously — reliably satisfying that demand is near-impossible.
- Each model requires individual hardware and software tuning. Adding a model takes real work.
Available now:
- huihui-ai/Huihui-Qwen3.5-9B-Claude-4.6-Opus-abliterated — Most affordable yet capable model. Great for saving costs and running autonomous agents.
- huihui-ai/Huihui-Qwythos-9B-Claude-Mythos-5-1M-abliterated — Upgraded 9B — newest Claude distillation (Mythos-5), up to 1M-token context, image input, abliterated. Affordable and capable.
- huihui-ai/Huihui-Qwen3.6-35B-A3B-Claude-4.7-Opus-abliterated — Middle ground — smart and capable without breaking the bank.
- huihui-ai/Huihui-Qwen3-Coder-Next-abliterated — Most powerful coding-specialized model. Use for programming and code generation.
New models are added over time. Every model can be found on Hugging Face by the same name.
Agentic API
Beyond the model itself, the API includes a CVE and exploit database updated every 6 hours, and built-in knowledge of connected Kali Linux VPS environments. Additional features can be requested via the support ticket system.
Workflow
I personally use Imbutus with PI (pi.dev) — but you can connect any supported client and use it however works best for you.
The web UI works, but it is not the intended primary interface. When a request arrives and the GPU is offline, your client shows real-time loading progress.
Dedicated VM — give your machine a name when provisioning (e.g. kalinux01). The AI model knows it by that name and gets direct shell access to run nmap, metasploit, sqlmap and any other tool on it. Just say the name in your prompt — the model connects and operates it. Billed per day — terminate anytime.
Supported agents
OpenAI and Anthropic are the two standard API protocols supported by almost every AI tool, IDE extension, and agent framework — so anything that works with ChatGPT or Claude connects here out of the box. Pick your client below for a setup guide.
OpenAI API · /v1/chat/completionsAnthropic API · /v1/messages- PiOpen-source AI coding agent for the terminal
- OpenClawOpen-source autonomous AI agent (clawbot)
- Hermes AgentNousResearch — personal AI agent that grows with you
- ZedHigh-performance code editor with built-in AI
- Claude CodeAnthropic — AI coding agent for the terminal
- opencodeOpen-source AI coding agent — TUI, desktop & IDE
- CursorAI-first code editor
- ClineAutonomous AI coding agent for VS Code
- AiderAI pair programming in the terminal
- JanOpen-source offline-first AI desktop client
- Chatbox AI (Desktop)Cross-platform AI chat app (Mac / Windows / Linux)
- Chatbox AI (Mobile)AI chat app for iOS & Android with voice input
Any agent or tool that supports the OpenAI or Anthropic API works here too — not just the ones listed above.
Media
Media generation runs through ComfyUI — a visual workflow editor — on a dedicated GPU. There are three independent bundles: Video, Image, and Voice. Each runs on its own GPU and you can run them at the same time; you are billed per second while a GPU is active. Each bundle ships ready-made example workflows in the ComfyUI Templates panel.
Pricing
Media generation (video, voice, image) works differently: each session gets a dedicated GPU that handles only one request at a time. Because the GPU is not shared between users, its cost is not split — you pay for the full GPU while it is active.
Models
- Sulphur 2 — Generates video clips from a text prompt, image, audio, or video with LTX Director 2.0. An uncensored version of LTX 2.3.
- SCAIL-2 — Animates or replaces a character in a video — give it a reference image plus a driving motion video and it transfers the motion (people auto-masked, no manual rigging). Apache-2.0, built on Wan2.1 14B.
- Qwen-Image-2512 — Generates images from text with high prompt fidelity and strong text-in-image rendering.
- Qwen-Image-Edit-2511 — Edits existing images by prompt — background swaps, object add/remove, restyling (up to 3 input images).
- FLUX.2 dev — Flagship ~32B FLUX.2 — top-quality text-to-image and multi-reference image editing in one model. Highest detail and prompt fidelity (Turbo LoRA for usable speed).
- FLUX.2 klein 9B (True V3, uncensored) — Fast 9B FLUX.2 klein, uncensored — the "True V3" aesthetic fine-tune with an abliterated text encoder. Text-to-image and multi-reference editing, light enough for a single 24GB GPU.
- Ideogram 4 — Text-to-image with strong typography — great for legible text & design. Visually place text and elements at exact positions on a canvas (regional layout control).
- Boogu-Image Turbo — Fast 4-step text-to-image with strong photorealism and bilingual (English/Chinese) text rendering.
- Boogu-Image Edit — Instruction-based image editing — describe the change in text to insert, replace, or restyle objects in an image.
- Krea-2 Turbo — Fast, photorealistic text-to-image at up to 2K resolution, with 9 selectable style LoRAs for different looks.
- CosyVoice 3 — Text-to-speech and voice cloning / conversion, plus voice-to-SRT and SRT-to-voice. A reference voice sample defines the output voice.
Workflows & nodes
Each bundle opens in ComfyUI with ready-made example workflows (Workflows panel); generated results show up in the Assets tab. A workflow is a graph of nodes — you only need to touch the ones that ask for your input. Nodes are color-coded:
- Required — you must provide this before running — upload a file or type a prompt.
- Crucial — important to check or commonly adjusted — aspect ratio, mode, or key settings.
By bundle
- Voice — three workflows — audio→SRT (transcribe audio into timestamped subtitles), SRT→audio (paste a translated SRT plus a short reference voice to get a dubbed track that keeps the original timing), and change-voice (generate speech from text, or re-voice audio of any length to a target timbre — the Mode toggle picks which).
- Image — two workflows — text-to-image (type a prompt) and image-edit (upload an image and describe the change; an optional second image can be supplied as a reference).
- Video — text/image-to-video — type a positive prompt; optionally supply an image for image-to-video, toggled by the bypass_i2v control.
Partnership Program
Go to the Referrals page to get your promo code and share it. Anyone who uses it gets 10% off. You earn 15% of everything they spend.
Philosophy
This is the project philosophy. First of all, I built these tools for my own casual working, and shared them with everyone. I'll also work on customer demands — if something can be created in theory, then I can implement it. Just ask in the support ticket system. If several users request a single feature or tool, I'll create it. You can think of it as an AI tools boutique run by your friend.