Documentation

Registration & Activation

After registration, top up your balance to the activation threshold. Once reached, your account activates automatically. The required amount may increase over time — activate early.

Minimum activation balance: $10 (will grow, don't be late)

LLM

Video tutorial

Watch the full walkthrough on YouTube: youtu.be/-aCTYDOvlj0

Pricing

You pay only for GPU usage at market rate — same or cheaper than renting a GPU directly. When multiple users share a GPU simultaneously, the cost is split among them. Invite friends to lower your costs further.

Models

The model list is curated intentionally. Three reasons:

Redundant models are excluded. 4B and 9B cost nearly the same, but 9B is significantly better — no reason to offer both.
Some models are impractical at scale. Kimi-K2.6 requires 8 top-tier GPUs simultaneously — reliably satisfying that demand is near-impossible.
Each model requires individual hardware and software tuning. Adding a model takes real work.

Available now:

huihui-ai/Huihui-Qwen3.5-9B-Claude-4.6-Opus-abliterated — Most affordable yet capable model. Great for saving costs and running autonomous agents.
64K ctxreasoning
huihui-ai/Huihui-Qwythos-9B-Claude-Mythos-5-1M-abliterated — Upgraded 9B — newest Claude distillation (Mythos-5), up to 1M-token context, image input, abliterated. Affordable and capable.
64K ctxreasoning
huihui-ai/Huihui-Qwen3.6-35B-A3B-Claude-4.7-Opus-abliterated — Middle ground — smart and capable without breaking the bank.
128K ctxreasoning
huihui-ai/Huihui-Qwen3-Coder-Next-abliterated — Most powerful coding-specialized model. Use for programming and code generation.
256K ctxno reasoning

New models are added over time. Every model can be found on Hugging Face by the same name.

Agentic API

Beyond the model itself, the API includes a CVE and exploit database updated every 6 hours, and built-in knowledge of connected Kali Linux VPS environments. Additional features can be requested via the support ticket system.

Workflow

I personally use Imbutus with PI (pi.dev) — but you can connect any supported client and use it however works best for you.

The web UI works, but it is not the intended primary interface. When a request arrives and the GPU is offline, your client shows real-time loading progress.

Dedicated VM — give your machine a name when provisioning (e.g. kalinux01). The AI model knows it by that name and gets direct shell access to run nmap, metasploit, sqlmap and any other tool on it. Just say the name in your prompt — the model connects and operates it. Billed per day — terminate anytime.

⚠Billing runs while a session is active. To stop it — use the Stop button on the main page (visible when a model is selected), or tell the model: "stop our session". If no one else is using the GPU at that moment, it will shut down and billing stops immediately.

Supported agents

OpenAI and Anthropic are the two standard API protocols supported by almost every AI tool, IDE extension, and agent framework — so anything that works with ChatGPT or Claude connects here out of the box. Pick your client below for a setup guide.

OpenAI API · /v1/chat/completionsAnthropic API · /v1/messages

Any agent or tool that supports the OpenAI or Anthropic API works here too — not just the ones listed above.

Media

Media generation runs through ComfyUI — a visual workflow editor — on a dedicated GPU. There are three independent bundles: Video, Image, and Voice. Each runs on its own GPU and you can run them at the same time; you are billed per second while a GPU is active. Each bundle ships ready-made example workflows in the ComfyUI Templates panel.

Pricing

Media generation (video, voice, image) works differently: each session gets a dedicated GPU that handles only one request at a time. Because the GPU is not shared between users, its cost is not split — you pay for the full GPU while it is active.

Models

Sulphur 2 — Generates video clips from a text prompt, image, audio, or video with LTX Director 2.0. An uncensored version of LTX 2.3.
video generation
SCAIL-2 — Animates or replaces a character in a video — give it a reference image plus a driving motion video and it transfers the motion (people auto-masked, no manual rigging). Apache-2.0, built on Wan2.1 14B.
video generation
Qwen-Image-2512 — Generates images from text with high prompt fidelity and strong text-in-image rendering.
image generation
Qwen-Image-Edit-2511 — Edits existing images by prompt — background swaps, object add/remove, restyling (up to 3 input images).
image generation
FLUX.2 dev — Flagship ~32B FLUX.2 — top-quality text-to-image and multi-reference image editing in one model. Highest detail and prompt fidelity (Turbo LoRA for usable speed).
image generation
FLUX.2 klein 9B (True V3, uncensored) — Fast 9B FLUX.2 klein, uncensored — the "True V3" aesthetic fine-tune with an abliterated text encoder. Text-to-image and multi-reference editing, light enough for a single 24GB GPU.
image generation
Ideogram 4 — Text-to-image with strong typography — great for legible text & design. Visually place text and elements at exact positions on a canvas (regional layout control).
image generation
Boogu-Image Turbo — Fast 4-step text-to-image with strong photorealism and bilingual (English/Chinese) text rendering.
image generation
Boogu-Image Edit — Instruction-based image editing — describe the change in text to insert, replace, or restyle objects in an image.
image generation
Krea-2 Turbo — Fast, photorealistic text-to-image at up to 2K resolution, with 9 selectable style LoRAs for different looks.
image generation
CosyVoice 3 — Text-to-speech and voice cloning / conversion, plus voice-to-SRT and SRT-to-voice. A reference voice sample defines the output voice.
voice

Workflows & nodes

Each bundle opens in ComfyUI with ready-made example workflows (Workflows panel); generated results show up in the Assets tab. A workflow is a graph of nodes — you only need to touch the ones that ask for your input. Nodes are color-coded:

Required — you must provide this before running — upload a file or type a prompt.
Crucial — important to check or commonly adjusted — aspect ratio, mode, or key settings.

By bundle

Voice — three workflows — audio→SRT (transcribe audio into timestamped subtitles), SRT→audio (paste a translated SRT plus a short reference voice to get a dubbed track that keeps the original timing), and change-voice (generate speech from text, or re-voice audio of any length to a target timbre — the Mode toggle picks which).
Image — two workflows — text-to-image (type a prompt) and image-edit (upload an image and describe the change; an optional second image can be supplied as a reference).
Video — text/image-to-video — type a positive prompt; optionally supply an image for image-to-video, toggled by the bypass_i2v control.

Partnership Program

Go to the Referrals page to get your promo code and share it. Anyone who uses it gets 10% off. You earn 15% of everything they spend.

Philosophy

This is the project philosophy. First of all, I built these tools for my own casual working, and shared them with everyone. I'll also work on customer demands — if something can be created in theory, then I can implement it. Just ask in the support ticket system. If several users request a single feature or tool, I'll create it. You can think of it as an AI tools boutique run by your friend.