Blog - NickScripts

TurboQuant in LlamaMan - Squeezing More Context Out of the Same GPU

Apr 2, 2026

Exploring the Llamaman implementation and deployment of TurboQuant after 2am.

admin-panel llama.cpp manager proxy rest-api ai llm turboquant

LlamaMan 0.8.6 - What's New

Mar 31, 2026

LlamaMan is a self-hosted web UI for managing llama.cpp instances. Check out what's new in version 0.8.6

admin-panel llama.cpp manager proxy rest-api ai llm

LlamaMan - Somebody Had to Do It

Mar 25, 2026

Ollama was too slow and apparently nobody had built a proper UI for managing llama.cpp instances in Docker. So I did. LlamaMan gives you full control over GPU layer offload, real-time VRAM monitoring, one-click model launching from a preset, and an Ollama-compatible proxy. Having control over your own hardware shouldn't be a novel concept in 2026.

admin-panel llama.cpp manager proxy rest-api

Categories

Subscribe to Updates