Tag: manager
TurboQuant in LlamaMan - Squeezing More Context Out of the Same GPU

Exploring the Llamaman implementation and deployment of TurboQuant after 2am.

LlamaMan 0.8.6 - What's New

LlamaMan is a self-hosted web UI for managing llama.cpp instances. Check out what's new in version 0.8.6

LlamaMan - Somebody Had to Do It

Ollama was too slow and apparently nobody had built a proper UI for managing llama.cpp instances in Docker. So I did. LlamaMan gives you full control over GPU layer offload, real-time VRAM monitoring, one-click model launching from a preset, and an Ollama-compatible proxy. Having control over your own hardware shouldn't be a novel concept in 2026.