Best VPS for AnythingLLM (2026): Workspaces That Do Not Die
AnythingLLM occupies a useful middle ground: more than a chat UI, less than a full LangChain setup. The workspace concept and built-in document ingestion make it a sensible default for teams that want a private ChatGPT with their own knowledge base. The hosting recommendations online are wildly inconsistent because most reviewers test it with empty workspaces.
I ran AnythingLLM in production for a client project with 30 workspaces and a 200K-document corpus. Here is what the actual workload needs.
What AnythingLLM Really Costs in RAM
The application is a Node.js stack plus the embedded LanceDB vector store. Idle memory is around 400 MB.
The memory cost driver is workspace activity:
- Inactive workspace: 50 MB resident
- Active workspace with small document set (under 100 docs): 200 to 400 MB
- Active workspace with large corpus (10K+ docs in LanceDB): 1 to 3 GB
- Concurrent ingestion jobs: 500 MB to 2 GB depending on document size
The disk footprint surprises people too. The LanceDB index can hit 2 to 5 GB per 100K documents depending on embedding dimension and document length.
VPS Comparison for AnythingLLM
| Provider | Plan | vCPU | RAM | Disk | Monthly | Best fit |
|---|---|---|---|---|---|---|
| Hetzner Cloud | CCX13 | 2 | 8 GB | 80 GB NVMe | 14.86 EUR | Small team, modest corpus |
| Contabo VPS | VPS M | 6 | 16 GB | 200 GB NVMe | 8.49 EUR | Budget production with large corpus |
| DigitalOcean | Premium AMD 8 GB | 4 | 8 GB | 160 GB NVMe | 56 USD | US team, ops simplicity |
| Hetzner CCX23 | CCX23 | 4 | 16 GB | 160 GB NVMe | 29.74 EUR | AnythingLLM + Ollama co-hosted |
Hetzner Cloud CCX13: For small teams
For a 5 to 10 user team with workspaces under 50K documents total, the CCX13 is enough. 8 GB RAM holds the app, the LanceDB index, and concurrent query activity comfortably. Dedicated CPU matters because document ingestion is CPU-heavy, and shared CPU plans show it as ingestion slowness.
Pros:
- Dedicated CPU keeps document ingestion predictable
- NVMe handles LanceDB index writes well
- 8 GB RAM fits a few active workspaces without trouble
The trade-off: not enough for serious document scale, you will need to upsize past 50K documents.
Get Hetzner: Hetzner Cloud.
Contabo VPS M: For document-heavy production
If your team handles large document corpuses (100K+ docs across workspaces), the Contabo VPS M’s 16 GB RAM is the cheapest path. 8.49 EUR a month for that spec is unbeatable.
The shared CPU shows up as occasional ingestion slowness when multiple users upload simultaneously. For batch ingestion (overnight imports), this is fine. For real-time multi-user document upload, the variance becomes annoying.
Pros:
- Best raw spec per euro for AnythingLLM
- 200 GB NVMe holds years of vector index growth
- 16 GB RAM fits multiple active workspaces with large corpuses
Get Contabo: Contabo VPS.
DigitalOcean Premium AMD 8 GB: For US team ops
56 USD a month is steep, but the platform polish matters when your team relies on AnythingLLM daily. Managed Postgres for the application metadata removes one operational concern. Snapshot-based rollback after a bad document import is genuinely useful.
Honest take: 8 GB RAM is the floor here, and you may need 16 GB for serious workspace use, doubling the cost. At that point compare against a self-hosted Hetzner CCX23.
Get DigitalOcean: DigitalOcean.
Hetzner CCX23: For AnythingLLM + Ollama
The right tier when you want AnythingLLM plus local embeddings plus local inference on one machine. 16 GB RAM accommodates the embedding model (1 to 2 GB), the inference model (8 GB), AnythingLLM with its vector store (4 to 6 GB), with reasonable headroom.
Pick this for privacy-sensitive deployments where everything must stay on one machine. Cheaper than running multiple machines for the same setup.
What I Would Pick
For a small team using AnythingLLM as a private ChatGPT with light document use: Hetzner CCX13. For document-heavy production with a budget: Contabo VPS M. For fully self-hosted with local models: Hetzner CCX23. AnythingLLM is stable enough that hosting recommendations should hold through 2026.
Full VPS landscape sits at the SelfHostVPS comparison. AnythingLLM pairs naturally with Ollama, see that guide for backend choices.
Frequently asked questions
How much RAM does AnythingLLM actually need?
The Node.js process idles at around 400 MB. Each active workspace with documents adds 200 MB to 1 GB depending on document volume and vector store choice. For a small team with 5 to 10 workspaces and modest document corpuses, 4 GB RAM is fine. For larger setups (50+ workspaces, 100K+ documents), plan for 8 to 16 GB.
Does AnythingLLM need its own vector database?
It ships with LanceDB (embedded) by default, which works fine for personal use and small teams. Production deployments often swap to Qdrant or Weaviate for better concurrent query performance. The embedded LanceDB struggles past 500K documents or 20+ concurrent users. Plan to migrate if you grow past that scale.
Can AnythingLLM share a VPS with Ollama for fully self-hosted RAG?
Yes, this is a common pattern. Plan for 16 GB RAM minimum: Ollama with a 7B embedding-optimized model (nomic-embed-text or similar) takes 1 to 2 GB, the inference model adds 8 GB, AnythingLLM plus its vector store adds 4 to 6 GB. The Hetzner CCX23 fits this comfortably.
How does AnythingLLM compare to Open WebUI for hosting?
AnythingLLM is heavier because it bundles document ingestion, vector storage, and workspace management. Open WebUI is just a chat UI that proxies to an inference backend. For pure chat use, Open WebUI is cheaper to host. For document-based Q&A (RAG), AnythingLLM removes the need to wire together separate components. The hosting cost difference is usually 2x in favor of Open WebUI for chat-only setups.