Tunneling Local LLMs to the Cloud with Tailscale
Why I Did This
I’ve been running lightweight LLMs locally using Ollama — it's fast, simple, and a great fit for small workflows or testing. But local-only meant they were cut off from the rest of my stack, especially my VPS-hosted services. I didn’t want to expose ports or mess with reverse proxies just to bridge that gap.
So I used Tailscale to create a private mesh network between my local machine and my VPS.
Now, apps on my VPS — including my self-hosted n8n instance — can make secure HTTPS calls directly to the LLMs running on my dev machine. Handy for experimenting, automating, or even just skipping cloud costs when I don’t need GPT-4-level firepower.
How It Works
- Ollama runs various LLMs on my local dev machine.
- Tailscale connects that machine to my VPS via a secure private network.
- n8n, hosted on my VPS, uses HTTP Request nodes to talk to the local LLMs.
- Other web apps can hit the same endpoints, as long as they're inside the Tailscale network.
I didn't need to do any SSL configuration or domain setup — Tailscale handles secure transport with minimal fuss.
Tailscale made this whole setup super easy! I was connected and using my local LLMs in minutes. Tailscale offers a very generous free tier, which is what I'm using here.
Stack Rundown
- Ollama – local LLM serving
- Tailscale – mesh VPN for private networking
- n8n – workflow automation on the VPS
- HTTPS nodes – n8n’s built-in HTTP request functionality
- Ubuntu VPS – hosting the cloud side of things
What I'd Improve or Tweak
- Add a local reverse proxy (like Caddy) to standardize routes and allow friendly naming (e.g., http://llm.local.tailnet).
- Maybe use Tailscale's Funnel feature to expose selected endpoints to the internet if I ever need that.
- Look into monitoring usage and response times for better observability — right now it’s all vibes and cURL.