Self-Hosted MCP Servers on a Tailscale VPS

Most MCP guides assume a server running locally over stdio. This is the remote version: arxiv, Scrapling, and SearXNG hosted on a hardened Tailscale VPS, reachable from any client machine, with the transport, binding, and verification choices that local guides never have to make.

9 steps 24 min read 2026-05-15
AI Tools RecommendedSee full toolkit below →
Claude App
Architecture & the claude.ai connector path
Claude Code CLI
MCP host and client wiring
Codex CLI
Config and prose review
Gemini
Cross-checking and research

Almost every Model Context Protocol guide you’ll find assumes the server runs on the same machine as the client, spoken to over stdio, spawned as a child process and forgotten. That works right up until you want the same tools from your laptop, your desktop, and a cloud session at the same time, or you want a scraper with a real browser behind it that isn’t fighting your workstation for memory. The moment the server lives somewhere else, you inherit a set of decisions the local guides never have to make: which transport, which interface to bind, how the client reaches it, and how you prove it actually works before you trust it.

This guide stands up three research servers on a remote box and wires them into Claude Code: arxiv for paper search and retrieval, Scrapling for resilient web scraping, and SearXNG for metasearch. They’re the worked examples for a reason. Plenty of research tooling already ships as a hosted connector you click once to add (Exa, Tavily, the Brave Search API); you would never self-host those. These three have no first-party hosted option, so if you want them, you host them. That makes them honest examples instead of contrived ones, and it makes the remote-hosting path the actual subject rather than an afterthought. The box itself is the hardened Tailscale-only Debian VPS from the VPS Security Foundations guide; this one assumes you’re starting from that end state.

Step 1

The Transport Problem

Start here because it determines everything downstream, including which of these servers can ever reach the Claude app versus only Claude Code.

MCP servers speak one of two transport families. stdio: the server is a process that reads JSON-RPC on stdin and writes it on stdout. The client launches it and owns its lifecycle. HTTP (Streamable HTTP, or the older SSE): the server is a long-running network service the client connects to by URL. Local guides almost always use stdio because the client can just spawn the binary. The instant the server is remote, stdio doesn’t disappear as an option, but it has to be tunnelled: you run the stdio server on the VPS and let SSH carry the pipe. HTTP, by contrast, is already a network transport and needs no tunnel, just reachability and a deliberate bind address.

The three servers here deliberately cover the three transport shapes you’ll meet most often:

Server What it is Transport as deployed
arxiv stdio binary, runs on the VPS stdio over SSH (client runs ssh ... arxiv-mcp-server)
Scrapling containerized HTTP service on the VPS native Streamable HTTP (http://<vps>:8000/mcp)
SearXNG a search web app on the VPS; the MCP server is a separate stdio shim run on the client stdio shim locally, talking HTTP to the remote SearXNG

That third row trips people up: SearXNG itself is not an MCP server. The MCP server is mcp-searxng, a small stdio process that runs on your client machine and makes HTTP calls to a SearXNG instance. Only the search engine is remote. Keep the two layers separate in your head or the wiring later won’t make sense.

This table also decides the Claude app question, and the deciding factor is transport, not the tool:

Server Reachable from claude.ai? Why
Scrapling Eligible in principle Already speaks the transport the Claude app requires (remote, HTTPS, authenticated). What’s missing is exposure, not protocol.
arxiv No clean path stdio only; no HTTP mode exists in the server at all.
SearXNG (mcp-searxng) No clean path Same as arxiv: the MCP shim is stdio only.

The Claude app can only attach a remote, publicly reachable, HTTPS MCP endpoint. It cannot spawn a local process, it cannot SSH anywhere, and Anthropic’s servers cannot route into your private tailnet. The app does not force auth on the endpoint, which is exactly why exposing one is your problem to secure: Step 9 treats authentication as mandatory. That rules out arxiv and SearXNG structurally: their MCP servers are stdio, full stop. Scrapling is the only one whose protocol is already right; for it, claude.ai is purely a question of public ingress and auth, which is Step 9. The honest framing: all three work in Claude Code today over Tailscale; only Scrapling has any path to the Claude app, and even that is an optional add-on, not the baseline.

Step 2

Prerequisites

The base box is the VPS Security Foundations end state. Everything below assumes it: Debian 13, zero public inbound ports, SSH reachable only over Tailscale, Docker installed from the official repo, a sudo-capable deploy operator account. If you don’t have that box, build it first; this guide does not re-derive it.

On top of that baseline you need:

Throughout, <vps-tailscale-ip> is the placeholder for your VPS’s Tailscale address (Tailscale hands these out of the 100.64.0.0/10 range, so they look like 100.x.y.z), and deploy is the operator account from the base box. Substitute your own.

Step 3

The Binding Decision

This is the conceptual core, and it’s the one place this guide deliberately contradicts the base box guide, on purpose.

The hardening guide’s downstream contract says: bind container ports to 127.0.0.1, never to 0.0.0.0, and front anything public with a Cloudflare Tunnel. That rule exists because the hardening guide’s apps are public web services whose only safe exposure is an outbound tunnel. MCP servers are a different animal. The “public” for an MCP server here is your own tailnet, not the internet. You explicitly want your other machines to reach it, and they reach it over the encrypted Tailscale interface. So the correct bind for these is neither loopback (your laptop can’t reach that) nor 0.0.0.0 (that’s on the public NIC too, the exact mistake the hardening guide spends a step warning about). It’s the Tailscale interface address specifically.

Concretely, a Docker port publish for these servers looks like this:

# Wrong: public interface, bypasses UFW, reachable from the internet
ports:
  - "8000:8000"

# Wrong here: loopback only, your other tailnet machines can't reach it
ports:
  - "127.0.0.1:8000:8000"

# Right for a tailnet MCP server: bound to the Tailscale IP only
ports:
  - "<vps-tailscale-ip>:8000:8000"

The reason this is safe enough to run plaintext http:// with no auth token: WireGuard, the encrypted transport Tailscale is built on, handles encryption and device authentication at the network layer. A request only reaches port 8000 if it originated from an authenticated device on your tailnet, because that port is not bound to any other interface. The plaintext HTTP inside the tunnel is wrapped in WireGuard’s encryption the moment it leaves the host. This is the same reason the base box is comfortable with SSH listening on all interfaces: the firewalls, not the daemon, do the access control. Here the Tailscale-scoped bind, not an app-level token, does it.

Two caveats worth stating before they bite you:

A host-IP-scoped Docker bind requires that IP to exist on the host at container start. If Docker starts a container before tailscaled has brought the interface up (a reboot race), the bind fails and the container won’t start. The fix is ordering: a restart: unless-stopped policy plus, if you hit the race in practice, a systemd drop-in making the Docker service wait on tailscaled.service. The base box’s reboot ordering already handles the common case; if you do hit it, a mysterious post-reboot “container won’t start” has an obvious first suspect.

And the reminder the hardening guide earned the hard way: Docker’s published ports bypass UFW. Publishing 8000:8000 is reachable from the internet even with UFW default-deny, because Docker’s iptables rules sit upstream of UFW. The <vps-tailscale-ip>:8000:8000 form is precisely the mitigation: a port bound to the Tailscale IP is not on the public NIC, so there is nothing for UFW to have to catch. The bind pattern is the defense, the provider firewall is the safety net, exactly as in the base box, just with a different “right” address for this class of service.

Step 4

arxiv (stdio over SSH)

Of the three, arxiv is the cleanest, and the reason it’s clean is worth understanding because it generalizes.

Install it on the VPS as the deploy user with uv:

uv tool install arxiv-mcp-server

On a default install, uv tool install writes a launcher at ~/.local/bin/arxiv-mcp-server that is a symlink into a dedicated, self-contained virtual environment under ~/.local/share/uv/tools/arxiv-mcp-server/. Confirm it landed:

readlink -f /home/deploy/.local/bin/arxiv-mcp-server

You’ll get a path inside ~/.local/share/uv/tools/. That detail matters in a second.

Create the storage directory the server writes downloaded papers into. It will not create this for you:

mkdir -p /home/deploy/arxiv-papers

Now the trap, and it is the teaching moment of this guide because it fails in the most misleading way possible. The client will invoke this server as ssh deploy@<vps-tailscale-ip> arxiv-mcp-server .... That form, ssh host command, runs a non-interactive, non-login shell on the VPS. It sources neither .bash_profile nor .bashrc. The ~/.local/bin entry that uv tool update-shell added to your PATH lives in those files, so under ssh host command it does not exist. Bare arxiv-mcp-server resolves to “command not found”.

What makes this vicious is that every manual test you’d reach for says it’s fine:

It works every way a human checks it, then fails only when the MCP client invokes it as ssh host /the/command. The prior debugging session that produced this guide lost an hour here.

The fix is to never rely on PATH. Invoke the absolute path, always:

ssh deploy@<vps-tailscale-ip> /home/deploy/.local/bin/arxiv-mcp-server --storage-path /home/deploy/arxiv-papers

This works for a reason that closes the loop on the readlink from earlier: because uv tool install made the launcher’s interpreter an absolute path into its own venv, the absolute path to the launcher is sufficient on its own. No PATH entry needed, no virtualenv activation needed, no .bashrc. The absolute path is not a workaround; it’s the correct invocation, and uv tool is what makes it self-sufficient. Write the absolute path into the config and you never debug this again.

Run that command by hand once. It should hang silently: a stdio server with no input, waiting on stdin. That hang is success. Ctrl+C out. The real test is Step 8; resist wiring the client until then.

How AI can help

This failure mode is the canonical example of "works when I test it, fails for the agent," and it's exactly where a second model earns its keep. Hand it the symptom ("which finds it, interactive SSH runs it, but the MCP client reports the binary not found") and the constraint (it's invoked as ssh host command). The useful answer names the non-interactive non-login shell and the unsourced .bashrc without you having to lay the trail. The fix is one line; the value is not having to rediscover why from scratch the way the original session did.

Step 5

Scrapling (containerized HTTP)

Scrapling runs as a Docker container that speaks native Streamable HTTP, so there is no SSH tunnel and no stdio. The container’s command runs the MCP server in HTTP mode:

mcp --http --host 0.0.0.0 --port 8000

The --host 0.0.0.0 looks alarming after Step 3’s warnings, but it is correct here and the distinction is the whole point of Step 3. 0.0.0.0 inside the container means “all interfaces the container has,” which is just the container’s own network namespace. The isolation is done by the host-side publish, which binds to the Tailscale IP only:

<vps-tailscale-ip>:8000->8000/tcp

So the container listens broadly inside its sandbox, and the host exposes that sandbox on exactly one address: the tailnet one. This is the standard and correct shape for a containerized network service behind the tailnet bind. Confirm what your running container is actually doing rather than trusting the compose file:

docker inspect scrapling --format '{{.Config.Cmd}}'
docker inspect scrapling --format '{{json .NetworkSettings.Ports}}'

The first should show the mcp --http command, the second should show the host binding scoped to your Tailscale IP, not 0.0.0.0 and not [::]. If you see 0.0.0.0 on the host side, stop and fix the publish before going further; that is the internet-reachable mistake, not the harmless in-container case.

The client reaches it by URL, with the path Scrapling serves MCP on:

http://<vps-tailscale-ip>:8000/mcp

Plaintext http:// is deliberate and acceptable for the reason established in Step 3: the bytes are inside WireGuard before they leave the host, and the port is unreachable from anywhere that isn’t on your tailnet. No token in the client config is not an oversight; it’s the tailnet-scoped bind doing the job a token would otherwise do. Step 9 revisits this for the one case (the Claude app) where the tailnet boundary no longer applies and a real auth layer becomes mandatory.

Step 6

SearXNG (remote search app + local MCP shim)

This is the two-layer one. The VPS runs SearXNG, a metasearch web application, in a container. The MCP server is a separate stdio process, mcp-searxng, that runs on the client and queries that remote SearXNG over HTTP. Nothing about SearXNG itself is MCP-aware. Get this split right and the rest is configuration.

The container publishes to the Tailscale IP, same pattern as Scrapling:

<vps-tailscale-ip>:8080->8080/tcp

SearXNG’s config is bind-mounted from the host into the container at /etc/searxng. Find the real path rather than guessing:

docker inspect searxng --format '{{json .Mounts}}'

That points at a host directory (for example ~/searxng/searxng) containing settings.yml. Three keys in that file decide whether the MCP layer works at all, and the defaults are wrong for this use:

search:
  formats:
    - html
    - json        # mcp-searxng calls the JSON API; stock SearXNG ships HTML-only

server:
  limiter: false        # the bot limiter throttles or blocks programmatic queries
  public_instance: false # public-instance mode tightens the same anti-bot path

The one that silently breaks everything is formats. mcp-searxng queries SearXNG’s JSON endpoint. A stock searxng-docker settings.yml ships formats as HTML only, so every search returns a non-JSON response and the MCP tool fails in a way that looks like “search returns nothing” rather than a clean error. limiter: false and public_instance: false matter because the bot-detection path will rate-limit or 403 automated queries even once JSON is on; for a tailnet-private instance that only you reach, that protection is pure downside. Verify all three in one shot:

grep -nE 'formats:|json|limiter:|public_instance:' /home/deploy/searxng/searxng/settings.yml

You want to see json present under the active formats: list, limiter: false, and public_instance: false. After editing settings.yml, restart the container so it re-reads the file:

docker restart searxng

The MCP shim itself installs nothing on the VPS. It runs on the client, on demand, via npx, pointed at the remote instance through one environment variable. That wiring is Step 7.

How AI can help

The settings.yml audit is a good delegated check. Hand a model the file and ask specifically whether mcp-searxng will function against it: a competent answer flags the HTML-only formats default as the silent failure, not just a generic "looks fine." It's a fast second pair of eyes on a file where the failure mode is invisible until you've already wired everything and are wondering why search "returns nothing."

Step 7

Wiring the Client

There are two equivalent ways to register these in Claude Code: the claude mcp add CLI, or hand-editing .claude.json. Lead with the CLI; it’s harder to get wrong than hand-editing JSON, and it writes the same config the manual route would. The hand-edited block is shown alongside each because Claude Desktop and IDE clients are configured by file rather than the CLI, and because seeing what claude mcp add actually wrote is the fastest way to debug a broken entry.

arxiv (stdio over SSH, absolute path per Step 4):

claude mcp add arxiv -- ssh deploy@<vps-tailscale-ip> /home/deploy/.local/bin/arxiv-mcp-server --storage-path /home/deploy/arxiv-papers

Scrapling (HTTP, the /mcp path matters):

claude mcp add --transport http scrapling http://<vps-tailscale-ip>:8000/mcp

SearXNG (local stdio shim, remote instance via env var):

claude mcp add searxng --env SEARXNG_URL=http://<vps-tailscale-ip>:8080 -- npx -y mcp-searxng

The equivalent .claude.json these produce, which is also what you’d hand-build for clients without the CLI:

{
  "arxiv": {
    "type": "stdio",
    "command": "ssh",
    "args": [
      "deploy@<vps-tailscale-ip>",
      "/home/deploy/.local/bin/arxiv-mcp-server",
      "--storage-path",
      "/home/deploy/arxiv-papers"
    ],
    "env": {}
  },
  "scrapling": {
    "type": "http",
    "url": "http://<vps-tailscale-ip>:8000/mcp"
  },
  "searxng": {
    "type": "stdio",
    "command": "npx",
    "args": ["-y", "mcp-searxng"],
    "env": { "SEARXNG_URL": "http://<vps-tailscale-ip>:8080" }
  }
}

Note where each server actually runs. arxiv’s process is on the VPS (SSH carries the pipe). Scrapling’s is on the VPS (HTTP). SearXNG’s MCP process is on the client (npx), and only the search instance it calls is on the VPS. The config encodes that asymmetry exactly; if it looks asymmetric, it’s because it is.

Step 8

The Verification Ladder

Do not trust a server because the client says “connected.” Connected means the transport opened, not that the tool works. The reusable discipline here is a five-rung ladder, transport-agnostic, where each rung fails differently and rung three is the one almost every guide skips:

  1. The binary or endpoint resolves. stdio: readlink -f the launcher. HTTP: the port answers at all.
  2. It runs the way the client will invoke it. arxiv: the absolute-path ssh command hangs on stdin (success). HTTP: a request to the port gets a response, not a connection refused.
  3. It speaks MCP. Send a real initialize frame and get a valid result back. This is the rung that proves the protocol, not just the pipe.
  4. The client sees it. claude mcp list shows it connected.
  5. A real tool call in a session. Ask Claude to actually search arxiv, scrape a page, run a metasearch.

Rung three is the high-value test and it’s one line per transport. The frame is the same; only the delivery differs.

For arxiv, pipe it through SSH exactly as the client will run it:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' | ssh deploy@<vps-tailscale-ip> /home/deploy/.local/bin/arxiv-mcp-server --storage-path /home/deploy/arxiv-papers

For Scrapling, POST the same frame at the HTTP endpoint:

curl -s -X POST http://<vps-tailscale-ip>:8000/mcp -H 'content-type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

For SearXNG, run the shim locally with the env var and feed it the frame on stdin:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' | SEARXNG_URL=http://<vps-tailscale-ip>:8080 npx -y mcp-searxng

A pass is a JSON-RPC result containing a serverInfo object and a capabilities block, then the process exits or waits cleanly. Get that back and rungs one and two were implicitly true, the protocol works end to end, and the only thing left is the client wiring (rung four) and a live call (rung five). If rung three passes but claude mcp list shows it failing, the bug is in the config syntax, not the server, which is a much smaller place to look. That triage value is the entire point of testing the protocol before trusting the client.

How AI can help

The whole ladder is worth handing to a model as a script-generation task: one wrapper that runs all three rung-three checks, parses the JSON-RPC responses, and prints a pass/fail line per server. It's tedious to write by hand, mechanical enough that a model gets it right, and it turns "is my stack healthy" into one command you can re-run after any change. The value is in the reusable harness, not the one-time check.

Step 9

Exposing Scrapling to the Claude App (Optional)

This section is scoped to Scrapling alone, and the first job is to say plainly why the other two are not here.

arxiv and SearXNG’s MCP servers are stdio-only. There is no HTTP mode to enable; the Claude app cannot drive stdio, cannot SSH, and cannot route into your tailnet. The only theoretical path is to run a stdio-to-HTTP bridge (something like mcp-proxy or supergateway) in front of each, then expose that publicly with auth. It works in the sense that it’s mechanically possible, and it is not worth it: it adds a second daemon per server plus the entire public-ingress-and-auth burden, with no capability you don’t already have in Claude Code. The honest recommendation is to use Claude Code for arxiv and SearXNG and not contort them.

Scrapling is different only because its transport is already correct. It speaks Streamable HTTP; the Claude app attaches Streamable HTTP servers. Nothing about the protocol needs to change. What’s missing is everything Step 3 deliberately left out, because the tailnet made it unnecessary, and now you’re leaving the tailnet:

The shape, then: keep the tailnet-bound Scrapling exactly as Step 5 sets it up for Claude Code, and add a separate public path (tunnel + TLS + auth proxy) only if you specifically need it from the Claude app. The tailnet setup is the baseline and is complete on its own; the public path is additive and changes none of the earlier steps.

What’s Next

Deliberately out of scope here, in rough order of how likely you are to want it:

Intentionally not included: anything that re-derives the base box (that’s the VPS Security Foundations guide and this one starts at its end state), and anything that frames stdio servers as claude.ai-capable, because they are not and pretending otherwise wastes the reader’s evening the same way it wasted the original debugging session’s.

The throughline, if you keep one thing: a remote MCP server’s safety and reachability are decided by the bind address and the transport, not by the tool. Get Step 1 and Step 3 right and the per-server steps are mechanical. Get them wrong and no amount of correct per-server config saves you.

Toolkit Reference

Servers and tooling that show up across this guide, plus the spots where a second model saves real time.

Tools and Services

VPS Security Foundations
The hardened Tailscale-only Debian base box this guide assumes as its starting state.
Tailscale
The private mesh network that is both the transport and the authentication for the tailnet-bound servers.
Docker
Runtime for the Scrapling and SearXNG containers, published to the Tailscale IP only.
uv
Installs the arxiv server into a self-contained venv with an absolute-path launcher, which is what makes the SSH invocation PATH-independent.
arxiv-mcp-server
Paper search and retrieval. stdio, run on the VPS, reached over SSH.
Scrapling
Resilient web scraping. Native Streamable HTTP; the only one of the three with any path to the Claude app.
SearXNG
Self-hosted metasearch. A web app, not an MCP server; needs JSON format and the limiter off.
mcp-searxng
The stdio MCP shim that runs on the client and queries the remote SearXNG.
Claude Code
The MCP host all three are wired into, via claude mcp add or .claude.json.

Where AI Earns Its Keep

The non-interactive SSH PATH trap
Diagnosing "works interactively, fails for the client" without laying the whole trail by hand. The canonical second-model win in this guide.
The SearXNG settings.yml audit
A fast check on a file whose failure mode (HTML-only formats) is invisible until everything is wired and search silently returns nothing.
The verification harness
Generating one script that runs the rung-three initialize check across all three transports and prints pass/fail per server. Reusable after every change.