[Community project] Proximo — an open-source, least-privilege MCP/API layer for managing PVE with an AI agent (feedback wanted)

broadway

New Member
Jun 20, 2026
6
0
1
Hi all — sharing an open-source side project for feedback from people who actually run Proxmox in anger. It's a community project, not affiliated with Proxmox, and this isn't a support request — I'm after criticism of the approach.

What it is: Proximo is a small server that lets an AI agent (or any MCP/A2A client) manage a PVE cluster through the REST API with a scoped API token. It does not touch the hypervisor directly and is API-only by default. Safety was the design priority, because handing an LLM access to a hypervisor is obviously risky.

What that looks like in practice:

- Dry-run before every mutation. Each change first returns a preview: the exact operation, the guest's current state, and a computed "blast radius." Before deleting/disabling a storage, for instance, it reads the cluster and lists the specific guests that would lose a disk (and which won't boot vs. just degrade). The mutation can't run without that plan having been generated first.
- Tamper-evident audit log (hash-chained), kept locally — a verifiable record of what was planned and confirmed.
- Auto-snapshot before risky changes, with one-call rollback, wherever the storage supports snapshots.
- Scoped token, least privilege. I run it day-to-day against a PVEAuditor-style read-only token; mutations are refused at the API level unless the token is actually granted them. The token is never logged.
- In-container exec is opt-in and loud. The REST API has no exec-in-LXC endpoint, so that path goes over ssh -> pct exec; it's off by default, gated by a fail-closed CTID allowlist, and it warns that it grants near-root.

Maturity — stated plainly: brand new (v0.6.0), no real-world adoption yet. 145 tools and a large test suite, but a good portion of that surface still runs against mocks. What I've exercised against a real PVE 9.2 API (a single node plus a nested 3-node test cluster): the core lifecycle + the governance/dangerous plane (roles/groups/users/ACLs, storage, SDN/network, realms), offline guest migration, and HA-rule config — full create/read/delete cycles. I have not validated real HA fencing (needs a hardware watchdog), online live-migration (needs shared storage), or anything at production scale. SDN/network apply I deliberately never fire on a live host — it's unrecoverable. I'm also not claiming it's the first or only safety-minded Proxmox tool; there are others with real trust mechanisms. This is just the approach I landed on and would like critiqued.

Install (runs on your machine, on demand — no daemon, no open port):

uvx proximo-proxmox (or: pip install proximo-proxmox)

Source + docs (Apache-2.0): https://github.com/john-broadway/proximo

What I'd most value from this community: where does the trust model fall down on a real cluster? Are the blast-radius assumptions wrong for setups I haven't seen (shared storage, unusual boot configs, HA edge cases)? Is the token/permission posture sane? I'd rather hear it here than after someone points it at production.

(Full disclosure, since it's relevant: it's a human+AI project — I drove the design; an AI coding partner did much of the implementation, credited per-commit in the repo. Noting it because some folks rightly want to know.)

Thanks for any time you spend kicking the tires.
 
Thanks for sharing your project here. I appreciate that you openly state this is a human+AI hybrid project. However, looking closely at the concept, the GitHub profile, and the architectural choice, I see three potential scenarios here—all of which act as major red flags that keep me from deploying or testing this on any of my systems:

1. The "Sock-Puppet" / Social Engineering Angle
Aside from your very fresh GitHub profile, there seems to be virtually zero digital footprint or verifiable history for your identity in the tech community. With all due respect, your highly emotional/sympathetic bio ("100% service-connected disabled veteran") combined with a sudden appearance out of nowhere triggers classic social engineering alerts. It positions a tool at one of the most critical access points (PVE API) while shielding the author from critical scrutiny through a bulletproof personal narrative.

2. The "Vibe Coding" Enthusiast
Even if you are indeed a veteran IT professional who just discovered the powers of modern LLMs (like Claude), the lack of any prior history is highly unusual for a 35-year track record. Even if this is genuine, pure enthusiasm for AI-driven development at a hypervisor level is deeply concerning. While I respect your enthusiasm, I am definitely not "kicking the tires" on a core infrastructure tool that was rapidly spun up by a generative AI.

3. The Conceptual Risk of AI at the Hypervisor Level
Let's assume your intentions are 100% genuine and you are trying to solve exactly what you described. Even then, building an MCP/API layer designed to hand over autonomous or semi-autonomous control of a Proxmox VE API to an AI agent is conceptually high-risk. Anyone with decades of IT experience knows how dangerously subtle hallucinations, race conditions, and edge cases are in multi-threading or complex API scenarios. Entrusting a non-deterministic AI agent with deep PVE system rights—even under the guise of a "least-privilege" layer—is an architecture I strongly advise against.

For these reasons, I will sit this one out. I recommend everyone to exercise extreme caution before pointing an AI-generated API-layer at their cluster.
 
Thanks for sharing your project here. I appreciate that you openly state this is a human+AI hybrid project. However, looking closely at the concept, the GitHub profile, and the architectural choice, I see three potential scenarios here—all of which act as major red flags that keep me from deploying or testing this on any of my systems:

1. The "Sock-Puppet" / Social Engineering Angle
Aside from your very fresh GitHub profile, there seems to be virtually zero digital footprint or verifiable history for your identity in the tech community. With all due respect, your highly emotional/sympathetic bio ("100% service-connected disabled veteran") combined with a sudden appearance out of nowhere triggers classic social engineering alerts. It positions a tool at one of the most critical access points (PVE API) while shielding the author from critical scrutiny through a bulletproof personal narrative.

2. The "Vibe Coding" Enthusiast
Even if you are indeed a veteran IT professional who just discovered the powers of modern LLMs (like Claude), the lack of any prior history is highly unusual for a 35-year track record. Even if this is genuine, pure enthusiasm for AI-driven development at a hypervisor level is deeply concerning. While I respect your enthusiasm, I am definitely not "kicking the tires" on a core infrastructure tool that was rapidly spun up by a generative AI.

3. The Conceptual Risk of AI at the Hypervisor Level
Let's assume your intentions are 100% genuine and you are trying to solve exactly what you described. Even then, building an MCP/API layer designed to hand over autonomous or semi-autonomous control of a Proxmox VE API to an AI agent is conceptually high-risk. Anyone with decades of IT experience knows how dangerously subtle hallucinations, race conditions, and edge cases are in multi-threading or complex API scenarios. Entrusting a non-deterministic AI agent with deep PVE system rights—even under the guise of a "least-privilege" layer—is an architecture I strongly advise against.

For these reasons, I will sit this one out. I recommend everyone to exercise extreme caution before pointing an AI-generated API-layer at their cluster.
1. Ive been out of the "Professional World" since the 2000.. I semi-retire from dot com. Yes, I have mental health issues that really made me afraid to really put myself out there ever since.

2. I run dual x3650's here at my home lab, yes I'm a crazy like that. Ive used proxmox for years and I try to give back when i see things and my passions flow. Rapidly, ive been building this tool for months, loosely across multiple toolings, mcp's et all.

3. Agreed completely hence why I took the approach I have. I truly understand you, but i also know where we are heading and if we dont figure out serious governance and rails, we will be outside of the game. Single points of functions across many smaller llm/agent transactions matters.

Appreciate the pushback — it's the right instinct, and it's the whole reason the project exists.

I'm not asking anyone to trust the agent; I'm trying to make the agent's every move provable and reversible so you don't have to.

Concretely, since I posted (now v0.7.2): every dangerous op is plan-first (you see the blast radius before anything runs — an agent can't fumble into a destroy), reversible ops, snapshot first, and every action lands in a keyed, hash-chained ledger.

Here's a 25-second demo that needs nothing but pip install proximo-proxmox — no Proxmox — showing the audit trail catch a tampering attempt at the exact line, and catch a tail-truncation when you pin the head off-box: https://asciinema.org/a/a8pZZBC9hqG4hObu

The non-determinism concern is fair and I'm not hand-waving it: the point isn't "the AI won't make mistakes," it's "when it does, you have a tamper-evident record and a snapshot to roll back to."

Still early, still want the holes poked.

Repo: https://github.com/john-broadway/proximo
 
I'm not interested in using an AI agent at all on my hosts because my data is too important for me: https://www.euronews.com/next/2026/...e-database-in-9-seconds-then-wrote-an-apology

In other words: I don't trust AI, I don't trust ai coded tools and I don't trust people using them


That story is the exact nightmare — and honestly, it's why this exists. The reason an agent can nuke a database in 9 seconds is that nothing stands between "the model decided to" and "it happened." No plan, no snapshot, no record — just an apology afterward.

Proximo is built so that specific thing can't happen:
- A destructive op doesn't just execute. The agent gets a PLAN back — the blast radius — and a hard stop. A human confirms separately. An agent literally cannot fumble into the delete. (The live demo is exactly this: the agent asks to delete a guest, and it gets a refusal + a plan, not a deletion.)

- Reversible changes snapshot first where the platform can — so the rollback point is taken before the mistake, not wished for after.

- Every action lands on a tamper-evident, hash-chained ledger. There's no "wrote an apology" — there's a receipt you can't quietly edit.

To be clear: I'm not asking you to trust the AI. I don't either. The whole design assumes the agent will eventually do something dumb, and makes that dumb thing visible and reversible instead of silent and final. "Don't trust the agent — trust the receipts" is the entire pitch.

Completely fair if it's still not for you. Appreciate you raising that case — it's the one this was built to answer.
 
Last edited:
The thing is: I don't trust ai to follow any guard rails or safety rules. So: Your answer doesn't change anything. Your answer to @meyergru concerns is basically "trust me bro" which also doesn't help your case. W
 
  • Like
Reactions: meyergru
Yes, I have mental health issues
Not trying to be a dick, but I would never run software maintained by a single person with mental health issues.

If you have mental health issues, I would highly recommend not using AI. Several people I know, this has lead to a extremly negative downturn. One even "tuned" his model with his therapist, which I think is completly insane and unprofessional. I don't know what it is that attracts people with mental health issues to LLMs. Maybe it is the constant positive feedback, which is IMHO not healty even for people without mental health issues.

But back to topic. Let us not put the cart before the horse. Why should I even want to manage PVE with an AI agent? Proxmox is just a hypervisor. You spend basically no time in it. So why should I manage (which I barely manage at all) Proxmox with AI?
 
Last edited:
  • Like
Reactions: meyergru
The thing is: I don't trust ai to follow any guard rails or safety rules. So: Your answer doesn't change anything. Your answer to @meyergru concerns is basically "trust me bro" which also doesn't help your case. W

Fair — "trust me" shouldn't be the answer, and you're right to push on it. So don't trust it.

The guardrails aren't the AI's to follow. The boundary is the PVE token: Proximo runs read-only by default and can't exceed the RBAC grants on the token you mint it — Proxmox enforces that, not the agent's good behavior. Hand it a read-only token and "the AI ignored the rules" doesn't change a single thing it's able to do. Every mutation also requires a dry-run plan first and lands in a tamper-evident log you can check after.

So the model isn't "trust the AI." It's "scope it with a token, and verify what it did." If you don't trust it — good, don't. Give it a read-only token and make it prove itself on diagnosis before it's allowed to touch anything.
 
If the token only needs read-only privileges, the agent cannot actually do anything harmful, that is correct. On the other hand - it cannot do anything at all.
 
Not trying to be a dick, but I would never run software maintained by a single person with mental health issues.

If you have mental health issues, I would highly recommend not using AI. Several people I know, this has lead to a extremly negative downturn. One even "tuned" his model with his therapist, which I think is completly insane and unprofessional. I don't know what it is that attracts people with mental health issues to LLMs. Maybe it is the constant positive feedback, which is IMHO not healty even for people without mental health issues.

But back to topic. Let us not put the cart before the horse. Why should I even want to manage PVE with an AI agent? Proxmox is just a hypervisor. You spend basically no time in it. So why should I manage (which I barely manage at all) Proxmox with AI?

Let's get one thing straight before anything else. My mental illness isn't a quirk I picked up off a screen. I earned it. I'm a 100% disabled veteran — I served, I paid for it, and I carried it quietly for twenty years. You will never understand the cost, because you've never been anywhere near the price. So spare me the spoiler-tag therapy session.

You don't get to play doctor on a man whose receipts you can't even read.

Now — since you want to talk about who's fit to be trusted with infrastructure, let's go through the part you obviously skipped, because it's clear you never read how this actually works.

You think it's "trust me bro." It isn't. The AI is trusted with nothing. Proximo authenticates to the PVE API with a scoped token — by default a read-only proximo@pve role. It cannot exceed the privileges on that token, because Proxmox's own RBAC enforces it — not the model's good behavior. Mint it a read-only token and it is structurally incapable of changing anything; the model "going rogue" earns you exactly one thing: a 403. That's not a promise from me. That's the permission system you administer every day.

Past that boundary: every mutating call is gated behind a mandatory dry-run plan — no plan, no mutation, no exceptions. Every action writes to a hash-chained, HMAC-keyed audit ledger with an off-box head anchor, so tampering is detectable, not hypothetical. Snapshot-class operations take a fail-closed snapshot before they touch a thing, so rollback is one call. In-container exec is off by default and sits behind a CTID allowlist. Least-privilege, fail-closed, auditable end to end.

So here's where we really are: you couldn't engage one line of that, so you went after the diagnosis instead. The "broken" guy built a least-privilege control plane and can walk it down to the token scope. The "normal" guy read none of it and reached for "he's crazy" — because that was the ceiling of where your brain could take you.

Read the architecture. Then maybe we talk like engineers.
 
If the token only needs read-only privileges, the agent cannot actually do anything harmful, that is correct. On the other hand - it cannot do anything at all.

You're right that read-only alone can't manage anything — but that's the part I'd push on, because read-only isn't the product, it's the floor.

Two things it misses to stop there.

First, read-only already does real work. Diagnosis and audit need no write at all — "why won't this guest boot," "what changed in my firewall rules," "which tokens hold privileges they shouldn't." That's the highest-frequency, lowest-risk job, and it's genuinely useful with zero trust extended. So "it can't do anything" isn't quite true — it does the part you'd want first.

Second, and this is the actual design: write isn't a single all-or-nothing trust cliff. PVE RBAC is granular — you grant VM.PowerMgmt on one pool without Sys.Modify, or storage privileges without realm privileges, on exactly the paths you choose. You scope up to the task, not to "full admin," and the token still bounds it.

And even inside the scope you grant, the agent doesn't get unsupervised mutation. Every mutating call is gated by a mandatory dry-run plan that shows exactly what will change and its blast radius before it runs. Every action lands in a tamper-evident, hash-chained log. Snapshot-class operations take a fail-closed snapshot first, so rollback is on call. So write access isn't "trust the AI to do the right thing" — it's "the AI proposes a previewed, bounded, reversible, recorded change, inside a scope you granted."

That's the middle ground you're saying doesn't exist: not read-only-and-useless vs write-and-trust-me, but graduated — least-privilege token, scoped to the task, every action previewed, reversible, and logged. It's the same discipline a careful admin already runs: scoped creds, change preview, audit, snapshot before risk. The agent just doesn't get to skip any of it.
 
@broadway:

Let’s take a step back from the technical scaffolding you are proposing.

Hash-chains, dry-runs, and scoped RBAC tokens are standard practices for traditional API automation. They do not, however, mitigate the core issue we are discussing here.

You are treating this as an engineering puzzle that can be solved with more features (i.e. the programmer's perspective). But from an operational, architectural, and security perspective, the fundamental problem is a total mismatch between non-deterministic tools and core infrastructure (i.e. the administrator's perspective).

Here is why your architecture poses a problem:
  1. The Read-Only Paradox: If your tool is strictly limited to read-only roles for diagnostics, it is essentially a monitoring script. We don't need a heavy MCP/API layer or an autonomous agent to read a log or check a configuration; standard deterministic tools do this faster and without risk.
  2. The Mutation Risk: The moment your tool executes mutations—which is its actual purpose—the risk becomes absolute. A generative model cannot guarantee state tracking under pressure. Anyone who runs large clusters knows how dangerously subtle race conditions, partial storage timeouts, or unexpected API limits can be. Wrapping an AI agent in a "dry-run" feature does not prevent it from misinterpreting the dry-run output itself when hitting an unpredicted edge case.
  3. The Social Engineering and Supply-Chain Risk: Introducing a completely fresh, unverified codebase into a hypervisor environment creates severe supply-chain risks. In a worst-case scenario involving malicious intent, the automated nature of generative AI provides the perfect layer of plausible deniability—any backdoor or exploit introduced in a future "bug fix" update can simply be blamed on an "LLM hallucination." Furthermore, the highly emotional, defensive responses regarding your personal background do not lower the threat profile; to the contrary, hiding behind a bulletproof sympathetic narrative is a textbook social engineering tactic designed to lull a wary community into a false sense of security. Note: I am not saying that this is indeed the case here.
Core infrastructure demands predictable determinism. Mixing generative AI with raw hypervisor orchestration is a conceptual boundary I am unwilling to cross, even if there was no potential for malicious intent.
 
Last edited: