What if XML could be more than just a payload?
Published on April 26, 2026
Written by MaDrCloudDev
I built a demo where one XML bundle helps a local WebGPU model generate a Pokémon card, and it has me wondering if XML might be more useful for AI workflows than we usually give it credit for.
I’ve been testing a small idea around XML and local AI.
One thing AI keeps doing is making older tools worth another look. Tailwind is the obvious example. Huge utility strings feel a lot less annoying when a model can write them. The same goes for things like HTMX. Once the repetitive glue gets cheap, some older patterns start feeling practical again.
That made me wonder about XML.
Not whether it should replace JSON. I do not think it should. JSON is still the better default most of the time. The question I had was narrower: if a model needs to turn structured input into structured output, does XML help because the structure is already spelled out?
To test that, I built a browser demo that runs a local model over WebGPU and has it generate the HTML for a Pokémon card from one XML bundle.
I picked a Pokémon card on purpose. It is structured enough to be annoying. There is a name, HP, attack rows, an image area, smaller decorative bits, and a bunch of places where specific values need to end up in specific slots. If the model drifts, it shows up quickly.
The XML bundle is doing more than carrying values. It includes the card data, a render map, and the small assets needed for the final output. So the model is not just getting “make a Pokémon card.” It is getting a labeled bundle that already says what the pieces are.
That ended up being the part I cared about most. JSON is excellent for moving data around, but XML is much more explicit about hierarchy and relationships. When the task is “turn this structure into that structure,” that extra explicitness seems useful.
The demo does work, but only with a fair amount of guidance. I still had to be explicit about the mapping, and I still had to steer the model toward the kind of HTML I wanted back. Raw XML by itself was not enough. Once I added the render map and tighter instructions, the output got a lot more stable.
So I would not frame this as XML replacing prompting. It felt more like XML gave me a better source of truth for the prompt to lean on.
Another thing I like about this setup is that the XML stays inspectable. I can open it, query it, and compare exactly what the app is using with exactly what the model is using. If you have never used XPath, the simplest way to think about it is as a way to ask the XML direct questions like “what is the second attack?” or “where is the flavor text?” Even without the AI part, that is useful.
Running the model locally also matters here. There is no server hiding the hard part. The browser gets the XML bundle, the model runs on your machine through WebGPU, and the result succeeds or fails right there. For an experiment like this, I prefer that.
The main limitation right now is pretty simple: this is still a guided workflow. The clearer the structure and the tighter the directions, the better the result. That may sound obvious, but I think it is important to say out loud because a lot of AI demos gloss over how much scaffolding is involved.
Still, I think there is something real here. Not “use XML everywhere,” just “XML may be a decent contract format when a model needs to read structure, not just values.”
That is where I am at with it. I had a hunch, built a rough test, and got a result that feels promising enough to keep pushing on.
The next thing I want to find out is how much of the extra guidance I can remove before the output falls apart. If the answer is “almost none,” that is still useful. If the answer is “less than I thought,” then it gets more interesting.
- Live Demo - Try the hosted version
Quick heads up: the local generator depends on WebGPU, which means browser support and hardware acceleration matter. If the demo does not run, you may need to enable hardware acceleration or WebGPU in your browser settings or flags, then reload the page. There is no server fallback for the model side of this demo.
At minimum, this convinced me that the format feeding a model matters more than I was giving it credit for when the goal is keeping structure intact.