CVE-2026-41674
EPSS 0.02%xmldom has XML injection through unvalidated DocumentType serialization
Description
## Summary The package serializes `DocumentType` node fields (`internalSubset`, `publicId`, `systemId`) verbatim without any escaping or validation. When these fields are set programmatically to attacker-controlled strings, `XMLSerializer.serializeToString` can produce output where the DOCTYPE declaration is terminated early and arbitrary markup appears outside it. --- ## Details `DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset)` validates only `qualifiedName` against the XML QName production. The remaining three arguments are stored as-is with no validation. The XMLSerializer emits `DocumentType` nodes as: ``` <!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]> ``` All fields are pushed into the output buffer verbatim — no escaping, no quoting added. **`internalSubset` injection:** The serializer wraps `internalSubset` with ` [` and `]`. A value containing `]>` closes the internal subset and the DOCTYPE declaration at the injection point. Any content after `]>` in `internalSubset` appears outside the DOCTYPE in the serialized output as raw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: `@xmldom/xmldom` ≥ 0.9.0 via `createDocumentType` API; 0.8.x only via direct property write. **`publicId` injection:** The serializer emits `publicId` verbatim after `PUBLIC` with no quoting added. A value containing an injected system identifier (e.g., `"pubid" SYSTEM "evil"`) breaks the intended quoting context, injecting a fake SYSTEM entry into the serialized DOCTYPE declaration. Identified during internal security research. Affected: both branches, all versions back to 0.1.0. **`systemId` injection:** The serializer emits `systemId` verbatim. A value containing `>` terminates the DOCTYPE declaration early; content after `>` appears as raw XML markup outside the DOCTYPE context. Identified during internal security research. Affected: both branches, all versions back to 0.1.0. The parse path is safe: the SAX parser enforces the `PubidLiteral` and `SystemLiteral` grammar productions, which exclude the relevant characters, and the internal subset parser only accepts a subset it can structurally validate. The vulnerability is reachable only through programmatic `createDocumentType` calls with attacker-controlled arguments. --- ## Affected code **`lib/dom.js` — `createDocumentType` (lines 898–910):** ```js createDocumentType: function (qualifiedName, publicId, systemId, internalSubset) { validateQualifiedName(qualifiedName); // only qualifiedName is validated var node = new DocumentType(PDC); node.name = qualifiedName; node.nodeName = qualifiedName; node.publicId = publicId || ''; // stored verbatim node.systemId = systemId || ''; // stored verbatim node.internalSubset = internalSubset || ''; // stored verbatim node.childNodes = new NodeList(); return node; }, ``` **`lib/dom.js` — serializer DOCTYPE case (lines 2948–2964):** ```js case DOCUMENT_TYPE_NODE: var pubid = node.publicId; var sysid = node.systemId; buf.push(g.DOCTYPE_DECL_START, ' ', node.name); if (pubid) { buf.push(' ', g.PUBLIC, ' ', pubid); if (sysid && sysid !== '.') { buf.push(' ', sysid); } } else if (sysid && sysid !== '.') { buf.push(' ', g.SYSTEM, ' ', sysid); } if (node.internalSubset) { buf.push(' [', node.internalSubset, ']'); // internalSubset emitted verbatim } buf.push('>'); return; ``` --- ## PoC ### internalSubset injection ```js const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom'); const impl = new DOMImplementation(); const doctype = impl.createDocumentType( 'root', '', '', ']><injected/><![CDATA[' ); const doc = impl.createDocument(null, 'root', doctype); const xml = new XMLSerializer().serializeToString(doc); console.log(xml); // <!DOCTYPE root []><injected/><![CDATA[]><root/> // ^^^^^^^^^^ injected element outside DOCTYPE ``` ### publicId quoting context break ```js const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom'); const impl = new DOMImplementation(); const doctype = impl.createDocumentType( 'root', '"injected PUBLIC_ID" SYSTEM "evil"', '', '' ); const doc = impl.createDocument(null, 'root', doctype); console.log(new XMLSerializer().serializeToString(doc)); // <!DOCTYPE root PUBLIC "injected PUBLIC_ID" SYSTEM "evil"><root/> // quoting context broken — SYSTEM entry injected ``` ### systemId injection ```js const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom'); const impl = new DOMImplementation(); const doctype = impl.createDocumentType( 'root', '', '"sysid"><injected attr="pwn"/>', '' ); const doc = impl.createDocument(null, 'root', doctype); console.log(new XMLSerializer().serializeToString(doc)); // <!DOCTYPE root SYSTEM "sysid"><injected attr="pwn"/>><root/> // > in sysid closes DOCTYPE early; <injected/> appears as sibling element ``` --- ## Impact An application that programmatically constructs `DocumentType` nodes from user-controlled data and then serializes the document can emit a DOCTYPE declaration where the internal subset is closed early or where injected SYSTEM entities or other declarations appear in the serialized output. Downstream XML parsers that re-parse the serialized output and expand entities from the injected DOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion. --- ## Fix Applied > **⚠ Opt-in required.** Protection is not automatic. Existing serialization calls remain > vulnerable unless `{ requireWellFormed: true }` is explicitly passed. Applications that pass > untrusted data to `createDocumentType()` or write untrusted values directly to a > `DocumentType` node's `publicId`, `systemId`, or `internalSubset` properties should audit > all `serializeToString()` call sites and add the option. `XMLSerializer.serializeToString()` now accepts an options object as a second argument. When `{ requireWellFormed: true }` is passed, the serializer validates the `DocumentType` node's `publicId`, `systemId`, and `internalSubset` fields before emitting the DOCTYPE declaration and throws `InvalidStateError` if any field contains an injection sequence: - **`publicId`**: throws if non-empty and does not match the XML `PubidLiteral` production (XML 1.0 [12]) - **`systemId`**: throws if non-empty and does not match the XML `SystemLiteral` production (XML 1.0 [11]) - **`internalSubset`**: throws if it contains `]>` (which closes the internal subset and DOCTYPE declaration early) All three checks apply regardless of how the invalid value entered the node — whether via `createDocumentType` arguments or a subsequent direct property write. ### PoC — fixed path ```js const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom'); const impl = new DOMImplementation(); // internalSubset injection const dt1 = impl.createDocumentType('root', '', '', ']><injected/><![CDATA['); const doc1 = impl.createDocument(null, 'root', dt1); // Default (unchanged): verbatim — injection present console.log(new XMLSerializer().serializeToString(doc1)); // <!DOCTYPE root []><injected/><![CDATA[]><root/> // Opt-in guard: throws InvalidStateError try { new XMLSerializer().serializeToString(doc1, { requireWellFormed: true }); } catch (e) { console.log(e.name, e.message); // InvalidStateError: DocumentType internalSubset contains "]>" } ``` The guard also covers post-creation property writes: ```js const dt2 = impl.createDocumentType('root', '', ''); dt2.systemId = '"sysid"><injected attr="pwn"/>'; const doc2 = impl.createDocument(null, 'root', dt2); new XMLSerializer().serializeToString(doc2, { requireWellFormed: true }); // InvalidStateError: DocumentType systemId is not a valid SystemLiteral ``` ### Why the default stays verbatim The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a `require well-formed` flag whose **default value is `false`**. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in `requireWellFormed: true` flag allows applications that require injection safety to enable strict mode without breaking existing deployments. ### Residual limitation `createDocumentType(qualifiedName, publicId, systemId[, internalSubset])` does not validate `publicId`, `systemId`, or `internalSubset` at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release. When the default serialization path is used (without `requireWellFormed: true`), all three fields are still emitted verbatim. Applications that do not pass `requireWellFormed: true` remain exposed.
Affected packages (3)
- Debian/node-xmldomfrom 0
- npm/xmldomfrom 0, <= 0.6.0
- npm/@xmldom/xmldomfrom 0, < 0.8.13
CVSS scores
| Source | Version | Severity | Vector |
|---|---|---|---|
| osv | CVSS 4.0 | — | CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N |
References (7)
- ADVISORYhttps://nvd.nist.gov/vuln/detail/CVE-2026-41674
- ADVISORYhttps://security-tracker.debian.org/tracker/CVE-2026-41674
- PATCHhttps://github.com/xmldom/xmldom
- WEBhttps://github.com/xmldom/xmldom/commit/372008f9ae0e20fd69f761c7b79e202598267314
- WEBhttps://github.com/xmldom/xmldom/releases/tag/0.8.13
- WEBhttps://github.com/xmldom/xmldom/releases/tag/0.9.10
- WEBhttps://github.com/xmldom/xmldom/security/advisories/GHSA-f6ww-3ggp-fr8h