File Carving

Prompt

The security team has found a rather strange file exiting the network, we're not sure if it's containing any sensitive information. Help us identify what's in it.

green_file.bin4.4KB

Tutorial Video

Walk-Through

This challenge will involve the ability to identify to magic bytes and extract concatenated files.

If you do not understand how digital data is represented in, you should check out our Computer Fundamentals for Cybersecurity.

Files are a sequence of bytes. Each byte can store a value between 0-255. In order to make sense of these bytes and do something useful, such as display an image or play an audio clip, these bytes are structured into a standardized format, known as a file format, so that a program can make sense of the information. These standardized formats dictate to a program how to interpret the data. For example, the format for an image may dictate that the first byte of the file is the color for the first pixel of the image and the second byte is the color for the second pixel and so on.The format for an audio file may dictate that the first byte is the frequency of the first second of the audio clip and the second byte is the frequency for the second second of the audio clip and so on. If you attempt to view an image with a music player or edit a song using a photo editor, you’ll find that these programs will get confused when trying to process the file and give you an error.

There are two primary ways that programs attempt to determine which format is being used by a file: the file extension (eg: .png, .mp3, .pdf) and the file signature, aka magic bytes. The file extension can be found directly in the name of the file while the magic bytes are a unique sequence of bytes values that are directly within the contents of the file itself, usually at the beginning of the file. Often, the file extension is used to determine the file format since checking the filename is more efficient than opening every file in a folder to look for the magic bytes. If a file extension has been modified or changed, this can trick some programs into thinking that they cannot open the file. A simple rename of the program to correct the file extension can quickly solve this problem.

image

For this challenge, the file looks relatively simple. If your operating system knows how to open this, it opens up with just some random green pixels as shown above.

If your operating system doesn’t know how to open it, or you want to know for certain what kind of file it is, you can use the Linux file command to do that:

file green_file

Which should output something like the following:

image

The file program stores a list of magic bytes and their corresponding file formats and searches the file you provide it for those magic bytes and displays any that match. The output tells you that it is a PNG file, so if your operating system did not know how to open it, you can rename the file to “green_file.png” and  see if your operating system can open it now.

green_file.bin4.4KB
mv green_file.bin green_file.png

Now to find out if this file is hiding additional files, one particular tool you can use is binwalk. This tool is designed to identify files through reading magic bytes and a number of other file carving techniques. If someone were to combine multiple files together, you can still identify where one starts and where one begins by looking for the magic bytes since these often indicate the start of a file. Once you know the start of each file, you extract the individual files from the combined file so you can view each one independently.

In this scenario, binwalk can help identify any other files that may be packaged into this seemingly green image file.

To do that, we can execute the following command

binwalk green_file

and the results should output the offsets where binwalk has detected a file. We can see in the output that there exists 6 different files inside this seemingly green image file, most of which are PNG files with one being a gzip compressed file. Now that we have an idea that there are other files packaged as part of this seemingly green image file. We can also leverage binwalk to extract the remaining files. To do that, quickly peruse the binwalk documentation by reading the manual pages on binwalk, you’ll find that the “Extraction Options” are very relevant to this scenario.

man binwalk
image

Now we can compose our extraction command by executing

binwalk --extract --dd “png:png” green_file

which will include PNG files in the extraction and give them the .png extension.

image

Once extracted, you can run the ls command to see a new directory with the name _green_file.extracted created. The contents of that directory are the outputs of the binwalk extraction process - they are named after the hexadecimal byte offsets.

You can review each one individually and see if there is anything of interest to this scenario. The file named CAB is the only non-image file in here, we can see what it is by running the file command on the CAB file like we did earlier. The outputs will show that this is a tar archive.

image

From here, we can unpack the tar archive by using the tar command as such:

tar xvf CAB
image

and you can see that the archive contains an interesting directory called flags with a flags.txt file in it which will contain the answer to the final question.

Questions

This file initially looks like something green, what's the file format of this green file?

How many files can be extracted from the binary blob?

What is the hidden flag in the file?

©️ 2024 Cyber Skyline. All Rights Reserved. Unauthorized reproduction or distribution of this copyrighted work is illegal.