Many applications that place user input into file paths implement some kind of defense against path traversal attacks, and these can often be circumvented.
If an application strips or blocks directory traversal sequences from the user-supplied filename, then it might be possible to bypass the defense using a variety of techniques.
You might be able to use an absolute path from the filesystem root, such as filename=/etc/passwd
, to directly reference a file without using any traversal sequences.
You might be able to use nested traversal sequences, such as ....//
or ....\/
, which will revert to simple traversal sequences when the inner sequence is stripped.
In some contexts, such as in a URL path or the filename parameter of a multipart/form-data request, web servers may strip any directory traversal sequences before passing your input to the application. You can sometimes bypass this kind of sanitization by URL encoding, or even double URL encoding, the ../
characters, resulting in %2e%2e%2f
or %252e%252e%252f
respectively. Various non-standard encodings, such as ..%c0%af or ..%ef%bc%8f
, may also do the trick.
If an application requires that the user-supplied filename must start with the expected base folder, such as /var/www/images
, then it might be possible to include the required base folder followed by suitable traversal sequences. For example:
filename=/var/www/images/../../../etc/passwd
If an application requires that the user-supplied filename must end with an expected file extension, such as .png, then it might be possible to use a null byte to effectively terminate the file path before the required extension. For example:
filename=../../../etc/passwd%00.png
The most effective way to prevent file path traversal vulnerabilities is to avoid passing user-supplied input to filesystem APIs altogether. Many application functions that do this can be rewritten to deliver the same behavior in a safer way.
If it is considered unavoidable to pass user-supplied input to filesystem APIs, then two layers of defense should be used together to prevent attacks:
The application should validate the user input before processing it. Ideally, the validation should compare against a whitelist of permitted values. If that isn't possible for the required functionality, then the validation should verify that the input contains only permitted content, such as purely alphanumeric characters. After validating the supplied input, the application should append the input to the base directory and use a platform filesystem API to canonicalize the path. It should verify that the canonicalized path starts with the expected base directory.
Below is an example of some simple Java code to validate the canonical path of a file based on user input:
File file = new File(BASE_DIRECTORY, userInput);
if (file.getCanonicalPath().startsWith(BASE_DIRECTORY)) {
// process file
}