In this article, we will see how we can validate file upload extensions and be certain that our applications are safe from malicious uploads.

When receiving files uploaded by users, relying solely on file names to determine file types is risky. This is because we can easily fake file types by replacing their extensions. Therefore, we need a more reliable method of verification.

To download the source code for this article, you can visit our GitHub repository.

What Is a File Signature?

A file signature, also known as a Magic number, is a unique sequence of bytes at the beginning of a file. Unlike file names, which users can easily manipulate, file signatures provide a unique fingerprint, allowing us to determine the true nature of a file.

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!

Create a Project to Validate File Upload Extensions

Our goal is to create a basic system for uploading image and PDF files, and conducting verification through two steps:

  1. Use the file extension of the uploaded file to determine whether it belongs to acceptable formats, PDF or image
  2. Verify that the file truly corresponds to the type indicated by its extension. We will perform this process only if the first step is successful

Create the Initial Project

Let’s create a new Web API using Visual Studio, specifying the project name and target framework. Make sure to keep the ‘Use controllers’ option checked, as we’ll use the standard controllers, not the Minimal APIs.

Add Our Main Classes

Next, we need to define an abstract class named FileFormatDescriptor, which will act as a base class for specific file-type implementations:

public abstract class FileFormatDescriptor
{
    protected FileFormatDescriptor()
    {
        Initialize();
        MaxMagicNumberLength = MagicNumbers.Max(m => m.Length);
    }

    protected abstract void Initialize();

    protected HashSet<string> Extensions { get; } = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
    protected List<byte[]> MagicNumbers { get; } = [];
    protected int MaxMagicNumberLength { get; }
    protected string TypeName { get; set; }

    public bool IsIncludedExtention(string extention) => Extensions.Contains(extention);

    public Result Validate(IFormFile file)
    {
        using var stream = file.OpenReadStream();
        Span<byte> initialBytes = stackalloc byte[MaxMagicNumberLength];
        int Readbytes = stream.Read(initialBytes);

        foreach (var magicNumber in MagicNumbers)
        {
            if (initialBytes[..magicNumber.Length].SequenceCompareTo(magicNumber) == 0)
            {
                return new Result(true, Status.GENUINE, $"{Status.GENUINE} {TypeName}");
            }
        }

        return new Result(false, Status.FAKE, $"{Status.FAKE} {TypeName}!");
    }
}

In our class, we have Extensions and MagicNumbers properties that store supported file extensions and their corresponding file signatures.

The Validate() method will be responsible for performing validation. It takes an IFormFile class object as input. In this method, we open a stream, extract the initial bytes, and then compare them with stored signatures to validate the file signature.

Since the majority of file formats have relatively short signatures, typically less than 16 bytes, we read them into a Span created using stackalloc. If byte requirements exceed a reasonable length, such as 256 bytes, we need to utilize ArrayPool or another appropriate method for our byte buffer. 

Note that because different file types may have signatures of varying lengths, as is the case with PNG and JPG image files, we need to read bytes up to the maximum length of the signatures stored in the MagicNumbers array to potentially match any of the stored signatures.

Now, we need a simple record to hold the result of the validation method, indicating whether the file is acceptable or not, along with an appropriate message:

public record Result(bool Acceptable, Status Status, string Message);

Next, we define two classes, Pdf and Image, which inherit from the abstract class FileFormatDescriptor. Each class represents a supported file format within our system. We will initialize the classes with their respective extensions and signatures.

Let’s start with the Pdf class:

public class Pdf : FileFormatDescriptor
{
    protected override void Initialize()
    {
        TypeName = "PDF FILE";
        Extensions.Add(".pdf");
        MagicNumbers.Add([0x25, 0x50, 0x44, 0x46]);
    }
}

Then, we will proceed by adding the Image class:

public class Image : FileFormatDescriptor
{
    protected override void Initialize()
    {
        TypeName = "IMAGE FILE";
        Extensions.UnionWith([".jpeg", ".jpg", ".png"]);
        MagicNumbers.AddRange(new byte[][]
        {
             [0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A],
             [0xFF, 0xD8, 0xFF, 0xE0],
             [0xFF, 0xD8, 0xFF, 0xE1],
             [0xFF, 0xD8, 0xFF, 0xE2],
             [0xFF, 0xD8, 0xFF, 0xE3]
        });
    }
}

Because our system is somewhat tolerant of image formats, we combine both JPG and PNG formats into one class. However, we could extract each into separate classes.

Implement our Validator Static Class

Let’s now centralize our validation process by creating a static class that maintains a list of allowed type classes and utilizes their respective Validate() methods:

public static class FileValidator
{
    private static readonly List<FileFormatDescriptor> AllowedFormats = [new Image(), new Pdf()];

    public static Result Validate(IFormFile file)
    {
        var fileExtension = Path.GetExtension(file.FileName);
        var targetType = AllowedFormats.FirstOrDefault(x => x.IsIncludedExtention(fileExtension));

        if (targetType is null)
        {
            return new Result(false, Status.NOT_SUPPORTED, $"{Status.NOT_SUPPORTED}");
        }

        return targetType.Validate(file);
    }
}

The validation starts by extracting the uploaded file’s extension and then determines the supposed file type by searching for a matching extension in the AllowedFormats list. If no corresponding extension is found, we simply return a response that the file type is not supported. Otherwise, we pass the file to the Validate() method belonging to an instance of targetType to verify if the content of the file matches the expected format.

Adding an Upload Endpoint

To be ready to verify the uploaded files, we have to have an action to call it. So let’s define the UploadController which will exercise our FileValidator:

[Route("api/[controller]")]
[ApiController]
public class UploadController : ControllerBase
{
    [HttpPost]
    public IActionResult Upload(IFormFile file)
    {
        var result = FileValidator.Validate(file);

        return result.Acceptable ? Ok(result) : BadRequest(result);
    }
}

Testing With Postman

Let’s verify that our code works as expected. Because we have created a Web API, we will be using Postman for testing.

We create a new POST request, selecting the form-data option. Finally, we’ll add a key-value pair where the key corresponds to the name of the IFormFile parameter in the controller, and the value represents the file itself.

Let’s start by uploading a fake PNG file: 

fake image file

The response indicates that the uploaded file is not acceptable, and our message property gives us a more readable explanation.

Now, let’s proceed with our testing by uploading a PDF file:

True pdf file

Finally, we will attempt to upload a Word document, which is an unsupported file type:

not supported file type

Great, the results meet our expectations. Now, building upon this success, we can enhance our solution by implementing action filters as detailed in our article Different Ways to Validate an Uploaded File in ASP.Net Core.

Conclusion

In this article, we have learned how to validate file upload extensions in .NET apps. We utilized file signatures in our system as a simple yet effective method for enhancing security. Despite its simplicity, this approach adds an extra layer of protection, making our software more secure against potential threats.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!