THINKING ENERGY
Custom File Uploads to S3

Written by David Yu

Software Engineer | B. Commerce & B. Science, University of Western Australia | David is a software engineer with experience building enterprise-grade software systems.

November 30, 2020

There are three kinds of geeks at Gridcognition: energy geeks, data science geeks, and software geeks.

I’m the third kind and I thought I’d share what I learnt last week when I was building a file uploader for Amazon S3.

Making your own file uploader to s3 can be tricky due to a lack of guides and documentation (and useless error messages!).
This guide will provide you with base working code to build your file uploader to s3

What we will be building: a simple react file uploader that gets a signed URL from the backend and uploads to s3 via that URL.

 

Backend

The backend code is straight forward, you can use either getSignedUrl which uses PUT HTTP method, or createPresignedPost which uses POST HTTP method. We will use getSignedUrl since that is the more commonly used one.

import AWS from 'aws-sdk';

function getUploadUrl(bucketName, fileKey) {
  AWS.config.update({region: 'ap-southeast-2'});
  const s3 = new AWS.S3();

  const params = {
    Bucket: bucketName,
    Key: fileKey, // Path to file and file name to save e.g. /folderName/file.txt
    ACL: 'private',
    ContentType: 'application/octet-stream'
  };

  return s3.getSignedUrl('putObject', params); // Returns a url which any http client can use to upload data
}

 

This function will live in your backend and is normally called only after you have authenticated and authorized the request. Depending on your backend setup you will need to give it the right permissions to access s3. Note it will generate a seemingly valid URL even if you do not have to correct permissions. However, once you use the URL it will respond with an authentication error.

 

Cors

If you intend on upload via a web browser you will need to configure cors in the s3 bucket. This is under the permissions tab in your s3 bucket. Here is an example config:

You can adjust these however you like but I would suggest going with the least secure approach first then, once you have everything working, tighten the security because an issue with the config here can lead to errors down the road that are very hard to debug.

[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
           "*"
        ],
        "AllowedOrigins": [
            "*"
        ],
        "ExposeHeaders": []
    }
]

Frontend

The HTML can be the normal file input component.

<input type="file" multiple />

Then listen for the change event to grab the files you want to upload. If you are using a framework, this event will be exposed in various ways e.g. in React its onChange . We will use React as the example to build the upload file component since its rare to use vanilla JavaScript anymore…

function S3FileUploader() {
  // Immediately upload files when user selects them.
  // You can also store user selected files into state to defer uploading for later
  function uploadFiles(event) {
    const files = event.target.files;
    // files isn't an array but of type FileList , we are turning it into one for convienence 
    const filesToUpload = Array(files.length)
      .fill(null)
      .map((_, index) => {
        return files[index];
      });

    // Normally you would now store those files in state

    const promises = filesToUpload.map(async(file) => {
       const signedUrl = await fetch('<https://yourbackendcodeyouwroteearlier>')
       // Upload your file
       await fetch(signedUrl, { method: 'PUT', body: file, headers: { 'Content-Type': 'application/octet-stream' } });
    }))

   // Upload files in parallel
   await Promise.all(promises);

    // Thats it your files are uploaded 🎉
  }

  return (
    <input type="file" multiple onChange={uploadFiles} />
  )
}

 

That’s all the code you need to upload your files. It’s quite straightforward but there are a few tricks here. You don’t use FormData, contrary to most file upload guides – this will corrupt your data if you use it to upload to s3. The event.target.files doesn’t return an array so if you try to iterate over it an error will be thrown. Lastly, you must set the Content-Type header to what you have defined in the backend before otherwise, you will get a cryptic error.

 

Suggestions

Building your own file component is actually very hard. There are a lot of things you need to do right, that’s why companies have emerged offering provider file uploads as a service.

  • Scan files for malware
  • Compress files before uploading them
  • Chunk parallel file uploads, to improve reliability and performance.
  • Tighten your cors configuration
  • Provide user feedback on the progress of a file upload

 

If you’re a software engineer and you are interested in accelerating the energy system transition come and join the team.

You May Also Like…

Have you met: Adam Green

Adam Green, an energy engineer turned data scientist, joined Gridcognition as a Data Engineer in late 2020. Having...

Subscribe to Thinking Energy

We promise we don't send spam