File Uploads

Now that we've added forms and form uploading to our application, we may want the additional ability to upload files. However, there is a bit more involved when uploading a file than there is for other kinds of form data.

Consider uploading an image file to our website. In addition to our raw image data (in binary), we probably want to send metadata describing the file itself - the filename, the file format, etc. While some of this information may be contained within the binary of the file itself (remember, files are typically themselves structured into a header and body section), in order to use it we would need to implement a file-reading strategy for every kind of image file we might encounter.

If we think in broader terms than image files, but rather all files, the enormity of this task becomes apparent. First, there are thousands of existing file types, and more are defined every day. Thankfully, the design of HTTP considered this, and came to the conclusion that HTTP servers should be able to treat file uploads as arbitrary blobs of data. Any metadata that the server should know about is added as an extra content header before the binary blob. Finally, we may want to be able to upload other form data with the files, and possibly multiple files.

Multipart Encoding

To support this, one possible format for the body is multipart encoding. Normal html forms are sent with a content-type of application/x-www-form-urlencoded, but multipart forms have a content-type of multipart/form-data. The intent to send a mulitpart form is signaled in the HTML form by setting the enctype attribute to multipart/form-data. Otherwise, the form defaults to regular form encoding, and the file data is not included.

Let's add a function to generate a form for uploading an image to our gallery:

function generateImageForm() { return '<form enctype="multipart/form-data" action="/" method="POST">' + ' <input type="file" name="image">' + ' <input type="submit" value="Upload Image">' + '</form>'; }

We can then invoke this function within our serveGallery() function as we build our HTML, embedding the resulting form in our Gallery page.

Multipart Boundaries

Additionally, when the content-type is multipart/form-data, an additional key/value pair for the boundary must be supplied, for a full header line like:

Content-Type: 'Multipart/Form-Data; boundary=----WebKitFormBoundaryA9ze2K1Ij2cA8X4T'

The actual value of boundary is arbitrary, and decided by the user agent submitting the files (typically the browser). However, there is one caveat. It must be a series of bytes that does not occur within the contents of the form and files being sent.

This boundary is then used to separate the form data from the file data. Say we upload two files as part of a form, the body of our HTTP request would consist of three parts, separated by our boundary sequence (let's say my-boundary for the purposes of this example):

--my-boundary <file 1 content> --my-boundary <file 2 content> --my-boundary <form content> --my-boundary--

Again, the boundary is arbitrary, and decided by the user-agent submitting the files. The only requirement is that it be a unique sequence of bytes not found in any of the content sections, as this is used to separate those sections. Also, the specification adds that a two hyphens (--) appear before the boundary bytes. Additionally, a new line (consisting of a carriage return and line feed) appear at the end of each boundary and content section. Full details can be found in rfc7578.

In order to process the HTTP body at the server side then, we first must split it into the individual parts using the value of the boundary.

File Contents

For each file contents section, there is again a header section which typically has a few lines of key-value pairs providing metadata describing the file. For example:

Content-Disposition: form-data; name="image"; filename="smiling.jpeg" Content-Type: image/jpeg

The Content-Disposition identifies the form field this file corresponds to (image), as well as the filename. The Content-Type provides the mime-type of the file. An extra empty line signifies the end of the content headers; all bytes after this are the binary contents of the file itself.

Parsing Multipart Content

With this structure in mind, we can now turn to the task of parsing a multipart request in our server. This process consists of three steps:

  1. Receive the streaming HTTP body
  2. Split the body using the boundary specified in the HTTP headers
  3. For each body content:
    • Split the content header and content body
    • Parse the content header
    • Decide what to do with the content body

This process suggests several functions, each consuming and transforming the incoming request, until the end of the chain were we must decide what to do with the transformed data.

This sounds like a lot of functions, and our server.js is already starting to get crowded... so let's modularize this code by placing it into another file.

User-Defined Modules in Node

You've already used core modules like fs and http in this example. But Node also lets us define our own modules using the CommonJS module pattern. We start by placing the module code in its own file, and defining what we want to export by assigning it to the module.exports global object, i.e.

module.exports = function() { return "Hello World"; }

Or

module.exports = { PI: 3.14159265359, sine: function(angle) {...}, cosine: function(angle) {...} }

Or

module.exports = class Bear { constructor() {} growl() {} ... }

Essentially, we can create a module that exports a function, class, or object.

These can then be required in other code with a require() call and passing the path to our file, i.e.

var myModule = require('./my-module-file');

Note the 'js' extension does not need to be included in the file path, though it can.

Additionally, we can require JSON files, which are then deserialized into a JavaScript object.

GOTCHA: ECMAScript now has a very different module style, using the import and export keywords. This approach is not supported in Node at this time.

Defining the Multipart Module

Let's go ahead and move our form parsing code into its own module, located in multipart.js. This module will export an asynchronous function that takes a http.IncomingMessage (our req parameter), a http.ServerResponse (our res parameter) and a callback function that is invoked once the form data has been received and parsed.

Since we have a well-developed idea of our process, We'll also do a step towards literate programming by describing our functionality with comments that will guide our code-writing efforts.

// multipart.js module.exports = multipart; /** @function multipart * Parses the content of a multipart request body, * and attaches it to the request object as the body * parameter, before invoking the next function with * the modified req and the untouched res. If an error * occurs, we'll short-circuit with server error page. * @param {http.IncomingRequest} req - the request object * @param {http.ServerResponse} res - the response object * @param {function} next - a callback to invoke if the * body is successfully parsed. */ function multipart(req, res, next) {} /** @function processBody * Process the complete HTTP request body using the * provided boundary. Use the supplied callback to * (return) an error or a * @param {Buffer} buffer - the complete HTTP request body * @param {String} boundary - the bytes separating content parts within the body * @param {Function} callback - a function to invoke when done processing the body. The first parameter is an error object, the second the parsed body as an associative array keyed on the input name attribute. */ function processBody(buffer, boundary, callback) {} /** @function splitContentParts * Splits a multipart body into the individual content * parts using the supplied boundary bytes. * @param {Buffer} buffer - the multipart body to split * @param {String} boundary - the bytes that separate content parts in the buffer * @param {Function} callback - a function to invoke when processing is done. The first parameter is an error object, the second an array of content buffers. */ function splitContentParts(buffer, boundary, callback) {} /** @function parseContent * Parses a content buffer, providing the contents * as a key/value pair in an array, where the key * is the name attribute of the input, and the value * is the body, or, if content is a file, an object * with filename, contentType, and data attributes. * @param {Buffer} content - the content buffer * @param {Function} callback - a function to invoke when done processing. */ function parseContent(buffer, callback) {}

As before, we'll need to define event handlers on the req parameter to handle the streaming body. We'll start with one to handle errors:

req.on('error', function(err){ callback(err); });

If an error happens, we simply trigger the callback function passing the error as the first argument. Because we don't pass a second argument, it will be undefined.

The next event handler needs to deal with the data event, which is triggered every time a new chunk of the request body is streamed across the internet to our server. These chunks are basically binary buffers at this point; we'll need to collect them and combine them once we have them all. We'll keep them in an array chunks:

var chunks = []; req.on('data', function(chunk) { chunks.push(chunk); });

Finally, we need to deal with the end event, which is triggered once all streaming data for the request is received. At this point we can reassemble the full body as a binary data buffer, and then go through the transformation process we outlined earlier.

There is one key piece of data in our header we need to do that transformation process - that is the sequence of bytes that define the separation between the different content parts in the multipart request body. The boundary is defined as part of the header's Content-Type value, which comes to us in the form:

Content-Type: multipart/form-data; boundary=<BOUNDARY>

We'll need to extract the actual boundary from this header; the easiest way to do so is a regular expression, i.e.:

var match = /boundary=(.+);?/.exec(req.headers['content-type']);

Here the () indicate a capture group - we want to collect any characters that appear within the parentheses. The . indicates we want any character, and the + modifies the . to indicate we are looking for one or more of any character. Since key/value pairs in a header are separated by semicolons, we specify a ; after our capture group, followed by a ? to indicate the semicolon may not be there (as is the case if this is the last key/value pair). You can learn more about regular expressions in JavaScript on the MDN Regular Expressions Guide

The RegExp's exec() method returns null if no match for the regular expression is found in the supplied string. If one is found, an array is the return value, with the first item being the full matching string. All subsequent items in the array is the text within the capture groups in the order they were specified. So in our case, we'll want the item at index 1 - this is our match. If it doesn't exist, we are dealing with a malformed request, and should pass on an error to our callback. Otherwise, we can start the transformation chain with a call to processBody(), which we'll write shortly.

Thus, our full end event handler looks like:

req.on('end', function(){ var body = Buffer.concat(chunks); var match = /boundary=(.+);?/.exec(req.headers['content-type']); if(match && match[1]) { callback(false, processBody(body, match[1])); } else { callback("No multipart boundary defined.") } });

And the full module.export:

module.exports = function(req, callback) { var chunks = []; // Handle error events by passing the error // to the callback function req.on('error', function(err){ callback(err); }); // Handle data events by appending the new // data to the chunks array. req.on('data', function(chunk) { chunks.push(chunk); }); // Handle end events by assembling the chunks // into a single buffer and passing that to the // processBody function, and sending its results // to the callback function. Also, supply the // boundary bytes defined in our header. req.on('end', function() { var body = Buffer.concat(chunks); var match = /boundary=(.+);?/.exec(req.headers['content-type']); if(match && match[1]) { callback(false, processBody(body, match[1])); } else { callback("No multipart boundary defined.") } }); }

Splitting the HTTP Body into Content Parts

As discussed above, a multipart HTTP body actually consists of multiple section split by the sequence of boundary bytes. We'll want to pull out each of these sections from the main buffer into individual buffers.

Immensely helpful for this is the Buffer.slice() method. It creates a new Buffer from an existing one with an optionally specified start and end index. The true benefit here is that the new buffer actually references the same memory as the original buffer - since there is no copying, this is very efficient.

Also needful is the Buffer.indexOf(), which returns the index of the first instance of supplied buffer, byte array, or string (remember, the buffer itself is binary data, which may or may not be characters). Since the boundary was defined for us as a sequence of characters, we can supply this to Buffer.indexOf(), and it will give us the starting point of those characters in the buffer (broken on byte boundaries).

You can find more information on the Buffer object in the Node Buffer documentation.

Remember that the boundary appears in multipart body before and after each content part, and that each appearance of the boundary is proceeded with -- and the last appearance similarly has a trailing --.

We can therefore find the start of the first content part at:

var start = buffer.indexOf('--' + boundary) + boundary.length + 4;

Remember, Buffer.indexOf() returns the starting position for the bytes in question, so we need to add the length of they boundary, plus and extra two for our leading -- and another two for a trailing CLRF in order to reach the first byte of our actual content.

From that start position, we can search for our next boundary using:

var end = buffer.indexOf(boundary, start);

Now we know that all bytes between start and end are our first content part. We may have an arbitrary number of these, and we want to process them all. How can we go about it?

A loop is a good solution. We can start at our previous end point and search for the next end index, capture that content part, and repeat. When do we know we are done? When we reach the end of our buffer, in which case there are no remaining boundaries, only the trailing --. If we try to locate a boundary index in this situation with Buffer.indexOf(), we'll get a -1 as our return value.

As we go through our loop, we'll want to add each content part to an array we return at the end of our function:

function splitContentParts(buffer, boundary) { var parts = [] var start = buffer.indexOf('--' + boundary) + boundary.length + 2; var end = buffer.indexOf(boundary, start); // invariant: the bytes between start and end // in buffer compose a content part. The value of // end must therefore be greater than start, // and both must fall between [0,buffer.length] while(end != -1) { parts.push(buffer.slice(start, end)); start = end + boundary.length + 2; end = buffer.indexOf('--' + boundary, start); } return parts; }

Parsing the Content Parts

Content parts themselves are composed of head and body sections. The head consists of header key/value pairs, using the same patterns as HTTP headers. The most common one will be Content-Disposition, which comes with all form-data types:

Content-Disposition: form-data; name="Gallery Time!"

The name is the name of the form field.

If the content part includes a file, then we will also see a filename, and there will typically be a second header specifying the Content-Type:

Content-Disposition: form-data; name="image"; filename="87706825_o.jpg" Content-Type: image/jpeg

Each header appears on its own line, and the header section ends with an empty line. Knowing this, we can split the buffer on that double line end, which consists of the characters for line end, character feed, line end, character feed. Let's define a constant Buffer for that sequence:

const DOUBLE_CRLF = Buffer.from([0x0D,0x0A,0x0D,0x0A]);

Splitting the content section buffer involves finding the first instance of these bytes, and using that index to partition the head and body:

var index = buffer.indexOf(DOUBLE_CRLF); var head = buffer.slice(index).toString(); var body = buffer.slice(index + 4);

Since we know the head is text, we can go ahead and convert its buffer to a string. The body might be text or a binary file, so we'll leave it as a buffer for now.

The next step is parsing the headers. Since we know that each line is a separate header, we can start by splitting the string on CLRF. In turn, each line can be split on the :, which separates the key from value.

const CLRF = Buffer.from([0x0D,0x0A]); var headers = {}; head.split(CLRF).forEach(function(line){ var parts = line.split(': '); var key = parts[0]; var value = parts[1]; headers[key] = value; });

Once we have the headers parsed, we can use a regular expression to extract the name and filename fields from the Content-Disposition header:

var name = /name=([\w\s\-_]+)/.exec(headers['Content-Disposition']); var filename = /filename=([^\\/:\*\?"<>\|]+)/.exec(headers['Content-Disposition']);

If a filename was found, then the content is a file; otherwise it is the value of the form field.

Let's wrap up all the previous steps into a parseContent(), returning the field name and value as a two-element array. If the content was an image file, the value will be an object with a filename and data attribute.

function parseContent(buffer) { var index = buffer.indexOf(DOUBLE_CRLF); var head = buffer.slice(index).toString(); var body = buffer.slice(index + 4); var name = /name=([\w\s\-_]+)/.exec(headers['Content-Disposition']); var filename = /filename=([^\\/:\*\?"<>\|]+)/.exec(headers['Content-Disposition']); if(filename) { return [name[1], {filename: filename[1], data: buffer}]; } else { return [name[1], buffer.toString()]; } }

Processing the Content Parts

Finally, we can bring the splitting and parsing of the content together in the function we referenced at the beginning, processBody().

First, we'll use splitContentParts() to break the multipart content into a buffer of arrays. This we can iterate over with the Array.forEach() method, calling parseContent() on each. The returned key/value pair can then be added to the formData object.

function processBody(buffer, boundary) { var formData = {}; splitContentParts(buffer, boundary).forEach(function(content){ var parts = parseContent(content); formData[parts[0]] = parts[1]; }); return formData; }

Saving the Image

Now, let's switch back to our server.js file to handle the POST request containing the multipart data. First, we'll need to require our new multipart library:

var multipart = require('./multipart');

We can then invoke multipart(), passing it the request object when we know we're processing a multipart request. The second argument would be a callback that takes an error and the contents of the multipart form as arguments.

Remember the form we defined earlier has a file attribute named image, so the contents parameter should have this attribute, which in turn has a filename and data. We can use fs.writeFile() with these parameters to write our image to the /images directory. After that, we just need to serve our gallery using our existing serveGallery() function. If we do this inside the fs.writeFile() callback, we ensure that the image is fully saved before we render our gallery (if not, the image might be missing until the user refreshes the page, which can be confusing).

Bringing that together as an uploadImage() function:

function uploadImage(req, res) { multipart(req, function(err, content) { if(err) { console.error(err); res.statusCode = 500; res.end(); return; } fs.writeFile('images/' + content.image.filename, content.image.data, function(err){ if(err) { console.error(err); res.statusCode = 500; res.end(); return; } serveGallery(req, res); }); }); }

Now we just need to determine, in our request handler, what constitutes an image upload. If we were to treat any POST request to our gallery as an image upload, we could differentiate between GET and POST requests using the req.method member:

if(req.method == 'GET') { serveGallery(req, res); } else if(req.method == 'POST') { uploadImage(req, res); }

If we restart our server and submit an image, it should appear in our gallery.

Large or Multiple Files

We can use the same method as in the previous post on form data to reject large files before they overwhelm our server. But sometimes, we need to be able to upload really large files for a good reason - say, uploading video files.

This is one reason the http.incomingMessage provides event handlers for data and end events - rather than holding the full contents in memory, we can stream them into a temporary file, and later move and rename the file using the fs library. This approach also helps conserve memory when we expect lots of middling-size file uploads like pictures. While one upload alone won't strain our memory reserves, thousands will, and can lead to page thrashing) and poor server performance.