Skip to main content
  1. Posts/

Uploading Files to Google Cloud Storage in Go

·1724 words·9 mins
GCP go google cloud web gcp bucket cloud storage
drt
Author
drt

The goal of this post is to submit form data to a web handler, and directly copy the file to a bucket in GCS. Upon a successful upload to the bucket, return the URL of the object as the web response. There are a handful of examples online and you’ll see a common theme between all of them:

  • The web request must have a Content-Type of multipart/form-data.
  • There is a snippet of code that looks like this:
func Upload(w http.ResponseWriter, r *http.Request) {
  err := r.ParseMultipartForm(32 << 20)
  // handle err
}

While there is nothing inherently wrong with the above code, it is important to note that ParseMultipartForm will take 32<<20 (~32MB) and allocate that much memory to store the file upload. If the form data is larger than the provided size, the rest is stored in a temporary file on disk. This works out fine for most situations, and when the files are small in size. This isn’t as great if you’re building a cloud function, and don’t really have access to the disk, or have a limited memory capacity.

To reduce my memory footprint, I wanted to write the data to a bucket file as it’s being uploaded to my server/function/etc. I also want to be able to give the user the option to rename the file that will be stored in the bucket.

shall we get started?

Standing Up the Infra #

I will assume you already have the gcloud command installed and have application default creds available on your system. First thing to do is create a new bucket. If you already have one ready, skip this step.

gsutil mb gs://drt-upload-bkt

The goal is to allow this content to be visible and be served to the public. We can update the permissions on the bucket to allow anyone to view an object with the following command.

gsutil defacl ch -u AllUsers:R gs://drt-upload-bkt

Base Project and Stubbed Code #

Start a new go project, and run go mod init. For bucket access and object creation, we’ll be using Google’s google-cloud-go repo. Install the dependencies with:

go get -u cloud.google.com/go/storage

With that out of the way, let’s start by stubbing out a reusable web handler to be used on the web server.

type Handler struct {
  // passing a bucket instead of a storage.Client
  bucket *storage.BucketHandle
  // logging is important!
  log *zap.SugaredLogger
}

func New(log *zap.SugaredLogger, bucket *storage.BucketHandle) *Handler {
  return &Handler {
    log: log,
    bucket: bucket,
  }
}

func (h *Handler) Upload(w http.ResponseWriter, r *http.Request) {
  // pass off to grab possible errors and obj URL
  u, err := h.upload(r)

  // handle error
  if err != nil {
    h.log.Errorw("error during upload", "error", err)
    http.Error(w, "internal server error", http.StatusInternalServerError)
    return
  }

  // log to system
  h.log.Infow("file uploaded", "url", u)

  // display to user
  fmt.Fprintln(w, u)
}

func (h *Handler) upload(r *http.Request) (string, error) {
  // the heart of the code
  return "", errors.New("not implemented")
}

Handling the Upload #

Everything up to this point is pretty boilerplate. Now let’s break down how to tackle the upload portion.

Basic steps to achieve our goals are as follows:

  • Get a reader to go through the multi-part form data.
  • Parse the optional rename form data.
  • Parse the file form data.
  • Create a new bucket object
  • Read file data and copy to bucket writer
  • Return URL

Instead of having the server parse the whole form data for us, we need to do it manually. The request has a function for that called MultipartReader().

func upload(r *http.Request) (string, error) {
  // the MultipartReader allows us to parse the form data piece by piece giving
  // us access to each part of the data. similar to how you would use 
  // `bufio.Scanner` or `sql.Rows`. except we wont be using a for loop.
  mpr, err := r.MultipartReader()
  if err != nil {
    return "", fmt.Errorf("r.MultipartReader failure: %w", err)
  }
}

Next we want to parse the name from the form. This is the optional but must be included in the form data.

NOTE: when submitting the form, ordering is VERY important. since we’re doing this in one pass, instead of loading into memory, we are expecting that the first part of the form data to be name.

func upload(r *http.Request) (string, error) {
  // MULTIPARTREADER -----------------------------------------------------------
  // code omitted

  // get the next (first) piece of data from the form
  // `part` satisfies the `io.Reader` interface, which is perfect for our use case
  part, err := mpr.NextPart()

  // handle errors
  switch {

  // make sure we're dealing with the expected piece of data
  // if the file was provided first, or some other id, we want to stop
  case part.FormName() != "name":
    return "", fmt.Errorf("expected name first")

  // any other unknown errors need to be bubbled up
  case err != nil:
    return "", fmt.Errorf("mpr.NextPart failure on first read: %w", err)
  }

  // copy the raw bytes of the form data into our buffer
  buf := bytes.NewBuffer(nil)
  if _, err := io.Copy(buf, part); err != nil && err != io.EOF {
    return "", fmt.Errorf("io.Copy failure for name: %w", err)
  }

  // it's okay if this is an empty string, it will be checked before writing to the bucket
  uploadFilename := buf.String()
}

Next up is validating that the next part of the form is our file. The application wont actually start copying the stream until after a bucket object is created. But similar to the name above, we will parse the next part of the form and make sure everything is as expected.

  // MULTIPARTREADER -----------------------------------------------------------
  // code omitted

  // HANDLE FILE RENAME --------------------------------------------------------
  // code omitted


  // get the next part of data from the form
  part, err = mpr.NextPart()

  // handle errors
  switch {

  // just like before, make sure we're dealing with the expected piece of data
  case part.FormName() != "myfile":
    return "", fmt.Errorf("file required")

  // any other unknown errors need to be bubbled up
  case err != nil:
    return "", fmt.Errorf("mpr.NextPart failure on second read: %w", err)
  }

  // overwrite uploadFilename if none provided
  if uploadFilename == "" {
    uploadFilename = part.FileName()
  }

Now its time to create/update an object in the bucket. If you’ve worked with the storage library before, this will be pretty straightforward. If not, its easy to pick up. An object can return an instance that satisfies the io.Writer interface. We’ll use that along with the part (io.Reader interface) in io.Copy to do the actual transfer.

NOTE: This code does not check if the object name exists or not. If there is an object with the same name IT WILL BE OVERWRITTEN. You might want to include additional checks to ensure that no files are overwritten.

  // MULTIPARTREADER -----------------------------------------------------------
  // code omitted

  // HANDLE FILE RENAME --------------------------------------------------------
  // code omitted

  // HANDLE FILE UPLOAD --------------------------------------------------------
  // code omitted


  // create new/update bucket object
  obj := h.bucket.Object(uploadFilename)

  // copy data to object
  // use the request context; if connection is cancelled, the write will stop
  sw := obj.NewWriter(r.Context())
  if _, err := io.Copy(sw, part); err != nil {
    return "", fmt.Errorf("io.Copy failure for file: %w", err)
  }

  // close writer; attrs will be nil if not closed; cannot defer
  if err := sw.Close(); err != nil {
    return "", fmt.Errorf("unable to close file after writing object: %w", err)
  }

At this point the upload is complete and file has been successfully written to the bucket. The final piece of the puzzle is returning the URL of the object to the user. This can be achieved by using an object’s Attrs() function and returning the MediaLink

  // MULTIPARTREADER -----------------------------------------------------------
  // code omitted

  // HANDLE FILE RENAME --------------------------------------------------------
  // code omitted

  // HANDLE FILE UPLOAD --------------------------------------------------------
  // code omitted

  // WRITE TO BUCKET -----------------------------------------------------------
  // code omitted


  // get URL of object
  attrs, err := obj.Attrs(r.Context())
  if err != nil {
    return "", fmt.Errorf("cannot generate attrs for object: %w", err)
  }

  return attrs.MediaLink, nil

And we’re done!

you did it!

To fully test you’ll need to add it to the multiplexer of your choice (I prefer using chi) and attach it to something like /upload. With that you can use the following curl request to send your data to the server and upload to your bucket!

curl localhost:8000 \
  -i \
  -F 'name="optional-rename.mp4"' \
  -F 'myfile=@./path/to/video.mp4'

TL;DR Full Handler Code #

I have a full implementation with a router and logging set up on Github to try out as well.

package handler

import (
	"bytes"
	"fmt"
	"io"
	"net/http"

	"cloud.google.com/go/storage"
	"go.uber.org/zap"
)

type Handler struct {
	bucket *storage.BucketHandle
	log    *zap.SugaredLogger
}

func New(log *zap.SugaredLogger, bucket *storage.BucketHandle) *Handler {
	return &Handler{
		log:    log,
		bucket: bucket,
	}
}

func (h *Handler) Upload(w http.ResponseWriter, r *http.Request) {
	u, err := h.upload(r)
	if err != nil {
		h.log.Errorw("error during upload", "err", err)
		switch {
		case isValidationError(err):
			http.Error(w, err.Error(), http.StatusBadRequest)
		default:
			http.Error(w, "internal server error", http.StatusInternalServerError)
		}
		return
	}
	h.log.Infow("file uploaded", "url", u)
	fmt.Fprintln(w, u)
}

func (h *Handler) upload(r *http.Request) (string, error) {
	mpr, err := r.MultipartReader()
	if err != nil {
		return "", fmt.Errorf("r.MultipartReader failure: %w", err)
	}

	// HANDLE FILE RENAME --------------------------------------------------------
	part, err := mpr.NextPart()
	switch {
	case err != nil:
		return "", fmt.Errorf("mpr.NextPart failure on first read: %w", err)
	case part.FormName() != "name":
		return "", validationError("expected name first")
	}

	buf := bytes.NewBuffer(nil)
	if _, err := io.Copy(buf, part); err != nil && err != io.EOF {
		return "", fmt.Errorf("io.Copy failure for name: %w", err)
	}
	uploadFilename := buf.String()

	// HANDLE FILE UPLOAD --------------------------------------------------------
	part, err = mpr.NextPart()
	switch {
	case err != nil:
		return "", fmt.Errorf("mpr.NextPart failure on second read: %w", err)
	case part.FormName() != "myfile":
		return "", validationError("file required")
	}

	if uploadFilename == "" {
		uploadFilename = part.FileName()
	}

	// WRITE TO BUCKET -----------------------------------------------------------

	// create new/update bucket object
	obj := h.bucket.Object(uploadFilename)

	// copy data to object
	sw := obj.NewWriter(r.Context())
	if _, err := io.Copy(sw, part); err != nil {
		return "", fmt.Errorf("io.Copy failure for file: %w", err)
	}

	// close writer; attrs will be nil if not closed; cannot defer
	if err := sw.Close(); err != nil {
		return "", fmt.Errorf("unable to close file after writing object: %w", err)
	}

	// get URL of newly created obj
	attrs, err := obj.Attrs(r.Context())
	if err != nil {
		return "", fmt.Errorf("cannot generate attrs for object: %w", err)
	}

	return attrs.MediaLink, nil
}

Clean Up #

Don’t forget to delete your bucket! Otherwise your test files will be publicly available!

gsutil rb gs://drt-upload-bkt