Optimizing site performance by "lazy loading" images

Recently, I've been spending some time making performance improvements to my site. In my previous blog post on this topic, I described my progress optimizing the JavaScript and CSS usage on my site, and concluded that image optimization was the next step.

Last summer I published a blog post about my vacation in Acadia National Park. Included in that post are 13 photos with a combined size of about 4 MB.

When I benchmarked that post with https://webpagetest.org, it showed that it took 7.275 seconds (blue vertical line) to render the page.

The graph shows that the browser downloaded all 13 images to render the page. Why would a browser download all images if most of them are below the fold and not shown until a user starts scrolling? It makes very little sense.

As you can see from the graph, downloading all 13 images take a very long time (purple horizontal bars). No matter how much you optimize your CSS and JavaScript, this particular blog post would have remained slow until you optimize how images are loaded.

Webpagetest images february before

"Lazy loading" images is one solution to this problem. Lazy loading means that the images aren't loaded until the user scrolls and the images come into the browser's viewport.

You might have seen lazy loading in action on websites like Facebook, Pinterest or Medium. It usually goes like this:

  • You visit a page as you normally would, scrolling through the content.
  • Instead of the actual image, you see a blurry placeholder image.
  • Then, the placeholder image gets swapped out with the final image as quickly as possible.
An animated GIF of a user scrolling a webpage and a placeholder images being replaced by the final image

To support lazy loading images on my blog I do three things:

  1. Automatically generate lightweight yet useful placeholder images.
  2. Embed the placeholder images directly in the HTML to speed up performance.
  3. Replace the placeholder images with the real images when they become visible.

Generating lightweight placeholder images

To generate lightweight placeholder images, I implemented a technique used by Facebook: create a tiny image that is a downscaled version of the original image, strip out the image's metadata to optimize its size, and let the browser scale the image back up.

To create lightweight placeholder images, I resized the original images to be 5 pixels wide. Because I have about 10,000 images on my blog, my Drupal-based site automates this for me, but here is how you create one from the command line using ImageMagick's convert tool:

$ convert -resize 5x -strip original.jpg placeholder.jpg
  • -resize 5x resizes the image to be 5 pixels wide while maintaining its aspect ratio.
  • -strip removes all comments and redundant headers in the image. This helps make the image's file size as small as possible.

The resulting placeholder images are tiny — often shy of 400 bytes.

Large metal pots with wooden lids in which lobsters are boiled
The original image that we need to generate a placeholder for.
An example placeholder image shows brown and black tones
The generated placeholder, scaled up by a browser from a tiny image that is 5 pixels wide. The size of this placeholder image is only 395 bytes.

Here is another example to illustrate how the colors in the placeholders nicely match the original image:

A sunrise with beautiful reds and black silhouettes
An example placeholder image that shows red and black tones

Even though the placeholder image should only be shown for a fraction of a second, making them relevant is a nice touch as they suggest what is coming. It's also an important touch, as users are very impatient with load times on the web.

Embedding placeholder images directly in HTML

One not-so-well-known feature of the <img> element is that you can embed an image directly into the HTML document using the data URL scheme:

<img src="
  DAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQICAQECAQEB
  AgICAgICAgICAQICAgICAgICAgL/2wBDAQEBAQEBAQEBAQECAQEBAgICAgICA
  gICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/wA
  ARCAADAAUDAREAAhEBAxEB/8QAFAABAAAAAAAAAAAAAAAAAAAACf/EAB8QAAM
  AAQMFAAAAAAAAAAAAAAECAwcEBQYACAkTMf/EABUBAQEAAAAAAAAAAAAAAAAA
  AAIG/8QAJBEAAQIFAgcAAAAAAAAAAAAAAQURAAIDBDEGFBIVQUVRcYH/2gAMA
  wEAAhEDEQA/AAeyb5HO8o8lSLZd01Jz2nbKoK4yxDVvZqYl7uaV4CWdmZQSSS
  ST86FJBsafEJK15KD05ioNk4G6Yeg0V9bVCmZpXt08sB2hJ8DJ2Tn7H/2Q==" />

Data URLs are composed of four parts: the data: prefix, a media type indicating the type of data (image/jpg), an optional base64 token to indicate that the data is base64 encoded, and the base64 encoded image data itself.

data:[<media type>][;base64],<data>

To base64 encode an image from the command line, use:

$ base64 placeholder.jpg

To base64 encode an image in PHP, use:

$data =  base64_encode(file_get_contents('placeholder.jpg'));

What is the advantage of embedding a base64 encoded image using a data URL? It eliminates HTTP requests as the browser doesn't have to set up new HTTP connections to download the images. Fewer HTTP requests usually means faster page load times.

Replacing placeholder images with real images

Next, I used JavaScript's IntersectionObserver to replace the placeholder image with the actual image when it comes into the browser's viewport. I followed Jeremy Wagner's approach shared on Google Web Fundamentals Guide on lazy loading images — with some adjustments.

It starts with the following HTML markup:

<img class="lazy" src="placeholder.jpg" data-src="original.jpg" />

The three relevant pieces are:

  1. The class="lazy" attribute, which is what you'll select the element with in JavaScript.
  2. The src attribute, which references the placeholder image that will appear when the page first loads. Instead of linking to placeholder.jpg I embed the image data using the data URL technique explained above.
  3. The data-src attribute, which contains the URL to the original image that will replace the placeholder when it comes in focus.

Next, we use JavaScript's IntersectionObserver to replace the placeholder images with the actual images:

document.addEventListener('DOMContentLoaded', function() {
  var lazyImages = [].slice.call(document.querySelectorAll('img.lazy'));

  if ('IntersectionObserver' in window) {
    let lazyImageObserver = new IntersectionObserver(
      function(entries, observer) {
        entries.forEach(function(entry) {
          if (entry.isIntersecting) {
            let lazyImage = entry.target;
            lazyImage.src = lazyImage.dataset.src;
            lazyImageObserver.unobserve(lazyImage);
          }
        });
    });

    lazyImages.forEach(function(lazyImage) {
      lazyImageObserver.observe(lazyImage);
    });
  }
  else {
    // For browsers that don't support IntersectionObserver yet,
    // load all the images now:
    lazyImages.forEach(function(lazyImage) {
      lazyImage.src = lazyImage.dataset.src;
    });
  }
});

This JavaScript code queries the DOM for all <img> elements with the lazy class. The IntersectionObserver is used to replace the placeholder image with the original image when the img.lazy elements enter the viewport. When IntersectionObserver is not supported, the images are replaced on the DOMContentLoaded event.

By default, the IntersectionObserver's callback is triggered the moment a single pixel of the image enters the browser's viewport. However, using the rootMargin property, you can trigger the image swap before the image enters the viewport. This reduces or eliminates the visual or perceived lag time when swapping a placeholder image for the actual image.

I implemented that on my site as follows:

const config = {
    // If the image gets within 250px of the browser's viewport, 
    // start the download:
    rootMargin: '250px 0px',
  };

let lazyImageObserver = new IntersectionObserver(..., config);

Lazy loading images drastically improves performance

After making these changes to my site, I did a new https://webpagetest.org benchmark run:

A diagram that shows page load times for dri.es before making performance improvements

You can clearly see that the page became a lot faster to render:

  • The document is complete after 0.35 seconds (blue vertical line) instead of the original 7.275 seconds.
  • No images are loaded before the document is complete, compared to 13 images being loaded before.
  • After the document is complete, one image (purple horizontal bar) is downloaded. This is triggered by the JavaScript code as the result of one image being above the fold.

Lazy loading images improves web page performance by reducing the number of HTTP requests, and consequently reduces the amount of data that needs to be downloaded to render the initial page.

Is base64 encoding images bad for SEO?

Faster sites have a SEO advantage as page speed is a ranking factor for search engines. But, lazy loading might also be bad for SEO, as search engines have to be able to discover the original images.

To find out, I headed to Google Search Console. Google Search Console has a "URL inspection" feature that allows you to look at a webpage through the eyes of Googlebot.

I tested it out with my Acadia National Park blog post. As you can see in the screenshot, the first photo in the blog post was not loaded. Googlebot doesn't seem to support data URLs for images.

A screenshot that shows Googlebot doesn't render placeholder images that are embedded using data URLs

Is IntersectionObserver bad for SEO?

The fact that Googlebot doesn't appear to support data URLs does not have to be a problem. The real question is whether Googlebot will scroll the page, execute the JavaScript, replace the placeholders with the actual images, and index those. If it does, it doesn't matter that Googlebot doesn't understand data URLs.

To find out, I decided to conduct an experiment. For the experiment, I published a blog post about Matt Mullenweg and me visiting a museum together. The images in that blog post are lazy loaded and can only be discovered by Google if its crawler executes the JavaScript and scrolls the page. If those images show up in Google's index, we know there is no SEO impact.

I only posted that blog post yesterday. I'm not sure how long it takes for Google to make new posts and images available in its index, but I'll keep an eye out for it.

If the images don't show up in Google's index, lazy loading might impact your SEO. My solution would be to selectively disable lazy loading for the most important images only. (Note: even if Google finds the images, there is no guarantee that it will decide to index them — short blog posts and images are often excluded from Google's index.)

Conclusions

Lazy loading images improves web page performance by reducing the number of HTTP requests and data needed to render the initial page.

Ideally, over time, browsers will support lazy loading images natively, and some of the SEO challenges will no longer be an issue. Until then, consider adding support for lazy loading yourself. For my own site, it took about 40 lines of JavaScript code and 20 lines of additional PHP/Drupal code.

I hope that by sharing my experience, more people are encouraged to run their own sites and to optimize their sites' performance.

Two internet entrepreneurs walk into an old publishing house

A month ago, Matt Mullenweg, co-founder of WordPress and founder of Automattic, visited me in Antwerp, Belgium. While I currently live in Boston, I was born and raised in Antwerp, and also started Drupal there.

We spent the morning together walking around Antwerp and visited the Plantin Moretus Museum.

The museum is the old house of Christophe Plantin, where he lived and worked around 1575. At the time, Plantin had the largest printing shop in the world, with 56 employees and 16 printing presses. These presses printed 1,250 sheets per day.

Today, the museum hosts the two oldest printing presses in the world. In addition, the museum has original lead types of fonts such as Garamond and hundreds of ancient manuscripts that tell the story of how writing evolved into the art of printing.

The old house, printing business, presses and lead types are the earliest witnesses of a landmark moment in history: the invention of printing, and by extension, the democratization of publishing, long before our digital age. It was nice to visit that together with Matt as a break from our day-to-day focus on web publishing.

An old printing press at the Plantin Moretus Museum

Dries and Matt in front of the oldest printing presses in the world

An old globe at the Plantin Moretus Museum

Why the EU Copyright Directive is a threat to the Open Web

An image of a copyright sign

After much debate, the EU Copyright Directive is now moving to a final vote in the European Parliament. The directive, if you are not familiar, was created to prohibit spreading copyrighted material on internet platforms, protecting the rights of creators (for example, many musicians have supported this overhaul).

The overall idea behind the directive — compensating creators for their online works — makes sense. However, the implementation and execution of the directive could have a very negative impact on the Open Web. I'm surprised more has not been written about this within the web community.

For example, Article 13 requires for-profit online services to implement copyright filters for user-generated content, which includes comments on blogs, reviews on commerce sites, code on programming sites or possibly even memes and cat photos on discussion forums. Any for-profit site would need to apply strict copyright filters on content uploaded by a site's users. If sites fail to correctly filter copyrighted materials, they will be directly liable to rights holders for expensive copyright infringement violations.

While implementing copyright filters may be doable for large organizations, it may not be for smaller organizations. Instead, small organizations might decide to stop hosting comments or reviews, or allowing the sharing of code, photos or videos. The only for-profit organizations potentially excluded from these requirements are companies earning less than €10 million a year, until they have been in business for three years. It's not a great exclusion, because there are a lot of online communities that have been around for more than three years and don't make more than €10 million a year.

The EU tends to lead the way when it comes to internet legislation. For example, GDPR has proven successful for consumer data protection and has sparked state-by-state legislation in the United States. In theory, the EU Copyright Directive could do the same thing for modern internet copyright law. My fear is that in practice, these copyright filters, if too strict, could discourage the free flow of information and sharing on the Open Web.

Optimizing site performance by reducing JavaScript and CSS

I've been thinking about the performance of my site and how it affects the user experience. There are real, ethical concerns to poor web performance. These include accessibility, inclusion, waste and environmental concerns.

A faster site is more accessible, and therefore more inclusive for people visiting from a mobile device, or from areas in the world with slow or expensive internet.

For those reasons, I decided to see if I could improve the performance of my site. I used the excellent https://webpagetest.org to benchmark a simple blog post https://dri.es/relentlessly-eliminating-barriers-to-growth.

A diagram that shows page load times for dri.es before making performance improvements

The image above shows that it took a browser 0.722 seconds to download and render the page (see blue vertical line):

  • The first 210 milliseconds are used to set up the connection, which includes the DNS lookup, TCP handshake and the SSL negotiation.
  • The next 260 milliseconds (from 0.21 seconds to 0.47 seconds) are spent downloading the rendered HTML file, two CSS files and one JavaScript file.
  • After everything is downloaded, the final 330 milliseconds (from 0.475 seconds to 0.8 seconds) are used to layout the page and execute the JavaScript code.

By most standards, 0.722 seconds is pretty fast. In fact, according to HTTP Archive, it takes more than 2.4 seconds to download and render the average web page on a laptop or desktop computer.

Regardless, I noticed that the length of the horizontal green bars and the horizontal yellow bar was relatively long compared to that of the blue bar. In other words, a lot of time is spent downloading JavaScript (yellow horizontal bar) and CSS (two green horizontal bars) instead of the HTML, including the actual content of the blog post (blue bar).

To fix, I did two things:

  1. Use vanilla JavaScript. I replaced my jQuery-based JavaScript with vanilla JavaScript. Without impacting the functionality of my site, the amount of JavaScript went from almost 45 KB to 699 bytes, good for a savings of over 6,000 percent.
  2. Conditionally include CSS. For example, I use Prism.js for syntax highlighting code snippets in blog posts. prism.css was downloaded for every page request, even when there were no code snippets to highlight. Using Drupal's render system, it's easy to conditionally include CSS. By taking advantage of that, I was able to reduce the amount of CSS downloaded by 47 percent — from 4.7 KB to 2.5 KB.

According to the January 1st, 2019 run of HTTP Archive, the median page requires 396 KB of JavaScript and 60 KB of CSS. I'm proud that my site is well under these medians.

File type Dri.es before Dri.es after World-wide median
JavaScript 45 KB 669 bytes 396 KB
CSS 4.7 KB 2.5 KB 60 KB

Because the new JavaScript and CSS files are significantly smaller, it takes the browser less time to download, parse and render them. As a result, the same blog post is now available in 0.465 seconds instead of 0.722 seconds, or 35% faster.

After a new https://webpagetest.org test run, you can clearly see that the bars for the CSS and JavaScript files became visually shorter:

A diagram that shows page load times for dri.es reduced after making performance improvements

To optimize the user experience of my site, I want it to be fast. I hope that others will see that bloated websites can come at a great cost, and will consider using tools like https://webpagetest.org to make their sites more performant.

I'll keep working on making my website even faster. As a next step, I plan to make pages with images faster by using lazy image loading.

Headless CMS: REST vs JSON:API vs GraphQL

An abstract image of three boxes

The web used to be server-centric in that web content management systems managed data and turned it into HTML responses. With the rise of headless architectures a portion of the web is becoming server-centric for data but client-centric for its presentation; increasingly, data is rendered into HTML in the browser.

This shift of responsibility has given rise to JavaScript frameworks, while on the server side, it has resulted in the development of JSON:API and GraphQL to better serve these JavaScript applications with content and data.

In this blog post, we will compare REST, JSON:API and GraphQL. First, we'll look at an architectural, CMS-agnostic comparison, followed by evaluating some Drupal-specific implementation details.

It's worth noting that there are of course lots of intricacies and "it depends" when comparing these three approaches. When we discuss REST, we mean the "typical REST API" as opposed to one that is extremely well-designed or following a specification (not REST as a concept). When we discuss JSON:API, we're referring to implementations of the JSON:API specification. Finally, when we discuss GraphQL, we're referring to GraphQL as it used in practice. Formally, it is only a query language, not a standard for building APIs.

The architectural comparison should be useful for anyone building decoupled applications regardless of the foundation they use because the qualities we will evaluate apply to most web projects.

To frame our comparisons, let's establish that most developers working with web services care about the following qualities:

  1. Request efficiency: retrieving all necessary data in a single network round trip is essential for performance. The size of both requests and responses should make efficient use of the network.
  2. API exploration and schema documentation: the API should be quickly understandable and easily discoverable.
  3. Operational simplicity: the approach should be easy to install, configure, run, scale and secure.
  4. Writing data: not every application needs to store data in the content repository, but when it does, it should not be significantly more complex than reading.

We summarized our conclusions in the table below, but we discuss each of these four categories (or rows in the table) in more depth below. If you aggregate the colors in the table, you see that we rank JSON:API above GraphQL and GraphQL above REST for Drupal core's needs.

REST JSON:API GraphQL
Request efficiency Poor; multiple requests are needed to satisfy common needs. Responses are bloated. Excellent; a single request is usually sufficient for most needs. Responses can be tailored to return only what is required. Excellent; a single request is usually sufficient for most needs. Responses only include exactly what was requested.
Documentation, API explorability and schema Poor; no schema, not explorable. Acceptable; generic schema only; links and error messages are self-documenting. Excellent; precise schema; excellent tooling for exploration and documentation.
Operational simplicity Acceptable; works out of the box with CDNs and reverse proxies; few to no client-side libraries required. Excellent; works out of the box with CDNs and reverse proxies, no client-side libraries needed, but many are available and useful. Poor; extra infrastructure is often necessary client side libraries are a practical necessity, specific patterns required to benefit from CDNs and browser caches.
Writing data Acceptable; HTTP semantics give some guidance but how specifics left to each implementation, one write per request. Excellent; how writes are handled is clearly defined by the spec, one write per request, but multiple writes is being added to the specification. Poor; how writes are handled is left to each implementation and there are competing best practices, it's possible to execute multiple writes in a single request.

If you're not familiar with JSON:API or GraphQL, I recommend you watch the following two short videos. They will provide valuable context for the remainder of this blog post:

Request efficiency

Most REST APIs tend toward the simplest implementation possible: a resource can only be retrieved from one URI. If you want to retrieve article 42, you have to retrieve it from https://example.com/article/42. If you want to retrieve article 42 and article 72, you have to perform two requests; one to https://example.com/article/42 and one to https://example.com/article/72. If the article's author information is stored in a different content type, you have to do two additional requests, say to https://example.com/author/3 and https://example.com/author/7. Furthermore, you can't send these requests until you've requested, retrieved and parsed the article requests (you wouldn't know the author IDs otherwise).

Consequently, client-side applications built on top of basic REST APIs tend to need many successive requests to fetch their data. Often, these requests can't be sent until earlier requests have been fulfilled, resulting in a sluggish experience for the website visitor.

GraphQL and JSON:API were developed to address the typical inefficiency of REST APIs. Using JSON:API or GraphQL, you can use a single request to retrieve both article 42 and article 72, along with the author information for each. It simplifies the developer experience, but more importantly, it speeds up the application.

Finally, both JSON:API and GraphQL have a solution to limit response sizes. A common complaint against typical REST APIs is that their responses can be incredibly verbose; they often respond with far more data than the client needs. This is both annoying and inefficient.

GraphQL eliminates this by requiring the developer to explicitly add each desired resource field to every query. This makes it difficult to over-fetch data but easily leads to very large GraphQL queries, making (cacheable) GET requests impossible.

JSON:API solves this with the concept of sparse fieldsets or lists of desired resource fields. These behave in much the same fashion as GraphQL does, however, when they're omitted JSON:API will typically return all fields. An advantage, though, is that when a JSON:API query gets too large, sparse fieldsets can be omitted so that the request remains cacheable.

REST JSON:API GraphQL
Multiple data objects in a single response Usually; but every implementation is different (for Drupal: custom "REST Export" view or custom REST plugin needed). Yes Yes
Embed related data (e.g. the author of each article) No Yes Yes
Only needed fields of a data object No Yes; servers may choose sensible defaults, developers must be diligent to prevent over-fetching. Yes; strict, but eliminates over-fetching, at the extreme, it can lead to poor cacheability.

Documentation, API explorability and schema

As a developer working with web services, you want to be able to discover and understand the API quickly and easily: what kinds of resources are available, what fields does each of them have, how are they related, etc. But also, if this field is a date or time, what machine-readable format is the date or time specified in? Good documentation and API exploration can make all the difference.

REST JSON:API GraphQL
Auto-generated documentation Depends; if using the OpenAPI standard. Depends; if using the OpenAPI standard (formerly, Swagger). Yes; various tools available.
Interactivity Poor; navigable links rarely available. Acceptable; observing available fields and links in its responses enable exploration of the API. Excellent; autocomplete feature, instant results or compilation errors, complete and contextual documentation.
Validatable and programmable schema. Depends; if using the OpenAPI standard. Depends; the JSON:API specification defines a generic schema, but a reliable field-level schema is not yet available. Yes; a complete and reliable schema is provided (with very few exceptions).

GraphQL has superior API exploration thanks to GraphiQL (demonstrated in the video above), an in-browser IDE of sorts, which lets developers iteratively construct a query. As the developer types the query out, likely suggestions are offered and can be auto-completed. At any time, the query can be run and GraphiQL will display real results alongside the query. This provides immediate, actionable feedback to the query builder. Did they make a typo? Does the response look like what was desired? Additionally, documentation can be summoned into a flyout, when additional context is needed.

On the other hand, JSON:API is more self-explanatory: APIs can be explored with nothing more than a web browser. From within the browser, you can browse from one resource to another, discover its fields, and more. So, if you just want to debug or try something out, JSON:API is usable with nothing more than cURL or your browser. Or, you can use Postman (demonstrated in the video above) — a standalone environment for developing on top of an any HTTP-based API. Constructing complex queries requires some knowledge, however, and that is where GraphQL's GraphiQL shines compared to JSON:API.

Operational simplicity

We use the term operational simplicity to encompass how easy it is to install, configure, run, scale and secure each of the solutions.

The table should be self-explanatory, though it's important to make a remark about scalability. To scale a REST-based or JSON:API-based web service so that it can handle a large volume of traffic, you can use the same approach websites (and Drupal) already use, including reverse proxies like Varnish or a CDN. To scale GraphQL, you can't rely on HTTP caching as with REST or JSON:API without persisted queries. Persisted queries are not part of the official GraphQL specification but they are a widely-adopted convention amongst GraphQL users. They essentially store a query on the server, assign it an ID and permit the client to get the result of the query using a GET request with only the ID. Persisted queries add more operational complexity, and it also means the architecture is no longer fully decoupled — if a client wants to retrieve different data, server-side changes are required.

REST JSON:API GraphQL
Scalability: additional infrastructure requirements Excellent; same as a regular website (Varnish, CDN, etc). Excellent; same as a regular website (Varnish, CDN, etc). Usually poor; only the simplest queries can use GET requests; to reap the full benefit of GraphQL, servers needs their own tooling.
Tooling ecosystem Acceptable; lots of developer tools available, but for the best experience they need to be customized for the implementation. Excellent; lots of developer tools available; tools don't need to be implementation-specific. Excellent; lots of developer tools available; tools don't need to be implementation-specific.
Typical points of failure Fewer; server, client. Fewer; server, client. Many; server, client, client-side caching, client and build tooling.

Writing data

For most REST APIs and JSON:API, writing data is as easy as fetching it: if you can read information, you also know how to write it. Instead of using the GET HTTP request type you use POST and PATCH requests. JSON:API improves on typical REST APIs by eliminating differences between implementations. There is just one way to do things and that enabled better, generic tooling and less time spent on server-side details.

The nature of GraphQL's write operations (called mutations) means that you must write custom code for each write operation; unlike JSON:API the specification, GraphQL doesn't prescribe a single way of handling write operations to resources, so there are many competing best practices. In essence, the GraphQL specification is optimized for reads, not writes.

On the other hand, the GraphQL specification supports bulk/batch operations automatically for the mutations you've already implemented, whereas the JSON:API specification does not. The ability to perform batch write operations can be important. For example, in our running example, adding a new tag to an article would require two requests; one to create the tag and one to update the article. That said, support for bulk/batch writes in JSON:API is on the specification's roadmap.

REST JSON:API GraphQL
Writing data Acceptable; every implementation is different. No bulk support. Excellent; JSON:API prescribes a complete solution for handling writes. Bulk operations are coming soon. Poor; GraphQL supports bulk/batch operations, but writes can be tricky to design and implement. There are competing conventions.

Drupal-specific considerations

Up to this point we have provided an architectural and CMS-agnostic comparison; now we also want to highlight a few Drupal-specific implementation details. For this, we can look at the ease of installation, automatically generated documentation, integration with Drupal's entity and field-level access control systems and decoupled filtering.

Drupal 8's REST module is practically impossible to set up without the contributed REST UI module, and its configuration can be daunting. Drupal's JSON:API module is far superior to Drupal's REST module at this point. It is trivial to set up: install it and you're done; there's nothing to configure. The GraphQL module is also easy to install but does require some configuration.

Client-generated collection queries allow a consumer to filter an application's data down to just what they're interested in. This is a bit like a Drupal View except that the consumer can add, remove and control all the filters. This is almost always a requirement for public web services, but it can also make development more efficient because creating or changing a listing doesn't require server-side configuration changes.

Drupal's REST module does not support client-generated collection queries. It requires a "REST Views display" to be setup by a site administrator and since these need to be manually configured in Drupal; this means a client can't craft its own queries with the filters it needs.

JSON:API and GraphQL, clients are able to perform their own content queries without the need for server-side configuration. This means that they can be truly decoupled: changes to the front end don't always require a back-end configuration change.

These client-generated queries are a bit simpler to use with the JSON:API module than they are with the GraphQL module because of how each module handles Drupal's extensive access control mechanisms. By default JSON:API ensures that these are respected by altering the incoming query. GraphQL instead requires the consumer to have permission to simply bypass access restrictions.

Most projects using GraphQL that cannot grant this permission use persisted queries instead of client-generated queries. This means a return to a more traditional Views-like pattern because the consumer no longer has complete control of the query's filters. To regain some of the efficiencies of client-generated queries, the creation of these persisted queries can be automated using front-end build tooling.

REST JSON:API GraphQL
Ease of installation and configuration Poor; requires contributed module REST UI, easy to break clients by changing configuration. Excellent; zero configuration! Poor; more complex to use, may require additional permissions, configuration or custom code.
Automatically generated documentation Acceptable; requires contributed module OpenAPI. Acceptable; requires contributed module OpenAPI. Excellent; GraphQL Voyager included.
Security: content-level access control (entity and field access) Excellent; content-level access control respected. Excellent; content-level access control respected, even in queries. Acceptable; some use cases require the consumer to have permission to bypass all entity and/or field access.
Decoupled filtering (client can craft queries without server-side intervention) No Yes Depends; only in some setups and with additional tooling/infrastructure.

What does this mean for Drupal's roadmap?

Drupal grew up as a traditional web content management system but has since evolved for this API-first world and industry analysts are praising us for it.

As Drupal's project lead, I've been talking about adding out-of-the-box support for both JSON:API and GraphQL for a while now. In fact, I've been very bullish about GraphQL since 2015. My optimism was warranted; GraphQL is undergoing a meteoric rise in interest across the web development industry.

Based on this analysis, for Drupal core's needs, we rank JSON:API above GraphQL and GraphQL above REST. As such, I want to change my recommendation for Drupal 8 core. Instead of adding both JSON:API and GraphQL to Drupal 8 core, I believe only JSON:API should be added. That said, Drupal's GraphQL implementation is fantastic, especially when you have the developer capacity to build a bespoke API for your project.

On the four qualities by which we evaluated the REST, JSON:API and GraphQL modules, JSON:API has outperformed its contemporaries. Its web standards-based approach, its ability to handle reads and writes out of the box, its security model and its ease of operation make it the best choice for Drupal core. Additionally, where JSON:API underperformed, I believe that we have a real opportunity to contribute back to the specification. In fact, one of the JSON:API module's maintainers and co-authors of this blog post, Gabe Sullice (Acquia), recently became a JSON:API specification editor himself.

This decision does not mean that you can't or shouldn't use GraphQL with Drupal. While I believe JSON:API covers the majority of use cases, there are valid use cases where GraphQL is a great fit. I'm happy that Drupal is endowed with such a vibrant contributed module ecosystem that provides so many options to Drupal's users.

I'm excited to see where both the JSON:API specification and Drupal's implementation of it goes in the coming months and years. As a first next step, we're preparing the JSON:API to be added to Drupal 8.7.

Special thanks to Wim Leers (Acquia) and Gabe Sullice (Acquia) for co-authoring this blog post and to Preston So (Acquia) and Alex Bronstein (Acquia) for their feedback during the writing process.

Drupal helps rescue ultra marathon runner

I'm frequently sent examples of how Drupal has changed the lives of developers, business owners and end users. Recently, I received a very different story of how Drupal had helped in a rescue operation that saved a man's life.

The Snowdonia Ultra Marathon website

In early 2018, Race Director Mike Jones was looking to build a new website for the Ultra-Trail Snowdonia ultra marathon. He reached out to a good friend and developer, Rob Edwards, to lead the development of the website.

A photo of a runner at the Ultra-trail Snowdonia ultramarathon
© Ultra-trail Snowdonia and No Limits Photography

Rob chose Drupal for its flexibility and extensibility. As an organization supported heavily by volunteers, open source also fit the Snowdonia team's belief in community.

The resulting website, https://apexrunning.co/, included a custom-built timing module. This module allowed volunteers to register each runner and their time at every aid stop.

A runner goes missing

Rob attended the first day of Ultra-Trail Snowdonia to ensure the website ran smoothly. He also monitored the runners at the end of the race to certify they were all accounted for.

Monitoring the system into the early hours of the morning, Rob noticed one runner, after successfully completing checkpoints one and two, hadn't passed through the third checkpoint.

A photo of a runner at the Ultra-trail Snowdonia ultramarathon
© Ultra-trail Snowdonia and No Limits Photography

Each runner carried a mobile phone with them for emergencies. Mike attempted to make contact with the runner via phone to ensure he was safe. However, this specific area was known for its poor signal and the connection was too weak to get through.

After some more time eagerly watching the live updates, it was clear the runner hadn't reached checkpoint four and more likely hadn't ever made it past checkpoint three. The Ogwen Mountain Rescue were called to action.

Due to the terrain and temperature, searching for the lost runner on foot would be too slow. Instead, the mountain rescue volunteers used a helicopter to scan the area and locate the runner.

How Drupal came to rescue

The area covered by runners in an ultra marathon like this one is vast. The custom-built timing module helped rescuers narrow down the search area; they knew the runner passed the second checkpoint but never made it to the third.

After following the fluorescent orange markers in the area pinpointed by the Drupal website, the team quickly found the individual. He had fallen and become too injured to carry on. A mild case of hypothermia had set in. The runner was airlifted to the hospital for appropriate care. The good news: the runner survived.

Without Drupal, it might have taken much longer to notify anyone that a runner had gone missing, and there would have been no way to tell when he had dropped off.

NFC and GPS devices are now being explored for these ultra marathon runners to carry with them to provide location data as an extra safety precaution. The Drupal system will be used alongside these devices for more accurate time readings, and Rob is looking into an API to pull this additional data into the Drupal website.

Stories about Drupal having an impact on organizations and individuals, or even helping out in emergencies, drive my sense of purpose. Feel free to keep sending them my way!

Special thanks to Rob Edwards, Poppy Heap (CTI Digital) and Paul Johnson (CTI Digital) for their help with this blog post.

Pulling the plug on Facebook

Facebook social decay
© Andrei Lacatusu

Exactly one year ago, I decided to use social media less and blog more. I uninstalled the Facebook application from my phone, but kept my Facebook account for the time being.

The result is that I went from checking Facebook several times a day to once or twice a month.

Facebook can't be trusted

At the time I uninstalled the Facebook application from my phone, Mark Zuckerberg promised that he would fix Facebook. He didn't.

The remainder of 2018 was filled with Facebook scandals, including continued mishandling of personal data and privacy breaches, more misinformation, and a multitude of shady business practices.

Things got worse, not better.

The icing on the cake is that a few weeks ago we learned that Facebook knowingly duped children and their parents out of money, in some cases hundreds or even thousands of dollars, and often refused to give the money back.

And just last week, it was reported that Facebook had been collecting users' data by getting people to install a mobile application that gave Facebook root access to their network traffic.

It's clear that Facebook can't be trusted. And for that reason, I'm out.

I deleted my Facebook account twenty minutes ago.

An image of Facebook's account deletion confirmation screen

Social media's dark side

Social media, in general, have been enablers of community, transparency and positive change, but also of abuse, hate speech, bullying, misinformation, government manipulation and more. In just the past year, more and more users have woken up to the dark side of social media. Open Web and privacy advocates, on the other hand, have seen this coming for awhile.

Technological change is a wonderful thing, as it can bring unprecedented improvements to billions around the globe. As a technologist, I believe in the power of the web to improve the world for many, but we also need to make sure that technology disruption is positive for all of us.

Last week, we heard that Facebook intends to further blend Instagram and WhatsApp with Facebook. If I were to guess, they want to make it harder to split up Facebook later (and harder for users to know what is happening with their data). Regulators should be all over this right now.

My social detox

I plan to stay off Facebook indefinitely, unless maybe there is a new CEO and better regulatory oversight.

I already stopped using Twitter to share personal updates and use it almost exclusively for Drupal-related updates. It remains a valuable channel to reach many people, but I wouldn't categorize my use as social anymore.

For now, I'm still on Instagram, but it's hard to ignore that Instagram is owned by Facebook. I will probably uninstall that next.

A call to rejoin the Open Web

Instant gratification and network effects have made social media successful, at the sacrifice of blogs and the Open Web.

I've always been driven by a sense of idealism. I'm optimistic that the movement away from social media is good for the Open Web.

Since I scaled back my use of social media a year ago, I blogged more, re-subscribed to many RSS feeds, and grew increasingly interested in the IndieWeb — all small shifts back to the Open Web's roots.

I plan to continue to work on my POSSE plan, and hope to share more thoughts on this topic in the coming weeks.

I'd love to see thousands more people join or rejoin the Open Web, and help innovate on top of it.

2019 Australian Open 'aces' the digital experience with Acquia and Drupal

Since I was young, I've been an avid tennis player and fan. I still play to this day, though maybe not as much as I'd like to.

In my teens, Andre Agassi was my favorite player. I've even sported some of his infamous headbands. I also remember watching him win the Australian Open in 1995.

In 2012, I traveled to Melbourne for a Drupal event, the same week the Australian Open was going on. As a tennis fan, I was lucky enough to watch Belgium's Kim Clijsters play.

Last weekend, the Australian Open wrapped up. This year, their website, https://ausopen.com, ran on Acquia and Drupal, delivered by the team at Avanade.

In a two-week timeframe, the site successfully welcomed tens of millions of visitors and served hundreds of millions of page views.

I'm very proud of the fact that many of the world's largest sporting events and media organizations (such as NBC Sports who host the Super Bowl and Olympics in the US) trust Acquia and Drupal as their chosen digital platform.

When the world is watching an event, there is no room for error!

A photo of the team that worked on the Australian Open website in 2019
Team Tennis Australia, Acquia and Avanade after the men's singles final.

Many thanks to the round-the-clock efforts from Acquia's team in Asia Pacific, as well as our partners at Avanade!

Acquia retrospective 2018

Every year, I sit down to write my annual Acquia retrospective. It's a rewarding exercise, because it allows me to reflect on how much progress Acquia has made in the past 12 months.

Overall, Acquia had an excellent 2018. I believe we are a much stronger company than we were a year ago; not only because of our financial results, but because of our commitment to strengthen our product and engineering teams.

If you'd like to read my previous retrospectives, they can be found here: 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009. This year marks the publishing of my tenth retrospective. When read together, these posts provide a comprehensive overview of Acquia's growth and trajectory.

Updating our brand

Exiting 2017, we doubled down on our transition from website management to digital experience management. In 2018, we updated our product positioning and brand narrative to reflect this change. This included a new Acquia Experience Platform diagram:

An image of the Acquia Platform in 2018
The Acquia Platform is divided into two key parts: the Experience Factory and the Marketing Hub. Drupal and Acquia Lightning power every side of the experience. The Acquia Platform supports our customers throughout the entire life cycle of a digital experience — from building to operating and optimizing digital experiences.

In 2018, the Acquia marketing team also worked hard to update Acquia's brand. The result is a refreshed look and updated brand positioning that better reflects our vision, culture, and the value we offer our customers. This included updating our tagline to read: Experience Digital Freedom.

I think Acquia's updated brand looks great, and it's been exciting to see it come to life. From highway billboards to Acquia Engage in Austin, our updated brand has been very well received.

A photo of an Acquia billboard at the Austin airport
When Acquia Engage attendees arrived at the Austin-Bergstrom International Airport for Acquia Engage 2018, they were greeted by an Acquia display.

Business momentum

This year, Acquia surpassed $200 million in annualized revenue. Overall new subscription bookings grew 33 percent year over year, and we ended the year with nearly 900 employees.

Mike Sullivan completed his first year as Acquia's CEO, and demonstrated a strong focus on improving Acquia's business fundamentals across operational efficiency, gross margins and cost optimization. The results have been tangible, as Acquia has realized unprecedented financial growth in 2018:

  • Channel-partner bookings grew 52 percent
  • EMEA-based bookings grew 103 percent
  • Gross profit grew 39 percent
  • Adjusted EBITDA grew 78 percent
  • Free cash flow grew 84 percent
A photo that summarizes Acquia's 2018 business results mentioned throughout this blog post
2018 was a record year for Acquia. Year-over-year highlights include new subscription bookings, EMEA-based bookings, free cash flow, and more.

International growth and expansion

In 2018, Acquia also witnessed unprecedented success in Europe and Asia, as new bookings in EMEA were up more than 100 percent. This included expanding our European headquarters to a new and larger space with a ribbon-cutting ceremony with the mayor of Reading in the U.K.

Acquia also expanded its presence in Asia, and opened Tokyo-based operations in 2018. Over the past few years I visited Japan twice, and I'm excited for the opportunities that doing business in Japan offers.

We selected Pune as the location for our new India office, and we are in the process of hiring our first Pune-based engineers.

Acquia now has four offices in the Asia Pacific region serving customers like Astellas Pharmaceuticals, Muji, Mediacorp, and Brisbane City Council.

A photo of an Acquia marketing one-pager in Japanese
Acquia product information, translated into Japanese.

Acquia Engage

In 2018, we welcomed more than 650 attendees to Austin, Texas, for our annual customer conference, Acquia Engage. In June, we also held our first Acquia Engage Europe and welcomed 300 attendees.

Our Engage conferences included presentations from customers like Paychex, NBC Sports, Wendy's, West Corporation, General Electric, Charles Schwab, Pac-12 Networks, Blue Cross Blue Shield, Bayer, Virgin Sport, and more. We also featured keynote presentations from our partner network, including VMLY&R, Accenture Interactive, IBM iX and MRM//McCann.

Both customers and partners continue to be the most important driver of Acquia's product strategy, and it's always rewarding to hear about this success first hand. In fact, 2018 customer satisfaction levels remain extremely high at 94 percent.

Partner program

Finally, Acquia's partner network continues to become more sophisticated. In the second half of 2018, we right sized our partner community from 2,270 firms to 226. This was a bold move, but our goal was to place a renewed focus on the partners who were both committed to Acquia and highly capable. As a result, we saw almost 52 percent year-over-year growth in partner-sourced ACV bookings. This is meaningful because for every $1 Acquia books in collaboration with a partner, our partner makes about $5 in services revenue.

Analyst recognition

In 2018, the top industry analysts published very positive reviews about Acquia. I'm proud that Acquia was recognized by Forrester Research as the leader for strategy and vision in The Forrester Wave: Web Content Management Systems, Q4 2018. Acquia was also named a leader in the 2018 Gartner Magic Quadrant for Web Content Management, marking our placement as a leader for the fifth year in a row.

Product milestones

An image showing a timeline of Acquia's product history and evolution
Acquia's product evolution between 2008 and 2018. When Acquia was founded, our mission was to provide commercial support for Drupal and to be the "Red Hat for Drupal"; 12 years later, the Acquia Platform helps organizations build, operate and optimize Drupal-based experiences.

2018 was one of the busiest years I have experienced; it was full of non-stop action every day. My biggest focus was working with Acquia's product and engineering team. We focused on growing and improving our R&D organization, modernizing Acquia Cloud, becoming user-experience first, redesigning the Acquia Lift user experience, working on headless Drupal, making Drupal easier to use, and expanding our commerce strategy.

Hiring, hiring, hiring

In partnership with Mike, we decided to increase the capacity of our research and development team by 60 percent. At the close of 2018, we were able to increase the capacity of our research and development team by 45 percent percent. We will continue to invest in growing our our R&D team in 2019.

I spent a lot of time restructuring, improving and scaling the product organization to make sure we could handle the increased capacity and build out a world-class R&D organization.

As the year progressed, R&D capacity came online and our ability to innovate not only improved but accelerated significantly. We entered 2019 in a much better position as we now have a lot more capacity to innovate.

Acquia Cloud

Acquia Cloud and Acquia Cloud Site Factory support some of the largest and most mission-critical websites in the world. The scope and complexity that Acquia Cloud and Acquia Cloud Site Factory manages is enormous. We easily deliver more than 30 billion page views a month (excluding CDN).

Over the course of 10 years, the Acquia Cloud codebase had grown very large. Updating, testing and launching new releases took a long time because we had one large, monolithic codebase. This was something we needed to change in order to add new features faster.

Over the course of 2018, the engineering team broke the monolithic codebase down into discrete components that can be tested and released independently. We launched our component-based architecture in June. Since then, the engineering team has released changes to production 650 times, compared to our historic pace of doing one release per quarter.

A graph that shows how we moved Acquia Cloud from a monolithic codebase to a component-based code base.
This graph shows how we moved Acquia Cloud from a monolithic code base to a component-based code base. Each color on the graph represents a component. The graph shows how releases of Acquia Cloud (and the individual components in particular) have accelerated in the second half of the year.

Planning and designing for all of these services took a lot of time and focus, and was a large priority for the entire engineering team (including me). The fruits of these efforts will start to become more publicly visible in 2019. I'm excited to share more with you in future blog posts.

Acquia Cloud also remains the most secure and compliant cloud for Drupal. As we were componentizing the Acquia Cloud platform, the requirements to maintain our FedRAMP compliance became much more stringent. In April, the GDPR deadline was also nearing. Executing on hundreds of FedRAMP- and GDPR-related tasks emerged as another critical priority for many of our product and engineering teams. I'm proud that the team succeeded in accomplishing this amid all the other changes we were making.

Customer experience first

Over the years, I've felt Acquia lacked a focus on user experience (UX) for both developers and marketers. As a result, increasing the capacity of our R&D team included doubling the size of the UX team.

We've stepped up our UX research to better understand the needs and challenges of those who use Acquia products. We've begun to employ design-first methodologies, such as design sprints and a lean-UX approach. We've also created roles for customer experience designers, so that we're looking at the full customer journey rather than just our product interfaces.

With the extra capacity and data-driven changes in place, we've been working hard on updating the user experience for the entire Acquia Experience Platform. For example, you can see a preview of our new Acquia Lift product in this video, which has an increased focus on UX:

Drupal

In 2018, Drupal 8 adoption kept growing and Drupal also saw an increase in the number of community contributions and contributors, both from individuals and from organizations.

Acquia remains very committed to Drupal, and was the largest contributor to the project in 2018. We now have more than 15 employees who contribute to Drupal full-time, in addition to many others that contribute periodically. In 2018, the Drupal team's main areas of focus have been Layout Builder and the API-first initiative:

  • Layout Builder: Layout Builder offers content authors an easy-to-use page building experience. It's shaping up to be one of the most useful and pervasive features ever added to Drupal because it redefines the how editors control the appearance of their content without having to rely on a developer.
  • API First: This initiative has given Drupal a true best-in-class web services API for using Drupal as a headless content management system. Headless Drupal is one of the fastest growing segments of Drupal implementations.
A photo of Acquia engineers, designers and product managers at Acquia Build Week 2018
Our R&D team gathered in Boston for our annual Build Week in June 2018.

Content and Commerce

Adobe's acquisition of Magento has been very positive for us; we're now the largest commerce-agnostic content management company to partner with. As a result, we decided to extend our investments in headless commerce and set up partnerships with Elastic Path and BigCommerce. The momentum we've seen from these partnerships in a short amount of time is promising for 2019.

The market continues to move in Acquia's direction

In 2019, I believe Acquia will continue to be positioned for long-term growth. Here are a few reasons why:

  • The current markets for content and digital experience management continues to grow rapidly, at approximately 20 percent per year.
  • Digital transformation is top-of-mind for all organizations, and impacts all elements of their business and value chain.
  • Open source adoption continues to grow at a furious pace and has seen tremendous business success in 2018.
  • Cloud adoption continues to grow. Unlike most of our CMS competitors, Acquia was born in the cloud.
  • Drupal and Acquia are leaders in headless and decoupled content management, which is a fast growing segment of our market.
  • Conversational interfaces and augmented reality continues to grow, and we embraced these channels a few years ago. Acquia Labs, our research and innovation lab, explored how organizations can use conversational UIs to develop beyond-the-browser experiences, like cooking with Alexa, and voice-enabled search for customers like Purina.

Although we hold a leadership position in our market, our relative market share is small. These trends mean that we should have plenty of opportunity to grow in 2019 and beyond.

Thank you

While 2018 was an incredibly busy year, it was also very rewarding. I have a strong sense of gratitude, and admire every Acquian's relentless determination and commitment to improve. As always, none of these results and milestones would be possible without the hard work of the Acquia team, our customers, partners, the Drupal community, and our many friends.

I've always been pretty transparent about our trajectory (e.g. Acquia 2009 roadmap and Acquia 2017 strategy) and will continue to do so in 2019. We have some big plans for 2019, and I'm excited to share them with you. If you want to get notified about what we have in store, you can subscribe to my blog at https://dri.es/subscribe.

Thank you for your support in 2018!

European Commission will start offering bug bounties for Open Source software

An image of a shield with the Drupal mascot

The European Commission made an exciting announcement; it will be awarding bug bounties to the security teams of Open Source software projects that the European Commission relies on.

If you are not familiar with the term, a bug bounty is a monetary prize awarded to people who discover and correctly report security issues.

Julia Reda — an internet activist, Member of the European Parliament (MEP) and co-founder of the Free and Open Source Software Audit (FOSSA) project — wrote the following on her blog:

Like many other organizations, institutions like the European Parliament, the Council and the Commission build upon Free Software to run their websites and many other things. But the Internet is not only crucial to our economy and our administration, it is the infrastructure that runs our everyday lives.

With over 150 Drupal sites, the European Commission is a big Drupal user, and has a large internal Drupal community. The European Commission set aside 89,000€ (or roughly $100,000 USD) for a Drupal bug bounty. They worked closely with Drupal's Security Team to set this up. To participate in the Drupal bug bounty, read the guidelines provided by Drupal's Security Team.

Over the years I've had many meetings with the European Commission, presented keynotes at some of its events, and more. During that time, I've seen the European Commission evolve from being hesitant about Open Source to recognizing the many benefits that Open Source provides for its key ICT services, to truly embracing Open Source.

In many ways, the European Commission followed classic Open Source adoption patterns; adoption went from being technology-led (bottom-up or grassroots) to policy-led (top-down and institutionalized), and now the EU is an active participant and contributor.

Today, the European Commission is a shining example and role model for how governments and other large organizations can contribute to Open Source (just like how the White House used to be).

The European Commission is actually investing in Drupal in a variety of ways — the bug bounty is just one example of that — but more about that in a future blog post.

db