Dries Buytaert

Automatically exporting my Drupal content to GitHub

This page is part of my digital garden. It is more like a notebook entry than a polished blog post. It's a space where I document learnings primarily for my own reference, yet share them in case they benefit others. Unlike my blog posts, these pages are works-in-progress and updated over time. Like tending to a real garden, I periodically refine its content. I welcome suggestions for improvements at dries@buytaert.net.

This note is mostly for my future self, in case I need to set this up again. I'm sharing it publicly because parts of it might be useful to others, though it's not a complete tutorial since it relies on a custom Drupal module I haven't released.

For context: I switched to Markdown and then open-sourced my blog content by exporting it to GitHub. Every day, my Drupal site exports its content as Markdown files and commits any changes to github.com/dbuytaert/website-content. New posts appear automatically, and so do edits and deletions.

Creating the GitHub repository

Create a new GitHub repository. I called mine website-content.

Giving your server access to GitHub

For your server to push changes to GitHub automatically, you need SSH key authentication.

SSH into your server and generate a new SSH key pair:

ssh-keygen -t ed25519 -f ~/.ssh/github -N ""

This creates two files: ~/.ssh/github (your private key that stays on your server) and ~/.ssh/github.pub (your public key that you share with GitHub).

The -N "" creates the key without a passphrase. For automated scripts on secured servers, passwordless keys are standard practice. The security comes from restricting what the key can do (a deploy key with write access to one repository) rather than from a passphrase.

Next, tell SSH to use this key when connecting to GitHub:

cat >> ~/.ssh/config << 'EOF'
Host github.com
  IdentityFile ~/.ssh/github
  IdentitiesOnly yes
EOF

Add GitHub's server fingerprint to your known hosts file. This prevents SSH from asking "Are you sure you want to connect?" when the script runs:

ssh-keyscan github.com >> ~/.ssh/known_hosts

Display your public key so you can copy it:

cat ~/.ssh/github.pub

In GitHub, go to your repository's "Settings", find "Deploy keys" in the sidebar, and click "Add deploy key". Check the box for "Allow write access".

Test that everything works:

ssh -T git@github.com

You should see: You've successfully authenticated, but GitHub does not provide shell access.

The export script

I created the following export script:

#!/bin/bash
set -e

TEMP=/tmp/dries-export

# Clone the existing repository
git clone git@github.com:dbuytaert/website-content.git $TEMP
cd $TEMP

# Clean all directories so moved/deleted content is tracked
rm -rf */

# Export fresh content older than 2 days
drush node:export --end-date="2 days ago" --destination=$TEMP

# Commit and push if there are changes
git config user.email "dries+bot@buytaert.net"
git config user.name "Dries Bot"
git add -A
git diff --staged --quiet || {
    git commit -m "Automatic updates for $(date +%Y-%m-%d)"
    git push
}

rm -rf $TEMP

The drush node:export command comes from a custom Drupal module I built for my site. I have not published the module on Drupal.org because it's specific to my site and not reusable as is. I wrote about why that kind of code is still worth sharing as adaptable modules, and I hope to share it once Drupal.org has a place for them.

The two-day delay (--end-date="2 days ago") gives me time to catch typos before posts are archived to GitHub. I usually find them right after hitting publish.

The git add -A stages everything including deletions, so if I remove a post from my site, it disappears from GitHub too (though Git's history preserves it).

Scheduling the export

On a traditional server, you'd add this script to Cron to run daily. My site runs on Acquia Cloud, which is Kubernetes-based and automatically scales pods up and down based on traffic. This means there is no single server to put a crontab on. Instead, Acquia Cloud provides a scheduler that runs jobs reliably across the infrastructure.

And yes, this note about automatically backing up my content will itself be automatically backed up.