Moving my Blog from Wordpress to Github Pages

Written on February 10, 2021 (5 min read)

Modified on Jan 5, 2022

While I was still working as a Developer Advocate at IBM, I have maintained a blog on Wordpress.com. Now that I retired, I don’t blog much. So I decided to let the Wordpress subscription expire by the end of this year, 2021. But I didn’t want to trash all I wrote so I started to play with Github Pages, Jekyll, and other tools. As you can see I have successfully moved my blog to Github Pages, now.

Moving Image by Peggy und Marco Lachmann-Anke on Pixabay

I have used Github Pages before to write the instructions for workshops but have always used one of the Github built-in themes. But they don’t work well for blogs. There are many other, Jekyll-based themes that can be used with Github Pages and work for blogs.

1. Prepare a Github repository

First of all you need a Github public repository named yourgithubusername.github.io.

If the first part of the repository doesn’t exactly match your username, it won’t work, so make sure to get it right.

The full URL of my repository is https://github.com/haralduebele/haralduebele.github.io and Github Pages will serve its content on https://haralduebele.github.io. This is called a user or organisation site.

2. Select a theme for Github Pages

The one I selected is called “Reverie”. I tried it, liked it, modified it and that is what you are looking at right now. The README has great setup instructions.

You need to modify _config.yml, too, before you can see something meaningful.

An important and not too obvious change is the permalink:

permalink: /:year/:month/:day/:title/

This duplicates the URL format for blog posts from Wordpress.com.

Once you commit and push your changes, it will take a moment and then you can view your new site.

3. Pack your crates

You can export your content on Wordpress.com under ‘Tools’ - ‘Export’.

I choose to export all content and export the media library:

wp export

Exported content will be a ZIP file with a XML document. The exported media library is a TAR file that contains the images, etc. sorted in folders by year and month.

What do you do with the huge Wordpress XML? Somebody (Will Boyd, lonekorean) already thought of that:

4. Convert Wordpress XML to MarkDown

I found a pretty good tool here.

Using it is pretty straightforward using the instructions in the repository. It requires Node.js 12.14 or later.

Unpack the Wordpress XML from the ZIP file into the root of this repository, run the script node index.js, and answer the questions.

I had it create folders for years and months. Output looks something like this:

wp convert

index.md is the actual post. If there is an images folder, it will contain all the images the tool was able to grab or scrape from the XML.

5. Complete the conversion

“wordpress-export-to-markdown” does a pretty good job but it does require moving files and some manual touch up to the blog posts.

a. File names

In Jekyll or Reverie respectively, blog entries go into the _posts directory. They need to follow a specific name schema: yyyy-mm-dd-name.md. The conversion tool creates a name like this for the folders but not for the actual md files. They are all called index.md. So you need to rename the files before you copy them over to the _posts directory. I have created year directories under _posts to make them a little easier to organize.

posts directory

b. Images

The images from the images folders go to the images folder in your new repo. I created year folders and month folders under the year folders to make it manageable. I believe that the XML files didn’t contain all images when I exported/converted. But you always have the media export that should contain all the images.

images directory

c. Frontmatter

The exported index.md files contain frontmatter pulled from Wordpress:

---
title: "Serverless and Knative - Part 1: Installing Knative on CodeReady Containers"
date: "2020-06-02"
tags: 
  - "knative"
  - "kubernetes"
  - "serverless"
---

But you must add some more. This is what I usually have there, e.g:

---
layout: post
title: "Serverless and Knative - Part 1: Installing Knative on CodeReady Containers"
date: "2020-06-02"
categories: [Knative,Kubernetes,Serverless]
published: false
---

You need “layout: post” and you can add “categories” which will show up in the post and you can display all your blog entries sorted by categories.

Change published: false to published: true to make the post visible.

Image links should look like this:

![description](/images/yyyy/mm/imagename.ext)

This assumes that you also sort your images into years and months folders.

On Wordpress I sometimes used subtitles under images. In the converted blog entries, the subtitles are simply text, which doesn’t really look good. I use a code block like this:

The text is then centered, smaller, and in italics
{:style="color:gray;font-style:italic;font-size:90%;text-align:center;"}

The text is then centered, smaller, and in italics

Github markdown cannot do this at its own. But you can simply add the HTML code ({:target=”_blank”}) to the link:

[Link Text](https://url){:target="_blank"}
f. Syntax highlighting

The Reverie theme uses Pygments/Dracula to highlight code in preformatted sections. I found this to be helpful, especially with quoted YAML.

```sh
$ this would be shell commands
```

```yaml
and:
  this:
    - would:
        be: yaml
```
g. Escape characters

Look out for backslashes \ and remove them, they are not needed.

6. Changes to the Theme

I made modifications to the theme, e.g. I changed the font family in style.scss to IBM Plex because that is my favorite font.

I added “read time” to my posts based on this article.

Instead of the search page that is part of the Reverie theme I created an archive page that lists all my blogs sorted by year. This is based on Rafa Garrido’s answer in this Stackoverflow question.

And some more stuff … you can go over your top once you figured out how Jekyll works.

Update: Comments section

Github Pages uses Jekyll to create a static site. This means you can’t include logic which would be needed to add comments.

I looked at Disqus, the Reverie theme I use is enabled for Disqus. It is an external service and the pages with Disqus added seem to get very heavy and heavily tracked, too.

I read about the idea to use Github Issues to store the comments. I like this idea and looked at several examples. Then I found utterances. It is a Github App that you install in your repository, you do a little configuration, add a piece of code to the post.html. That’s it. It just works. And its Open Source, too. Data is stored as Github Issues, there is no tracking and no ads. So this is what you see below.


Share this on Twitter
Leave a comment