Close Date Expand Location Next Open/Close Previous 0.5 of 5 stars 1 of 5 stars 1.5 of 5 stars 2 of 5 stars 2.5 of 5 stars 3 of 5 stars 3.5 of 5 stars 4 of 5 stars 4.5 of 5 stars 5 of 5 stars Repeat Slide Current slide

V7: The Great Data Migration

Bringing it all home

I’ve done a lot of work on the site in the last two months, and a launch date, while still a ways off, is finally coming into focus. I’ve been working on this redesign very intermittently for over four years now, but at this point I expect to keep at it until it’s done, with as little interruption as possible.

Among other recent advances, I’ve moved the site from Jekyll to Eleventy, chosen a font family, and designed and built out the front end for several core components and templates, all of which I’ll hopefully discuss in more depth soon. But today’s milestone is about content: At long last, I’ve finished reformatting all remaining external data! Decades of social media posts and other assorted falderal are all neatly packaged into thousands of Markdown files, ready to be published all in one place for the first time. It wasn’t so long ago that I had no idea how I was going to accomplish this, so I’m pretty stoked.

All the reformatting was done locally with Node.js, using my established metadata structure and site map as a guide for the finished files. I did my best to avoid replicated content, so for anything that was cross-posted between, say, Twitter and Instagram, the primary post was kept and the dupes removed. The structure and quality of the various platforms’ exported data varied quite a bit, so I had to create unique processes for reformatting each one (although there was a lot of code they shared). Some notes about how it went down:

  • Dribbble: Downloading your Dribbble data gets you a single JSON file and no images, which is disappointing to say the least. The data does include the URLs for the images, and if I had been a heavier Dribbble user, I would have used this opportunity to finally become proficient at web scraping. But since there were only a few dozen images, I downloaded them manually. The data export also didn’t include anything about rebounds (which are essentially threads), but that was easy enough to clean up by hand too. Otherwise this one was pretty straightforward.
  • Flickr: Rich data, easy to work with, and included everything I needed except the image dimensions, which I was able to get easily with the image-size Node module. Flickr’s export gives you the original high-res media you uploaded, and while I expect eleventy-img to handle processing for my static images, I’ll probably have to compress the videos myself using Adobe Media Encoder or some such. There’s only about 50 of them and they’re all short, so that shouldn’t be a big deal.
  • Goodreads: Lots of data I didn’t really need (around preferences, followers, etc.), and the stuff I did need was missing some core things, like authors and publication dates. Like Dribbble, I didn’t use Goodreads that much, so I was able to handle that stuff (and download book cover images) manually.
  • Google Reader: I stumbled on this data last year, a decade after I downloaded it just before Reader shut down, and realized I could turn it into link posts on my site. Of the close to 1,000 things I shared on Reader, I decided only to migrate the 224 I added notes to, and the data was easy to work with. My notes abruptly stop in October of 2011, nearly two years before Reader’s demise, so I have to wonder if a bunch of stuff is missing, but if so, I don’t suppose there’s much to be done about it at this point. Thanks to some sites’ use of feed proxies and redirects, a lot of the links I shared with Reader are now broken, which is a bummer.
  • Instagram: Ugh. Death by a thousand paper cuts with this one. I had the option to get my data in JSON and/or HTML, and each one contained information the other didn’t, so I needed both of them to get it all (which still wasn’t everything). Inconsistent formatting between IGTV videos, reels, posts, and stories; convoluted Unicode entities I couldn’t decode; tagged users omitted from posts with multiple images/videos; NO FUCKING PERMALINKS?! I had to jump through so many hoops to get everything to a decent place. The only other data source I had to spend more time with was Twitter, and that was only because that was my first Node project. At least this experience was consistent with my extremely low opinion of Meta and Instagram.
  • iTunes: The desktop app formerly known as iTunes has suffered greatly in many ways in the nine years since Apple Music started up, especially for people like me who still maintain a local music library, but luckily you can still easily export a very detailed XML file of all your data. For me, that data goes back to the very beginning of iTunes in 2001, and in 2004 I finally ripped all my CDs to MP3. This means I have reliable data about when albums were originally added to my music library from 2005 on, which I’m happy to be able to put on my site. The process got a little messy (especially when non-ASCII characters were for some reason encoded differently in directory names than they were in the MP3s’ ID3 tags), but went fairly quickly, and generating JPGs from Base64 album cover data embedded in the MP3s (using jsmediatags) was especially satisfying.
  • Letterboxd: I took an initial stab at reformatting my Letterboxd film diary with Python awhile back, but that was before I decided to include directors and film posters. Letterboxd’s data export is great, but it doesn’t include directors or posters, so I had my friend Jon help me use the TMDB API to get them, which we were able to do over a weekend.
  • Twitter: Twitter’s data export is pretty fantastic (or at least it used to be—Elon has probably ruined it by now), with two exceptions: it doesn’t give you the highest quality versions of your media files, and it doesn’t include alt text. Some blessed soul wrote a Python script I’ve since lost track of that gets the media files for you, which I used right after I quit Twitter, so that was great. As for getting the alt text, there’s surely a better method than mine, but I filtered my timeline by media, went through every post since the platform added alt text capabilities in March of 2016, and copied and pasted. (All told, I had 164 images with alt text.) My final files only include tweets that aren’t retweets and aren’t part of a conversation.

All posts in this series

V7: Introduction

Redesigning my site in public

Welcome to RobWeychert.com V7! There are a number of new things I want to try with my site, from structure to aesthetics to code, and so it’s time to begin a fresh redesign. Inspired by my friends Jonnie and Frank, I’ve decided to do it in public from the ground up. I’m starting with bare-bones HTML and as the design process unfolds, each step will be reflected on the site in real time and documented… See more →

Go to this post

V7: The “viewport” meta tag

Apparently it is still necessary!

The first thing I did when setting up this new version of my site was to put together some minimum viable HTML templates. Here’s the blog post template:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title><!--POST TITLE--> | RobWeychert.com V7</title>
    <meta name="description" content="<!--POST DESCRIPTION-->" />
    <link rel="alternate" type="application/rss+xml" title="RobWeychert.com V7" href="/index.rss"/>
  </head>

  <body>
    <… See more →
Go to this post

V7: Content priorities

Making my projects more visible

I added a tiny bit of CSS to aid readability by keeping line lengths in check on larger viewports:

body {
  margin: 0 auto;
  max-width: 75ch;
  padding: 1rem;
}

When calling the CSS file from the page head, I include a query string based on today’s date, which I’ll update when the CSS is updated. This will let updates get past the browser’s cache.

<link rel="stylesheet" href="/assets/css/main.css?20200108" />

Hopefully this small stylistic addition will keep things tidy enough until I properly begin the visual… See more →

Go to this post

V7: Structural challenges

The ambitous scope of the timeline section

Most of this redesign’s structural challenges pertain to the timeline section, previously described thusly:

  • Timeline: The blog on the current version of my site, V6, collects most of what I’ve written for public consumption since 2001 across nearly 40 different sources. I’d like to expand that to include even more sources and content types, collecting virtually everything I’ve shared online in one sprawling, sortable/filterable timeline.

Since the projects section is a higher priority and the new… See more →

Go to this post

V7: Timeline section inventory

Untangling the content

Progress on the redesign has slowed, partly because I’ve been busy with other things, and partly because, frankly, the open questions about the timeline section enumerated in my previous post are an intimidating mess, a perfect example of the early stages of the Design Squiggle.

In a fight or flight situation like this, here are the arguments for flight:

  • “Uh, the timeline isn’t even your top priority for the site, remember? What’s more important: working on… See more →
Go to this post

V7: The timeline is taking shape

Making progress with sketches, wireframes, and a prototype

Though it’s mostly taken place in scattered, stolen moments, I’ve made a lot of progress on the UX of the timeline section, much of which was still a disconcerting mystery not so long ago.

With the help of the data categories and content inventory I established in the previous post, I’ve settled on a binary timeline concept: each post is either small or large. Small posts consist of up to 100 words and/or up to… See more →

Go to this post

V7: On dependency

How I incorporate other people’s work into my own—and how I don’t

I might have expected quarantine life to be a boon to my site’s redesign process since most of my preferred social distractions were nullified. Instead, I’ve been using the time in isolation to make music videos, finalize a home purchase, move into said home, and try to find my place in our national reckoning on racism and public safety reform. But as I slowly shift some of my attention back to the redesign, I’ve been… See more →

Go to this post

V7: Choosing a CMS

Do my new content requirements need a new content management system?

For awhile, I had basically resigned myself to the idea that the massive amount of stray content I’m planning to bring home (thousands of tweets, Flickr photos, etc) would necessitate moving my site onto a LAMP stack CMS. I started poking around in WordPress, which I hadn’t touched in years, and Craft, which I use regularly in my work at ProPublica. The former felt bloated and the latter’s setup presumed a level of back-end know-how… See more →

Go to this post

V7: Beginning data migration

Prepping hundreds of tiny blog posts for republishing

Apropos of nothing, I decided that the first of the old entries I’d bring over to V7 would be granular ones:

  • Daily Haiku: A section of the fourth version of my site, beginning back in 2005. As the name suggests, I wrote a haiku every weekday based on the Dictionary.com Word of the Day. Each haiku was originally its own entry, but when I brought them over to V6 a few years ago, I consolidated… See more →
Go to this post

V7: Renewed purpose

Goodbye, Twitter

It’s been nearly two years since I posted an update on this project! I’ve been moving it forward slowly and quietly since then, and I’ll share some details about those activities in due time, as well as details about how work and life changes have introduced new and different demands on my time and somewhat expanded the scope of the site. But for now, the most important takeaway is that my fundamental vision for V7… See more →

Go to this post

V7: The Procrastination Destination

Working on my site instead of yours

I’ve given my V7 redesign project the unofficial tagline “The Procrastination Destination” since the significant progress it’s seen in the past few months has come mostly in stolen moments, some of which turned into extremely productive (and perhaps troublingly obsessive) deep dives. This recent movement has been pretty non-linear, and the tasks in play are all interdependent enough that none of them are really done until all of them are, but I seem to be… See more →

Go to this post

V7: Eleventy it is

Switching static site generators

Every static site generator has idiosyncrasies, and Eleventy is no different. As is the case pretty much any time I try out software, I find that Eleventy often does things differently than I think it ought to, and it doesn’t always make itself as clear as I think it could. A couple of examples:

  • Eleventy has no built-in mechanism for date-based archives. A common blogging convention I’ve adhered to for many years involves organizing post… See more →
Go to this post

V7: Expanding scope

Bringing more data and functionality into the mix

In my previous post, I mentioned Tinnitus Tracker, my standalone concert diary site which can be browsed by genre, artist, venue, city, state, and year. I had been planning to continue updating that site concurrently with V7, but it recently occurred to me that it makes a lot more sense to just consolidate the two sites, which in hindsight seems incredibly obvious.

For one thing, I’ve never been satisfied with the Tinnitus Tracker design, and… See more →

Go to this post

V7: Metadata structure and sitemap

Solidifying the information architecture

I’ve been revising a metadata structure for blog posts and a sitemap for a few months now, and since I haven’t felt the need to tweak either of them in awhile, they’re probably solid enough to document here.

Metadata structure

The blog post metadata has been developed to accommodate a wide variety of post types, to give me a lot of flexibility in how to present them, and to give users a lot of options… See more →

Go to this post

V7: The Great Data Migration

Bringing it all home

I’ve done a lot of work on the site in the last two months, and a launch date, while still a ways off, is finally coming into focus. I’ve been working on this redesign very intermittently for over four years now, but at this point I expect to keep at it until it’s done, with as little interruption as possible.

Among other recent advances, I’ve moved the site from Jekyll to Eleventy, chosen a font… See more →

Go to this post

V7: The Great Data Migration, Part 2

Once more, with feeling

From the beginning, it was clear that data migration was going to be this redesign’s biggest, most cumbersome task, as the site was growing from 600-some blog posts to untold thousands. I assumed that reformatting the mountain of data arriving in disparate configurations from over a dozen external sources (as described in my previous post) would be the lion’s share of the work, and it would be smooth sailing from there. How wrong I was!… See more →

Go to this post

V7: Launch day

Expanded site, new design, same me

I started redesigning this site in January of 2020. Remember January of 2020? We didn’t know we were living in the Before Times. There were still a few people in the White House who weren’t Fox News hosts or meme coin shills or raw milk evangelists. Our tech bro billionaires hadn’t yet entered the endgame of their persistent campaign to annihilate whatever sense of objective reality we once shared. We were so young.

I wouldn’t… See more →

Go to this post