By the time you read this, I'll have been running Weir as my full-time RSS reader for two and
a half weeks, starting on July 1. It's going well! Having just added OPML export, so that I
can switch if it stops being worth the trouble, I've had a chance to sit back and consider
some lessons learned from the project.
- Eight megabytes does not seem like a lot of data these days, but it adds up. Before I
started culling feeds (and before I added Gzip support to the request service), Weir was
pulling down roughly eight megs of data with each fetch, at ten minute intervals. That's
48MB per hour, a small amount that adds up to over a gigabyte per day. By
default, I get 24 gigs per month of transfer allowance on my server. Something had to go.
It's interesting to note, by the way, that this is something that RSS services like Newsblur
or Feedly don't worry so much about, because the cost of each feed is spread across all
subscribers. I didn't cost Google Reader as much traffic as Weir requires on its own.
- So I started unsubscribing. The majority of my original subscription list in Google
Reader came from Paul Irish's front-end feed collection, and while I had already
unsubscribed from the crazy people, it turns out most of the other blogs in that collection
were dead. Even with 304 support added in, Weir was downloading a ton of RSS, only to
discard much of it as being past the configured expiration date. I don't think this means
blogging is a thing of the past, personally, but it's clearly down from its heyday in favor
of social services, particularly (in the technical community) Google+.
- That said, using Feedburner seems to be a clear indication that you weren't that
interested in blogging anyway, because it makes up a disproportionate amount of the
abandoned or simply broken RSS feeds on my list. I suspect this is because it signals a lack
of ownership. If you care about your feed, you maintain it yourself.
- Even with feeds that work, sometimes connections fail, or things break, just because
it's the wild and crazy web out there. Taking that into consideration from the start, and
tracking the last result for every feed, was one of the smarter things I did. I should
probably be tracking more, but I'm too lazy to do real logging.
- Feeds are messy, and sanitization is hard. People inject all kinds of styles into their
RSS. They include height and width attributes that don't play well with mobile. They put
things into tables. They load scripts that I don't want to run, and images that I'd like to
defer until their containing post is activated. Right now, I'm using
document.implementation.createHTMLDocument() to make a functional (but "dead")
DOM, then running a sanitization task over that, but figuring out that process--and making
it watertight--has not been easy.
- In fact, working with RSS--ostensibly a "machine-readable" format--tends to drive home
just how porous the web can be, and how amazing it is that it works at all. Take RSS date formats discovered by the developer of
another reader web app, for example. I'm relatively isolated from the actual parsing, but
there's code in Weir to work around buggy HTTP responses, missing feed information, and
weird characters. Postel's
Law has a lot to answer for, in my opinion.
- "Worse-is-better" really works for me as a personal development philosophy. My priority
has been to get things running, no matter how badly--hacks get added to the .plan file and
addressed later on, when I have time to figure out a graceful solution. This has kept my
momentum high, and ensured that I don't get bogged down with architecture that I might not
even need.
- The value of open-source software for this project can't be overstated. There's no
way I could have built Weir on my own. In addition to Node, of course, I'm using a number of
open-source modules for parsing feeds, handling two-factor auth, and storing posts in the
database. Because of open source, I can patch those various libraries together, add my own
code on top, and have a newsreader application that does everything I need. Weir doesn't
stand on the shoulders of giants--it stands on the shoulders of countless other people, each
giving a little bit back to the wider community.