A few thoughts about RSS news readers from someone who thinks about them way more than you probably do
Last weekend, while I was putting together the list of Google Reader alternatives, I discovered more than a few projects and services that have died or entered a sort of zombie state, available but not updated for years. Dead links, home pages announcing end-of-service, or sites left just as they were years ago with unchanged blog updates last dated circa 2007. You may not realize it, but news readers are a wasteland of lost dreams.
Some had all the bells and whistles that someone looking for a Reader alternative today would be looking for. For example, I ran into this video of Streamy - which was launched in 2009 and gone only a year later - which shows a service that would probably be getting as much attention as Feedly is getting now if were still live. There are several others just like it. Bloglines came close to closing as well before it was sold, but then was never really updated or promoted. Those are just the web-based readers, the number of desktop clients and open source projects is even bigger.
I don't think it's unusual for technologies to have a definite lifespan - they become the next big thing, they get popular, then they fall by the wayside as other technologies take their place because of improvements or simply a change in taste. Instant messaging would be a great example of that: I used to be logged into at least three different services at all times, but it's been years since I've even had it running regularly. Various mobile technologies supplanted IM: First SMS, then services like WhatsApp and SnapChat. It didn't disappear entirely, but I wouldn't consider it an area with a lot of potential any more either.
But news readers seem different - they've never really gone away. Even the transition to the "Post PC" era over the past half-decade didn't fundamentally change the technology or people using it. There are dozens and dozens of Reader clients in the App store and on Android for mobile and tablets, and they all basically work the same way as a desktop reader from 2003. After the Google Reader announcement, the outcry over the loss of this seemingly innocuous service was deafening, and the stampede of developers racing to provide replacements has yet to stop. It is in fact an incredibly useful technology that millions of people use every day, and an area that still fires the imaginations of geeks everywhere (yours truly included).
And yet...It's very plain from looking at the past decade or so that the success of this technology has been limited at best, with more than its fair share of disappointments. What's the disconnect? Why haven't news readers ever been really successful, or at least profitable enough to keep more than just a handful of options alive at any one time?
Well, I have a guess: News readers simply don't seem to have any sort of viable business model - they never have, and I'm really not sure if they ever will. And if there's no real money to be made in a technology, it just can't survive, no matter how useful or worthwhile it might be.
I think Google's abandoning Reader is pretty compelling proof of this. Even taking into account their overall strategy of promoting Google+, and their general excuses about the focusing on bigger markets, if Reader was profitable (or had any real chance of becoming profitable), they would have kept it alive. If for no other reason than to shut up the Wall Street folk who are constantly bitching about diversifying income, or more recently, the lack of ROI from Android. Considering the paucity of resources - both man and machine - they needed to dedicate to it, the Google leadership must have really considered it to be a complete dead end.
And they're probably right.
(I've said lots of bad things about Google, but being stupid is not generally one of them. Well, except for the guys making the Android version of Chrome. They're truly idiots. But I digress...)
Basically, news readers as they are implemented today, are fundamentally broken for commercial purposes. There are a few reasons for this, both cultural and technological. Primarily, the core technology itself (polled or pushed RSS/Atom XML feeds) is brittle, bloated and bewildering, and to make matters worse, the benefits of using it are pretty unclear to just about anyone outside the most heads-down techie.
This hasn't been helped by the fact that the browser makers have done their absolute best to make news feeds as baffling as possible. Either by ignoring them, treating them as errors, displaying them in new windows or in a variety of random layouts, shunting them off to other applications that may or may not exist, or worse. And of course, no two browsers (or even two versions of the same browser) do the *same* confusing thing to feeds, so even after years there's not even a small fraction of normal web users who are comfortable with them.
For those on the other side of the pipe - content and service providers - the reasons to provide feeds has always been a bit uncertain, especially considering how little of their audience actually uses them at all. Feeds are generally a pain to set up and maintain, hard to track, almost impossible to monetize, and is ripe for abuse by spam sites or even more legitimate 'aggregators' like the Huffington Post and their ilk. I think this is what's finally leading to a general breakdown of the feed "ecosystem" with a growing number of sites not bothering with news feeds at all.
Amazingly, online news readers are somehow supposed to sit in between these two bewildered parties in hopes of providing a useful service! They end up serving a relatively small number of users, yet still have to gather, store, organize and deliver vast quantities of information, over which they have little control and even less rights. And the commercial services have to somehow do it while trying to make money (or at least not lose too much). Given all that, it's not a surprise there's been so many failures in the past, or that Google decided to just not bother with it any more.
Let me break the issues down a bit more - like I said, the problems are both cultural and technological, and many times the two intermingle in such a way as to make it somewhat hard to get a big picture.
Rights. News readers don't own the content that they're gathering, and the rights to redistribute that content (even in summary form) are murky at best. This means monetization options are limited (no direct advertising on aggregated content), and potential licensing costs from content providers could be high - see AP vs. Meltwater for more on that. This is probably the core reason news readers are broken, and the most likely reason Google decided to close, rather than expand, Reader. Given their legal problems around the world with Google News and Google Books, my guess is Google simply didn't want to wade into another content-rights quagmire had they decided to grow Reader in the future.
Access. Content providers are under pressure to find revenue and are increasingly walling off content behind pay-walls and custom magazine-style 'apps'. It won't be long before these sites stop providing news feeds completely, as the ROI for providing them has always been questionable at best. Web based news readers have always had the problem of not being able to access password protected content and Intranet content as well, reducing their potential functionality considerably. Finally, sites like Twitter, Facebook and Google+ have all shown clearly that access to their users and user generated content is going to be limited at best, or completely forbidden. Access issues already severely limit news readers today, and will continue to do so in the future, steadily degrading their overall usefulness.
Alternatives. In addition to the access problems stated above, social networks are constantly expanding their functionality. As their home streams have become more sophisticated (cards, apps, etc.) much of the utility of a news reader is being integrated into the big three's core services, and in fact, improved upon by the implicit relevancy of the shared content. As a result, social networks view news readers as direct competitors, have limited access as I talked about above, and will most definitely take actions (technical, legal) against any potential workarounds in the future. And, since content providers can see the traffic (both in quality and quantity) coming from social sources rather than news feeds, less effort naturally is given to providing feeds, which creates a sort of negative feedback loop for news readers.
Technology. Let's face it, RSS and Atom are really crappy formats to deliver web content updates. XML is brittle and bloated, parsing is painful, meta-data standardization is minimal (here's an example) and updates are all or nothing. Again, regular users have never, ever understood the concept of newsfeeds, and the browser makers have never agreed on how to handle them, leading to general chaos and confusion for everyone involved. Servers (or cloud services) are expensive, and RSS/Atom feeds are incredibly inefficient, making bandwidth and storage costs non-trivial (though, admittedly, a lot less expensive than it used to be).
Market: Finally, despite the millions of soon-to-be ex-Reader users, the total numbers are not overly impressive, especially compared with other potential markets like mobile apps or social services (both in the billions). My guess is that the only viable long-term business model would be recurring payments like the ones found at boutique web services like BaseCamp or DropBox. But with large players like Feedly already in the market - and other established/funded companies like Digg coming soon - there's going to be the same price pressures that were there during most of Google Reader's existence. Basically this means there won't be enough users who are unhappy enough with the free offerings to support for-pay ones, even with significant added value. This will lead to more churn as more services fail, leading to a general attrition of potential users who eventually give up on the whole idea.
To me, all of the issues list above show pretty clearly why so many online news readers have come and gone over the past decade. They start up with a few innovations, attract a few users, eventually get hammered with traffic and storage costs, find out their hands are tied while trying to generate income and eventually give up. Rinse and repeat.
And yet, we persist. Why!? I guess it's because the fundamental idea is solid. Users who use news readers can't imagine *not* using them. I think it's a sort of an ingrained preference, honestly. The first delivery boy was hired in 1833 to deliver newspapers to subscribers at their homes, and I think since pretty much that moment, we've loved the idea of having news and information quickly brought to us in a convenient, condensed way. The Internet has only intensified that feeling. In fact, for the past week, I've been avoiding using my prototype news reader in an effort to spur me to get what will be my production code up and running in the cloud (which it's not yet). It's been driving me absolutely batshit crazy not having my reader to visit 200 times a day ("Hi My name is Russ and I'm an info junkie...").
So the problem definitely isn't the general idea - news readers provide what is a must-have service for many, many people. I can *feel* how valuable it is from a very visceral level. This is why I chose to do a news reader rather than other things in the first place. The problem, of course, is doing it while making a lasting business out of it.
Very shortly, you're going to see a lot of those free news reader 'startups' and one man dev teams I listed last weekend disappear. They're going to be shocked at the costs associated providing the service, and the time involved in maintaining it as well. This is already happening at Skimr - they pulled their service off of AWS and are now serving it off their own servers in Prague because it was so much cheaper. And they're one of the 'simplified' news readers that have launched - imagine what it'll be like for more full-featured ones?
News readers are similar to search engines, in that they have to constantly crawl the web for content, but with the added challenge of storing and serving all that content to users all day, every day. An average user might spend 10 minutes a day total on their favorite website, an average news reader user will spend 10 times that amount (or more). This is expensive. Just so you can understand real-world pricing, here's a typical AWS setup for a 'marketing' site that costs roughly $1,400/month: 1 Load Balancer, 2 Web Servers, 2 App Servers, 1 High Availability Database Server, 30 GB of Storage; 120 GB of Data Transfer. But if you want to crawl the web and be highly available, well, your infrastructure might end up looking a lot more like the bigger chunks in Obama for America's setup. If you can't figure out how to make money from each user from day one, then eventually you're just going to get sick of throwing money away and shut it down, as we've seen so many times before.
That said, NewsBlur did generate roughly $120,000 in signups the week after Google's announcement, and probably a lot more by now, which is huge. But then again, so are the costs of AWS. I'm still positive though - NewsBlur's model - having an open source version for the most advanced users, a very limited version (only 60 feeds) for free users and a nominal fee ($24 a year) for everyone else might actually work out well. We'll have to wait and see, but it could be the perfect model for a boutique service as long as they can keep costs in line and fees sufficient to cover them.
The big guys though, are going to have a lot of trouble in my opinion. I just don't understand how Feedly is going to work, honestly. I like Feedly a lot. I've loved their design aesthetic for a while now and I love that they were ready go when Google dropped the bomb and are willing to take a lead on an API in the future. They also seem to be the number one choice for most people who are switching from Reader. But unlike Google, Feedly doesn't have some other form of revenue that can support it while it provides free services for its ever-expanding number of users. It's going to have to figure out how to make money, and the options don't look good. From what I understand, they're going to try to offer a 'freemium' package, but I'm pretty pessimistic the economics are going to work out unless it's got some *serious* added value and a comparable price. This means Feedly will probably attempt to monetize the number of 'eyeballs' it has.
This generally never ends well - look at the continuing disaster that is Facebook as a perfect example. Facebook's News Feed may have started as just "the wall", but as time has gone by it's become much more like a regular news reader in many ways. Nowadays, it's just as much about what your friends are sharing (the same links as every other service), as it is about what your friends are doing. I can't guess at the balance between Facebook's user-generated content (posts, pictures, etc.) and the external content they're now pulling into their news feed, but I bet it's nearing 50-50 at this point.
The question for Facebook lately has been about how to monetize their News Feed. Because Facebook never pulls in the full articles from outside sources, and has lots of home-made user generated content, they can happily put advertising pretty much anywhere on their site - something that a news reader just won't be able to do easily. Good for Facebook, right? Wrong. The result is disastrous - maybe predictably so.
This is what I saw this morning when I happened to check out Facebook on my iPad. A majority of my home screen was dominated by sponsored content and ads. It was completely obnoxious.
Facebook should be the holy grail of targeted, relevant advertising, right? Yet, despite knowing just about everything about me you can possibly know (where and when I was born, who all my friends are, where I've worked, where all my friends have worked, the fact I'm a father, where I've travelled, and more), they've yet to figure out how to present me with advertising that's useful. In fact, the crap I saw today only 'works' in a sense because they're force-feeding it to me - distorting their user interface and perverting the user experience - basically holding the rest of the service hostage so that I'm forced to pay attention to the ads. Compare this way of using ads to the utility of Google AdWords - which are essentially win-win-win for all those involved - and you can see how much trouble Facebook is truly in.
So what's my point? Well, Facebook already has a massive customer base, infrastructure in place, advertisers salivating to get at their users and a great brand name - and yet even they are having trouble making money from the content in their News Feed. If this is the case for a company like Facebook, how is it possible that a general news reader - which has a self-limiting user base, a reliance on unpredictable third-party content from around the web, and less options when it comes to monetizing that content - will ever be able to create a viable business based on advertising? The answer is, they probably won't.
So, do I have some sort of magic solution to the plight of the news reader that will prevent the service I create from repeating the failures of the past? Nope. Not at all. I have some ideas, but no solid answers. What I do know is that repeating the same futile efforts of the past 10+ years is definitely a plan doomed to fail. There needs to be fundamental changes in how the problem is approached and the service is offered.
Like I said, the idea is great, the service is valuable and not only that, I think it's inevitable: In the future we'll all be relying on personal agents to collect information and data on our behalf, aggregate the results, and present them to us in a time and effort-saving manner. That will happen. But the path to that future definitely doesn't start with the road most of us are on now.
Here's where I see some obvious changes to get on the right path.
Technology. We need a better way to get content updates from websites in a way that's fast and inexpensive. Servers are either hammering away or getting hammered by thousands of bots every day for little reason. I tested over 50,000 feed URLs, and only about half the sites support any sort of ETag or Last Updated header support. Even for ones that that do, once a single update happens, an XML page of the last dozen or so posts is returned, and it's up to the feed parser to work out the deltas. It may 'work', but from just about every other measure it sucks.
The problem is that this topic brings in technology zealots. For example, those that bow to the god of REST will say that this is how it's Supposed To Work (TM). But dogmatic REST doesn't have any concept of lists, only individual pieces of content. This is why every API out there usually has some sort endpoint which provides a list of contents, with parameters to limit the size of the returned results, and paging as well. This stuff isn't in the zealot's REST vocabulary.
My best guess would be to replace polling RSS for updates with JSON API calls, but it doesn't necessarily have to be that radical. A model to look towards would be Facebook's Open Graph API. They came up with a good solution for summaries using Meta Tags on web pages, since duplicated by Twitter's new card tags and Google+ tags as well. This worked out extraordinarily well for just about everyone involved with that system. Users get thumbnails and summaries of the content before they click on links (or even as they are adding the link to their post), content providers get trackable traffic, and Facebook/Twitter provide good looking streams of rich content.
That's only good for the content though (replacing the bulk of RSS), there still needs to be a way to get updates in the quickest, lowest touch way possible. PubSubHubbub tried and pretty much failed (with only a handful of service providers) - so I'd love to see instead something based on the simplest solution that could possibly work based on JSON and delta updates.
Yes, lots of hand waving in the last couple paragraphs, but I'm not trying to define a new spec right here - only to point out the current tools we're using are fundamentally broken.
Clarity. Part of the reason I like the open graph stuff is because it's integrated with the page of content itself so it's clear to the content provider what is supposed to be there, and it's clear to the end user what's going on as well. Feeds have never been like that - clicking on a link has been a dangerous proposition for years and the blame goes 100% the browser makers.
It would be nice to create a new system and include the HTML5 web standards guys, so that any new specs that are defined have standard visual results, rather than being left up to chance or worse the browser makers themselves. I highly doubt this will ever happen, but that'd be the ideal in my opinion. Update: You can find very similar thoughts from Wikimedia's Luis Villa here: Why feed reading is an open web problem, and what browsers could do about it.
More likely, we have to create a system with the expectation that the browser will have zero part to play in the process. I think copying, or riding on, the success of OAuth would be the right option here. Users generally understand the process involved with giving access permissions to third party sites. What I envision is similar, but in reverse, where content providers approve access to users. What's clear is that the end user should probably never be dealing with 'feed' URLs any more than they pass around links to developer APIs.
Compensation. Content providers need to have incentive to provide access to their content. Whether it's a blog, news site, video site, or something like Twitter and Facebook, there needs to be a way to make it clear where the content came from and provide the content provider value for that content - either in way of traffic, ads or direct payments.
This is the route that FlipBoard took when it first launched. It didn't integrate with Google Reader until six months after its initial launch. Instead it did deals with big publishers, as well as Twitter (I think), to make sure content providers were part of their service from day one. This probably isn't a panacea in terms of making money for the news reader, as Flipboard is now trying to become a 'content curation' service for some reason, but in general I think the idea is sound.
The problem is that the rest of the world doesn't have a group of lawyers able to go flying around making deals with Time Warner and Conde Naste. If these guys are going to make their content available on the web, there's got to be a way for them to say, "Here's what we expect in terms of content usage" and display to the end user, giving rights to the news reader to gather that content on the user's behalf. Again, similar to how you approve what a service can do when you use OAuth.
Content owners aren't exactly known for their flexibility, but I can imagine some pretty neat scenarios. For example, a website could give your news reader the option to display only a title and summary for free, without ads, or but have some sort of ad link or code that could to be included in exchange for the full-text of the content. Even simply enabling the automatic ability for news readers to gather content for subscribers (like for the WSJ or NYT) would be great. This is the sort of win-win-win system I'm talking about - something that works for the news reader, the end user and the content provider.
The other day I tweeted that I had just written a blog post which could be summed up as, "We're all totally fucking doomed," and I guess everything I've written above could definitely be construed as such. But really, it's simply about paying attention to the mistakes of the past, to avoid repeating them in the future. I'm thinking really hard about this stuff right now, and a blog post like this helps me clarify my thoughts and set my direction. (It also helps remind me of what the hell I decided, as sometimes I forget that as well.)
It might appear like gloom and doom (actually, if it doesn't, you're not looking), but really it's a massive opportunity. Again, anyone who uses a news reader can't imagine having to live without one, yet I see the feed ecosystem slowly deteriorating with no replacement in sight. That's the opportunity, in my mind. It's not about simply mimicking Google Reader (which I would never do anyways, as I've never liked it), but about bootstrapping off the system we have today, in order to create a new type of news reader for tomorrow. The future of news readers, so to speak. ;-)