Filters, agents and aggregators
I'm at the airport with *way* too much time to spare. I'm the guy that's always late and cutting it close. Being at the gate 45 minutes before boarding gives me the heebeejeebies. Doesn't seem to be a WiFi access point available here in SFO's International Terminal (something says tmobile, but it won't let me sign up/sign in) so I'm connected via my EDGE phone. Just as good, really. All I'm doing is email, blogging and IM...
Anyways, I was wondering if anyone has done a "mashup" of an RSS News Aggregator and a SpamAssassin like filtering algorithm yet? For two reasons. First would be for referrer links to filter out splogs so I only see "real" referrers from real blogs. As a person, you quickly get a pretty good idea of the domains that are real blogs, and which are another-splog-ad-trap-full-of-crap.blogspot.com, no? It'd be nice if my news aggregator (I've been using NetNewsWire lately...) would grey-out the junk for me.
On the flip side of this, it be great if I was accessing my RSS news via my mobile phone if only the stuff that is most likely interesting to me showed up. Or if I was using my desktop, the stuff that I thought was great (because I trained a filter with my clicks) would be highlighted or moved to the top of the list, so I read that stuff first. And conversely, the stuff that I rarely if ever click on would float down to the bottom of the order where I don't have to worry about it unless I'm really bored. Since I interact with my aggregator quite a bit - choosing which subfolders to read in which order, clicking through to both articles and to links in those articles, etc. it seems like a no brainer to funnel that stuff through a proxy to train a filter, no?
Then that same filtering technology could be sent off to a server to serve as an agent for me to find *new* blogs that I haven't read yet (and aren't part of my aggregator yet) in order to alert me to content of interest. Imagine being able to upload a file of your filtered preferences to blo.gs or weblogs.com or whatever, and as those sites are pinged, your agent looks at the content for anything that may interest you, then sends those results to your aggregator so the next time you refresh, you see a section with interesting new articles and posts. Sort of stretches the idea of an aggregator, no?
Filters and agents have definitely been on my mind lately especially because of their impact on mobility... Think back to the General Magic days when you would have agents go off and do cool stuff and then bring the results back to your portable device for you. (What was that scripting language called?) Well, a decade has past and now we have machines that are the equivalent of original Pentiums in our pockets, on 100kbps+ Internet connections to boot. Now there's a real opportunity to do some really neat things on the device and on the server side as well. Nowadays we've got scalable and relatively inexpensive server solutions that could take requests from a bazillion intermittently connected devices and go off and look out on the internet for answers... then aggregate them and return them to your mobile or other device for you to use when you need it. This stuff definitely seems much more reasonable now to think about, no?
So it's a question - is there stuff like this out there? I'm suspicious of "magic" intelligent agents and all that baloney, but it seems there could be some basic tasks that could start to be done for you now with all this infrastructure and new formats like RSS that now exist out there, no? I'm thinking like this: It's like Palm's Graffiti. Everyone at the time was trying to get devices to understand normal handwriting, and they were horrible at it. In contrast, in order to use Graffiti, you had to learn to write in a very specific way, and then it'd work great. By making their users meet the problem half-way Palm simulated handwriting recognition much better than other technologies and their success in the late 90s showed it.
I wonder if there's a parallel there for filters and agents? Instead of saying "go do this magic for me" and expect it to work flawlessly, instead we figure out a better way of addressing the question: "Which information is the most valuable to me right now?"
Thoughts? (I'll be on a plane of course, so it'll be a while for moderated comments..). See you on the other side of the world!