The Afterthoughts – If Google came up with an RSS Reader

So here is another post in The Afterthoughts series.

Post: If Google came up with an RSS Reader
Originally posted on: 2005-01-30

This post was made long before Google came up with Google Reader. I was experimenting with RSS readers and started wondering what it would be like if Google came up with an RSS reader.

Now that we have one from Google, it is time to look back and see how my expectations matched with the actual product.

> * It would first buy the domain “greader” or something similar.
This didn't happen. However, Google Reader is popularly called GReader. I guess I made this comment because of Gmail.
On a side note, Google does own greader.net.

> * It would have an index of more than 8 million different feeds.
This is not how an RSS reader has evolved. Google Reader does have recommendations based on the feeds you already have. It would be good to see an integration of Google Blogsearch or even Google News with Google Reader. The only integration I see is the subscription of search results from both of these in Google Reader (a 'new' feature).

> * It would offer 1 GB space for storing posts.
The storage in most online readers is unlimited.

> * It would have an excellent search feature for searching posts.
This was a surprise! The feature came in so late. Totally unexpected.

> * The interface would be simple, but at the same time powerful.
You bet this has been true. The keyboard shortcuts are just superb. The speed with which you can navigate and read feeds is extremely good. (You will need my script to make it even faster. :))

> * We would be able to mail any post just at the click of a button.
I guess this feature has been around since quite some time now.

> * It would allow us to filter posts and also label them for future reference.
With tagging and folders, this has been better than expected.

> * It would also allow us to make blog entries (of course the service would be integrated with Blogger.)
Again, this is a surprise. Google has not provided any integration with Blogger. However, recently Google added a feature to share an item with notes. With the microblogging revolution, and Google having acquired Jaiku, I guess that integration will happen first.

> * It would integrate greader with other offerings like mail, groups etc.
The integration is not that great as of now. It would be cool to see posts related to a mail, or a message in a group etc.

> It would be Beta forever. 🙂
Surprise! This isn't true!

Final thoughts:
So after more than 3 years since I made the original post, (which is a lot of time in technological evolution) I should say, Google did match most of the expectations that I had back then, some features were developed much better than what I had expected. However the integration with other services is one thing where it could have done better.

The Afterthoughts – Gmail forwarding and service interoperability – an interesting observation

“The Afterthoughts” is a series where I revisit some of my older blog entries and see how things have changed since the time I made the blog post and now.

The posts that I will choose initially will be from 2004 to 2006.

So here is the first one in the series:

Post: Gmail forwarding and service interoperability – an interesting observation
Originally posted on: 2005-11-21

The entry goes about explaining how when you connect various services together, you could end up with the same information multiple times.

This is increasingly becoming a problem these days. Services like Twitter and Friendfeed are not solving the problem elegantly, so you see more and more duplicates and links to the original post.

Here is a typical scenario today:
I make a blog entry. In order to ensure that my readers see my post immediately, I have a service that automatically posts a message in Twitter. This is like instantly messaging my friends (actually Twitter followers) telling them, “Look, I made a blog entry”.

Now, I use a lot of Web 2.0 services. So, in order to ensure that all my friends have a single feed to follow my activities, I use some aggregator like FriendFeed or Tumblr.

Some friend of yours (let's call him Bob) likes your blog entry and bookmarks it on del.icio.us. Another friend, Andrews bookmarks it in Magnolia.

Let us now say, there is another person Dave, who is a friend of you, Bob and Andrews. He is following all 3 of us in Friendfeed.

How many entries is Dave going to see of the original entry?
6 in total! 3 from you – 1 from your blog post directly, 1 from Twitter, 2 from Tumblr (1 via the blog post and 1 via Twitter), 1 from Bob via del.icio.us and 1 from Andrews via Magnolia.

The screenshot shows duplicate entries from mashable's blog feed and from Twitter:

Now this is real noise. And this is more true if Dave is not even interested in the blog post to begin with.

So the solution?
Friendfeed allows you to hide specific feeds from specific people. For example, Dave can hide all bookmarks from Bob or all Tumblr entries from me.

Now that is not a good solution because not all bookmarks from Bob are duplicates.

Tools like Feedblendr and Blogbridge have solved this problem for simple RSS aggregation. However things are different when it comes to social network and aggregation.

So right now there is no simple way of detecting duplicates and more and more people are complaining about this in the blogosphere explaining how Friendfeed is more noise than information and why the good old Google Reader is still relevant.

Here is one such discussion. As the discussion suggests, it is not just about eliminating duplicates; it also requires you to merge discussions/comments in each of these posts keeping in mind that not everyone is a friend of everyone else.

So what has changed over the last 2 years?
If anything, the problem has become a tougher one. I am sure the startup that does duplicate elimination and gives you a filtered feed taking your social networks into consideration is going to be the next hyped startup in the Web 2.0 world.

I changed my RSS reader

In my recent posts [Maintaining multiple feed lists],[The evolution of the pub-sub model on the web], I have been mentioning about this new trend emerging in the RSS world, where people subscribe to tags rather than specific feeds. This helps in getting all the data related to these tags from the entire index of the search engine (or crawler, or indexer, or whatever). I also mentioned about the problems faced.

Now it should be obvious that I have been looking around for some solution to this problem and my present reader Google reader does not help me in doing this well. The reason is that this reader has been designed for a very different environment and although I was reluctant to switch from it, I had to.

So what did I find that helped me change my mind? Blogbridge.

What's so special about Blogbridge?
Blogbridge has the concept of Smartfeeds, which is not rocket science, but this ability to support tag-subscribes. This shows that they are trying to address the very thing that I am looking forward to. Now the interesting part here is that you can define simple rules to aggregate multiple Smart feeds so that you get all the feeds aggregated into one single feed.

Continue reading I changed my RSS reader

Maintaining multiple feed lists

said, “Hey, I want to read News in the morning and Hacking in the evening. I have a list of feeds for each. How do I use my reader to do this?”

I said, “Create multiple accounts. Simple. 🙂 ”

I was joking. Somewhere in the back of my mind, I was thinking of a solution to this. I have been facing this problem especially after the explosion of blogs/feeds that I have subscribed to. I seem to have different interests at different times and want to read different set of feeds. What can I do?

As a co-incidence, the founder of Feedblendr updated me on an earlier request of mine asking him to remove duplicate feeds that result due to subscription to different tags that result in the same feed. (Grammatical error: Extremely long sentence) He said he has fixed it and this is the link he sent.

I started experimenting with it and it works like a beauty.

Now this seems to be a good solution to the earlier problem that we are facing. Just create a subscription list (or a blend) using Feedblendr containing a set of related feeds you are interested in and then use its RSS in your Reader. So basically this boils down to having ONE RSS to define each of your interests. Howzzat?!

The evolution of the pub-sub model on the web

Recently, I have seen a new trend emerging on the web. Until quite recently, we had people publishing their information as RSS feeds and others subscribing to it. This was the first step towards the pub-sub (publish subscribe) model.

Then came tagging and people started publishing 'relevant' tags along with the feed entries. This has helped in the emergence of a new trend, wherein I am able to track not just websites, but information pertinent to certain keywords (or tags).

A major advantage of this is that I don't have to subscribe to RSS feeds, rather I just subscribe to a set of keywords (optionally combined using a regular expression) and then get information based on it. I have been trying this for quite sometime now and have been getting wonderful results.

In fact, this is how founders of websites are able to track the popularity of their tool by just subscribing to the keyword that relates to their website. The moment someone tags their blog entry with this tag, it arrives in the feed readers of the founders and they are quick to comment and 'show interest'. Here's more information and an example of how the founder of a website tracked my blog entry within a single day and here's another.

Hoping that tagging is not misused (remember what happened to <meta>?), we have a new way of tracking relevant information.

Web 2.0 service aggregation tools

Recently I noticed a new trend in the Web 2.0 aggregation tools. These are tools which combine other web 2.0 services in one place and provide a way to host a single page containing all your services. The most common services provided by these aggregating tools are combining delicious, flickr, blogspot and rss feeds in one place.

Examples of such tools are:
Suprglu
Squidoo
Peoplefeeds

Here are my pages:
My Suprglu page
My Semantic Web page @ Squidoo
My Peoplefeeds page
(I had signed up for squidoo long back and got a chance to check out their public beta offering.)

I found an inherent problem in these services.

What I tried to do is to set up a page which contains feeds of my interest based on various other tag search results. In particular, I wanted it to aggregate feeds from delicious, Technorati, Google blog search, Yahoo news search, Feedster, Icerocket etc. I wanted search results for:
(semanticweb OR semantic-web OR semweb OR sw OR semantic_web) AND (owl OR rdf OR rdfs OR ontology OR ontologies OR taxonomy OR rdql OR SPARQL OR w3c OR metadata OR semantic OR semantics OR knowledge)

These are the problems I faced:
* Most tag search engines are not intelligent enough to provide RSS feeds for such searches.
* The page is not intelligent enough to remove duplicate links. For example, suppose I have a page bookmarked in delicious having the tags as semantic-web and rdf, then that particular link shows up in both the tag searches. So if I combine the tag search results, the page shows up twice.
* Most of the service providers do not have an option to turn off non-English pages. So many Japanese and French (or Latin?!) pages turn up in the results.
* I want a hierarchy. I should be able to create a group “Semantic web” which contains feed results for the search query given above and another group, say “Web 2.0” which has a similar query. I should be able to relate the results of “Semantic web” group with those of “Web 2.0”.
* The ability to view feeds using different views – “Technical” and “Non-technical” or “Office related” or “Non office related”.
* Finally, there should be a theme. I would like to read my “Technical feeds” once a day and “Comics” once a week. How do I separate them?

I am still looking for a solution.

RSS hacking – some observations

I tried simulating the situation that I had mentioned in my previous blog entry on Gmail forwarding and service interoperability – an interesting observation.

I first opened a new account in Reader1 (I don't want to mention this) and then subscribed to my blog's RSS feed using it. Then using Reader2, I subscribed to Reader1's RSS feed. I also finally subscribed to Reader2's RSS using Reader1.

Nothing happened again. Reason?

RSS 2.0 specification says that there should be one 'channel' element within the root 'rss' element. 'channel' can contain any number of 'item' elements. 'title', 'link' and 'description' are mandatory elements in 'item'.

Usually, every RSS feed includes a 'pubDate' element although it is not mandatory. Also they include a 'guid', which is a Globally Unique Identifier. The latter makes it unique. The former can be used along with 'link' to give a hint of duplicate entry. So the readers usually identify duplicate entries and a loop will not occur.

However there is something that can still be experimented:

Since the mandatory elements are only: 'title', 'link' and 'description', and since you cannot uniquely identify any feed using one of these (atleast I could not see any mention of this in the spec), we can create an environment where we can show that the infinite loop can occur in principle.

2 things before I wind up:

One: There is some solution to stop the infinite loop problem in RSS although this is not obvious in first sight.
Second: This problem is something that we need to seriously consider now (this stage of web evolution) or else it could be a major design flaw that will require ugly patches later on (remember IPv4?). And this is where a formal approach (standards based) always helps.

Problems with Podcasts

Podcasts are the new buzz thing in the WWW. While RSS provides a mechanism to subscribe to textual feeds, Podcasts help in subscribing to audio/video content. So, instead of those small orange bars, you will now see colorful iTunes images or Odeo images.

However there is an inherent problem with podcasts. They are not searchable. A typical podcast, for example, Slashdot Review contains many different news items. In this example, Slashdot review contains all the important stories published in Slashdot in that day.

In RSS, suppose I am not interested in reading a particular news item, I can just skip and read the next one. But in Podcasts, since all the news items are aggregated together into a single audio feed, we are not able to skip certain items.

However considering the fact that Podcasts are still in their infancy, we can expect a solution soon.

One such solution is to extend the RSS type to include 'skip points' in the audio file. By 'skip point' I mean a description of which news item starts at what offset. The Podcast descriptor would contain not just the location of the file, but also the contents of the audio file. This would also require a special podcast player, which is able to read and understand the podcast descriptor. Of course, this needs to be standardized so that podcasts from all providers adhere to a single standard. Another advantage of this is that the descriptors could be searched in a standard way and podcast directories are able to show news items and the exact location of those news items in podcast files.

However one problem with this technique is that, it is not easy to make listeners listen to advertisements. It would be easy to skip advertisements if the listener is not interested in it. A second problem is that the accuracy of the podcast descriptor is in the hands of the provider.

Any other solution?

If Google came up with an RSS Reader

What would it be like if Google came up with an RSS Reader?

* It would first buy the domain “greader” or something similar.
* It would have an index of more than 8 million different feeds.
* It would offer 1 GB space for storing posts.
* It would have an excellent search feature for searching posts.
* The interface would be simple, but at the same time powerful.
* We would be able to mail any post just at the click of a button.
* It would allow us to filter posts and also label them for future reference.
* It would also allow us to make blog entries (of course the service would be integrated with Blogger.)
* It would integrate greader with other offerings like mail, groups etc.

And finally one thing…

Guess what?

Ya, you guessed it right.

It would be Beta forever. 🙂

For those of you who have started RSSing

I have been using RSS for the last 4 months or so, and I have faced some problems with the way I use it (I use My Yahoo and now My MSN as well. Don't ask me, “Why 2?”. Techies always like having more than one for some unidentified reason. For this reason, they usually have a dozen email ids. 🙂 Anyway, let's continue…)

* My Yahoo just displays a summary of the RSS feed at the maximum. So if I need to read the entire feed then I have no choice but to actually go that site and find out. I think mentioned this sometime back.
* I cannot distinguish between RSS feeds that I have already read and the ones that I haven't.
* I cannot save a feed for future use.
* I cannot email a particular entry to someone else. (I can only tell them the feed link.)
* The entries are refreshed everyday (I have customized My Yahoo to display atmost 5 entries and a short summary for each). So what if I need to see the last 2 days' entries?

Well, apparently the problem is not with RSS but with the way I have been using it. My Yahoo and My MSN provide RSS support, but apparently they are more concerned with integration of various services that they provide (mail, address book, etc). Atleast, this is true with My Yahoo.

I found a better solution yesterday. I was on the lookout for a reader specializing in RSS, that is able to satisfy my requirements.

And I entered “online rss reader” and I got a million (I don't know exactly how many) sites. Newsgator won the race. A quick registration and I was done. It has the following features:

* Easy and quick registration (not like MSN 🙂 )
* Quick feed adds.
* Import feeds (using some new format OPML. Haven't heard of it before.)
* Store feeds in separate directories.
* Distinguish read and unread entries.
* Pretty neat interface.
* Store entries, mail them to friends etc…

To summarize, it looks somewhat like a mailbox with all unread mails (I mean feed entries :P) in bold and all feeds displayed in a separate column to the left. And ya, you don't have spam here!

One interesting feature I saw was that it was able to display some images as well! I am not quite sure how this is supported in RSS. Gotta find out. Anyone out there who knows?