Author: Gautham Pai

Post author By Gautham Pai
Post date April 29, 2006

(NOTE: This is a work-in-progress. Last edited: 02nd May 2006.)

I have never been fully satisfied with an available bookmarking tool. Each tool has its own set of cool features to offer. But there never has been one tool that offers all. This made me think, “Why not write my own bookmarking tool? What will it take to write one?”

This is an entry in progress. I intend to update this entry as I try out various things and write about my own experiences.

WHY my own bookmarking tool???

Although the title mentions 'How' first, let us look at the 'Why' aspect of it. Requirements lead to implementation.

Let us look at the question: “Why do people bookmark something?”

If you do a Google search on “Why do * bookmark”, you are bound to see some analyses done by people. There is some formal research done on this topic, and I will have to go through these papers. However, the stupid answer that you will find in the informal analyses is 'To retreive it later'.

Ok, let us go ahead with this stupid answer and ask the next question, why would someone want to retreive it later?

I see a bookmarking tool as an aid in Personal Knowledge Management (PKM).

If you are looking at bookmarking tools from a research angle, then the reason you would want to bookmark URLs and retreive them later is to use the information in the URL when you need it and make some analysis with the information. There are problems with existing bookmarking tools if you want to use them for this purpose.

There are chances that you bookmark something to share it with people, but that is not the problem that I am trying to solve here. Existing bookmarking tools are good enough to solve this latter requirement, although there is scope to make them better.

There are also chances that you bookmark something for some purpose that I don't know and again I don't guarantee that writing your bookmarking tool will help.

Let us now look at the problems with existing bookmarking tools in information extraction and analysis.

As a first step, let us look at what the requirements are and then see why existing bookmarking tools cannot help us in our requirements.

The first and foremost thing that I would like to see in bookmarking tools when I am using it in research work is to be able to enter information then and there about what the resource contains, why it is useful to me and how it is related to other bookmarks that I have. I should then be able to get an overall picture of the problem in hand, by using the information about all the bookmarks that I have for that particular problem. More often than not, you learn new concepts or technologies and then you would want to delve into this topic and learn more about it. This goes on and on.

Soon, this would result in a network of concepts, where different items are related to each other in some way.

So, the reasoning is quite clear. A bookmarking tool should aid us in creating this network and also retreive information from it when required. The retreival should be as efficiently done as possible (remember usability?).

Let's now look at existing bookmarking tools and see why they cannot help us solve this problem.

Most online bookmarking tools are not written for personal knowledge management. They are for people who want to store links somewhere and possibly share it with a group of people. The most you can expect from these tools are a set of fields to fill in clippings, comments etc. I have not seen a single bookmarking tool that allows me to relate information between bookmarks. (I need to have a look at Connotea). You cannot expect these tools to do more than this because the average user would not want a whole bunch of boxes to be filled just to store that damn URL.

But then I want more. I want bookmarking tools to be extensible. I also feel that there are people out there who feel the same.

The fields that they provide are static and may not serve the purpose for all people. I would like to add my own fields and/or edit them when required.

Ok, ok, now tell me HOW do I write one???

There are many ways we can solve this problem. One easy solution I found was to use Lazybase. Lazybase allows anyone to design, create and share a database of whatever they like.

Ok, so how can we use Lazybase to create our own bookmarking tool?

Here are the steps to create a bookmarking tool using Lazybase:

1. Create a database and name it whatever you want. I would call it “BookmarkDB”.
2. Create an item type named “Resource”, which has the following fields: Name, Keyword, URL, Bookmark Date, Clipping, Rating, Category, Read (Yes/No), Comments, Related To etc.
3. Create a bookmarklet for “Resource” and make sure URL is extracted from Page URL, Name is extracted from Page Title, Clipping is extracted from Selected Text.
4. Drag and drop the bookmarklet onto your bookmark toolbar.
5. You can define as many item types as you want to capture the relationship between the bookmarks. For example, you could define an item type “Relationship” and have two URL's as its fields and a third field to define the kind of relationship (why the 2 are related).
6. You are ready!

The actual design of the BookmarkDB and the item types and their relationship is upto you. You can in-fact make it a collaborative experience by adding an item type 'Person' and then adding this to your 'Resource' to identify 'who' bookmarked a particular URL. All you need to do is share the 'edit' URL of your database with those who you would want to collaborate with. If you want to export all your bookmarks, you can use the CSV file option to export all the bookmarks.

Disadvantages

The solution is not perfect, but good enough. I have been using Furl for quite some time now and I will miss the 'page copy' that Furl does.

But then, with the number of bookmarks that I have and the way I want to use them, I feel Furl does not serve me well.

Tags bookmark, bookmarking-tool, connotea, dogear, howto, lazybase

World Wide Web

Google, Yahoo! and innovation

Post author By Gautham Pai
Post date April 17, 2006

Google recently released “Google calendar“. Time and again, Google reminds me of Jeremy Zawodny's blog, Google is building Yahoo 2.0 – Google trying to re-build what Yahoo and others have built, but provide one killer feature that makes it irresistible.

Ok, if you search for comparsions of the Yahoo and Google services, you are bound to get thousands of entries. I don't want to do the same here. But there are some things that I would like to highlight from my own personal experience.

I have tried out a lot of the Yahoo services. Same is the case with Google. Although Yahoo has a lot of features, the innovation seems to have stopped. The mail, address book, calendar, note services are still in the pre-Web 2.0 phase. (Yeah they have been promising a new look and feel, but where is it??? I am waiting). Google on the other hand started off in the early Web 2.0 phase, and has added some product or the other to its portfolio, not to mention adding petty features to existing products.

Another striking difference has been the kind of integration that exists between the services. Yahoo started with lots of services. Each service was on offer individually, least bothered about what other services offer and how the 2 could be related. For example, Yahoo's calendar service seems disintegrated from Mail. Then there is a briefcase service to store files and attachments. The chat service is different; there are different kinds of searches. There are different kinds of bookmarking services. The list goes on and on.

Contrast this with Google. Google started off providing services one after the other, carefully keeping them tightly integrated. (Is this slow poison? 🙂 Get users to use one service and lure them into the rest?) Google seems to be building a 'single page interface'. “For all your requirements on the web use Google.”, that's what they seem to say. You can use the calendar from the mail interface, your chat logs are in your mail. You have ample space to store all your mail (you don't need a briefcase), the search is always there no matter where you are, search something, if you find it interesting save it, label it and search for it later.

This does not mean Google has done it all right. There is a lot still left to be done. The ultimate aim seems to be – get me all my information on demand – get me the information, wherever I want it, whenever I want it, get me only the information I want, and all the information I want, instantly.

All this translates to: A great expectation from Yahoo's new service. Do they have this kind of service integration? Or is it just old things in new clothing? I am waiting.

Tags google, interoperability, user-experience, web2.0, yahoo

My Updates

One of the geeky ways to wish someone

Post author By Gautham Pai
Post date March 31, 2006

My room-mate wished me at 12:00:00. Wonder how?

Both of our systems are in sync with the Internet Time Server. So Arun wrote a simple script to play a birthday song at exactly 12:00:00.

Now that's what I call a geeky way of wishing someone.

Thanks dude.

Tags birthday, wish

Technology

Now how do you do this in functional programming

Post author By Gautham Pai
Post date March 30, 2006

Consider the following piece of code:

...
int flag=0;
for(int i=0;i<10;i++){
   if(val==arr[i]){
      flag=1;
      break;
   }
}
if(flag==1){
   System.out.println("Found.");
}
else{
   System.out.println("Not found.");
}
...

In functional programming, we cannot assign values to variables; variables can only be initialized. So I can't use a 'flag', the reason being that if I initialize the 'flag' outside the for, I can't change its value inside the for and if I initialize the 'flag' inside the for, I can't use it outside because of scoping.

For people new to functional programming, here's an excerpt from Wikipedia:

Functional programming languages have the following features (or should I call it restrictions?):

...Functional programming can be contrasted with imperative programming. Functional programming appears to be missing several constructs often (though incorrectly) considered essential to an imperative language such as C or Pascal. For example, in strict functional programming, there is no explicit memory allocation and no explicit variable assignment. However, these operations occur automatically when a function is invoked: memory allocation occurs to create space for the parameters and the return value, and assignment occurs to copy the parameters into this newly allocated space and to copy the return value back into the calling function. Both operations can only occur on function entry and exit, so side effects of function evaluation are eliminated.

I had such a requirement in my program. I was using XSLT to program something and I came across this requirement. (XSLT is also a functional programming language). I then used some XPath constructs to solve it but wondered if there is a standard way to solve it in functional programming languages.

I am new to functional programming (although I have been doing XSLT scripting for the last 1 year) and I am still looking for a standard solution to this pattern.

Tags functional-programming, thought, xslt

World Wide Web

Programmable wikis, Application wikis, Situational applications

Post author By Gautham Pai
Post date March 30, 2006

Heard of Jot? It has been in the news for some time now, calling itself the first true application wiki.

So what's this thing all about and how is it different from normal wikis?

Before we delve into this, we need to know where normal wikis fail and how this new concept of application wikis helps in solving them.

Wikis in their present form contain highly unstructured data. Take the example of Wikipedia. Wikipedia allows users to create pages containing information about just anything in the world.

The information in the wikis would be more useful if it can be used somewhere else. For example, I would want to just double click on a word in my browser and view the definition of it (and not the entire page). Or I might want to relate content in a page with that of another – semantically. I might also want to view content based on my current expertise level (contextual views).

In order for this to happen, we require that wikis be more intelligent. Enter application wikis.

Application wikis bring in the dynamic content aggregation feature that is lacking in current wikis. This means that the data may not even reside in one place. It might be aggregated at runtime. However this is not it. The content might be pushed out as an RSS and people can subscribe to changes made to specific sections of the wiki or maybe mailed to them. The basic idea is to be able to 'program' the wiki to display 'information' dynamically.

This brings in some interesting applications of application wikis. Application wikis can be used, for example to create a page for a conference. This page would contain 'A google map' plugin, which would show the venue on the map, latest news in the form of an RSS feed, the weather information in another portlet, a list of all participants, which might be coming from a database directly, a list of all talks (which might be maintained in a separate database for some reason).

You might note that the data does not actually exist in the wiki at all. The wiki just acts like an aggregator of content.

Now comes the concept of views. As a participant in the conference, I might be given more information about the talks, while a non-participant may get lesser information. A speaker might get a totally different set of information and so for the organizers.

You probably don't even need a Graphical UI to view the data. You might as well have a kind of Query Interface that allows you to view data based on your role and your preferences.

It might be obvious, but let me clarify that the content for a particular page may come from a specific section of some other page. So I could have, say a page on Bangalore, a page on Mysore and a page on Cities in Karnataka, which fetches content from these pages. The moment the content changes in the original pages, the content viewed from the aggregated page also changes.

Yeah, there's nothing special here, it's the traditional MVC pattern applied to Wikis. It had to come some day.

Two initiatives that I know in this field are from Jot and Semantic MediaWiki.

If you are a Web 2.0 or a Semantic web geek, you might have come across this by now, but if you have not, then you got to check this out.

Tags application-wiki, jot, mvc, programmable-wiki, semantic-mediawiki, situational-applications, wiki, wikipedia

World Wide Web

Yeah I do have a URI

Post author By Gautham Pai
Post date March 23, 2006

In respect of Sir Timbl's blog, Give yourself a URI, I thought I should have a URI for myself.

So here it is.

Tags foaf, semantic-web, timbl, uri

World Wide Web

Semantic Crawler – an update

Post author By Gautham Pai
Post date March 23, 2006

This is in continuation of my blog entry on Semantic Grabbers. I did some experiments after consultation with . Thanks for the inputs.

My intention was to get a set of related words given a single word as input. I wanted to make use of the <rdf:Bag> tag that Delicious provides.

The idea that I had in mind was to start off by seeing the number of occurrences of each tag in the <rdf:Bag> of all links and then to use this to decide which tag to analyze next. The more frequent the occurrence of a tag, the more likely it is to be chosen next.

For example, suppose I see that RDF occurs most frequently in the links, then I select that as my next tag for analysis. I keep updating this list with more tags and their frequency as I crawl through the tags.

Here's the problem I faced: There are chances of the use of very generic words like tech, development, tutorial etc that are likely to be used in more links than others. So the crawler was mislead. The selected tag becomes more and more irrelevant as the crawling proceeds.

There are some solutions that I have in mind.
1. Provide weight-age in comparison with the root-word (i.e. the given word).
2. Do a study of 'all' the tags for the entire list possibly including the description as well and then see the relationships. (This emerged after my discussion with .
3. Provide more than one word as input and use these words to determine the set of related words.

Determining relationships between words is not quite easy in folksonomies because of the lack of contextual information. However it surely is a rich set of information that needs to be exploited.

The result will be available here for a few days.

Tags delicious, folksonomies, semantic-crawler, tagging

My Updates

Technologix 2006

Post author By Gautham Pai
Post date March 11, 2006

I had been to college today to deliver a talk on behalf of ISL. The talk was on 'Innovation at IBM'. The talk was jointly delivered by me and Tariq Aftab.

We spoke about RFID, IBM Everywhere Interactive Display and IBM Smart Surveillance System (S3). It lasted for about 1 hour 15 mins. Overall, the response was good (the auditorium was jam-packed) and there were questions (which means people understood what is going on).

Ananth Narayan from LTC also accompanied us. Ananth is also an alumnus of SJCE.

We then had the privilege of meeting the principal, the head of the department of computer science and several other lecturers.

It was good to be back in college after such a long time. The atmosphere was pleasant and I felt I am really missing college.

Overall it was a satisfactory mission.

Tags eid, ibm, isl, rfid, s3, sjce, technologix

World Wide Web

Happy birthday ? /\ /\/ /\ /\/ `/ /\

Post author By Gautham Pai
Post date March 10, 2006

Yeah, this is the day I get to speak out and Gautham shuts up.

I am 2 years old today. I was born on 10th March 2004 at 12:45 in Mysore. Last year, I celebrated my birthday like this. I was called Modus Vivendi then, but now I am called ? ? /\ /\/ /\ /\/ `/ /\.

So here's how I am progressing:

Journal entries: 147

Total comments: 353

1			120
2			51
3			41
4			27
5			19
6			13
7			13
8			12
9			11
10			8
11			7
12			5
13			5
14			4
15			4
16			2
17			2
18			2
19			2
20			1
21			1
22			1
23			1
24			1

Generated using ljstats.

Tags birthday, blog, livejournal

World Wide Web

A semantic grabber

Post author By Gautham Pai
Post date March 8, 2006

So what's a semantic grabber? If you do a Google search, you get, umm, '0' results (as on 08-March-2006).

So this definitely is not the word used in the wild. So what's it then?

Well, the story began like this. I started off experimenting the evolving pub-sub model wherein you give a list of keywords and you get the latest feeds for it based on the keywords specified. I was trying to come up with an optimum filter that would give me really crisp information. This is a tough job especially in the as yet semantically immature WWW.

My first requirement was to get a good list of keywords. For example, I would like to know all keywords related to semantic-web. I know words like RDF, OWL, RDQL etc are related to semantic-web. But I want a bigger list. (Does this remind you of Google sets?)

Where can I get a list of keywords? I turned to Delicious. If you are a Web 2.0 geek, you would definitely be aware of the rdf:Bag tag, where you get the list of all tags for a particular link.

For example, an rss page for the tag 'rss' has a link which has the following tags:

<taxo:topics>
<rdf:Bag>
    <rdf:li resource=”http://del.icio.us/tag/rss”/>
    <rdf:li resource=”http://del.icio.us/tag/atom”/>
    <rdf:li resource=”http://del.icio.us/tag/validator”/>
</rdf:Bag>
</taxo:topics>

So you know that rss, atom and validator are some 'related' keywords. Of course, there is no context here, so there could be possibilities of people tagging http://www.google.com/ as 'irc'. (This is true. I have seen people tag Google as IRC). But if you consider a weightage for tag relationships, then soon you can come up with a model where you get to see tag clusters.

Ok, now back to the topic on Semantic grabbers. The idea came to my mind when I thought of writing a crawler that crawls on Delicious RSS feeds and tries to find out tag clusters. So this crawler is not interested in links, but is actually interested in data that resides in the links. That clearly distinguishes it from a normal HTTP grabber, which blindly follows links and grabs pages.

Soon, with the evolution of RDF, I guess there will be more such crawlers on the web (what are agents?) and people are already talking about how we can crawl such a web. This is my first attempt at it.

So ditch Google sets (if at all you have tried it) and use a 'semantic grabber'. 😉

Tags crawling, delicious, pub-sub, semantic-grabber, semantic-web, tagging