Author: Gautham Pai

Web 2.0 service aggregation tools

Post author By Gautham Pai
Post date December 8, 2005

Recently I noticed a new trend in the Web 2.0 aggregation tools. These are tools which combine other web 2.0 services in one place and provide a way to host a single page containing all your services. The most common services provided by these aggregating tools are combining delicious, flickr, blogspot and rss feeds in one place.

Examples of such tools are:
Suprglu
Squidoo
Peoplefeeds

Here are my pages:
My Suprglu page
My Semantic Web page @ Squidoo
My Peoplefeeds page
(I had signed up for squidoo long back and got a chance to check out their public beta offering.)

I found an inherent problem in these services.

What I tried to do is to set up a page which contains feeds of my interest based on various other tag search results. In particular, I wanted it to aggregate feeds from delicious, Technorati, Google blog search, Yahoo news search, Feedster, Icerocket etc. I wanted search results for:
(semanticweb OR semantic-web OR semweb OR sw OR semantic_web) AND (owl OR rdf OR rdfs OR ontology OR ontologies OR taxonomy OR rdql OR SPARQL OR w3c OR metadata OR semantic OR semantics OR knowledge)

These are the problems I faced:
* Most tag search engines are not intelligent enough to provide RSS feeds for such searches.
* The page is not intelligent enough to remove duplicate links. For example, suppose I have a page bookmarked in delicious having the tags as semantic-web and rdf, then that particular link shows up in both the tag searches. So if I combine the tag search results, the page shows up twice.
* Most of the service providers do not have an option to turn off non-English pages. So many Japanese and French (or Latin?!) pages turn up in the results.
* I want a hierarchy. I should be able to create a group “Semantic web” which contains feed results for the search query given above and another group, say “Web 2.0” which has a similar query. I should be able to relate the results of “Semantic web” group with those of “Web 2.0”.
* The ability to view feeds using different views – “Technical” and “Non-technical” or “Office related” or “Non office related”.
* Finally, there should be a theme. I would like to read my “Technical feeds” once a day and “Comics” once a week. How do I separate them?

I am still looking for a solution.

Tags aggregation, blog, delicious, flickr, rss, search

World Wide Web

Any relation between President Bush’s speech and Semantic Web?

Post author By Gautham Pai
Post date December 6, 2005

This blog entry is about a news item, which says that it was a professor of a university who originally wrote President Bush's speech. How was it found out and what are its implications?

Well, look at this article: National Strategy for victory in Iraq.

Now download the PDF document and view the properties of the PDF document. Do you see “feaver_p”?

Here's the reasoning that NYTimes article gives about this:

The role of Dr. Feaver in preparing the strategy document came to light through a quirk of technology. In a portion of the document usually hidden from public view but accessible with a few keystrokes, the plan posted on the White House Web site showed the document's originator, or “author” in the software's designation, to be “feaver-p.”

This has raised concerns about metadata harming the privacy of people.

Some more interesting analysis here and here (These are the articles which relate to metadata over the semantic web).

Ethics is going to be a really hot field soon. 🙂

Tags metadata, semantic-web

World Wide Web

All your data is ours, but, but wait, what about privacy? contd…

Post author By Gautham Pai
Post date December 5, 2005

I had recently blogged about privacy concerns with regard to storing data online. And this is what I found today: Do you trust Google?

Among the various things that the article mentions I found these interesting:

* Google working with scientists to make available data related to human genomes. (Now who is going to gift me Google Story?)

* Google providing personal data based on RFID tags.

What is Google upto?!

Tags google, privacy, rfid, security

My Updates

ISL Fusion Funda

Post author By Gautham Pai
Post date December 5, 2005

ISL Fusion Funda was the annual cultural event organized in Nimhans Convention Center on 26th November, 2005.

Tags event, funda, fusion, ibm, isl

World Wide Web

Speech recognition -> Podcasts -> Podzinger

Post author By Gautham Pai
Post date December 1, 2005

Podzinger is just what I was looking for! Podzinger uses speech recognition technologies to actually try and figure out the words in a podcast and then helps us to search within podcasts! Although not quite 100% accurate, it is quite impressive.

This can actually be used in a number of ways:

* Just search for keywords the way you do a normal search and get the podcasts of your choice. Podzinger actually provides RSS alerts for these keywords and so you get podcasts on the fly delivered to your favorite reader.

* I had recently written about the Problems with podcasts, where I had mentioned:

…there is an inherent problem with podcasts. They are not searchable. A typical podcast, for example, Slashdot Review contains many different news items. In this example, Slashdot review contains all the important stories published in Slashdot in that day.

In RSS, suppose I am not interested in reading a particular news item, I can just skip and read the next one.

Podzinger helps us with this.

Usually podcast publishers provide you with a description, which tells you what the podcast contains. Just use this to search in Podzinger and you can magically be transferred to the exact location where that particular item starts.

One problem however: Podzinger works only with IE 5.0+ with RealPlayer. (However for the sake of using this utility you can definitely go back and use that browser. 🙂 )

If you care about podcasts, you definitely should give it a try!

Tags khoj, podcast, search

General

Guess who?

Post author By Gautham Pai
Post date December 1, 2005

uhtttaauuthtmumgtmauaaautmmaaaauthmuaahggagtuumamahmahuagauu
athumamtatmmmtuhthhtmgtamhgghaaamhthaaaguatahaaatmutaaaaaguh
mmmumumagtggaaamaatahutttggaaggughhhgummmahaaatmuammhggmagah
aagamgaguumtgamuahaaumgtamttahaahmghugggamathtgtmhmmhhmaauat
mhmgghmhmatahgagtgahuhhtmutagatgmahuuaauugaataaautmhgtgtmaaa
htuhggmttmmgaaathutguuahuaguatgumamuautaughagautgtaaaggmauat
htagaaahaaatuauhttuggumagggttggagatmhhuhgmgthahgmmutmhuaguuh
gmmaagmuhumgamgugmuautmhtmmahtumguhaaaaaamgmatauuaaaumuamtat
hauaaaahaghghatgaahumhmamauhhuaaahmmhmmtaaththaathggaauaahgm
ggughamhtaggthgmaamamaaatah
auugtamhaahagmahaahtgmguttghuaaut
tataagmggmtaaamgaaatgathaamgtuamgthagttgaatmammtmthaaahauahh
uuaahtmtttgumtamtugttuaggammugatgmgughguuuutaguhmtmuattagaaa
gatguumumgaumhatamgmggumhuhhaahamtuhhtammgaaghtmguamhmtgtuag
tguhagauhumhamggaaauhgauamaghutuatuhgagamhtuuahttauauggumaug
amagaatghautttamauaguguhhamauugtatmguaahattathuttugaatahggha
ggaagmaghaguahmthhagmaumthtagagtahauhamhgtghamaagthtahahhgag
uhamhahhaaamhhaagggaauhmuaagmgtaguhuamahtmahhuhahhmhuhauaumm
hagagattamgtmmhmmaatuaautgtahtugtmhtmgmggahgmatghtaggamaammu
mthgahmaataggauamhtuuhggataguaagmhhugmauhgmmmtgathhahuttumga
aagagagmtmhgtuamauh
thtgaautamtgatamauhttugtmtggghaumggamhttm
tggmatgtahammuhataahmaaughuutuhtaatmaamthmahhaghagtghuhgauhm
tahgtttmmatgtaahggmtauahtugtmmtauuumaahughugmtugataaumggmgau
muamtghamaamgtahhtttaauaaaatmuaamggaaahtmhgguthamtugmhamuuma
uammhuatmmuaaahutammuhtmmtgaatgumhthhhumautaugauaamtammmmmat
ghuammmmthtuthhhtgtuhuuhgmahtaaaauhtttuaggaamugugautaaghuauu
mtatuhghhutmtggauhugghamgmaatagmtauhuahghagagagthammaghhaggu
aautgaahtahhataaatattauuguttgaagahggamaaaatmtgtumamguaumaatg
hgugghtummgthuuauaahtaamagahagaugmmguhgmgugmtmhguggaumgmmagh
hauhgmummtagtaathauammtaggaaghgugahumhtuhaugtauutgattaaahggu
mtauuamttaamaugtamhhgaaaahuhuutguuaagauuaaagtumhuamtgmgmghua
aahaaggtaahhatggagatmgtumaatmagaaguhuhghmagaaumathtathhgtaaa
ghgtmuguhugtmuutmgaahaumuahgaaagmahtuahhtamtaguhahaggagautau
atmuaahthaghhmhhtaumhtuhaaauguhttatatahatutahmamtauhmamgtugh
aagtuaauautgghtamuhgataaagamagtttataghahaaaaamaamghmuhmmgaag
hgtaathtgghthgahhaaaagaatamaguhahmaamttamaatguaatmmtaumahamt
hghtaghhmgaauhaaahhhauuagttatatmmghummhaumthmgumtaammahautgg
ttahguuatmuhatuthtmmgttmuumaaaahahahaumuhgtagmauhmhahhamtmga
augtumhaauaaugmggagmauahatumhhhhauamagauthhtgahatumahgmauaam
gtgmatatgthumttuagmuatauhhgggmagtgghathttmtmagatuhattuaguath

Time to change my LJ pic. This goes better with the ? / // / // `/ ? / /-/ theme.

Tags image, text

World Wide Web

All your data is ours, but, but wait, what about privacy?

Post author By Gautham Pai
Post date November 29, 2005

It started with Gmail as far as I can remember. Google provided 1 GB of space and people thought why not store everything online. As I have already told a zillion times, this is what the single data source concept is all about. And now it is back with a bang, with Google Base.

But a thought struck me today.

How can we rely on people who we don't even know? What is the guarantee that Google will not misuse our data? You might say, “What will Google do with MY data?”, but think again. The world becomes so restricted because of the absence of trust. You are not ready to store your confidential files or your private files in the same place. That 100 billion dollar idea that you wrote last night? Are you ready to store it in an online data-source?

The solution?

It would be better if Google (or anyone for that matter) provides the same service, but it does not know what data we store.

The idea is simple.

Encrypt all data as soon as it is created using some key that depends on the user who created the data. Decrypt it just when you need it. A mediator between the client interface and the server is responsible for the encryption and decryption. The mediator of-course lies on the client side.

And in the world of semantic web services, you can expect companies encrypting all data that they generate. So it is ok if you store your confidential files or the vision document of your company in the same single-data-source that you use to publish your photos to the public! (This seems like a horror story now, but it is perfectly valid.) Accidental leaks will not be a problem.

You don't have to be bothered about whether someone will be accessing that data, or if someone misuses it. All copies made of the document will be a waste as people just cannot make sense of it.

Security features like encryption and digital signatures are going to be a very important piece in technological evolution in the years to come. You can bet on it!

Tags privacy, security

Technology

Technology and its consequences – an update

Post author By Gautham Pai
Post date November 28, 2005

I recently blogged about Slaves of technology, where I mentioned that over-dependency on technology might create problems.

Here's what I found today regarding responsibilities of humans in technological evolution. This is an audio excerpt from Ray Kurzweil's interview in Accelerating Change 2005.

Tags kurzweil, singularity, Technology, thought

General

Random thought

Post author By Gautham Pai
Post date November 25, 2005

If you are committed, luck is always on your side… ALWAYS.

World Wide Web

RSS hacking – some observations

Post author By Gautham Pai
Post date November 22, 2005

I tried simulating the situation that I had mentioned in my previous blog entry on Gmail forwarding and service interoperability – an interesting observation.

I first opened a new account in Reader1 (I don't want to mention this) and then subscribed to my blog's RSS feed using it. Then using Reader2, I subscribed to Reader1's RSS feed. I also finally subscribed to Reader2's RSS using Reader1.

Nothing happened again. Reason?

RSS 2.0 specification says that there should be one 'channel' element within the root 'rss' element. 'channel' can contain any number of 'item' elements. 'title', 'link' and 'description' are mandatory elements in 'item'.

Usually, every RSS feed includes a 'pubDate' element although it is not mandatory. Also they include a 'guid', which is a Globally Unique Identifier. The latter makes it unique. The former can be used along with 'link' to give a hint of duplicate entry. So the readers usually identify duplicate entries and a loop will not occur.

However there is something that can still be experimented:

Since the mandatory elements are only: 'title', 'link' and 'description', and since you cannot uniquely identify any feed using one of these (atleast I could not see any mention of this in the spec), we can create an environment where we can show that the infinite loop can occur in principle.

2 things before I wind up:

One: There is some solution to stop the infinite loop problem in RSS although this is not obvious in first sight.
Second: This problem is something that we need to seriously consider now (this stage of web evolution) or else it could be a major design flaw that will require ugly patches later on (remember IPv4?). And this is where a formal approach (standards based) always helps.

Tags rss