When I built the ‘Twitter Trending Topics‘ application, one of the things I had in mind was to see how quickly an application can be built in the most economical way.
While the application is working like a charm, a day into the launch, I already see a few issues with the hosting solution that I chose, the Google AppEngine.
- All your data is in the Google BigTable
This is the biggest problem I see with Google AppEngine. While the concept of BigTable is good for storing data in general, the restrictions it comes with are painful.For example, you cannot perform a query that returns all the data for you as there is a limit of 1000 tuples per query, and as this post mentions, you can’t even use offsets more than 1000, in essence limiting your queries to return at most the last 2000 tuples.
Full text searching although available is quite restrictive. There is the concept of a SearchableModel, but you can’t event start comparing it to Lucene/Solr.
There is no bulk data export functionality. Worse, there is no bulk data delete functionality either. The ways people do it is to write scripts to delete data which are then called via ‘curl’ and as there are time out limits on the curl requests, you need to split your task in chunks and delete data in batches.
- Cron job support is restrictive
In general, it is expected that cron jobs take time to execute. However, in Google AppEngine, cron jobs have a 30 second time-out limit, so if your cron task is not completed in 30 seconds it throws an exception. The solution for this is to split your tasks across multiple cron jobs. - A few things are broken in the development sandbox
In theory, an application running in your sandbox should work fine in the production environment. Unfortunately, I came across a few bugs which are preventing me to run the application in the sandbox. Google is aware of some of these problems and is fixing it. - Google AppEngine apps cannot be deleted
There is a limit of 10 applications per user. Further, the applications cannot be deleted. - No way to download access logs
It is not easy for you to download your access logs. The dashboard provides a way to search within your access logs, but that does not act as a substitute for awk/sed/perl churning of the logs does it? - GQL sucks
If you have used GQL queries, you will know what I am talking about. It takes a while for you to get used to what is available and what is not. - No way to restrict the bad bots
If you see one misbehaving bot sucking up your bandwidth, there is no way to restrict this one IP (or a range of IP’s) from accessing your application.
While Google AppEngine has these restrictions, and many more, it is still free for the most parts, and so acts as a good substitute for Amazon EC2 for trying out your application. But is it good for serious commercial application development?