What I dislike about Google AppEngine

When I built the ‘Twitter Trending Topics‘ application, one of the things I had in mind was to see how quickly an application can be built in the most economical way.

While the application is working like a charm, a day into the launch, I already see a few issues with the hosting solution that I chose, the Google AppEngine.

  • All your data is in the Google BigTable
    This is the biggest problem I see with Google AppEngine. While the concept of BigTable is good for storing data in general, the restrictions it comes with are painful.

    For example, you cannot perform a query that returns all the data for you as there is a limit of 1000 tuples per query, and as this post mentions, you can’t even use offsets more than 1000, in essence limiting your queries to return at most the last 2000 tuples.

    Full text searching although available is quite restrictive. There is the concept of a SearchableModel, but you can’t event start comparing it to Lucene/Solr.

    There is no bulk data export functionality. Worse, there is no bulk data delete functionality either. The ways people do it is to write scripts to delete data which are then called via ‘curl’ and as there are time out limits on the curl requests, you need to split your task in chunks and delete data in batches.

  • Cron job support is restrictive
    In general, it is expected that cron jobs take time to execute. However, in Google AppEngine, cron jobs have a 30 second time-out limit, so if your cron task is not completed in 30 seconds it throws an exception. The solution for this is to split your tasks across multiple cron jobs.
  • A few things are broken in the development sandbox
    In theory, an application running in your sandbox should work fine in the production environment. Unfortunately, I came across a few bugs which are preventing me to run the application in the sandbox. Google is aware of some of these problems and is fixing it.
  • Google AppEngine apps cannot be deleted
    There is a limit of 10 applications per user. Further, the applications cannot be deleted.
  • No way to download access logs
    It is not easy for you to download your access logs. The dashboard provides a way to search within your access logs, but that does not act as a substitute for awk/sed/perl churning of the logs does it?
  • GQL sucks
    If you have used GQL queries, you will know what I am talking about. It takes a while for you to get used to what is available and what is not.
  • No way to restrict the bad bots
    If you see one misbehaving bot sucking up your bandwidth, there is no way to restrict this one IP (or a range of IP’s) from accessing your application.

While Google AppEngine has these restrictions, and many more, it is still free for the most parts, and so acts as a good substitute for Amazon EC2 for trying out your application. But is it good for serious commercial application development?

5 Comments

  1. Givas Travera

    June 22, 2009 at 6:15 pm

    Well, GQL of course sucks (writing SQL by hand sucks even more), but why do you use it, at all? There’s a much nicer filter() API – no need to learn a new sucky language.

  2. Hello Givas,
    I have seen the filter API, although I haven’t used it in my application. Will be interesting to see how that pans out.

  3. Check out our new Task Queue API for doing background work in a more flexible way; that should help address your Cron issues and the 1000 result limit:
    http://code.google.com/appengine/docs/python/taskqueue/overview.html

    We have support for bulk upload and deletion:
    http://code.google.com/appengine/docs/python/tools/uploadingdata.html

    You can download your request logs:
    http://code.google.com/appengine/docs/python/tools/uploadinganapp.html#Downloading_Logs

    Otherwise, do you have a list of the bugs you hit in the development server? Feedback from our users is very helpful!

    Thanks.

  4. That was quick! Thanks Brett for the links and pointers.

    As for the bugs in the development server, I am currently stuck with this one:

    http://code.google.com/p/googleappengine/issues/detail?id=1446

    which has rendered my development sandbox useless. The same application works fine in the production environment.

    The other issue I have faced is not being able to find the final URL that the redirects of a fetch resolves to, in other words, the equivalent of urllib.urlopen’s geturl() function.

    Thanks,
    Gautham

  5. No more 30-second limit for background work – With this release, we’ve significantly raised this limit for offline requests from Task Queue and Cron: you can now run for up to 10 minutes without interruption.

    from http://googleappengine.blogspot.com/2010/12/happy-holidays-from-app-engine-team-140.html

    [Edited: Admin removed link to an irrelevant site - for more info refer to - http://buzypi.in/about/comment-acceptance/ ]

Leave a Reply

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>