2012-01-17

Working with App Engine backends

This post is all about backends in App Engine, but first I want to make two personal points.

One is that I'm sorry I have not been updating this blog (or really contributing to Python and its great community) more this past year. It was a rather insane 2011 for me personally: I finished my Ph.D., got married, moved to SF  (w/o my wife thanks to immigration), started working at Google full-time, moved to Toronto (yes, I moved internationally within the span of 6 months), transferred to the Google Waterloo office and started on a new team. In other words I have been stressed out and busy and busy continuously for what feels like ages. But now that I am back with my wife and I am done with the crazy international moving I don't expect 2012 to be anywhere near as nuts (nor any other year in the near future for that matter if I can help it).

Two, I am no longer on the App Engine team, so all that I am about to say (and will also say in the future as I have a couple of posts to write on App Engine) is from me as just another person using the service. I have been working on a redesign of a website of mine in my spare time and these next couple of posts are based on that work (if you following me on Google+ you know what it is, but I'm not ready for a full-blown Internet debut so I'm not going to link here yet). I don't have any insider knowledge here nor speak for the team.

With all of that out of the way, let's talk about backends! First you need to understand what an instance is to App Engine. When you hit a website hosted on App Engine, an instance is what is serving that request. Now an App Engine app can have anywhere from 0 to thousands of instances up and running at any one time to handle traffic. So think of an instance as a copy of your app that is running on App Engine serving traffic.

Normally you have some time restrictions on how long any of your instances can take to service a request. If it is an HTTP request from the general Internet you have 60 seconds. If it is from a cron job or push queue (i.e. from within your own app) you have 10 minutes. Now both time restrictions are actually rather long and you can accomplish a lot within those restrictions. And in the case of push queues, you can break your work up into multiple tasks so that you really can take as much time as you need as long as you can slice your work into 10 minute time chunks.

But there are just those times when you want something that will take more than 10 minutes, such as a process that is constantly polling some other service to get data. In those instances you want a backend since they are instances that run for as long as you want (assuming App Engine doesn't kill it for misbehaving or some other sysadmin reason). You can have a resident backend which never stops running or a dynamic backend that will stop running when it is done executing, but either type can take about 24 hours to run if it wants.

The way you use instances is to send requests to the backend's subdomain. You name your backends in your backend.yaml file which becomes the subdomain off of your appspot.com subdomain for addressing your backend. So, assuming you have a backend name "forever" for an app named "spam", the domain for the instance will be forever.spam.appspot.com.  You can then create URL handlers in your instance and hit those URLs to cause your instance to do it's long-running thing.

Now for those of you who have been using App Engine for a while you might notice that the subdomain for a backend looks just like the specific version ID URL for your app, and that's on purpose. Thanks to it having the same format you can specify a backend name anywhere you can specify a version ID for your app and have it use your backend. E.g. to have a cron job that hits your backend you specify your backend's name as the "version" of your app to hit in your cron.yaml.


And before anyone worries, your backend defaults to being private to your app. If you really want your backend reachable from outside of your app (i.e. the public) you can specify that in your backend.yaml.

While backends sound fantastic and address an old complaint against App Engine and its lack of background processes, I have two warnings/tips about using them.

First, the tip: you can't use the OS X AppEngineLauncher to use backends. When you launch a dev_appserver you need to specify the --backends flag. You can do that manually in the AppEngineLauncher, but you need the flag for appcfg.py as well which you can't specify. So if you want to use backends (and thus want to specify the --backends flag) you will need to use the command-line version of the tools. Obviously this is a minor thing, but it took me quite a while to realize that was why backends were not working for me.

And my warning is that don't expect to use backends on the free quota. While you get 9 free hours of backend usage, you can blow through that surprisingly quickly. But the real issue is all of the other quotas that you have on App Engine. For instance, I was hitting another server to get data to process. What ended up happening is that I was maxing our my incoming bandwidth quota for urlfetch. So then I started to write code to throttle my requests. But even then I occasionally hit the quota. To properly handle that I would need to grab the quota exception and then retry or heavily throttle myself so that I never hit the issue. At that point I was writing code to throttle my requests and to retry when I managed to request to quickly; that's just silly when I have push queues to do that for me automatically. So I personally ended up not using backends in my case (although ironically after I refactored the code I started hitting another free quota limit which led to me turning on billing, so now I'm using $0.05/day in quota, so the app is just costing me $9/month).

But when you are willing to pay for backends they are great for what they provide, but you just have to realize they are just another tool and are not always going to be the best solution for everything (especially if you don't want to pay to use App Engine).