2010-03-06

The various ways of distributing Python source applications

I am currently cleaning up the CLI implementation of Oplop, and so I am beginning to think of the various ways I can distribute the source code as an application instead of a library without resorting to compiled solutions like py2exe or py2app. Turns out there are ton of options!

The traditional way of distributing a Python application that you didn't package up using a tool like py2exe was to have a single package that contained the application code. You then included a script with the term "python" in the shebang as pointed to by your setup.py file. The issue with this, though, is that scripts relying on a shebang are UNIX-specific and so leave Windows users out in the fold.

Luckily Python 2.7 and runpy fix this problem somewhat. You can now add a __main__.py file to a package containing your app's front-end code to make it executable using the -m option (this changes application layout best practices, which I will discuss later in this post). That means Windows users could execute an app using python -m oplop instead of requiring a batch file to make it work.

But there is the issue of requiring installation of application code on your PYTHONPATH which really should not be needed. Some people solve this problem by having a specific Python install just to hold their various Python applications, but this seems like it shouldn't be needed.

Luckily in Python 2.6 you can execute directories directly. Much like executing packages, if a directory contains a __main__.py file next to the application's package then Python can run it: python /dir/to/oplop. That means you can simply toss the application code somewhere and alias the application name to the execution of Python with the directory location passed to it (this is actually how I handle executing Oplop myself since it means I can have it always point to my Hg branch).

But still, having to keep a directory full of files shouldn't be needed either. Wouldn't it be nice to keep all the needed code in a single file, making upgrades, relocations, or deletions dead-simple? Well, since Python 2.6 you can thanks to zip files being executable (as long as you are only using Python source or bytecode; extension modules are out of luck). Using the same mechanism that allows you to execute a directory -- a __main__.py file -- you can pass a zip file to Python and it will execute it: python oplop.zip. Once again, just like the aliasing of an application for a directory, you can do the same for executing zip files.

But the best solution I can think of is the zip file one but without requiring the alias in your shell. I would love it if I could simply drop a Python application on to my PATH and have it simply work. Luckily you can on UNIX starting with Python 2.6: echo "#\!"`which python` | cat - oplop.zip > /usr/bin/oplop . What that shell command does is prepend the zip file with a shebang pointing to the Python interpreter you want to execute the zip file with. Once you make the new oplop file executable you will be able to run it like any other application. This works because the shell only cares about the shebang and Python inspects the data passed in as a zip file and so executes it as if you had specified the path to a zip file instead. Now you have a fully self-contained Python application that is executable without having to muck with your shell!

I personally find all of this fantastic. Having so many options on how to execute a Python application gives me flexibility as a user to install and execute an app in the way I find most fitting. But unfortunately this is so new that hardly anyone is supporting it. And what is even sadder is that you can support all of this with graceful degradation of support no matter what version of Python you support if you follow some best practices.

First, move your application execution code (i.e. your main() function and support code) into a __main__.py file contained within your package. You can do this for any version of Python without any ill effects since it is just a file. It has a nice direct benefit of making it easy for anyone to find your application front-end code without having to grep around for sys.argv or something (I have personally done this multiple times to look for undocumented command-line options). It also has a nice side-effect of making your application instantly executable using python -m package_name by Python 2.7/3.1 and later. Since this is fully backwards-compatible everyone should do this for their applications.

Second, you should make a symlink from your primary execution script to a __main__.py file located next to your package directory. Once again this is totally backwards-compatible as __main__.py won't interfere with older versions of Python; it simply lets Python 2.6 interpreters and newer execute the directory directly instead of having to install the application through a setup.py file. Once again, everyone should do this.

Third, you should make your main execution script/__main__.py that sits next to your application's package be this generic execution script if you only support Python 2.5 and newer (this includes Python 3.x). This forces you to properly move your application's front-end code in a __main__.py file in your application's package. It also is a simple test to make sure you see how your code is run when someone uses the -m option.

Fourth, if you have extension modules, please provide pure Python equivalents! Python itself is planning to do this where reasonable (i.e. not for extension modules that wrap a C library, but for things such as datetime). If you do this then you can easily zip up your application with your generic __main__.py and distribute your application that way if you so wished. Mercurial, for instance, already has pure Python equivalents with their --pure option in their setup.py file.

Now imagine if everyone followed these best practices. That would mean that some day there could be a distutils command called app_sdist that instead of zipping your application code into a version-specific directory, it simply zipped it up at the top level of the zip file. That would make it instantly runnable by Python on its own. And if you tossed in an app_install command that simply concatenated the shebang on to the zip file created by app_sdist, you now have a fully self-contained Python application in your PATH that is a single file! I personally think it would be fantastic to have those two commands available in distutils once people start following the best practices I laid out above, but I doubt the code will get written until there is wide support for __main__.py files. You are going to start pestering your favorite apps to follow best practices now, right? =).