Coder Who Says Py: January 2010

2010-01-23

HTML5 will lower the use of CDNs for delivering JavaScript

As Firefox 3.6 was released today, people have begun to use the async attribute for the script tag from HTML5. For those of you unaware of the new attribute, it tells the browser to execute the JavaScript that a script tag points to through its src attribute in an asynchronous manner. Now from my reading of the spec that should be asynchronously, but one at a time based on the order of the script tags are found in the doc. But if you play with Firefox 3.6 it becomes apparent that Mozilla disagrees with my interpretation as Firefox will begin executing the next async JavaScript file without waiting for the previous one to finish.

And this immediate execution is where things begin to make things interesting for using a CDN for JavaScript code. I don't know about some of you, but I use the Google AJAX Libraries
to get my copy of jQuery through their URL interface. This is great for me as it means the traffic is served by someone (i.e. free for me), Google's CDNs are fast, and there is a decent chance that others have used the CDN as well which lets the browser uses a cached copy of jQuery instead of fetching it again just for my web app. This means you don't concatenate jQuery in with your JavaScript code to get a single JavaScript file to serve, but my thinking (until now) has been that the perk of the browser already having a cached copy of jQuery was enough to not care about the potential separate HTTP request.

But you can't reliably use the async attribute with library code that subsequent JavaScript code depends on. In my case I use jQuery to execute JavaScript code once the page is loaded and rendered. When I put the async attribute on both the CDN-served jQuery code and on my own code, I was able on my local machine to occasionally trigger a race condition where my JavaScript code was executed before jQuery, triggering an error as $ was not defined yet. It was tough to trigger, but it definitely happened.

This means either you execute the CDN-served library code synchronously or you concatenate it with your code and serve that entire file asynchronously. The decision then becomes what gives you more benefit: possible faster downloading from a CDN plus cache hits but blocking JavaScript execution, or having to download from your servers but having fully asynchronous execution.

Now some of you might view this as a somewhat thin argument that CDNs serving JavaScript will get marginalized, and that's fair. But what I really think will impact it is offline web applications and the app cache. For a web application to work offline you define a cache manifest file which lists what URLs to cache, which URLs to hit a specific file if not online, and which ones should always go to the network no matter what. But at issue here is the fact that all listed URLs must follow the same-origin policy. This means that for you to serve up some JavaScript library while offline you need to have it hosted on your server to be able to list it in your cache manifest to get the benefits of an offline web app.

As people begin to look at offline web apps more and more I think it will be interesting to see how this impacts people's use of CDNs for stuff like JavaScript.

2010-01-19

Oplop web app, now using HTML5

Back in October I blogged about why I liked jQuery UI in my search for a JavaScript library that would let me create a wizard interface the way I wanted to do it for Oplop (which is a password hash algorithm app for creating unique account passwords; more details can be found in the How Oplop Works page). And while jQuery UI worked for the initial launch of the wizard approach I took, I quickly realized it was not going to work for me in the long term as I wanted the web app to work on both a desktop and a cell phone using a single version and that meant minimizing download size. While jQuery UI looked fine on my Android phone, it did have some extra overhead that was simply not needed by me. That's when I decided I would create my own replacement for my use of jQuery UI's accordion to get the same effect, albeit specifically tailored to my needs.

I also took this opportunity to completely go all out and only use HTML5 (which should now simply be called HTML). Having used the new version of HTML heavily for my thesis work, I have come to know and appreciate all of the new features in the spec. Add in Mark Pilgrim's wonderful Dive Into HTML5 site/book, and I knew I wanted to go nuts and be entirely cutting edge and essentially give Internet Explorer the finger (for now; I plan to add support for Chrome Frame in the future). While I am sure most of you pro developers are not looking at HTML5 yet, you should still at least give it a glance as there is already JavaScript out there to help make IE function properly with some of the new stuff (see Mark's discussion on this).

First thing I did for this redesign was go for a semantic markup of the site. There was to be no CSS styling embedded in a tag or the page, nor any JavaScript anywhere to be seen in the page layout markup. Based purely on tags, id, and class markups I should be able to look at the HTML have the UI be obvious and clean. I also wanted the page to render exactly how it should look upon initial load with nothing but the CSS stylesheet and HTML; JavaScript was to be purely for logic and later style changes, but it should not be an initial load requirement (although at the moment the JavaScript is blocking the loading of the rest of the page; I will talk more about how I might solve that later on). The idea is that the load should take no longer than downloading and parsing all the files and not have to wait on any JavaScript short of wiring up form controls; no nasty UI pop-in from JavaScript execution, etc.

Luckily my goal of using only CSS and HTML for the initial page load styling turned out not to be a problem. Thanks to various new tags like section (to encase each step in for semantic reasons) and new attributes on form elements like autofocus (so the proper text field is selected upon initial load) the page looks exactly as it should sans JavaScript, making the view cleanly separated from the controller, all while being proper HTML.

With the look out of the way I tackled replacing jQuery UI's accordion widget. Because I imposed my restriction that the UI had to look correct w/o JavaScript, I had used CSS to hide the steps of the wizard that had not been reached yet with an open class. Using that as a base, I decided to have transitions from one step to another occur by triggering a nextStep() method which uses jQuery to find the step that has the open class, toggle that class off, go to the next section, and then toggle on the class there. This made the query simple in jQuery and allowed the transition code to be generic from step-to-step (although I had to add some hook support through jQuery.data() for step-specific prep and validation stuff).

That got me as far as having the web app function just like it did before I started this endeavour. But at this point I wanted to extend the potential audience, so I added iPhone support as well. This turned out to be basically simple, but I did have to tweak the UX just for the iPhone. When I designed the web app I was designing it for me while I am on my laptop, which means a full keyboard. I also made sure it worked on my adp1 (which is the developer version of the HTC Dream) which has a physical keyboard. And being somewhat of a UNIX geek, I made sure the entire web app could be driven from the keyboard under at least Google Chrome, from proper focus between form controls by pressing Enter, down to having your account password be in a text field that is already selected for easy cutting by hitting Cmd-X (or whatever your cut keyboard shortcut is). Nice and elegant and damn quick for getting your account password.

But the damn iPhone has no Enter key on its soft keyboard when you are filling out a form. Instead you get a "Go" button which acts as form submission. In my case that's useless as there is no place to submit the form to; everything is client-side for security reasons. That meant I had to tweak the UX so that you can not only press Enter to transition to the next step in Oplop, but you can also click the title of the next -- and only the next -- step as well to trigger a transition. In other words Oplop can be driven by a mouse now.

Since I was already adding UX support specifically because of the iPhone I figured I would also make the web app support being a web clip (web app that can be added to your home screen) as well. Thanks to Jonathan Stark's in-progress book on iPhone developement I made Oplop be zoomed in, have a proper home screen icon, and to ditch the location bar.

I would also like to thank ImageOptim while I am at it for being such a great little tool to optimize all the PNGs I have as icons (favicon, web clip icon, etc.).

So now I have a web app that works under (at least) Chrome, Safari (both desktop and Mobile), Firefox, and Android. I am rather happy with that browser coverage considering there is no magical browser detection stuff require, nor two versions of the web app.

But of course I am not finished yet. First thing is to write up some UX tests using Selenium 2/WebDriver and Jython. Now that everything is functioning as expected I want to keep it that way.

After that I want to clean up my JavaScript code to be more event-driven. While it is part way there thanks to having a generic function that handles step transition details, that function gets called explicitly. It would be better to instead trigger a custom event and have a single event handler for that at the top of my HTML structure. Minor, to be sure, but just feels more "right".

Once that is done comes time to start really optimizing the web app. I still have yet to use Google's Closure compiler to minify all of my JavaScript code into a single file to shrink total download size and minimize the number of HTTP connections required to download everything (which can matter on a cell phone that has a high latency connection). I also want to play with the new async attribute on link tags. My hope is that I can add that to my minified JavaScript file so as to have it start downloading concurrently while the rest of the page is downloaded and rendered and have them finish roughly at the same time. Unfortunately I don't know if that will work reliably enough to let this work w/o somehow letting the user know when everything is finally wired up. Plus no browser currently supports the attribute (although Firefox 3.6 will have support apparently). After that there will be lots of time spent with Speed Tracer to see if there are any other obvious things to tweak.

And then back to adding features! The next big ones will be adding offline web app support and creating a Google Chrome extension based on the web app. The former should be relatively straight-forward, but lead to some custom Mercurial hooks for properly updating the cache manifest (need to tweak the file every time something changes to trigger a new download of the app) and the latter should simply be fun as I think my planned approach will be more secure and have a better UX than the other password has algorithm extensions already available (if my idea works =).

Expect more blog posts on Oplop in the future as it is turning out to be a fun personal project. And if you actually use Oplop let me know and I will consider setting up a mailing list or something for Oplop-specific announcements (e.g. like the Python 3 command-line version I just uploaded to the project site) if there is enough users beyond just a handful of friends of mine.

2010-01-16

The importers project is now public

For my PyCon talk on importers, I needed a running example. After asking online what people wanted as an example I ended up creating an importer that uses sqlite3 databases ... which I then scrapped and rewrote from scratch after realizing that there was a lot of boilerplate that I could abstract out. I decided to some ABCs to make it easy to write other importers (specifically a zipfile one), much like the ones I have in importlib. This not only helped to simplify the code, but it let me easily conceptualize what is and is not consistent between importers. Knowing what is common helps me know what I need to cover at PyCon.

But there is no need to keep this code to myself, and so the importers project has been created. The project is hosted at Google Code, uploaded to PyPI, with docs at packages.python.org. The code has a bunch of ABCs that let you create finders and loaders as long as you can specify a few simple operations from the perspective of a file path (e.g. a file exists, reading from a path, etc.). This makes it very easy to create new importers that use different archive formats as you simply need to manipulate paths and read from the archive; the ABCs do all of the fancy import work for you. I also included both the sqlite3 and zip importers in the package in case anyone wanted to use them. I have also gone ahead and include a lazy loader mixin that works with other loaders. This is the same lazy loader I blogged about in the past, but with proper unit tests.

Thanks to the importlib dependency, this is Python 3.1 only. My hope is that feedback on the code in the project will be such that I feel confident in moving the code into importlib. While I was able to get importlib into the standard library w/o following the typical one year waiting period, I am not skirting that unofficial rule anymore. You can consider the importers project purgatory for future importlib code (assuming feedback is positive).

This has also been the first project I used Distribute with. It all went rather smoothly, even running under Python 3. What was really handy was the upload_docs command once I used the --upload-dir flag (for some reason my setup.cfg was not doing what I needed). In other words, good work Tarek and the people helping him with Distribute!

2010-01-09

Where the Hg transition stands

[edit 2010-01-09: links to mailing list archives containing latest discussion]

At PyCon 2009 it was announced that python-dev planned to move Python development from svn to hg. Well, just because we chose our distributed version control system (DVCS) does not mean that we were ready to hit the switch. For one I took a three month sabbatical from python-dev to get my PhD thesis proposal finished (which I did, thank science). Luckily Dirkjan Ochtman stepped in with PEP 385 and volunteered to handle the transition. At that point we thought that it would be a matter of creating a new sys.mercurial attribute (which we still need to code up), write up new developer docs on the workflow we expect to use, and then do a high fidelity conversion of the revision history to hg and then flip the switch.

But then bloody line endings wielded their ugly heads. While I was writing PEP 374 and evaluating the three leading DVCSs I was under the impression that the win32text extension for hg did what we needed. No one every spoke up saying otherwise while the PEP was out for discussion or anything so I simply didn't worry about it.

But then Mark Hammond came forward and said we had a problem. Obviously Mark has experience working under Windows, but he also has experience with hg thanks to his work with Mozilla. From Mark's experience it seemed that no matter how careful people were that the line endings would get messed up in the repo, and that just isn't acceptable. Martin v. Löwis then came forward and pointed out how this was not acceptable as well. Turned out that win32text didn't properly protect from mistakes at it is user-specific, not repo-specific. This was not what we wanted; svn's svn:eol setting is really handy and has turned out to be great to have around.

So this led to a long discussion over what an hg extension would look like that would mimic what svn:eol did. This led to the idea of the hgeol extension. In a nutshell we would end up with an extension where we had a .hgeol file that was version controlled. It would specify how files should be checked out (e.g. native for the OS, \n, \r\n, or binary) and make sure that no checkins are going to lead to bad line endings. The design can be found in the Mercurial wiki (be aware it is a wiki page so some people have simply dumped ideas in there). The latest discussions on various Mercurial mailing lists can be found here and here (search for [eol] to find the relevant threads).

Martin Geisler, a Mercurial contributor, in the end up picked up the torch and went a good distance. He has his in-development code at bitbucket. But the work is not finished. Martin has to work on his PhD thesis, so he has stopped active development for a few months. That means those that are motivated to help would be greatly appreciated. At this point what is really needed is making sure the code is robust and that is works as desired. That means making sure the tests work and the results are as expected on both UNIX (this includes OS X) and Windows. It also means making sure that the test suite is thorough enough to cover all the possible problems that might come up during development.

This inherently helps test to make sure that the design covers what is needed. One of the reasons this entire line ending problem has not been solved before is most of the Mercurial dev team is either not on Windows or use editors that know how to handle line endings properly (I'm looking at you with the evil eye, Visual Studio). So while we think the current design works, we don't have any real-world usage yet. So some pounding on the extension with a repository that someone actually uses would be great to make sure we didn't miss something.

In other words we would appreciate help pounding the heck out of the extension. Both running the tests, making sure the tests are thorough, and using the extension with an actual repository that gets used on a regular basis would be highly appreciated.

Dirkjan is coming to PyCon 2010 so I would expect at least a lightning talk on this. There is also hope between Dirkjan and I that we can see this transition happen in the first half of this year, but that really depends on the hgeol extension getting into a good enough place that we are not isolating Windows developers.

2010-01-02

Announcing the "Little Bit of Python" podcast

Andrew Kuckling had an idea about doing a podcast about Python, focused more directly on the language itself, and on the PSF and its actions. He approached Jesse Noller and me about joining him in doing it as we said we liked the idea on Twitter. We then got Michael Foord and Steve Holden involved so we could have some variation on accents in the podcast.

And thus the 'Little Bit of Python' podcast was born. We currently have two episodes up and have recorded the third but have still to edit it and post it online. We are all new at this so forgive us if the audio is not perfect or we seem a little stiff. I know I for one was rather quiet in these first two episodes but talk a lot more in the third. We have tried to keep these podcasts clean, but I expect I will be the first to break this rule at some point. =) We also plan on having a more proper web site in the future once Jesse has the time to get it up.

You can also see Michael's post on announcing the podcast as well.