2007-06-10

Initial impressions of OCaml

I am a programming language whore. I am also a language bigot. And finally I am a language developer. Those three things lead to me learning new languages so that I can learn them, be able to argue how Python is better, and to "borrow" ideas from various language for possibly including them in Python, respectively.

To accomplish this I have five small programs that I write in every language that I am willing to "claim" I have learned. They all serve a different purpose such as how to work with streams (in the form of an infinite Fibonacci sequence), file I/O (with a simple spellchecker that reads the words file in UNIX), objects or some form of state (the spellchecker and the Fibonacci sequence), recursion (factorials), array usage and basic data structure recursion (heapsort), and then "Hello, World!". Those five programs actually exercise a good portion of most languages in terms of the core of the language. I have done this for ten various languages at this point. When I get my personal site up and going in my next life I plan to toss this code up online.

But this weekend I decided to add #11. I tackled OCaml since there is a possible implication for my current thesis topic. The language itself turned out to be a pleasure to work with, but learning it was more painful than I would have liked.

I had four things I worked from to learn the language: ocaml-tutorial.org, the CalTech book (PDF), the O'Reilly book, and the official docs (both in HTML format and directly from the interface files). I mostly worked off the tutorial with the other references to fill in gaps (especially the CalTech book). And I think that was the biggest frustration: the gaps. I felt like no one specific reference covered everything. I had to use Google to find out how to create an initializer for objects. You would think that would be rather basic! I think this partially stems from people presenting the language in the way they want to see the language used, and this especially applies to object use as it seems its use is discouraged.

Once I realized that I was going to have to look around to get the proper depth of info I needed for what the language offered I was able to find what I wanted for the language. But then I needed to realize that finding examples of how to do what I want was not going to necessarily be possible. For instance, I wanted to split a string on whitespace. Searching for [ocaml split string on whitespace] didn't turn up anything useful. But then I noticed the use of the Str module for some code that was of no use to me. I knew there was a String module, but not a Str module.

But sure enough, the Str module has a split function! For some reason the official docs don't list Str in the standard library list but as a separate entry. If you don't notice that or don't look in the stdlib module index you won't know it is there. Because OCaml use is not large by any means it leads to a dearth of examples online which forces you to have to figure stuff out on your own which can be painful.

But once you do, the language is nice. I had learned Standard ML once, but never did anything with it. Plus I have done Haskell in my five languages so I have learned functional languages that have pattern matching. But the great thing about OCaml is the objects. Having objects for those times where you want state is really handy. Yes you can use closures and various tricks, but sometimes having an object is just plain nice.

My biggest complaint against the language is not specific to OCaml: I really hate it when type inference gets in my way. I get frustrated when the type inference algorithm says I can't use something the way I want to just because there is some minute chance it could be used in some funky way. Either that or the error messages need to be better to help me figure out exactly how the problem could come about so I can fix the problem.

As of right now I would put OCaml above the other functional languages I know: Scheme, Lisp, Standard ML, and Haskell. I will have to use it some more before I decide how it falls amongst the various other languages I know.