Whaleventures: February 2012

Wednesday, February 29, 2012

Dinosaurs and Programming

Ever since I can remember I've had a love for dinosaurs. Even now, I sometimes secretly dream of flying off into the African sunset on a Pterodactyl, gunning down antelope with a .50 caliber machine gun (see http://theoatmeal.com/comics/ptero). If it wasn't for computers (or computer games actually) I would probably have studied Archaeology.

Now, at this point in time you may be asking: "What does dinosaurs have to do with programming or Whaleventures for that matter?" Well ... a lot actually. Let me set the scene for you: I was 9 years old and in standard 1 (we still referred to it as standard 1 back in the day) when Jurassic Park was released. For a 9-year old dinosaur freak, this was the biggest thing since chocolate coated vanilla ice-cream. Every time I got a glimpse of a trailer or poster somewhere I would nag my dad about it. Finally; I think out of sheer desperation to shut me up; my dad took me to see Jurassic Park. However, there was one technical difficulty: Jurassic Park's minimum age restriction was 10, while I was obviously 9 years old. With a suspicious look from the lady at the counter, and a white lie or two, my dad got me into the theater. I was in for an awesome action-packed 2-hour roller-coaster ride that involved a T. Rex eating a man straight off the toilet, Velociraptors raining down death and destruction on the main cast and Brachiosaurus sneezing a ball of snot in a girl's face.

At the time, I din't notice it, but years later when I watched Jurassic Park again, something interesting caught my attention. One of the characters in the movie is a computer programmer by the name of Dennis Nedry (or Ned). If you didn't watch the movie, note: some spoilers are up ahead. He is basically the guy that was responsible for the downfall of the entire park. One set of scenes (right after the shit starts to hit the fan), is of particular interest where Samual L. Jackson (he goes by the name Ray Arnold in the movie) tries to get the security system of the park up and running again. Here is an outline of what sort-of happened:

In the abovementioned set of scenes we get to see a strange operating system, with some lines of code in a strange (or not-so-strange) programming language.

Thus, to kick off part 1 of our: "Identify the programming language / operating system in the movie / television program"-series the first challenge (which is not that difficult) is to identify the programming language in which the Jurassic Park system is coded. The first one to provide an answer (do not cheat!, I'm watching you) will get 500 internets for his/her efforts. Here is a hint (more or less) of what you'll be looking for when watching the movie:

The second challenge is to identify the operating system that is running Jurassic Park. Lex (the hacker girl) is also using it when she tries to lock the Velociraptors out during a chase scene much later in the movie. If you can identify this (without cheating) you'll be really hardcore and you'll win 20000 internets (seriously).

So, go to your nearest video store this weekend, and rent this excellent movie. You'll be amazed by the special effects (given that it was 1993), the action scenes, and more importantly: the DINOSAURS!!!

Managing callback "spaghetti" in node.js

Although node.js' evented I/O model works really, really great for writing tcp/http servers, which are inherently event driven domains, sometimes you just want to execute a number of steps in sequence, wait for them all to complete, and then take some action.

Let's take the example of adding some sample data into a test mongo database. This is a pattern you see a lot. Some test framework, be it mocha/vows/nodeunit whatever, it doesn't mater. But there is a before hook, and you want to populate the database with a nice set of initial data for your test.

Now normally, you would end up with what I call callback "spaghetti". A bunch of async calls, taking callbacks, being called in sequence, each taking the next step as a callback, until finally you end up with a heavily-spaghetti, difficult-to-follow piece of code. And it's always the same pattern over and over again. To give an example, here is some demo code using mongojs to populate a mongo db for testing:

Now as you can clearly see, this very quickly gets unwieldy and annoying to work with. Luckily, there is a pattern to get around this. It is mentioned in the book Secrets of the Javascript Ninjas by John Resiq (thanks Dawid for the tip). On a basic level, it has the following form:

So basically, you have an array of functions. These functions each do their work, and call the shared "callback" function. The "shared" callback function simply pops the first function in the array, and executes it. Next time, it again pops the first item, and executes it. Until the array is empty. At this point your sequence of steps is completed.

This is a very nice pattern, and something that I know I've come up with in some form of another a couple of times already. But I always love it when someone actually comes along and "solidifies" the pattern. I don't know what the pattern is called, I just know that I like it.

So, I thought, what about trying to make it even more generic? What if you can have some function, that takes a "done" functions, and a list of "step" functions (each taking a callback) which do their async work, and then calls the callback, that was passed in to them. So, I came up with this:

This node module exports a single function, waitForAll. This function takes a variable number of arguments (as is natural with any JS function). The first argument is the "done" function - this gets called upon completion of all the async steps. All the other arguments are then treated as async step functions. So inside the function, the list of actions, and the in-between callback function gets created. The in-between callback function is then partially applied as the first argument to all the action step functions inside the preparation code. This means that all functions passed in already have their first argument bounded to the internally created callback function. All that is left then, is to capture inside the internal shared callback function, the arguments that were passed to that function, and apply them to the partially applied function from the first step. This allows us to pass data from the one async step to the next. Consider the following test:

As you can see, the steps are now simply arguments passed to the waitForAll function. The first argument being the done function. Each step gets the callback, and all parameters passed in from the previous step. It is the single duty of each step to ensure that the callback() function gets called when work is done.

This means that, the first example becomes much much simpler to comprehend/manage:

A big win for readability in my opinion.

Raspberry Pi vs cstick Cotton Candy

Two similar ideas with very different prices :)

Raspberry Pi launched two models of their "mini-computer" (micro-computer?) today. Model A costs $25 while Model B costs $35.

FXI Technologies also opened the cstick Cotton Candy for pre-order and it costs $199.

Deep down, you know that IDEs suck

Let's not kid ourselves, most IDEs are pretty good in general. There is however, an area where they are particularly bad. That just happens to be the editor - the main component every developer works with!

Now, an editor is the main tool every programmer uses. Without a good editor, you are simply wasting your time. Most people will probably say that IDE editors are great because they have some form of code generation and type checking.

While that may be slightly true, the editor still is the most un-feature-rich part of all IDEs. The refactoring support may be useful, but it does not aid you in writing the text that is your logic. The IDE is nothing more than a simple text editor that has some hooks allowing you to see type errors / syntax problems / etc.

So what can we do about it? Sadly not too much :P The IDE frameworks are generally monstrous and do not allow for much customization. It seems strange to say, but Emacs got it right. An editor that allows the user to customize it completely. Similar efforts are going into Sublime Text 2 (which has Python at the core) and obviously VIM (with VIMScript). These editors are what people should be using to make themselves more productive. If the editor is missing something like a snippet expansion etc, the person using the editor can just implement it themselves and more importantly, share it with the world on GitHub :)

So what does an awesome editor, something like Emacs, not have? This is what IDEs have tried to solve, but mostly very badly. It simply needs the interaction with the compiler and/or linting system. One technology that enables this is the SWANK protocol. The SLIME mode in Emacs uses it and the Clojure guys have managed to use it to create very nice tooling for Clojure.

ENSIME is another gem, it is a server (and Emacs mode) that caters for Scala and to a lesser degree Java. This means that the SWANK protocol is all that is needed for an editor to get these features (along with the editor actually being able to use this data in some way. ENSIME has very advanced Emacs support and integration is currently underway for Sublime Text 2 and for Vim. Very good stuff.

The point is, people find reasons to blame the IDE. Agreed, its a valid complaint and you have to wait for a while for the fix to propagate into the next release. Unfortunately, software like eclipse is, simply put, very, very poor. I don't care about the notion of a "workspace" and neither should the IDE / editor. Let me write my programs effectively, thats all I as a "professional" ask for. Let the stupidity die and lets really move forward!

Tuesday, February 28, 2012

I would like to have Funtoo

I am a gentoo fanboi, but I heard about Funtoo a while ago which is a different variant of gentoo, like Sabayon and also Google's ChromeOS according to http://www.internetnews.com/skerner/2010/02/google-goes-with-gentoo-portag.html. Both are from the same creator, Daniel Robbins. To me it seems that the major difference is that the portage tree is stored in git of which I am also a fanboi. I will be getting a new laptop soon and I am considering making the switch to Funtoo. I will post my experiences here.

Monday, February 27, 2012

First message in gmail

So for some reason today, I wanted to find out when I registered my gmail account. I couldn't see any direct place on the UI or whatever, so I found out that if you go to https://mail.google.com/mail/#search//p99999999 it shows you the first message ever received.

Mine is my "Gmail is different" e-mail, which I'm assuming is like your welcome e-mail. It's dated 2004-09-01

Whats yours?

Saturday, February 25, 2012

Spatial Competitive Evolution

Okay, time to post some (semi-)tangible stuff. The philosophical discussions have their place, sure... but that is mostly based on opinions and biases from past experiences. Experience and empirical observation is, in my opinion, what this blog should be all about. (There I go again with philosophy... sorry, dear reader.)

Ok, I am going to assume that you are familiar with Evolutionary Algorithms (EAs). If not, where have you been? :-) No worries, have a look at http://en.wikipedia.org/wiki/Evolutionary_algorithm.

Some of the problems that I currently experience with the run-of-the-mill EA includes the following:

The populations are simply too small. Evolution does not work effectively with anything under a 1,000 individuals. The biological metaphor on which EAs are based, operates on hundreds of thousands or even millions of individuals. Small populations tend to cause a dramatic drop in diversity with a high occurrence of genetic drift. Representing a large population (say, 100k+) tends to be too expensive in terms of processor and memory requirements and problems which do allow large populations are typically simple, trivial problems.
Selection tends to operate on the entire population space. Again, this is unnatural and tends to produce algorithms with a higher selective pressure. Do you really date someone from Japan, Jamaica and Portugal at the same time? ...and if you do, how do address the language barrier?
Most fitness functions are one-dimensional in nature. Several characteristics are measured on a like-for-like basis (very similar to the zero-sum games of classical AI.) What I mean by this is that a feature is identified (by the creator!) and individuals are measured in exactly the same way against that feature. The evaluation is therefore rigid and static - and limited by some predefined bias. This, in my opinion, is the true ball-and-chain of classical EAs. This often result in an "unfit" individual, with a small, yet break-through component in its genetic make-up (the super-genius in the wheel-chair) to become extinct. Furthermore, a high selective pressure coupled with this, short-sighted fitness function really tends towards a hill-climbing rather than true evolution.

I am fully aware that many people with disagree with me on many of these points. Thank you, you're more than welcome (please share why)... I am currently experimenting with a new idea (I think) that I call "SCE" or "Spatial Competitive Evolution", which tries to address these issues to some extent. It has not been as successful (yet!) as I hoped, but I am continuing fiddling with it. Here's the main ideas:

The algorithm operates on a 2d grid (for now), where each cell represents an individual. Individuals can only reproduce and compete with its eight neighbouring cells (extending this beyond is of course also possible). This increases diversity remarkably well and take-over time takes much longer. Unfortunately, genetic drift still occurs and larger populations are still required. This is very similar to the "lbest" algorithm of Particle Swarm Optimisation.
The fitness function is relative-based (that is, in competition with its neighbours). The function varies in that only a specific evaluation region of the fitness function is applied during evaluation. The fitness function itself evolves with the individual and has the ability to include all evaluation regions. This allows different sub-sets of the population to excel at different aspects then elsewhere on the grid. Eventually take-over will occur and the fitness function will evaluate all regions. This is true evolution, where the environment change (the lion runs faster) and you have to adapt (as the gazelle, you better learn some camouflage... and if the trees disappear? ...well then you better learn to run faster!).
Except for the fitness function and population representation, the standard EA is followed with standard mutation and cross-over functions. Some changes come into play with the 2D grid, but these are rather intuitive.

Results... well... not that well at this point. I've used the algorithm to evolve pieces of text that resembles a predefined line of text (text-writing Shakespearean monkeys comes to mind). The fitness function starts off by evaluating a single letter and this then expands over time with the individual. It takes significantly longer to find the final solution (compared to a vanilla GA), but it does eventually get there...

I often keep things like this secret... but I realise that sharing it on this blog will help in determining if this approach has any merit (especially with all these smart people on this blog!). I will submit what I have on Github soon (and details to follow on this blog). You're welcome to clone the repository, play with it and let us know if you find a working set of parameters.

Disqus-ting

Sigh, for the life of me ... I couldn't get disqus to send out multiple email notifications for any new comments that were posted. As a workaround, I'm forwarding all of the whaleventure gmail emails to you guys (and by you guys I'm referring to the moderators, duh!). This is the single email address that gets notified whenever a comment is posted on disqus. If you don't want to receive these emails, then just don't verify the forwarding-authorization email that you received. I don't know if there is a better way of doing this? If someone out there figured it out: you will receive an award of 57 internets if you can provide us with a solution, plus you'll be really metal and stuff.

Friday, February 24, 2012

The internet and its opinions and cliques

NOTE: Everything in this is related to the "technology"/"development" "community" roaming the twitters and the blogs and the forums and the mailing lists

The Internet is a strange beast. It seems to form opinions. Even though there are millions of people using it, and millions of people chattering and reading, popular opinions seem to rise up, and unpopular opinions seem to get lost.

It's like a sort of a crowd effect - you know, the kind of thing where a crowd of people get together, get all hysterical and then kill some "outsiders" because they walk funny or sit strange or whatever.

This has been going on throughout human history. Crowds do strange things to us. They immediately create an "us" and "them" effect. Now I'm no evolutionary biologist, but I would presume it's got something to do with survival. People are quite receptive to "anti-them" sentiment, so much so that some of the worst atrocities in human history have been commited by powerful dictators stirring up and feeding on this sentiment.

And the fact is, these "outsiders" are classified not by any sort of objective assesment - it's simply hearsay that forms these opinions. And the opinions rise up, because they are powerful.

Ok, so enough with the abstract. What am I getting at?

Well, I feel the same things happens on the discussion channels on the Internet. The Internet is a quick moving thing - so quick in fact that it's hard to keep up. So the time to produce and consume information is limited. In fact, some forums of communication even have a strict 140 character limit!

This all lends itself to the fact that these "anti-them" and "pro-us" sentiment get the most attention, because they are the most powerful. They also tend to rise up, and become stronger and stronger over time. You see this a lot in technology discussions. An opinion by someone may start out like "I think x is better than y because of a and b". Then this gets disseminated into the big wide Internets, and then people either agree or disagree.

Now the very nature of the forums (I'm looking at you twitter) is that you need to be quick to respond. You also need to be short (there is after all a hard limit to the amount of characters you can send). This results in, not discussion, but bashing, or confirmation.

It is much easier to say "Yes! I agree with John that x is better!" or "No! John is wrong, x sucks." then to actually take on the arguments and reflect on them. This process then repeats and repeats, and in my opinion, you end up with camps. Little cliques who are vehemently pro-something or anti-something-else.

Now, I guess its human nature to form camps. To try and surround ourselves with people with similair ideas and notions. This is not conductive to argumentation or improvement though. And I feel that discussions on the Internet fall into this trap a lot.

So can we please stop putting ourselves in camps?

I don't think theres any point into forming these cliques. Its not conductive to open discussion, or learning in general. Just saying "x sucks" makes you blind to x. It doesn't make you knowledgable. You don't need to protect yourself by dissing "x" because you're afraid that you only know "y". You also don't necessarily need to always pronounce the pros of "y" at all costs. It's all good. We're all humans. There's no need for camps.

So, let's start getting rid of our cliques and our camps. Let's embrace the things that we like, but don't try to evangelise them. Let's also not be so afraid of things we don't like or know, that we completely diss them out of hand.

I'll start: Hi, my name is Nico. And I am a programmer, with opinions.

I like dynamic typing - even though I understand and see the merits in static typing
I like javascript for the server - even though I don't pretend that it's the next coming of Jesus Christ. it is still javascript after all
I dislike the boilerplate in Java - even though I understand the power of the community around that language
I've always loved python since I started using it - even though I think the python lambdas don't really work
I like Linux - even though I still play my games in Windows.
I love programming with functions - even though I realise that OO is still the predominant paradigm
I prefer closures over objects - even though I realise that an object is a closure, and a closure is an object

Thursday, February 23, 2012

Automating Jenkins job creation

So, here at work we've got a lot (and I mean a LOT) of git repositories.
We also got an underused build server which Gary and myself put up way, way back. Now my mission in life is to get this thing into peoples lives. If it saves one guy (such as myself) from forgetting to git add a new file he added, it would be worth it for that reason alone.
So, in any case, there are so many repos that it would be impossible to add them all by hand. Luckily, jenkins comes with a nice command line tool. So I though, I'll just write a script to import all the repos!
But then I saw, even better! They've got a REST API for creating jobs. Just go to http://your_jenkins_server/api to see the docs for it.
So in any case, this endpoint expects an XML file of the same format as it stores internally. So to get this format, I first used the "get-job" command on the Jenkins CLI to get an example XML file.
Then, the process seemed simple.

Get a list of all repos
Run through each of the repos, and clone them
If theres not a pom.xml, go to the next repo (its not buildable, thus doesn't belong on a build server)
If there is a pom, grab the groupId,artifactId and populate the job xml
POST the resulting job XML file to the jenkins server API

Sounds simple right? Well it was, actually. Guess what took the longest? Getting the damn groupId and artifactId from the XML file! For some reason, python's libxml2 didn't want to do XPath queries on the pom.xml, because of the super-awesome over-engineered XML namespace abortion. So, I had to resort to a hack to get the data from the XML. See if you can spot the hack in the code :)
Anyway, heres the code if anyones interested (and if we maybe find out it broke everything and need to reload everything :) )

Blog logo

I have been thinking of a logo for the blog. I searched the Internet a little bit and I quite like this cartoon drawing (even the "simple framework" one will do) from How to Draw Cartoons online . Maybe it can serve as some inspiration for Clemens? The author (Jeff Scarterfield) is very passionate about drawing and also has a similar view on creating things, which is something we all value. I don't know if we are allowed to just use the drawing. I would prefer if Clemens could create something that is unique though.

CraneCard - HighCharts

High Level overview - I created a little reporting engine that takes a JSON file with a certain structure and produces the chart above.Within the engine the chart is generated from an HTML Table.

More Details Later...

What we've been up to

I think a summary of what we've been up to in our spare time is in order. These are only the things I know of, if there's anything else please let me know:

Cilib - Gary/Wiehann/Theuns have been busy working on this. I know Gary is very excited about their inevitable move to Scala
When Again? - A very interessting side project I've been working on with Dawid and Gary. Code is here
gapaint - Jaco's GA image evolution app. Very cool!
The above was used to create an animation of the doom logo here
jsdoom - my life hobby project. Creating a tribute to my favorite game of all-time, Doom. Code is here

If I left off anything, or there is anything else you want to add, let me know in the comments. I'll update the post.

The origins of Whaleventures

So, you may be wondering, what the hell is Whaleventures? What is the point of this blog? What do whales have to do with coding, or even anything in general? Other questions may be flowing through your mind at this point as well: Why am I reading this? Why should I care? What is the point ?

Let's start with a story. A story about a young boy (or girl for 21st century compliance) who likes to create. And learn. And learn to create. And create to learn. Now this boy (girl) doesn't have a name, because he is every one of us. He may still be inside of you today, however advanced your age may be. But the point is, this boy is there. He exists. And he likes to create. And he likes to learn. Sometimes he hides away, because he feels threatened or locked in, or not appreciated, but he is always there.

The story goes further that this boy likes openness, and freedom. And that's why he has an arrangement - an arrangement to escape. To escape from restrictions. To escape from rigidness. To escape from everything that wants to diminish or block creativity and learning. That want to monetise it, and direct creativity for any purpose other than pure creation itself.

Now whilst this boy realises that there is nothing wrong with creating things and selling them for money, he does miss the pure form of creation. And thus, he uses his whale to escape. To escape to the outside world, to the primitive world. To get out of corporate mumbo-jumbo and return to where creativity and learning can be pure. This escape, is his whale. And he does this on the big, vast ocean. An ocean of endless freedom.

Whaleventures is the ongoing story of creation and learning.

Whaleventures is a coding/tech blog. Whaleventures is also a place for ranting about technology. It's also a place where you can show off your newest creations. Whaleventures is also a place where we can learn from each other!

Whaleventures is a state of mind. It is about curiosity, learning, striving for better things. It is about experimenting, creating and exploring. It is about being careless, but also very careful at the same time. The art of creation is integral to each and every human being, and it is something that brings us all together.

Whaleventures is a celebration of the spirit of creation, in the purest possible sense. Because only by sitting on top of the little water splash on the whales' back, surfing through the big wide ocean, do you have the freedom to explore new things and ways of thinking and building. You're most certainly not gonna get that freedom if you don't actively look for it. No one will hand it to you.

You need to get out on your whale, into the big scary ocean, and start learning again. And creating.

Above all, life is about creating. And learning. And having fun.

Building your own IDE

So, recently, I've been dabbling a bit in node.js / javascript. This has led me to the conclusion that these monolithic IDE's we use every day, are a Bad Idea (i'm looking at you eclipse). Because of the bloatedness of IDEA/eclipse/netbeans for javascript / HTML development I went back to basics. Basically building my own IDE using smaller, more specialized components. So, my basic setup is as follows:

Sublime Text 2 - text editor and syntax highlighting
jslint - Lint tool for Javascript code. It's basically a syntax checker, together with a bit of static analysis
npm - this is the node package manager. Think maven. It can also run your tests
guard - this is a ruby gem I use to watch for file changes. It can execute arbitrary commands whenever a file changes
libnotify/notify-send - libnotify is used to make popups on the desktop

Right, so those are the pieces. How do they fit together though? Well, easy. Guard is setup to watch my source/views and tests. Whenever any of these changes, the complete test suite is run using npm, and the output is received as JSON. I then parse the JSON to create a popup on my desktop, which instantenously gives me the results of the unit tests. This works so well that I haven't really touched the browser in the last week to test my javascript code. Here's the code for the test running/popup: Here's my guard file: The popups look really nice. They are also quite customisable (depending on your notify backend). Try

notify-send "Test" test

on your PC to see what it does. Anyway, first post! Here's an example of the popup: