Integrating Play and Cassandra

Working in the IT industry we are forever coming across new technology, or is it a new use of old technology, or needing to get a feel for something new. Maybe you have an interview coming up? I have ended up looking into 2 different enterprise web frameworks recently, Spring and the Play Framework. This is about the later.

Play Framework

This is an asynchronous framework built upon Akka, QUOTE: "Akka is a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on the JVM.". The reason I mutter about old technology, is systems I worked on 20+ years ago used asynchronous everywhere (e.g. RSX11 QIO), so this to me is a step back in time. In those days we would not think of writing code that blocked, semaphores, events etc. were everyday fodder.

Unlike a threaded web server in the Play Framework the actions all run on a very small number of shared threads. If you block these threads for even short periods of time then the application throughput will suffer. So when implementing the controller methods (Actions) you need to be aware of what calls may block, i.e. database calls. Play provides the concept of a Promise of something, i.e. I am going to give you the result at somepoint but not yet. Think of a richer form of Future. In Java setting these up looks scary, I suspect this will be simpler with Lambdas in Java 8.

    public static Promise<Result> index() {
            return Promise.promise(
                                    new Function0<List<Track>>() {
                                            public List<Track> apply() {
                                                    return factory.findSome();
                                            }
                                    }
                            ).map( new Function<List<Track>, Result>()
                                            {
                                                    public Result apply(List<Track> t)
                                                    {
                                                            return ok(tracks.render(t));
                                                    }
                                            }
                                    );
}

If we break this down we return from our action a Promise, the framework will call the first Function apply that returns a List<Track> and when it completes it will then call the second Function apply to convert the List into a rendered result to be returned to the caller.

The boiler plate is using an anonymous class to achive similar functionality to the use of a Lambda, providing an anonymous function (the apply method defined within the anonymous class) to be called.

In Scala where functions are first class entities this simplifies to.

def index() = Action.async {
    scala.concurrent.Future { factory.findSome() }.map( t => Ok(tracks.render(t)) )
}

I can see clearly what happens with the Scala version but the Java one is a little bit like black magic.

Cassandra

Cassandra is a nosql database (I love translation of nosql -> Not Only SQL). Instead of seeing a database that stores everything in two dimensions it can store things in 3 dimensions (maybe more, thats where my brain stopped for now). For my investigations I basically treated it as a two dimensional database, I had a csv file of some singles lying around (yes track listing for 7" vinyl) from a long time ago. I used this to load a table in cassandra and then generated some simple pages and json from the data. This was using the datarax Java driver in blocking mode

So a TrackFactory to access a cassandra column family (AKA Table) and we have a simple integration of Cassandra and Play.

Afterthought

The hardest part was getting my head around the Java Promise functionality, this is a case of the language not really supporting modern paradimes. It was also interesting getting my head around the workings of an alternate nosql database, the other one I have used is mongodb.

I think the most powerfull tool here is Cassandra, it makes you look at data in a different way. You have to change the way you model your data. Also as the aims of Cassandra are redundancy and performance we start dropping some of the core RDBMs rules, such as Normalization.

So another couple of usefull tools in my box.