20100126: Functional Java

Starting to explore Functional Java, which have been on my radar for quite a long time. Of course, the Java syntax without syntactic sugar for closures does not help in accelerating understanding of what's going on particularly in the fj.control.parallel package which contains the most interesting stuff to do concurrent computations.

What I am trying to do is to parallelize maven builds using this framework:

20100112: Maven: Why so much hatred ?

Following a post that is best left alone in the dark, I am seeing an upsurge of posts and twits that goes along comparing various build systems, the latest of which is this one I grabed from a twit by builddoctor. From my tiny point-of-view it seems the cyberworld contains two unevenly-sized set of people with an opinion on build systems: Maven haters and maven users. And the funny thing is that the intersection of the two sets is not empty...

As a disclaimer for potential readers, I must confess I fall in latter of the two groups: I have been using maven since ... well, I could not remember precisely so I dug into Apache's archives and found that the oldest available version was 1.0-beta-5, dated from October 2003.

At that time I was undertaking my PhD with Norsys and the LIFL. I think I remember precisely how we encountered Maven at Norsys. We were starting our long road toward software factories and like most people at that time, we were using Ant scripts. We found that our Ant scripts could be generalized and parameterized across various projects, lowering the efforts needed to build, run tests, produce nice reports and package our Java applications.

Then we offered an internship to a student to industrialize that stuff, maybe even to the point where we would have a nice GUI for generating all the build.xml files from a couple of parameters. After a couple of days, our intern and I stumbled upon the maven project and, reviewing the available material, we thought that what we were dreaming of was being implemented by some smart guys from the ASF. So we dropped our homebrew Ant scripts and started writing our projects using Maven.

Of course, we suffered a lot: Maven 1.X was (and probably still is) painful and could very quickly grow into a sour mess, but the benefits we found in standardization of projects structure and process, integration of reports in a single site, and dependency management outweighted the inconvenience of writing programs using XML.

Then came maven 2 which was a huge improvement, even with all its drawbacks and convoluted architecture (ah, the joys of understanding plexus container DI without documentation), and which has kept improving thanks to a large and active community. On this fertile land grew a whole ecosystem which helped streamline and organize large projects: CI, repository and artifacts management (how many companies simply did not manage their dependencies before maven ?), tons of reports, analysis tools, testing frameworks and reporting utilities. None of these are Maven specific of course, but standardized build system helped a lot in propagating those ideas.  

And with success came criticism, and sometimes hatred. I feel that Maven hatred is often tied to Java-bashing. The rise of Ruby, Groovy, Scala, RoR and more agile tools, frameworks, languages may slowly push Java and its infrastructure in the realm of legacy code. But this will take time and in the meanwhile Java and maven are here to stay.

The question

My love affair with Maven is akin to the one with Java: It started in passion but quickly lost romantism and quickly degraded into mere friendliness. We stayed closed friends, to the point of going out together and meet other people, but there was no desire anymore to go beyond this mundane relationship. In contrast, I have a passionate but distant love affair with more exotic languages, like Haskell or scala, you know, that girl

So it is for maven: I like it, I use it professionally to the point I can introduce it in some places, when I need to replace a haphazardly thrown together mix of ant and shell scripts, which is often what would be called build system. I gave and wrote courses in french on maven. When I start a project in java or scala, I write a simple to get started quickly. I know how to setup a simple (eg. httpd based) or complex (eg. nexus based) artifact repository. I have written more than a handful of plugins and even contributed some documentation. I went so far as to start using MavenEmbedder and try dabbling with maven's internals with Plexus ! I know it, yet I don't love it (anymore).  

I can understand that people find more value in other tools. As said in the article cited in my introduction, Maven's learning curve is steep and, while powerful, is difficult to master and very easy to turn into a mess. But there is one thing I cannot understand, is how one can come to hate maven.  

So here comes, at last, the goal of this article: I would like to gather stories about Maven, good or bad, the better or worse being all the more interesting. So if you, gentle reader, want to share a story, a compelling one, complete with all the gory details (eg. how much time you lost/gained with Maven, how much size in build scripts you lost/gained), feel free to drop me a mail (abailly arobase oqube point com) or twit me (abailly, less safe) with a link of your war stories with Maven. If enough people read this article and send me links, I will set up a page, something we could call a Cabinet de Curiosités dedicated to Maven.

20100104: Poor Man's Dependency Injection

Working continuously with functional programming languages like Haskell and Scala affected my programming style and way of thinking in Java, which stays my main programming language (everybody needs food and lodging...). One important lesson I learnt is to try as hard as possible to prevent statefulness creep: The state of a system where each and every object can itself change state in hardly predictable ways through messages. This implies that, whenever I can, I try to use immutable objects and functions (ie. methods that transform a passed object to another object). This gives systems built on such elements a nice property called referential transparency: The ability to substitute equals for equals without bothering for the true identity of objects involved.  

Back to Java, a language impaired with a noisy syntax and imperative semantics, this means those days I use more and more some feature of the language that may be considered bad-style: nested and anonymous classes. This is the closest approximation to closures that we have in Java 6 (real closures should make their way to Java 7 after all...) and closures are really useful when one wants to leverage the benefits of functional paradigm in an idiom. Much like closures, nested  classes have the property of capturing their environment (hence the term closure) in a way that is essentially hidden to the outside world.

The context

Recently, I have been working on some (simple) piece of software where objects need to be configured with references to other objects and values before use. Being serious about the Single Responsibility Principle, I try as much as possible to segregate roles between different objects which usually leads to creating an ecosystem of objects (hence classes in Java) interacting with one another, where the various roles are usually expressed as Java interfaces (I am not very dogmatic with this one, and let interfaces grow from real needs expressed through TDD, rather than introducing them up-front).

To properly construct the resulting graph of objects, then one ressorts to Dependency Injection: References and values are injected from the outside into the objects, rather than being created by the objects themselves.  

In the meanwhile I read Functional Pearl: Implicit Configurations,  by Oleg Kiselyov and Chung-chieh Shan, Proceedings of Haskell'04 Workshop, and was quite inspired by it. The idea implemented in this paper is based on some tricky haskell machinery: Typeclasses, Foreign function interface serialization, phantom types, but it allows writing very expressive configurable code. Here is an example where some arithmetic expression is configured at runtime to use some modulus:

previous

   test4 :: (Modular s a, Integral a) => M s a
   test4 = 3 * 3 + 5 * 5


   withIntegralModulus :: Integral a =>
                             a -> (forall s. Modular s a => M s w) -> w
   withIntegralModulus (i :: a) k :: w =                                
      reifyIntegral i (\(t :: t) ->
                         unM (k :: M (ModulusNum t a) w))
   test4 = withIntegralModulus 4 test4

The really nice thing, that is permitted by the way Haskell handles genericity of functions and type-classses overloading, is that a modular arithmetic expression just looks like a standard arithmetic expression, eg. 3 × 3 + 5 × 5.  

The type system here acts as an environment in which computation takes place guaranteeing some value will be provided later on, eg. at runtime, to produce a result. The net effect is that we keep the code's functional purity while allowing flexible configuration at run-time, in a type safe way.  

Of course, Java's type system even with the addition of generics in Java 5 is rather lame when compared to Haskell's, so it would be silly to try to reproduce this. But this article and the Cake Pattern in Scala induced me in trying to implement something similar in Java.

Initially I was tempted to use one of the DI frameworks that are widely available those days: Guice, PicoContainer, Spring DI to name a few I experimented with. I most recently used Guice and was quite pleased with it, but I must confess that:

  1. I find annotations clumsy and a bit dissonant: They introduce some orthogonal behavior whose semantics may not be immediately obvious from reading the source code and may be changed arbitrarily at run or compile-time,
  2. I would rather not tie myself to a specific framework if I can leverage the possibilities offered by the language and the platform I am using.

Hence this modest attempts.

The Code

The idea of this configuration scheme is simple. Configurable objects should implement a NeedConfiguration interface that allows retrieval of a configuration object:

previous

public interface NeedConfiguration<Conf extends Configuration> {    
   Conf config();
}


public interface Configuration {
  Logger getLog();
}


public abstract class MyConfigurable implements NeedConfiguration<Configuration> {


   public void doSomething() {
      config().getLog().log("some message");
       ...
   }
}

NeedConfiguration classes  are made abstract to ensure that they are provided a correct configuration at runtime: They cannot be instantiated hence used without a correct configuration.

Configuration classes provide context to configurable classes through anonymous subclassing, and implement other configurations:

previous

public class DefaultConfiguration implements Configuration {


    final public MyConfigurable object = new MyConfigurable() {


        @Override
        public Configuration config() {
          return DefaultConfiguration.this;
        }
    };


    private MapperLogger log;


    @Override
    public Logger getLog() {
        return log;
    }


    public void setLog(Logger log) {
        this.log = log;
    }
}

Configuration provider simply instantiate required configuration. For example, if we have a Main class that does some job, then we can do:

previous

public static void main(String[] args) {
  DefaultConfiguration config = new DefaultConfiguration();
  config.setLog(new Logger(args[0]);
  config.object.doSomething();
}

A configuration can be made more flexible by segregating interfaces by usage, for example:

previous

interface WithLog {
  Logger getLog();
}


interface WithCache {
  Cache getCache();
}

Of course, we lack the flexibility of Scala traits, so we cannot reuse configuration implementations directly, only their interfaces. That is, if we have both a CacheConfiguration and a LogConfiguration as concrete classes, and we want to provide a GlobalConfiguration which encapsulates both, we need to use some delegation at the cost of writing boilerplate code:

previous

class LogConfiguration { ... }
class CacheConfiguration { ... }


class GlobalConfiguration implements WithLog, WithCache {


   MyConfigurable object = new MyConfigurable() { ... };
   LogConfiguration logc = new LogConfiguration();


   Logger getLog() { return logc.getLog(); }
   ...
}

The important point here is that instances of the domain objects are created at the root of the configuration classe(s) as instances of some anonymous class.

Discussion

What happens here is very close to what the compiler does in scala when implementing Cake Pattern, albeit with less power (we cannot inherits implementations, only interfaces) and more verbosity. When compared to standard DI solutions, we can say the following:

I have no idea how this simple scheme might scale when one want to configure a lot of objects with tens of parameters, scaling here meaning of course maintaining readability and understandability, but I feel like just as in any ecosystem of objects, good separation of concerns and an explicit domain designed can help a lot.

As a last minute addendum it seems this idea as been pushed quite far by Qi4j.