Maven may not solve the problem you thought it solves….

This post is about Apache Maven and why I think it does not deliver what many developers hope to get from it.

Maven is a build and dependency management tool. Maven claims to be something more general, but I think that is misleading:

<quote from=”Maven Getting Started Guide“>
At first glance Maven can appear to be many things, but in a nutshell Maven is an attempt to apply patterns to a project’s build infrastructure in order to promote comprehension and productivity by providing a clear path in the use of best practices. Maven is essentially a project management and comprehension tool… .
</quote>

Whatever… let’s call it a build tool for the moment. When people start looking into Maven, that’s is the problem they are trying to solve. All the other aspects of Maven, i.e. all the other stuff it does, I’d call a consequence of the environment it was initially developed for and where it grew up.

Beyond general curiosity that is also why I looked into Maven. I even recommended Maven to address software production issues for a non-trivial software system. I would not do that anymore. Here is why.

When I started looking into Maven I wanted to solve the problem of “maintaining a solution consisting of several interrelated modules without caring much about build problems”. As it turned out, that’s not what Maven is built for.

How Does It Work?

Let’s not look into how Maven invokes a compiler or build plugins and such. The more significant aspect is that Maven relies on repositories that hold artifacts, by name and version, typically produced by other Maven builds. In fact, you have one locally. When Maven cannot find what it needs, it will go to remote repositories and fill your local repository that then serves as a cache. Also Maven plugins, the code that actually implements the very thing you wanted it to do in the first place comes from there (which is consistent). That’s why Maven, on first call for anything, seems to virtually “download half the internet” (which, no, I am not complaining about. That is just bootstrapping).

Maven projects are described in a Project Object Model (POM). Ignoring the details, the POM describes project type, some version numbering and dependencies (by name and version). Dependencies are resolved using the repositories.

You put stuff into a local repository by installing it. You put it into remote repositories by deploying it (see references below).

So in short, apart from actually invoking plugins that transform some source code artifact into something better process- or  machine-processable, Maven is all about managing artifacts between your local development environment and some remote storage that provides missing dependencies to you and others.

What Problem Does It Solve?

Maven was built for Apache. And that’s where the approach makes most sense. Many  projects are single-project really with dependencies to stuff that is out of individual reach, so that Maven’s repository approach provides an automated, defined interface to getting a hands on some other project, in the form of its (binary) output, it’s released artefacts.

So, Maven is great, if you want to integrate into the Apache Eco-System as you are right in it – from the start – and you have well-defined interface to express your dependency onto other people’s projects.

Maven for Solution Development?

When developing and maintaining a software system that comprises of any non-trivial number of modules however, you will want:

  • Integrity: You do not want to fiddle around and dive deeply into your dependency version vector. There should be one place to look for what is being used in your system, and their should be no doubt whatsoever where that comes from.
  • Modifiability: Sooner or later you will need to fix something that initially was taken from elsewhere and is now part of your system.
  • Repeatability: Whether you look at your system code now or in two years. If and when you need to build it, the result should be guaranteed to be identical.

So… thinking about it, what this means is that you – at least – need to take ownership of the surrounding infrastructure. I. e. you need implement/set up your own shared repositories to guarantee availability and repeatability. You may need to take over local versions of other projects to make sure you can patch. So that’s a principle problem that may not be obvious in the beginning: You are now in charge of operating a non-trivial distributed infrastructure.

Secondly, you buy into the complexity of managing non-trivial version vectors although you really didn’t mean to build a network of independent components that somehow get intelligently integrated into a solution. You just wanted to work on your solution. Maybe even in a Product Line Engineering style. But you definitely didn’t want fine-grained module level versioning.

So in short, something that is useful for a large number of relatively small and independent groups of interest does not apply well to highly interdependent groups that work on a large, modular software system.

Conclusion

Well… you can guess where this is going. I am convinced that solution development and maintenance demands a system-centric approach – and I have seen that successfully implemented. Solving it by adding secondary tools and configuration from the “outside” that eventually has a similar complexity as the actual solution simply doesn’t cut it.

References