Tag Archives: Java

Write awesome CLIs!

It’s time to start writing kick ass CLIs instead of hacking scripts! 🙂 It’s a lot easier than you might think.

If you’re impatient just scroll to the bottom for a link to the code in Github. 🙂

All those scripts

I see a lot of scripts around, but they usually suffer from many of these problems:

  • Missing or bad error handling
  • Limited input validation
  • Clumsy parameter handling
  • No testing, so every change requires testing all input combinations. Not to mention different state on the hard drive.
  • Copy and paste code. It’s hard to re-use libraries in scripts, even though a lot exists.
  • Implicit dependencies to OS and OS packages

I’ve done way too much of this in my time, and I have felt the pain of maintaining 16k lines of Bash code (I know, stupid). So I started looking for something better…

What I wanted

Coming from the developer side of things I’m really used to making third party libraries do a lot of the heavy lifting for me. It felt really awkward that there were no proper way to do this when creating tools for the command line. So I set out to look for:

  • A good way to define the Command Line Interface
  • Proper error handling
  • Test frameworks to enable automated testing
  • A way to package everything together with dependencies

In addition; I really wanted to do some automated testing. I hate writing code without knowing instantly that it performs as I believe it does. You might be differently inclined. 😉

Solutions?

There are many ways of doing this, but the only ones I’ve been able to get some real experience with are Python and Java. I would really like to learn Go, but it’s usually not politically viable and would take some time to learn.

I did maintain and develop a CLI in Python for a good while. And I really like the Python  language and all the awesome third party libraries available. But I always found it lacking in the distribution part. We were under certain (networking) constraints, so downloading stuff from PyPi was NOT and option. It took quite a lot of hacking with Virtualenv and Pip to set up some kind of infrastructure that enabled us to distribute our CLI with it’s dependencies. YMMV. 🙂

But all these hoops we were jumping through with Python made me think about what is great about Java. The classpath. 😉 Yeah, I know, I know. You all hate the classpath. But that’s because it’s been abused by the Java EE vendors through all these years. It’s really quite awesome, just make sure you take full control of it.

Java with some help from friends (see the details further down) would let me package it all up and create a truly cross platform single binary with all dependencies included (JRE required)! It even starts fast! (Unless you overload it with all kinds of Spring+Hibernate stuff. That’s on you.). And even though it sounds like something a masochist would do; it is actually kick ass. Try it. 🙂

If you don’t do this in Java; use Docopt (available in many languages). You should write CLIs and keep your build tool simple (dependencies, versioning, packaging).  I’ve seen way too much tooling shoe horned into different build tools. Write CLIs for the stuff not related to building and use the right tool for the right task.

Test a Java CLI

If you just want to try how fast and easy it actually works (you’ll need a JRE on your path):

$ curl https://dl.dropboxusercontent.com/u/122923/executable-json-util-1.0-SNAPSHOT.jar > ~/bin/json-util && chmod u+x ~/bin/json-util
$ json-util
Usage:
  json-util animate me
  json-util say [--encrypt=] 

In case you did not catch that:

  • We downloaded a jar, and saved it as a regular binary and saved to ~/bin.
  • We executed it and didn’t give any parameters so it printed the help text.

It’s a really simple (and stupid) example, but to invoke some “real” functionality you can do:

$ json-util say "Hello blog!"
Hello blog!

$ json-util --encrypt=rot13 "Hello blog!"
Uryyb oybt!

Neat! Write the utils you need to be effective in a language you know, with the tooling you know (this util is created with Maven). And write some F-ing tests while you’re at it. 😉

Tell me more, tell me more…

The things that makes writing CLIs in Java fun, easy and robust is:

  • Java. 🙂 Alright, alright. Maybe not the best language for this stuff. But the new IO APIs and the Streams with Lambdas in Java 8 helps a lot. And it’s typed… If you’re into that kind of stuff. 🙂 You can of course do this in Groovy or anything else that runs on the JVM, but be aware that many of those languages takes some time to bootstrap and you’ll notice that every time you run the CLI.
  • Maven-shade-plugin. It packages your code together with all the dependencies to one binary.
  • Maven-really-executable-jars-plugin. It modifies the single jar with a zip-compliant header that lets you skip the “java -jar …” part of executing it every time.
  • Docopt-java. It makes writing, validating and parsing command line arguments extremely easy and fun.
  • Docopt-completion. Once you have your kick ass CLI, add some kick ass tab-completion. 😉

Show me code!

You can see an example of all of this (Java and Maven required) at: https://github.com/anderssv/executable-json-util .

Java migrations tools

Wow, it’s been a while. If you’re interested in good links follow me on Twitter: http://twitter.com/anderssv . I usually update there these days.

My talk on “Agile deployment” got accepted for JavaZone this year! I’m extremely happy, but a bit scared too. 🙂 I’ll be talking about rolling out changes in a controlled manner, and one of the things that are usually neglected in this scenario is the database side. I’ll cover stuff like packaging and deploy of the application too, but that’s probably the area where I know the least. The database side of things are really sort of my expertise.

I have written some blog posts on this already, and in relation to the talk and things at work I did a quick search for Java migration tools. DBDeploy I have used earlier, but there are now a couple of other contenders. Here’s my list so far of tools that work on sql deltas that can be checked into SCM:

  • DBDeploy – Tried, few features but works well. Ant based.
  • DbMaintain – Probably has the most features. Ant based.
  • c5-db-migration – Interesting alternative, similar to DBDeploy. Maven based.
  • scala-migrations – Based on the Ruby on Rails migrations. Interesting take.
  • migrate4j – Similar to Scala Migrations, but implemented in Java.
  • Bering – Similar to Scala Migrations, and looks a lot like Migrate4J

I’ll definitely be looking into DbMaintain and c5-db-migration soon. DbMaintain looks promising, or I migh just contribute to DBDeploy some features. I’ll let you know how it went. 🙂

(updated with scala-migrations, bering and migrate4j after first post)

Agile deployment talk retro

On wednesday I did a talk at the Norwegian Java User Group about agile deployment. The slides (in norwegian) are available here as well as embedded on the bottom of this post.

From the comments and questions I got afterwards, I could see how I should have included more detail. That would have made it even more interesting for that kind of crowd. I probably also should have clearified that I had limited time to prepare and that this was just a slightly extended version of a lightning talk I held at XP Meetup last year. I hope to get the chance to correct this in a JavaZone talk with more details. If you did see the talk and have comments please do leave them at the bottom of the page. 🙂

Many of the questions I got revolved around the handling of the database, so I just thought I should give some pointers here to articles that better describes what I have been up to:

Check them out if you’re curious.

Migrations for Java

If you are familiar with Ruby on Rails you know what Migrations are. The same thing can and should be done in Java, it’s just not that well known.

Why migrations? Because it enables you to automatically update any environment you have to the latest version. And this is done through source control closely tied to the code. This means that every time a developer checks in a change to the database it gets propagated to all the developer sandboxes. The person responsible for deploying into the test environment doesn’t have to know which changes are to be applied or not. This is an automatic process that has been tested by CI already. But more on this later.

Simple migration

For those of you that don’t know what migrations are, a migration is a change set. An example would be a change that adds a column to a table. Writing a migration is then writing eirther code or SQL that represents an alter table add column statement. You do not update the original create table statement. For RoR this means some Ruby code, and with the current tools in Java it means writing SQL. I’m not quite sure which one of them I actually prefer, but when making changes to tables with millions of rows I prefer to have as much control of the SQL as possible. The rest of this entry will be focused on the Java way with writing SQL.

So when I wan’t to make a change to the database I would create  a new file calles something like 0325_add_comment_column_to_user.sql :

ALTER TABLE PERSON ADD COLUMN COMMENT VARCHAR2(255);

You might notice that this is not portable. VARCHAR2 is an Oracle specific thing. How often do you change database implementations? Not very often, and your code is still vendor independent since you’re using iBatis or Hibernate. Also, during a change of databases you need to port the schema, which will then also include this change because you will port the base line.

Base line

Base line you say? For these migrations to work you will need something that is already there. You obviously can’t do a alter table without a create table first. The create table migration is a bit back in time, but it has been done. So if everything that is done in your database is migrations in separate files you will get a lot of scripts after some time. To prevent this; at regular intervals generate a new script containing the schema that represents the current state of your database. This is from production, or some “production like” environment. Then you can delete all the scripts that lead to that base line, and keep adding new scripts.

This also makes it easier to create new instances of your database. You first run the schema from the base line. In this base line it is included everything up until revision 307. So when you have done this, you need to apply the latest migrations that are missing up until 325 which we created above.

The harder migrations

One important concept with migrations is that you can’t just think about the schema. You will also need to take into consideration the data in tables. This wasn’t really that hard when adding a column that can be null. If we were adding a column that had a ‘not null’ constraint we would need to figure out what to fill in. Maybe this was a default, or maybe you’ would have to calculate the new values. To achieve this you must write PL/SQL or something similar, which would scare a lot of developers off. I personally think that we should all know enough database stuff to be able to do this. It is not just useful in creating migrations, but for gaining a better understanding of databases and how we can use them effectively. I wish I could show you a harder migration here, but I actually don’t have anything to experiment on at hand, so I just hope you get the picture.

The tools

We are using a simple tool called DBDeploy. It’s a very simple tool that requires a table in the database. This table holds which revision has already been applied, and it uses this information to find out which numbered scripts on the disk is to be applied. In the case of the base line above that would be scripts 308 through 325. It does no processing of the scripts, and just puts it together in a new file with some updates to the changelog table in between. This means that you can pretty much write whatever you want inside these scripts. The test to see if it is working is by running the resulting file. Because DBDeploy actually doesn’t execute anything, we have made a simple ant integration with sqlplus for executing the statements. We did use to call the sql task in Ant, but found that we needed some of the flexibility that sqlplus did offer us to be able to perform these migrations in a good way.

We are also using Maven to package and release these scripts in the same manner as we release code. That way we can always go back to see what DB state equals the code, and we can pull down official releases from the Maven repository. It also integrates nicely into a Continious Integration scheme. Every time someone checks in a change it is automatically tested against a separate database. This way we catch most errors instantly and can fix them while we still remember what we were doing.

Some special cases

Simple is not easy. So there are some things that you need to be aware of, and some things that are not really handled that well. I do however belive that it is worth it regardless. Some of the things we have discovered:

  • You can not change a migration after it is checked in to source control – The script has already been applied to a database and will not be run again no matter how many changes you do to the script after first check in. There are some exceptions to this, but the basic rule is that it can not be changed if the first script was succesful. You might need to write an alter table statement for the alter table you just did.
  • Branches – Since the scripts are number based you there will be a problem when merging a branch into your trunk. What if both trunk and branch has a number 62, but with different contents? First of all, DBDeploy will fail because it finds two scripts with the same number. So you’ll need to rename the script that you just merged in with a different number. But you can’t give it the last number, since some of the other scripts without conflicts from the branch might be dependent on the one with a conflict. So you should always make sure the change in the branch uses a number higher than the one in trunk, and merge that as fast as possible down into trunk so anyone else doesn’t use that number. Not very good, I admit, but it’s the best solution so far.
  • Synonyms – Synonyms will be dependent on the schema names. This is not something you must do with DBDeploy, but we prefer to keep our scripts independent on user names. This enables us to have several instances for different purposes in the same environment. We have actually added a Ant search and replace to enable us to write scripts with synonyms, but replace with the correct db schema at runtime.

The stuff I haven mentioned here is very much still a work in progress, and my goal is that we will be able to change our database as we need instead of putting it off until it becomes a separate migration/rewrite project. We need to be able to make improvements with as little hassle as possible so out systems won’t rot away.

Clover saving time in development

I read this post about Clover and using it to minimise the number of tests run. A nice idea, so I decided to have a go at it.

What it does is use the test-coverage that it was originally written to do, to figure out which tests exercise which classes. So when you change Class1 and Class2 it knows which that Test1, Test2, Test3 and Test4 need to be rerun to check if you have broken anything.

I did an initial test on our build which takes almost 10 minutes. I’m of the slightly paranoid type so I like to run the full build to verify my changes, at least before I check in. With changes in one class Clover figured it should rerun about 20 tests, and ran in about 4 minutes. That’s 6 minutes saved many times a day for each developer. We’re not using it at our build server just yet, but trying to save time for each developer before commit.

Any downsides? Of course. The initial time (mvn clean removes all info) to build is almost doubled for my project. You won’t catch all errors, and updates to dependencies won’t be caught. And there is a problem handling deleted classes. For some reason it creates a optimized-src directory which is a copy of all your sources and compiles from there. If you delete a class in your src folder it won’t be deleted in optimized-src and you could get compilation errors. After some initial tests it seems these problems are bigger in theory than they are in real day situations.

Using Clover should be no excuse bad tests though. I know all too well how I can really mess up my own tests. The long test times often stem from our inability to focus on testing the logic separately from infrastructure as databases or external services. And of course loading the Spring context is done way too much. But that’s stuff for another post.

But even if you have good unit tests there will be time to save. And everything that can keep me from getting distracted when developing is a good thing.

I’ll give this a good run until the 30 day trial licence expires, and maybe invest. Maybe we will see something similar in Cobertura too…

Update: It looks like it is also activated when doing mvn eclipse:eclipse . Because it redefines the source folders to target/clover/src-optimized this is also what is written in you Eclipse project. So be sure to have an easy way to disable running if you’re going to use this.

JavaZone 2008 is over

So JavaZone 2008 is over. Had a blast, and saw lots of cool stuff. It was a bit crowded some times, and a bit too many talks was full, but all in all good. Just a short summary for on the good stuff:

  • Heidi Arnesen Austlid on Open Source in the public sector – The government in Norway has a strong preference for Open Source. The motivation for this is to reduce costs, enable exchange of information through open standards and take back control of their it-systems.
  • Rickard Öberg on Qi4J – A good introduction to the component oriented stuff Qi4J is built upon. Everything i compiled by the Java compiler, and everything is refactorable. Really nice stuff, that will be extremely interesting once it matures.
  • Mary Poppendieck on the Double Paradox of Lean Software Development – Mary is always interesting. Utilisation is not the thing to strive for, throughput is. In fact if you maximise utilisation for the expected you have no capacity to handle the unexpected and performance will suffer severely when the unexpected occurs.
  • Robert C. Martin on functions for Clean Code – Uncle Bob is also one of those really good speakers that are always entertaining. A good talk on the basics of function design and how to make this readable and maintainable. Most of us has a lot to learn about pretty basic things, and that a lot of this basic training in good programming (often good OO) is ignored in our education.

Reviewing the program I now see that I have missed more good talks than I really wanted. A mix of bad planning, beer and walking around meeting people will have to take the blame. 😉 I hope they publish most talks as videos later on.

Great conference, see you next year. 🙂

Shorter turnaround with JavaRebel

All of us have felt the pain waiting for the web-container to reload after just a minor change to the code. Well, recently i came across JavaRebel from ZeroTurnaround and they seem have solved our problem! And if you, as most of us, enjoy Spring you’ll also enjoy their Spring plug-in to monitor your context-files.

As they put it:

JavaRebel is a developer tool that will reload changes to compiled Java classes on-the-fly saving the time that it takes to redeploy an application or perform a container restart. It is a generic solution that works for Java EE and Java standalone applications.!

Promise me to have a look at their screancasts (found on the right hand side of the page). They also have a nice list of features and supported JVM’s and web-containers.

They also have a Google group that answers questions you may have.

I haven’t tried it yet but am truely looking forward to it:-)
And if you do, please post a comment and let me know how it went.