GeeCon - managing remote projects

2012-05-24 08:05
In his talk on visibility in distributed teams Pawel Wrzeszcz motivated why working remotely might be benefitial for both, employees (less commute time, more family time) as well as employers (hiring world wide instead of local, getting more talent in). He then went into more detail on some best practices that worked for his company as well as for himself.

When it comes to managing your energy the trick mainly is to find the right balance between isolating work from private live (by having a separate area in your home, having a daily routine with fixed start and end times) and integrating work into your daily live and loving what you do: The more boring your job is, the less likely you are going to succeed when working remotely.

There are three aspects to work remotely successfully: a) having distributed meetings – essentially: minimize them. Have more 1 on 1 meetings to clear up any questions. Have technology support you where necessary (Skype is nice for calls with up to ten people, they also tried google hangouts, teamspeak and others. Take what works for you and your colleagues). b) For group decisions use online brainstorming tools. A wiki will do, so do google docs. There's fancier stuff should you need it. Asynchronous brainstorming can work. c) Learn to value asynchronous communication channels – avoid mail, wikis, issue trackers etc. are much better suited for longer documentation like communication.

Essentially what will happen is that issues within your organisation are revealed much more easily than working on-site.

GeeCon - failing software projects fast and rapidly

2012-05-23 08:04
My second day started with a talk on how to fail projects fast and rapidly. There are a few tricks to do that that relate to different aspects of your project. Lets take a look at each of them in turn.

The first measures to take to fail a project are organisational really:

  • Refer to developers as resources – that will demotivate them and express that they are replaceable instead of being valuable human beings.
  • Schedule meetings often and make everyone attend. However cancel them on short notice, do not show up yourself or come unprepared.
  • Make daily standups really long – 45min at least. Or better yet: Schedule weekly team meetings at a table, but cancel them as often as you can.
  • Always demand Minutes of Meeting after the meeting. (Hint: Yes, they are good to cover your ass, however if you have to do that, your organisation is screwed anyway.)
  • Plans are nothing, planning is everything – however planning should be done by the most experienced, estimation does not have to happen collectively (that only leads to the team feeling like they promissed something), rather have estimations be done by the most experienced manager.
  • Control all the details, assign your resources to tasks and do not let them self-organise.


When it comes to demotivating developers there are a few more things than the obvious critizing in public that will help destroy your team culture:

  • Don't invest in tooling – the biggest screen, fastest computer, most comfortable office really should be reserved for those doing the hard work, namely managers.
  • Make working off-site impossible or really hard: Avoid having laptops for people, avoid setting up workable VPN solutions, do not open any ssh ports into your organisation.
  • Demand working overtime. People will become tired, they'll sacrifice family and hobbies, guess how long they will remain happy coders.
  • Blindly deploy coding standards across the whole company and have those agreed upon in a committee. We all know how effective committee driven design (thanks to Pieter Hintjens for that term) is. Also demand 100% test coverage, forbid test driven development, forbid pair programming, demand 100% Junit coverage.
  • And of course check quality and performance as the very last thing during the development cycle. While at that avoid frequent deployments, do not let developers onto production machines – not even with read only access. Don't do small releases, let alone continuous deployment.
  • As a manager when rolling out changes: Forget about those retrospectives and incremental change. Roll out big changes at a time.
  • As a team lead accept broken builds, don't stop the line to fix a build – rather have one guy fix it while others continue to add new features.


When it comes to architecture there are a few certain ways to project death that you can follow to kill development:

  • Enforce framework usage across all projects in your company. Do the same for editors, development frameworks, databases etc. Instead of using the right tool for the job standardise the way development is done.
  • Employ a bunch of ivory tower architects that communicate with UML and Slide-ware only.
  • Remember: We are building complex systems. Complex systems need complex design. Have that design decided upon by a committee.
  • Communication should be system agnostic and standardised – why not use SOAP's xml over http?
  • Use Singletons – they'll give you tightly coupled systems with a decent amount of global state.


When it comes to development we can also make life for developers very hard:

  • Don't establish best practices and patterns – there is no need to learn from past failure.
  • We need not definition of done – everyone knows when something is done and what in particular that really means, right?
  • We need not common language – in particular not between developers and business analysts.
  • Don't use version control – or rely on Clear Case.
  • Don't do continuous integration.
  • Have no code ownership – in contrast have a separate module modified by a different developer and forbid others to contribute. That leaves us with a nice bus factor of 1.
  • Don't do pair programming to spread the knowledge. See above.
  • Don't do refactoring – rather get it right from the start.
  • Don't do non-functional requirements – something like “must cope with high load” is enough of a specification. Also put any testing at the end of the development process, do lots of manual testing (after all machines cannot judge quality as well as humans can, right?), post-pone all difficult pieces to the end, with a bit of luck they get dropped anyway. Also test evenly – there is no need to test more important or more complex pieces heavier than others.

Disclaimer for those who do not understand irony: The speaker Thomas Sundberg is very much into the agile manifesto, agile principles and xp values. The fun part of irony is that you can turn around the meaning of most of what is written above and get some good advise on not failing your projects.

GeeCon - TDD and it's influence on software design

2012-05-22 08:04
The second talk I went to on the first day was on the influence of TDD on software design. Keith Braithwaite did a really great job of first introducing the concept of cyclomatic complexity and than showing at the example of Hudson as well as many other open source Java projects that the average and mean cyclomatic complexity of all those projects actually is pretty close to one and when plotted for all methods pretty much follows a power law distribution. Comparing the properties of their specific distribution of cyclomatic complexities over projects he found out that the less steep the curve is, that is the more balance the distribution is, that is the less really complex pieces there are in the code the more likely are developers happy with the current state of the code. Not only that, also that distribution would be transformed into something more balanced after refactorings.

Now looking at a selection of open source projects he analyzed what the alpha of the distribution of cyclomatic complexity is for projects that have no tests at all, have tests and those that were developed according to TDD. Turns out that the latter ones were the ones with the most balanced alpha.

GeeCon - Randomized testing

2012-05-21 08:02
I arrived late during lunch time on Thursday for GeeCon – however just in time to listen to one of the most interesting talks when it comes to testing. Did you ever have the issue of writing code that runs well in your development environment but crashes as soon as it's rolled out at customers only to find out that their Locale setting was causing the issues? Ever had to deal with random test failure because against better advise your tests did depend on execution order that is almost guaranteed to be different on new JVM releases?

The Lucene community has encountered many similar issues. In effect they are faced with having to test a huge number of different configuration combinations in order to make sure that their software runs in all client setups. In recent months they developed an approach called randomised testing to tackle this problem: Essentially on each run “random tests” are run multiple times, each time with a slightly different configuration, input, in a different environment (e.g. Locale settings, time zones, JVMs, operating systems). Each of these configurations are pseudo random – however on test failure the framework will reveal the seed that was used to initialize that pseudo random number generator and thus allow you to reproduce the failure deterministically.

The idea itself is not new: published in a paper by Ntafos, used in fuzzers to identify security holes in applications this kind of technique is pretty well known. However applying it to write tests is a new idea used at Lucene.

The advantage is clear: With every new run of the test suite you gain confidence that your code is actually stable to any kind of user input. The downside of course is that you will discover all sorts of different issues and bugs not only in your code but also in the JVM itself. If your library is being used in all sorts of different setups fixing these issues upfront however is crucial to avoid users being surprised that it does not work well in their setup. Make sure to fix these failures quickly though – developers tend to ignore flickering tests over time. Adding randomness – and thereby essentially increasing the number of tests in your testsuite – will add the amount of effort to invest in fixing broken code.

Dawid Weiss gave a great overview of how random tests can be used to harden a code base. He introduced the testframework written at carrot search that isolated the random test features: It comes with a RandomizedRunner implementation that can be used to subsitute junit's own runner. It's capable of tracking test isolation by tracking spawned threads that might be leaking out of tests. In addition it provides utilities for instance for creating random strings, locals, numbers as well as annotations to denote how often a test should run and when it should run (always vs. nightly).

So when having tests with random input – how do you check for correctness? The most obvious thing to do is when being able to check the exact output. When testing a sorting method, not matter what the implementation and the input is – the output should always be sorted, which is easy enough to check. Also checking against simpler, but maybe in practice more expensive algorithms is an option.

A second approach is to do sanity checks: Math.abs() at least should always return positive integers. The third approach is to do no checking at all in some cases. Why would that help? You'd be surprised by how many failures and exceptions you get by actually using your API in unexpected ways or giving your program unexpected input. This kind of behaviour checking does not need any assertions.

Note: Really loved the APad/ iMiga that Dawid used to give his talk! Been such a long time since I last played with my own Amiga...

GeeCon 2012 - part 1

2012-05-20 11:02
Devoxx, Java Posse, Qcon, Goto Con, an uncountable number of local Java User Groups – aren't there enough conferences on just Java, that weird programming language that “makes developers stupid by letting them type too much boiler plate” (Keith Braithwaite)? I spent Thursday and Friday last week in Poznan at a conference called GeeCon – there main focus is on anything Java, including TDD, Agile and testability. It's all community organised – switching between Poznan and Krakow on a yearly basis, backed by two corresponding Java User groups with a clear focus on good speakers and interesting content: Really well done, wish they could have fit more talks into each of these days: Five tracks in parallel left one with just around 4 regular talks + keynotes each day. That does make for a very human start and end time – but it feels like there's so much going on in parallel that most likely you miss some of the particularly interesting content. Looking forward to the videos!

One note: If you are ever invited as a speaker to GeeCon: Do accept! It's really well organised, an incredibly friendly atmosphere, and a really tasty speaker's dinner. One thing that caught me be surprise this morning: My room was all paid for even though I stayed longer and had offered to cover the additional nights myself - Thanks guys, you rock!

Watch this space for more details on the talks in the coming days.

Presentation shortening

2012-05-15 20:23
In an effort to make more room for more talks in our schedule for this year's Berlin Buzzwords we've asked quite a few people to shorten their presentation from 40min down to 20min. The thought behind it is to not only give more people a chance to talk on their work but also have those shorter talks focused down to the absolute essential information for people to learn.

However I've seen people give awesome 45min presentations fail miserably when forced to cut down their talk - and have myself delivered a very weak presentation at a 5min Ignite presentation.

As a result I thought it might be a good idea to share some thoughts on how to go about shortening your talk and still deliver a convincing performance:

First of all, don't take your usual 40min talk and cut away slides. As obvious as it may seem that this will result in poor slides it's still all too tempting to take a working long presentation and just throw away some content to make it shorter in time. What really happens however is that people either cut out the meat - which leaves you with a shallow brief introduction and not much else left - or the meat is left in with not much around to help listeners understand what the talk is all about. Also speakers might be tempted to leave well working jokes in: Don't without thinking twice - there are things that do take long to prepare, if you cut away all preparation the fun is gone as well. Some people cut down demos to just briefly skip to the browser and than switch back to the slides - if you like the demo and think it's worthwhile: Take your time to demo and shorten elsewhere. Noone benefits from briefly seeing a browser window with not much like an application in there.

So how to go about when asked to cut down your slides? First of all: Think about what is the main message that you want to deliver. What is the core piece of knowledge people should know when leaving your talk. From there build up your story and provide all the necessary detail for the audience to understand your talk.

That does not necessarily mean throwing out all greek symbols because math is just to hard to explain briefly - if they are needed, leave them in, take the time for explanation and build up equations as you go.

Also it doesn't mean that you should cover the very basics only. Clearly label your talk as advanced whenever that is both appropriate and possible - build on your audience's knowledge without repeating all nitty gritty details. It can help to openly ask at the beginning simple yes/no questions and ask people to raise their hands to find out whether they are familiar with a certain technology or not. Knowing your attendees background can save you a lot of time when preparing a talk.

One final piece of advise: There's one book that once helped my a lot improve my own talks called Presentation Zen - if you don't know it yet, it certainly is well worth reading.

PS: Dear speakers, if you are reading this but have not yet fully read the speaker acceptance notification mail - please do so now - I promise it does contain information that is valuable for you to know in particular if your employer happens to sponsor your travel to the conference.

Traveling to Berlin in June? Update: No airport changes!

2012-05-01 09:23
Update: Seems like there won't be any airport changes for Berlin Buzzwords: German article at Tagesspiegel on postponing airport opening.









If you are planning to travel to Berlin in June – e.g. to attend Berlin Buzzwords – note that there is a major change to airports happening on June 2nd:

Saturday, June 2nd will be the last day, both Schönefeld Airport (SXF) as well as Tegel Airport (TXL) are going to be open. All planes departing TXL that day will arrive at SXF in the evening.

The morning after (Sunday, June 3rd) airport Berlin Brandenburg International (also known as BBI, IATA code BER) is going to open. This airport is located very close to Schönefeld, there will be trains and busses connecting it to the city.

Airlines should handle this change transparently. However when arriving at TXL make sure to check which airport you are departing from to avoid ending up in front of closed doors ;) Also should you be arriving from the US keep in mind that there will be a few more direct connections to Berlin starting June 3rd – e.g. Air Berlin will offer multiple daily flights to and from New York and Miami.

When travelling from the airport to the conference by public transport, keep in mind that for TXL you only need a ticket covering zones A and B – for SXF and BER your need to purchase a ticket that is valid for zones A, B and C.

Travelling from TXL to the conference venue and speaker hotel by cab is roughly 30 Euros. For BER the fare is roughly 50 Euros.

Berlin Buzzwords Schedule online - book your ticket now

2012-04-30 10:29
As of beginning of last week the Berlin Buzzwords schedule is online. The Program Committee has
completed reviewing all submissions and set up the schedule containing a great lineup of speakers for this years Berlin Buzzwords program. Among the speakers we have Leslie Hawthorn (Red Hat), Alex Lloyd (Google), Michael Busch (Twitter) as well as Nicolas Spiegelberg (Facebook). Checkout our program in the online schedule.

Berlin Buzzwords standard conference tickets are still available. Note that we also offer a special rate for groups of 5 and more attendees with a 15% discount off the standard ticket price. Make sure to book your ticket now: Ticket prizes will rise by another 100 Euros for last minute purchases in three weeks!

“Berlin Buzzwords is by far one of the best conferences around if you care about search, distributed systems, and NoSQL...” says Shay Banon, founder of ElasticSearch.

Berlin Buzzwords will take place June 4th and 5th 2012 at Urania Berlin. The 3rd edition of the conference for developers and users of open source projects, again focuses on everything related to scalable search, data-analysis in the cloud and NoSQL-databases. We are bringing together developers, scientists, and analysts working on innovative technologies for storing, analysing and searching today's massive amounts of digital data.

Berlin Buzzwords is organised by newthinking communications GmbH in collaboration with Isabel Drost (Member of the Apache Software Foundation, PMC member Apache community development and co-founder of Apache Mahout), Jan Lehnardt (PMC member Apache CouchDB) and Simon Willnauer (Member of the Apache Software Foundation, PMC member Apache Lucene).

More information including speaker interviews, ticket sales, press information as well as "meet me at bbuzz" buttons are available on the official Berlin Buzzwords website.

Looking forward to meeting you in June.


PS: Did I mention that Berlin is all beautiful in Summer?

Berlin Hadoop Get Together (April 2012)- videos are up

2012-04-23 14:22

Second steps with git

2012-04-22 20:34
Leaving this here in case I'll search for it later again - and I'm pretty sure I will.

The following is a simplification of the git workflow detailed earlier - in particular the first two steps and a little background.



Instead of starting by cloning the upstream repository on github and than going from there as follows:


#clone the github repository
git clone git@github.com:MaineC/mahout.git

#add upstream to the local clone
git remote add upstream git://git.apache.org/mahout.git


you can also take a slightly different approach and start with an empty github repository to push your changes into instead:


#clone the upstream repository
git clone git://git.apache.org/mahout.git

#add upstream your personal - still empty - repo to the local clone
git remote add personal git@github.com:MaineC/mahout.git

#push your local modifications branch mods to your personal repo
git push personal mods


That should leave you with branch mods being visible in your personal repo now.