Bye, bye Germany

2010-06-20 15:27
... for the next two weeks: I'll be on vacation with strict internet interdiction. Will be a tourist exploring beaches and maybe a few hiking tracks in the next few days, so don't expect to read anything here apart from what was scheduled already ;)

Teaching Free Software Development

2010-06-20 12:55
In Summer last year I was invited to give a presentation on Apache Mahout at TU Berlin. After the talk was over some of the research group members asked me to design and give a course on scalable machine learning with open source software during the winter semester.

The project attracted four to five students - not very many - but then again it is a course people can take voluntarily. During the first semester participants were asked to integrate Mahout to build a system that crawls web pages, assigns them to clusters and makes the content searchable with Lucene. The intention was to get students to publish any patches they have to make at Mahout. In addition the code behind the system was supposed to be published after the project was over.

This setup turned out to be sub-optimal: The participants never grew confident enough to publish not only their ideas and design on the mailinglist but also send in the access data to the SCM system that hosted the project source code.

Some similar setup was run at HPI Potsdam by Christoph Böhm: He let students implement various information retrieval and machine learning algorithms on top of Apache Hadoop. After the course was over he tried to motivate students to publish their code at Apache Mahout. So far I have seen no submissions.

Being aware of these problems next time I setup the course for the summer semester at TU I chose a slightly different model: Having only four students who do not have enough free cycles to work on the project full time I set the goal to implement an HMM - including tests, example and documentation. Being roughly aligned with GSoC I asked students to publish their timeline in JIRA. As soon as coding started I urged them to publish even incremental progress and ask the community for feedback.

Now we do have an open JIRA issue with a patch attached to it. People also got some code review feedback already. Having Berlin Buzzwords in town while the course was still running I used my chance to get students in touch with other Mahout developers. Looks like at least one of them is planning to stay with the project for a little longer. For me it would be a great success if at least one student could be turned into a longer term contributor to the project.

So far it looks like applying the general principle of releasing code early and often helps people do integrate into some project. My own lesson learned from those experiences however is to urge students early on to get in touch and release their code: It was not particularly easy to get them to send e-mails to public mailing lists. However if they had done this just once, feedback usually was very positive - and surprised by how friendly and helpful in the free software community generally are.

Service as in real good customer service

2010-06-19 08:07
First thing to do after getting up: Go to the kitchen and switch on the coffee machine. However, one random Sunday morning that caused the fuse for exactly this kitchen to go off. After fixing that we turned on the coffee machine again - trying to finally get a first cup. All worked well until having a closer look at what the machine produced as coffee: It was cold!

We initially got the machine from Giuseppetti - a vendor in Berlin. Though it was to late to get it fixed on warranty we still took the thing over to his shop the following week. What happened than was amazing to see for us:

The mechanics unscrewed the machine, started examining it immediately. Knowing we had bought it there, the owner gave both of us a cup of coffee. Long before I had finished mine the fixed machine was brought back to us - a fuse in it had gone off as well.

So after less than half an hour we got our working coffee machine back w/o being charged for the repair. Next morning was way better than the previous Sunday morning.

Linus Torwalds on the Linux kernel community

2010-06-15 18:10
A few days ago, Linus send a very interesting mail on why he considers C the programming language that is most suitable for the Linux kernel. Despite the language specific arguments, the text contains quite a few insights on how the Linux kernel community works and communicates that might be interesting to non-kernel-hackers as well:


People working for free still doesn't mean that it's fine
to make the work take more effort - people still work for
other compensation, and not feeling excessively
frustrated about the tools (including language) and getting
productive work done is a big issue.


When attending Open Source conferences or contributing to free software projects I have made the exact same observation multiple times: Developers may not get money out of contributing to a free software project (though often they may do) there are other rewards that keep them working on any particular project: Learning from excellent peers usually is one reason. Being able to work on a topic you like at any time that suits you may be another one.

Developing free software is largely different than your usual professional work: People work voluntarily, putting as much effort into the code as is needed for them to be satisfied with the end-result. There may not be any deadlines fixed by contracts, still developers honour release cycles with the goal of providing a reliable product to their end users.

In the end it all boils down to being passionate about what you work on. To be involved in any open source project takes a huge amount of energy - but usually you get more in return than you are even able to invest. However if passion is a pre requisite to working on any free software this also means it is extremely hard to pay developers to work on any free software project: You just cannot buy passion or love for money.


But the thing is, "lines of code" isn't even remotely close
to being a measure of productivity, or even the gating
issue. The gating issue in any large project is pretty much
all about (a) getting the top people and (b) communication.


Can only quote that - totally agree with the analysis. This applies to any software project - free or proprietary. So if you own a software development business: What is your strategy for getting the top people and facilitating communication in your company? What is your measure of productivity?

Velocity update

2010-06-14 18:34
After Berlin Buzzwords is over now, I thought it might be time to update a formerly published velocity graph. If you have been following this blog during the past few months you may know already, that we are using Scrum @ Home for several months already. I even published results of a Scrum Nokia test earlier this year.

Today I created another one of these nice velocity charts. What is interesting about this graph are two things. First of all: There are two spikes in the velocity graph. As our scrum board is used only for things done during our spare time, these spikes roughly correspond to German vacation days (Christmas and Easter).

More interestingly the work load shortely before Berlin Buzzwords happend was not much higher than usual. Weekend was totally free of any work - except for meeting with friends, going to town and attending the Barcamp. A sure sign of fantastic preparation and organisation with that few tasks left to do. Julia, thanks for helping with scheduling tasks such that no panicky late-night action was needed before, during or after the conference.

My highly subjective Berlin Buzzwords recap

2010-06-13 18:32
Last November I innocently asked Grant what it would take to make him to give a talk in Berlin. The only requirement he told me was that I'd have to pay for his flight. About eight months later we had Berlin Buzzwords - a conference all around the topics scalability, data storage and search. With Simon Willnauer, Uwe Schindler, Michael Busch, Robert Muir, Grant Ingersoll, Andrzej Bialecki and many others we had quite a few Lucene people in town.







From the NoSQL community, Peter Neubauer, Rusty Klophaus, Jan Lehnardt, Mathias Meyer, Eric Evans and many others made sure people got their fair share of NoSQL knowledge. With Aaron Kimball, Jay Booth, Doug Judd and Steve Loughran we had several Hadoop and related people at the conference.

The conference also featured two talks on Apache Mahout: An overview from Frank Scholten as well as a more in-depth talk by Sean Owen. It's great to see the project grow - not only in terms of development community but also in terms of requests from professional Mahout users.









In addition we had a keynote by Pieter Hintjens that concentrated on messaging in general and 0MQ in particular - a scalability topic otherwise highly underrepresented at Berlin Buzzwords.







We got well over 300 attendees that filled Berlin Kosmos - a former cinema. Attendees were a good mixture of Apache and non-Apache people, developers and users. People used the breaks and bar tours after the event to get in touch, exchange ideas. It's always good to see developers discuss design issues and architectural challenges.

Monday evening was reserved for local people taking out the speakers and interested attendees for Bar Tours to Friedrichshain. Those from Berlin took Berlin Buzzwords people to their favourite restaurants and bars - or to what they considered to be "typical Berlin". Some spent evenings later that week drinking beer or Berliner Weisse.




The tour for keynote speakers Grant Ingersoll, Pieter Hintjens and friends was organised by Julia and myself. We went over to Kreuzberg - some went to famous Burgermeister for Burgers, the other half went to a nearby Indian restaurant. After that we spent the evening in Club der Visionäre - a club next to the water. Me personally I left at about midnight - several people of the Lucene community moved to the well known Fette Ecke later on.

When asking the audience about repeating the conference next year, all hands went up immediately. Beside lots of praise for the organisation, from the feedback form we put up we got some good ideas on how to improve the conference next year. I'd love to have you guys back here in 2011 - and I'd love to get even more attendees in. Was great fun having you here. Thanks for 5 great days:

Five instead of two days, because:

  • Keynote speakers got a special treatment - that is a personal city guide for the weekend before Buzzwords.
  • We had the official conference start on Sunday with a Barcamp.
  • We had another Apache dinner on Wednesday with those Apache people that live in Berlin. In addition the Aaron and Sarah joined us as they were still in town for the Apache Hadoop trainings. Also Greg Stein had pizza and beer with us - he was in town for the svn conference at the end of the week.












Thanks to all who helped turn this conference into a success: Julia Gemählich for conference management, Ulf and Wetter for WiFi setup, Nils for travel management, Simon and Jan for support ranking talks and reaching out to your communities, all speakers for fantastic talks, those taking pictures of the conference and sharing them on Flickr for showing those who stayed at home how great the conference was, peoplezapping for the videos that will soon be available online, all sponsors for supporting the conference, all attendees for their participation. I'd love to have all of you (and many more) back in Berlin next year. An informal call for presentations has been set up already - submit now and be the one to set the trend instead of just following the Buzzwords!

For those who do not want to wait for another year: We will have another Apache Hadoop Get Together in September 2010 - watch this space for more information. If you'd like to give a talk their and present your Hadoop/ Solr/ Lucene etc. system - please get in touch with me.

Apache Dinner June 2010

2010-06-09 23:54
After Berlin Buzzwords was over yesterday - and as there is an svn conference in the city that starts tomorrow, we thought we could easily put together a smallish Apache Dinner for tonight. So Torsten mailed a few people, booked some space at Heinz Minki in Kreuzberg. We announced it at the end of the conference and invited people to join us.






So after a beautiful day out I spent the evening with a bunch of Apache related guys and girls having drinks and great pizza: We had several svn committers, Greg Stein met us there - he arrived today for the svn conference. Of couse the usual suspects were there as well: Simon Willnauer and his wife, Torsten Curdt, Erik Abele, Thomas Wöhlke, Valerie Hajdik. In addition we had guests from the Apache Lucene and Apache Hadoop communities: Sarah Sproehnle and Aaron Kimball as well as Uwe Schindler joined us.







Thanks to Torsten for putting the meetup together. See you next time in July. If you are interested in joining our meetups: Subscribe to our mailing list.

Apache Dinner after Berlin Buzzwords

2010-06-03 08:52
Staying in town after Berlin Buzzwords? Interested in meeting with the Apache folks here? Torsten Curdt kindly organises an Apache Dinner on Wednesday evening after Berlin Buzzwords. If you would like to participate, contact Torsten for details on when and where it will take place.

Though named Apache dinner, there is no need to be Apache committer or even member to participate: Being generally interested in Apache projects and in meeting the guys behind the project is totally sufficient.

Apache Dinner - May 2010

2010-05-27 22:50
This evening a bunch of Apache committers and friends gathered in Berlin Kreuzberg at "Goodmorning Vietnam" for tasty food, nice drinks - or put another way, for a very nice evening. Simon had booked the table - we were expecting no more than eight people. However, as with any user group these meetup tends to grow. Shortly after the appointed time we had to move to another table to fit everyone around. See below for a quick shot taken while eating (Thanks to Eric for taking the picture):



There were people from Lucene, from SVN, Cocoon, CouchDB, HttpComponents and various other projects. Even one potential future Mahout committer :) Counting attendees quickly I guess we were about fifteen people.

Looking forward to the next meetup that will be scheduled to take place shortely after Buzzwords. Please talk to Torsten Curdt if you want to get notified or simply subscribe to our mailing list.

Scaling user groups

2010-05-26 19:32
A few hours ago, Jan Lehnardt posted a link on How to organise a nerd conference - joking that this is how we planned Berlin Buzzwords. Well, it is not exactly that easy - however the comic actually is not so far from the truth either:

About two years ago, after having started Apache Mahout together with Grant Ingersoll, Karl Wettin and others, several Apache Hadoop user groups, meetups and get togethers started to pop up all around the world. The one closest to me was the Hadoop user group UK. Back in 2008 I was pretty envious to all these user groups - being so distributed, there was no way I could ever attend all of them, though talks were certainly interesting. So the naive thought of a back then naive free software developer was: Let's have that in Berlin. To have initial talks I called Stefan Groschupf. His answer was very positive: Oh yeah, let's do this. I am in Germany for another two weeks, so it should be at about that timeframe. We agreed that if no-one showed up, we could still have some pizza together and share insights from our projects.

For the venue I knew from regular meetups of the Free Software Foundation Europe - read FSF*E* - that newthinking store was available for free for meetups for devs of free software. On I went, calling Martin from the store, booked the room. After that some mails went to the usual suspects, mailing lists and such. At the first meetup two years ago, more than 15 attendees - with two more people who had prepared slides. Pizzas obviously had to wait a little.

If you are wondering what that looked like back then - Thanks to Martin for taking the image back then and putting it online.






We (as in all attendees) decided to repeat the exercise three months later*, talks for the next time were proposed during that first session. Noone objected to having it in Berlin again - everyone knew this was the only way to avoid having to do the organization next time.

The meetup grew steadily in size, talks started being proposed three to six months in advance. I ended up creating not only a mailing list for the meetup but also a blog so I could publish news on Jan's CouchDB talk and Lars George's HBase talk back then. We got video sponsoring from Cloudera (Thanks Christophe), StudiVZ (Thanks Nils), and Nokia (Thanks Matt). Late last year I did the first European NoSQL meetup together with Jan Lehnardt - 80 attendees, lots of potential for more, the newthinking store obviously a bit too small for that :)

If you are wondering what NoSQL and Hadoop meetups looked like last time:



During that meetup the idea was born for a larger NoSQL conference in Berlin in 2010. First ideas were tossed around together with Jan and Simon Willnauer during Apache Con US in Oakland. The topic Hadoop got added there. In January 2010 finally Lucene was added to the mix. We contacted newthinking for support - got a very warm welcome.

Now - two years after the first Apache Hadoop Get Together Berlin we are proud to host Berlin Buzzwords - focussed on NoSQL, Apache Hadoop and search as in Apache Lucene.The conference is co-organised by newthinking communications, Simon Willnauer, Jan Lehnardt and myself. A big thanks to neofonie for supporting me by making it possible that I could do most of the organisation during my regular working hours.

The speaker lineup looks fantastic. Registration is going very well - exceeding expectations (did I mention that registration is still open, group and student tickets still available?).

I am really looking forward to an amazing conference on 7th and 8th of June. We will have a NoSQL barcamp in newthinking store Sunday evening before the conference. Keynote speaker packages have been sent out and were well received. Hotel rooms for speakers are booked. We are about to pull together the last loose ends in the coming days. Happy to have so many guys (and a few girls) interested in scalability topics here in town at the beginning of June. Looking forward to seeing you in Berlin.



* The second meetup turned out to be the first and so far only one that took place w/o the organiser - I broke my leg on my way to newthinking by getting hit by a BMW X5... *sigh* Note for other meetup organizers: Always have a backup moderator - in may case that was my neofonie manager Holger Düwiger who happened to attend that meetup for the first time back then.