Inglourious Basterds

2009-08-24 22:48
This evening I went to the cinema Odeon in Berlin Schöneberg. It is a pretty traditional, old-fashioned and very lovely cinema that has specialised on showing non-dubbed, original versions of movies.

Showing the great movie Inglourious Basterds, the cinema was completely sold out today. Fortunately we were able to grab some of the last tickets.

Just in case the entrance seemed familiar to those who have attended a Mahout presentation in the recent past - a picture of the Odeon usually visualises one part of my motivation on the Mahout slides ;)

Apache Hadoop Event Blog

2009-08-24 20:38
As Apache Hadoop becomes ever more popular both in industry as well as in research, user groups, conferences and hacking days are being scheduled around the world. The goal of the event calendar blog hosted on is to provide a common space for organizers to announce their events and potential participants to look for new conferences.

September Apache Hadoop Get Together @ Berlin

2009-08-23 20:48
The upcoming Apache Hadoop Get Together Berlin is to take place on September 29th in newthinking store. Details are up on the web page at upcoming and will be sent out to the mailing list soon.

Fellow now

2009-08-23 20:45
After two years volunteering as booth staff for the FSFE at the Chemnitzer Linuxtage explaining the advantages of becoming a FSFE fellow I am a fellow myself for two days ;)

I first got in contact with the FSFE through Fernanda Weiden during my time in Zürich in 2006. In the meantime I have learned more and more about the political activities of FSFE: Mostly during the local Berlin meetups in newthinking store and as a booth member in Chemnitz.

If you yourself want to support the work of FSFE, join the fellowship.

Flying back home from Cologne

2009-08-23 20:40
Last weekend FrOSCon took place in Sankt Augustin, near Cologne. FrOSCon is organized on a yearly basis at the university of applied sciences in Sankt Augustin. It is a volunteer driven event with the goal of bringing developers and users of free software projects together. This year, the conference featured 5 tracks, two examples being cloud computing and the Java track.

Unfortunately this year the conference started with a little surprise for me and my boyfriend: Being both speakers, we had booked a room in Hotel Regina via the conference committee. Yet on Friday evening we had to learn that the reservation never actually reached the hotel... So, after several minutes talking to the receptionist, calling the organizers we ended up in a room that was booked for Friday night by someone who was known to arrive no earlier than Saturday. Fortunately for us we have a few friends close by in Düsseldorf: Fnord was so very kind to let us have his guest couch for the following night.

Checkin time next morning: On the right hand side the regular registration booth. On the left hand side the entrance for VIPs only. The FSFE quickly realized it's opportunity: They soon started distributing flyers and stickers among the waiting exhibitors and speakers.

Set aside the organizational issues, most of the talks were very interesting and well presented. The Java track featured two talks by Apache Tomcat committer Peter Roßbach, the first one on the new Servlet 3.0 API, the second one on Tomcat 7. Too sad, my talk was in parallel to his Tomcat talk, so I couldn't attend that. I appreciate several of the ideas on cloud computing highlighted in the keynote: Cloud computing as such is not really new or innovative, it is several good ideas so far known for instance as utility computing that are now being improved and refined to make computation a commodity. At the very moment however cloud computing providers tend to sell their offers as new, innovative products. There is no standard API for cloud computing services. That makes switching from one provider to another extremely hard and leads to vendor-lockin for its users.

The afternoon was filled by my talk. This time I tried something, that so far I only have done in user groups of up to 20 people: I first gave a short introduction into who I am and than asked the audience to describe themselves in one sentence. There were about 50 people, after 10 minutes everyone had given is self-introduction. It was a nice way of getting detailed information of what knowledge to expect from people, and it was interesting to hear people from IBM and Microsoft being in the room.

After that I attended the RestMS talk by Thilo Fromm and Peter Hintjens. They showed a novel, community driven way to standards creation. RestMS is a messaging standard that is based on a restful way for communication. So far the standard itself is still in it's very early stages, still there are some very “alpha, alpha, alpha” implementations out there that can be used for playing around. According to Peter there are actually people who already use these implementations for production servers and send back bug reports.

Sunday started with an overview of the DaVinci VM by Dalibor Topic, the author of the OpenJDK article series in the German Java Magazin. Second talk of the day was an introduction to Scala. I already know a few details of the language, but the presentation made it easy to learn more: It was organised as an open question and answer session with live coding leading through the talk.

After lunch and some rest, the last two topics of interest were on details on the campaigns of FFII against software patents and an overview of the upcoming changes in gnome3.0.

This year's FrOSCon did have some organizational quirks but the quality of most of the talks was really good with at least one interesting topic in one of the sessions at nearly every time slot - though I must admit that that was easy in my case with Java and cloud computing being of interest to me.

Update: Videos are up online.

Converting a git repo to svn

2009-08-17 10:15
Pretty unlikely though it may seem, but there are cases when one might want to convert a git repo to svn and still keep all revisions intact. There is a nice explanation online on how to do that in the Google Open Source blog.

September 2009 Hadoop Get Together Berlin

2009-08-17 09:11
The newthinking store Berlin is hosting the Hadoop Get Together user group meeting. It features talks on Hadoop, Lucene, Solr, UIMA, katta, Mahout and various other projects that deal with making large amounts of data accessible and processable. The event brings together leaders from the developer and user communities. The speakers present projects that build on top of Hadoop, case studies of applications being built and deployed on Hadoop. After the talks there is plenty of time for discussion, some beer and food.

There is also a related Xing Group on the topic of building scalable information retrieval systems. Feel free to join and meet other developers dealing with the topic of building scalable solutions.


Please see upcoming page for updates.

  • Thilo Götz: JAQL
  • Uwe Schindler: Lucene 2.9
  • Ad Recommendation with Hadoop
  • T. Schuett: Solving puzzles with Hadoop.

If you yourself would like to give a presentation: There are additional slots of 20 minutes each available. There is a beamer provided. Just bring your slides. To include your topic on this web site as well as the entry, please send your proposal to Isabel.

After the talks there will be time for an open discussion. We are going into a nearby restaurant after the event so there will be plenty of time for talking, discussing and new ideas.


The Apache Hadoop Get Together takes place at the newthinking store Berlin:

newthinking store GmbH

Tucholskystr. 48

10117 Berlin

View Larger Map


  • Homeli - not exactly in walking distance, but only a few S-Bahn stations away. Very nice Bed and Breakfast hotel. (The offer is only valid if you stay for at least three nights.)

  • Circus Berlin is a combination of hostel and hotel close by.

  • Zimmer in Berlin is yet another Bed and Breakfast hotel.

  • House boat near Friedrichshain


If you would like to be notified on news please subscribe to our mailinglist. The meetings usually are also announced on the project mailing lists as well as on the newthinking store website.


In case you have any trouble reaching the location or finding accomodation feel free to contact the organiser Isabel.

Past events

AMQP Erlang user group talk

2009-07-10 15:56
Last Wednesday at the Erlang user group Berlin Matthias Radestock from the RabbitMQ project gave a talk on RabbitMQ, AMQP and messaging in general. Slides are available online.

First Matthias motivated the need for an open standard for messaging: So far, their are a few provides of middleware systems like Tibco and IBM. But those solutions are usually closed, expensive, cumbersome to handle. In short they do not fit into a world where people rely on open standards for communication, free software for development and lightweight implementations.

AMQP aims to povide an open standard for messaging - that is decoupled communication between processes that may reside on separate boxes or in different datacenters. There are a few providers of AMQP implementations. Some examples are iMatix focussed on low latency communication, Apache Qpid and the corresponding project inside of RedHat and RabbitMQ.

RabbitMQ is implemented in Erlang (after all, the talk was hosted by the Erlang User Group Berlin ;) ). With about 7000 lines of code the code base is rather compact. The goal was not to built a super-fast implementation, but one that is scalable and highly available.

So far there is no facility for building reliable cross datacenter communication built into RabbitMQ. Yet, there are several projects available that aim at providing just that.

Solr at AOL

2009-07-02 13:06
Grant Ingersoll has posted a very interesting interview with Ian Holsman on Solr at Relegance, now AOL. It describes the business side of the decission to switch to an open source solution, provides some inside on the size of the installation and details which technological reasons have driven the decission to switch from a proprietary implementation to Solr:

Lucene slides online

2009-06-30 10:04
The slides of the Lucene talk at the last Apache Hadoop Get Together Berlin are available online: Lucene Slides. Especially interesting to me are the last few slides which detail both index size and machine setup:

The installation is running on two standard PCs with 2 dual-core processors (usual speed, bought in January 2008 for about 4000 Euro). They have 32GB RAM, 24 GB are used as ramdisk for the index. Without ramdisk initial queries especially those accessing fields are slower but still acceptable. The index contains about 19 million documents, that is 80GB of indexed text + billions of annotated tags.