Apache Hadoop Get Together Berlin

2009-10-29 07:40
Title: Apache Hadoop Get Together Berlin
Location: newthinking store, Tucholskystr. 48, Berlin Mitte
Link out: Click here
Description: The upcoming Apache Hadoop Get Together Berlin will feature four talks by people explaining how they put Hadoop to good use in their entreprise. Table at Cafe Aufsturz is booked already. Talks will be announced late next week.
Start Time: 17:00
Date: 2009-12-16

Videos are up

2009-10-22 07:31
As of yesterday the videos of the last Apache Hadoop Get Together Berlin are available online.

Thanks to the speakers for providing insight in their projects and thanks to Cloudera for sponsoring the videos.

The next meetup will be announced soon - three talks have already been proposed. In addition, StudiVZ offered to sponsor video taping of the next Get Together. Looking forward to seeing you in Berlin in December.

Lucene 2.9 @ Heise

2009-10-06 18:13
After last week's Hadoop Get Together heise published an in-depth article on the changes and improvements that come with the latest Lucene 2.9 release.

Thanks to Simon Willnauer for helping me write this article and patiently explaining several new features. Thanks also to Uwe Schindler for kindly proof-reading the article before it was sent out to Heise.

Getting Hadoop trunk up and running from source

2009-10-04 20:18
Having told Thilo about the possibility to write Hadoop jobs in Python with Dumbo, we spent some time getting Dumbo 0.21 up and running over the past weekend. The first option the wiki proposes is to take a pre-0.21 release and patch that to work with the current Dumbo release. The second option described takes the not-yet-released version of Hadoop that can be used w/o any patches.

We decided to follow the latter suggestion. After the latest split of the project, we downloaded common, hdfs and mapreduce. Building each project was easy - assuming that ant, Sun JDK 6 (for Hadoop), Forrest (for the documentation pages) and Sun JDK 5 (for forrest) is installed.

Deviating from the documentation, the distributed filesystem as well as map reduce are now started from separate scripts (start-dfs.sh/ start-mapred.sh instead of start-all.sh). These scripts are located in the common project. In addition the variables HADOOP_HDFS_HOME and HADOOP_MAPRED_HOME must be set to point to respective projects for cluster setup to work. Other than that the setup currently is identical to the previous version.

Dev House Berlin 2.0

2009-10-04 20:04
This weekend DevHouseBerlin took place in the Box119, kindly organized by Jan Lehnardt, sponsored by Upstream and StudiVZ. There were about 30 people gathered in Friedrichshain, hacking and discussing various projects: Mostly Python/ Django, Ruby/ Rails and Erlang people.

The first day was reserved for hacking and exchanging ideas. Late afternoon attendees put together a list of talks that were than rated, ranked with the top three chosen for presentation on Sunday. The list included topics on CouchDB, RestMS, Hadoop, Concurrency in Erlang, P2P CouchDB and many more. The first three topics were chosen by the participants for presentation.

During the time at DevHouse I finally got a list of topics and papers up at Mahout TU project - now only the exact credit system for the Mahout course at TU is missing. I got some time to work on Mahout improvements and documentation. Unfortunately I was too tired today to complete the code review for MAHOUT-157 - promise to do that early next week.

Spending one weekend with equal-minded people, being able to pair with someone else in case of more complex problems made the weekend a great time for me. Planning to be there again next year. Thanks to the sponsors and organisers for making this happen.

Slides are up

2009-09-30 09:02
The slides for yesterday's talks just arrived. They are available online at:

Videos will be online early next week.

Upcoming: Apache Hadoop Get Together Berlin

2009-09-23 19:00
This is a friendly reminder that the next Apache Hadoop Get Together takes place next week on Tuesday, 29th of September* at newthinking store (Tucholskystr. 48, Berlin).

  • Thorsten Schuett, Solving Puzzles with MapReduce.
  • Thilo Götz, Text analytics on jaql.
  • Uwe Schindler, Lucene 2.9 Developments.

Big thanks goes to newthinking store for providing the venue for free and to Cloudera for sponsoring videos of the talks. Links to the videos will be posted on , on the upcoming page linked above, as well as on the Cloudera Blog soon after the event. Yet another thanks goes to O'Reilly for providing three "Hadoop: The Definitive Guide" books to be raffled at the event.

The 7th Get Together is scheduled for December, 16th. If you would like to submit a talk or sponsor the event, please contact me.

Hope to see you in Berlin next week.

* The event is scheduled right before the UIMA workshop in Potsdam, which may be of interest to you if you are a UIMA user.

Apache Hadoop Event Blog

2009-08-24 20:38
As Apache Hadoop becomes ever more popular both in industry as well as in research, user groups, conferences and hacking days are being scheduled around the world. The goal of the event calendar blog hosted on wordpress.com is to provide a common space for organizers to announce their events and potential participants to look for new conferences.

September Apache Hadoop Get Together @ Berlin

2009-08-23 20:48
The upcoming Apache Hadoop Get Together Berlin is to take place on September 29th in newthinking store. Details are up on the web page at upcoming and will be sent out to the mailing list soon.

September 2009 Hadoop Get Together Berlin

2009-08-17 09:11
The newthinking store Berlin is hosting the Hadoop Get Together user group meeting. It features talks on Hadoop, Lucene, Solr, UIMA, katta, Mahout and various other projects that deal with making large amounts of data accessible and processable. The event brings together leaders from the developer and user communities. The speakers present projects that build on top of Hadoop, case studies of applications being built and deployed on Hadoop. After the talks there is plenty of time for discussion, some beer and food.

There is also a related Xing Group on the topic of building scalable information retrieval systems. Feel free to join and meet other developers dealing with the topic of building scalable solutions.


Please see upcoming page for updates.

  • Thilo Götz: JAQL
  • Uwe Schindler: Lucene 2.9
  • nugg.ad: Ad Recommendation with Hadoop
  • T. Schuett: Solving puzzles with Hadoop.

If you yourself would like to give a presentation: There are additional slots of 20 minutes each available. There is a beamer provided. Just bring your slides. To include your topic on this web site as well as the upcoming.org entry, please send your proposal to Isabel.

After the talks there will be time for an open discussion. We are going into a nearby restaurant after the event so there will be plenty of time for talking, discussing and new ideas.


The Apache Hadoop Get Together takes place at the newthinking store Berlin:

newthinking store GmbH

Tucholskystr. 48

10117 Berlin

View Larger Map


  • Homeli - not exactly in walking distance, but only a few S-Bahn stations away. Very nice Bed and Breakfast hotel. (The offer is only valid if you stay for at least three nights.)

  • Circus Berlin is a combination of hostel and hotel close by.

  • Zimmer in Berlin is yet another Bed and Breakfast hotel.

  • House boat near Friedrichshain


If you would like to be notified on news please subscribe to our mailinglist. The meetings usually are also announced on the project mailing lists as well as on the newthinking store website.


In case you have any trouble reaching the location or finding accomodation feel free to contact the organiser Isabel.

Past events