Alan Atlas at Scrumtisch Berlin

2010-02-17 23:35
At the last Berlin Scrumtisch, @AlanAtlas gave a presentation on how he introduced Scrum at Amazon (starting as early as back in 2004). Introducing Scrum at Amazon by that time seemed natural due to a few factors:

  • Amazon was and is always very customer centric. The original methodology of working backwards in time - that is starting with the press release, from there writing the FAQ and manual and finally get to the code - really made people concentrate on the product.
  • Teams were sized according to the two-pizza rule: Each team is supposed to be no larger than can reasonably be fed for lunch with two pizzas. Turns out that is about five to ten people, given regular american pizzas.
  • Management didn't really care exactly *how* the job of software development was done. They only cared *that* it was done. This proved as both - an advantage as it gave quite a bit of freedom - and a disadvantage - as indifference leads to impediments in the process that cannot be easily resolved.

The goal of Alan's team was to build an infinitely scalable, zero downtime storage system that had a web service based interface - it was the S3 team. Given the task at hand, it was clear that the task wasn't going to be anything even close to trivial. Alan's idea: Try Scrum on that.

He himself came to the idea after attending an Agile conference and hearing about Scrum for the first time there. Alan requested a Scrum training from his manager, who approved it, provided it took place in Seattle - close to Amazon HQ that is. Turns out, the Scrum trainer available in Seattle happened to be Ken Schwaber.

The idea of Scrum basically was spread by word of mouth propaganda: In the hallways, in the smokers lounge, close to the coffee machine. People started adopting it - with some teams doing better, some worse. Allan himself got two days a quarter to give people an introduction to this funny new methodology called Scrum.

18 months later, S3 was shipped - as a side effect, the number of subscribers to the internal scrum mailing list had increased from three to 100, there were 150 CSM graduates, still three teams using XM, reputation of scrum was mostly positive.

In January 2007 others started giving their own introductions to Scrum. Developers generally liked it, there were significant failure cases, lots of resistance and misunderstanding mostly on the management side who sometimes confused it with lean production.

In June 2008 Allen became a fulltime internal Scrum trainer. First thing organised was a Scrum gathering - in an open spaces like setup people were invited to discuss their issues and questions on Scrum.

In January 2009, about 50% Scrum usage was reached, their were 600 subscribers on the Scrum mailing list, 450 CSM graduates, reputation of Scrum was as good (or bad) as months before. However it became more and more clear that further restructuring would only be possible against a lot of resistance.

There were a few lessons learned from introducing Scrum in other companies as well:

  • You cannot introduce Scrum if there is no notion of comparably stable teams (as in for at least six months, not five projects at a time per developer).
  • You need permission and ownership of committments to really get going.
  • You need to know at least the basics of Scrum.

There are a few factors that make introduction easier: Community matters. People need to be able to talk to each other, exchange experiences and learn from one another. Coaching matters. More support, immediate success in lots of cases is the result of getting a coach on the team. Credibility for management increases, Scrum implementation is easier understood that way, skepticism and resistance reduced. Management matters. Middle management must be part of the process. They will mess up scrum if not educated correctly. Going bottom-up works up to a point - but to go the whole way, you do need management.

For Amazon, that is the point where implementation got stuck for Allan: Management was neither interested nor really involved in Scrum. However there are some impediments that cannot be fixed w/o. Still Scrum is working to some extend - teams are still trying to get better and improve the process. So it is not really "Scrum, but". Even going only part of the way helped a lot already.

As for the questions: There were a few unusual ones - e.g. on what to do with a team that is itself skeptical about Scrum introduction - especially if that was done by higher management. There are a few ways to remedy that: Point out success to the negative team member. Make that member a Scrum master to let him go through the process and really understand what is going on. And above all make the reasons for going that way clear and transparent. Including promises and wishes from that change.

Another question dealt with how to convince management of Scrum. Clearly better performing teams with happy developers ("you cannot buy developers with money only") are some valid reasons.

Yet another question dealt with convincing customers. Allan's way is to sneak Scrum into the process: Speak the customers language. If he does not like spring planning, call it a project status meeting.

Asked for cases of developers feeling like they work in a hamster wheel when doing Scrum, the general consensus was that it needs to be made sure that people do not only get more responsibilities but get the benefits from Scrum as well. Otherwise Scrum with all its numbers and measurements can be perfectly abused and turned into an amazing micro management tool.

Asked for other bay area companies using Scrum - yes, apple does so, Microsoft at least tries to. Google certainly uses the ideas, without calling it Scrum. Facebook seems not to use it. Xing (not bay area) uses Scrum. There seem to be about 50% of all software development companies worldwide who are using Scrum.

After the talk there was time to gather and have some pizza and pasta, time for drinks and discussions. I really appreciated the comments and ideas exchanged after the talk. Hope to see you all next time around.

FOSDEM - video recordings online

2010-02-14 20:32
As published in the FOSDEM blog the video recordings are available online - at least for the main track and the lightning talks. Happy video watching!

Berlin Buzzwords - June 2010

2010-02-11 22:42
As announced at FOSDEM: Early June (currently scheduled for 7th/8th) a conference on the topics scalable search, storage and processing will take place in Kalkscheune/Berlin. The conference is co-organised by newthinking store, Jan Lehnardt, Simon Willnauer, Thilo Fromm, and Isabel Drost.

The focus will be on NoSQL databases like CouchDB, Jackrabbit, MongoDB, HBase. Search tracks will cover topics like Lucene, Solr, katta and others. Data munging tracks will focus mainly on Hadoop, MapReduce in general and distributed systems.

More information including the call for presentations will be made available online next week on a separate webpage. Early registration starts in March. Watch this blog for more information or follow @hadoopberlin.

Apache Hadoop Get Together - March 2010 - Update

2010-02-11 14:25
Due to conflicts in the schedule of newthinking store, we had to change the time of the Get Together slightly. We will start one hour earlier than announced.

When: March 10th, 4p.m.
Where: newthinking store, Tucholskystr. 48, Berlin Mitte

Looking forward to seeing you there.

FOSDEM 2010 - part 3

2010-02-10 21:02
Sunday started in Janson with Andrian Bowyer's talk on RepRap machines, that is devices that can be used as manufacturing devices and are able to replicate themselves. After that I went over to the Mono dev room to listen to Miguel de Icaza on Mono Edge. A great talk on the history of Mono, the way the community interacts with Microsoft, the C# language itself and special features only available in Mono.

After this talk we went over to Janson for Andrew Tanenbaum's talk on Minix. We knew quite a bit of the talk already from Froscon two years ago, however Andrew is an awesome speaker, so it's always fun to catch up on the news on Minix.

The scalability talk started with an introduction to Hadoop by myself and continued with a talk on the facebook infrastructure by David Recordon. According to feedback I got after the talk, laughing with Thilo helped quite a bit to get myself calm. Before the talk I received one very good recommendation of one of the audio guys: Imagine you are giving the talk to one of your best friends - and forget about the microphone. Though I had way more slides than minutes to talk, we had enough time for the Q&A session after the talk. I started the talk by learning more about the audience - however this time not by handing the microphone to those listening (room too large) - I just asked them "have you heard about Hadoop?" - half of the audience. Are you Hadoop users: one quarter maybe. How large are your clusters? - 10 to 100 nodes mostly. Have you heard of Zookeeper? - some, Hive - some more, Pig - a few, Lucene - a lot, Solr - a little less, Mahout - maybe 5, Mahout users: 1.

Turns out the Mahout user in the audience was Olivier: It's so nice to meet people you know are active on the mailing lists for real and have a chat with them. Hope to see you more often on the lists - and meet you face to face again.

I used the chance to announce the Berlin Buzzwords 2010, a two day event on search and scalability buzzwords like cloud computing, Hadoop, Lucene, NoSQL and more. It takes place on June 7th and 8th in the center of Berlin. Follow this blog for further information. Judging from the input I got after the announcement there is quite some need for such a conference in Europe.

The slides of my talk are soon to be available online.

After my talk I could stay in Janson: A talk on the Facebook infrastructure (not only the Hadoop side of things) followed. After that I met Lars George at the NoSQL dev room - unfortunately I did not manage to actually talk to Steven Noels, who organised the room.

The afternoon was reserved for Greg Kroah-Hartman on how to "Write and submit your first Linux Kernel Patch" - my personal conclusion: git is really awesome. I really, really need to find a few spare minutes to learn how to effectively use it.

In the evening we met with Pieter Hintjens for dinner - and to finalize an awesome weekend in Brussels and a great 10th anniversary FOSDEM. A huge Thank You to all volunteers and organisers of FOSDEM - you did a great job this year putting together an awesome schedule, you did a fantastic job making the now pretty huge event (with 306 talks and about 5000 hackers attending) run smoothly. Even the wireless was working from minute one. See you again at FOSDEM 2011.

Hadoop trainings in Europe

2010-02-02 19:23
Recently I received this mail from Christophe Bisciglia on Cloudera Hadoop trainings. Thought it might be interesting to the Hadoop Berlin community:

Hadoop Fans,

Over the next year, you'll see new options for Hadoop training and
certification from Cloudera. One of the first things you'll see will
be live sessions outside the US, tentatively planned for the April /
May time frame.

We've seen strong interest in Hadoop on all of our international
trips, so we'd like to ask for community input as we decide exactly
which cities to visit next. For cities we come to, we'll offer our 3
day developer training + certification, and with sufficient interest,
we may also include a 1 day training + certification program for
system administrators.

If you are interested in attending one or both of these sessions,
please fill out a brief survey (link below). If you're using Hadoop at
work, and it's time to train more of your team, you can let us know
how large of a group you have. Survey responses aren't a commitment to
attend, but we may reach out to respondents before we schedule a
session to get a better understanding of actual attendance.

You can fill out survey here:

If you have any trouble with the survey, or are interested in a
private training session, please don't hesitate to reach out directly.


Shopping at Ikea

2010-02-01 19:17
Some weeks ago, Thilo had a tiny little gadget not to be missed in an average geek's appartment: A server - admittedly a little old and a bit slow, but still usable for playing around. He installed Ubuntu server on it. At the evening we got it configured to run Hadoop. Little later we found out that some friends of us probably, maybe have some usable hardware left as well - we'll see on Monday.

However having a server on your dinner table is not really practical: There's always some danger of spilling tea over it... However last week, one of my colleagues posted a link to the Lack Rack wiki page in the eth-0 Wiki on one of our mailing lists.

So yesterday was one of the (very rare) days, when I got Thilo to join me on a trip to Ikea. The result can be seen in the images above. Looks like elephants invaded our living room ;)

Hadoop at Heise c't

2010-01-31 13:37
Interesting for those readers speaking German: Heise published an introductory article on Hadoop in its latest issue. Have fun reading.

Thanks to Simon for proof-reading and providing valuable input. Thanks to Thilo Fromm for the hadoop graphics (unfortunately none of them got published in its original form), the catchy title, proof-reading the text over and over again and for keeping me sane during several past and coming months.

If you want to know more on Apache Hadoop, come watch my FOSDEM Hadoop talk next weekend. If you want to join discussions on Apache Hadoop and Lucene, stay tuned for a conference in Berlin on these topics.

March 2010 Apache Hadoop Get Together Berlin

2010-01-29 08:40
This is to announce the next Apache Hadoop Get Together that will take place in newthinking store in Berlin.

  • When: March 10th, 4p.m.
  • Where: Newthinking store Berlin

As always there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. We will go to Cafe Aufsturz after the event for some beer and something to eat.

View Larger Map

Talks scheduled so far:

Chris Male (JTeam/ Amsterdam): Spatial Search with Solr

Abstract: The rise in popularity of Google Maps and mobile devices with GPS have resulted in a trend in the search field. People are no longer content with finding results that match a text query, they also want to find results which are near a location. So called spatial search differs considerably from traditional free text search in that it cannot be achieved through common search techniques such as inverted indexes. Instead, new algorithms and data structures had to be developed that achieve efficient and accurate spatial search, that also allow spatial search to have a role in the determination of a result's relevance. This technology has primarily been found in proprietary closed source search applications, however in the last 12-18 months, considerable effort has been invested into bringing open source spatial search support to Apache Solr and Lucene. While much is still left to be done, this talk will introduce how spatial search is currently supported in Solr, what work is happening currently, and a roadmap for future developments.

Dragan Milosevic (zanox/ Berlin: Product Search and Reporting powered by Hadoop


To efficiently process and index 80 million products, as well as store and analyse 30 million clicks and 500 million views daily, Zanox AG is using Hadoop HDFS and Map?Reduce technologies. This talk will present product-processing and reporting frameworks running on 17 node Hadoop cluster, being able to (1) robustly store products and tracking data in distributed manner, (2) rapidly consolidate, normalise and categorise products, (3) merge and aggregate tracking data and (4) efficiently builds indexes for supporting distributed search and reporting, running in several search clusters.

Bob Schulze (eCircle/ Munich): Database and Table Design Tips with HBase

Abstract: Recurring design patterns for the BigTable/HBase storage model.

A big Thanks goes to the newthinking store for providing a room in the center of Berlin for us. Another big thanks goes to Nokia Gate 5 for sponsoring videos of the talks. Links to the videos will be posted here.

Please do indicate on the following Upcoming event if you are planning to attend to make planning (and booking tables at Aufsturz) easier. Registration through Xing is possible as well.

Looking forward to seeing you in Berlin,

Third "December Hadoop Get Together" video online

2010-01-05 19:29
In the following video taken at the last Hadoop Get Together in Berlin Jörg Möllenkamp explains why Hadoop is interesting for Sun - and why Sun Hardware might be a good fit for Hadoop applications:

Hadoop Jörg Möllenkamp from Isabel Drost on Vimeo.

In a blog post published after the event, Jörg gives more details on his idea of Parasitic Hadoop he introduced at the meetup.