As I keep searching for those URLs over and over again linking them here. When running into JVM heap issues (an out of memory exception is a pretty sure sign, so can be the program getting slower and slower over time) there's a few things you can do for analysis:
Start with telling the effected JVM process to output some statistics on heap layout as well as thread state by sending it a SIGQUIT (if you want to use the number instead - it's 3 - avoid typing 9 instead ;) ).
More detailed insight is available via jConsole - remote setup can be a bit tricky but is well doable and worth the effort as it gives much more detail on what is running and how memory consumption really looks like.
For an detailed analysis take a heap dump with either jmap, jConsole or by starting the process with the JVM option -XX:+HeapDumpOnOutOfMemoryError. Look at it either with jhat or the IBM heap analyzer. Also netbeans offers nice support for searching for memory leaks.
As of Monday, February 6th a new Apache Mahout version was released. The new package features
Lots of performance improvments:
A new LDA implementation using Collapsed Variational Bayes 0th Derivative Approximation - try that out if you have been bothered by the way less than optimal performance of the old version.
Improved Decision Tree performance and added support for regression problems
Reduced runtime of dot product between vectors - many algorithms in Mahout rely on that, so these performance improvements will affect anyone using them.
Reduced runtime of LanczosSolver tests - make modifications to Mahout more easily and have faster development cycles by faster testing.
Increased efficiency of parallel ALS matrix factorization
Performance improvements in RowSimilarityJob, TransposeJob - helpful for anyone trying to find similar items or running the Hadoop based recommender
K-Trusses, Top-Down and Bottom-Up clustering, Random Walk with Restarts implementation
Added MongoDB and Cassandra DataModel support
Added numerous clustering display examples
Many bug fixes, refactorings, and other small improvements. More information is available in the Release Notes.
Overall great improvements towards better performance, better stability and integration. However there are still quite some outstanding issues and issues in need for review. Come join the project, help us improve existing patches, improve performance and in particular integration and streamlining of how to use the different parts of the project.
Though I had the chance to tinker with some Clojure code only briefly it's programming model and the resulting compact programs do fascinate me. As the resulting code runs on a JVM and does integrate well with existing Java libraries migration is comparably cheap and easy.
Today I finally managed to attend the local Berlin Clojure meetup, co-organised by Stefan Hübner and Fronx. Timing couldn't have been much better: In this evenings event Philip Potter from Thoughtworks introduced Overtone - a library for making music with Clojure.
After installing and configuring jack for sound output, supercollider, and overtone outputting your first tone is as simple as registering the overtone library and typing (definst foo  (saw 220)) (foo)
To stop it type (stop).
Other types of waves of course are supported as well, so is playing different waves simultaneously and modifying them at runtime. Also expressing sounds as notes (c, d, e, f, g) that may have a certain length is possible of course – which makes it so much easier to design music than having to thing in frequencies.
A sample of what can easily be done with Overtone:
Original sound way better - this sample was taken with a mobile phone, compressed, re-coded and then put online. Checkout Overtone project for the real thing - and don't even try to listen to the sample with low-end laptop speakers ;)
Overall a well organised meetup (Thanks to Soundcloud for hosting it, to the organisers for putting it together and to the speaker for a really well done introduction to Overtone) and an interesting way to get started with Clojure with very fast (audio) feedback.
Markus Andrezak : "Queue Management in Product Development with Kanban - enabling flow and fast feedback along the value chain" - It's a truism today that fast feedback from your market is a key advantage. This talk is about how you can deliver smallest product increments or MVPs (minimal viable products) quickly to your market to get fastest possible feedback on cause and effect of your product changes. To achieve that, it helps to provide a continuous deployment infrastructure as well as all you need for A/B testing and other feedback instruments. To make the most of these achievements, Kanban helps to limit work in progress, thus manage queues and speed up lead times (time from order to delivery or concept to cash). This helps us speed through the OODA Loop, i.e. Eric Ries' (The Lean Startup) Model -> Build -> Code -> Measure -> Data -> Validate -> Model. The more we can go through the loop, the more we have a chance to fine tune and validate our model of the business and finally make the right decisions.
Markus is one of Germany’s leading Kanban practitioners - writing and presenting talks about it in numerous publications and conferences. He will provide a brief view into how he is achieving fast feedback in diverse contexts. Currently he is Head of mobile commerce at mobile.de.
Martin Scholl : "On Firehoses and Storms: Event Thinking, Event Processing" - The SQL doctrine is still in full effect and still fundamentally affects the way software is designed, the state it is stored in as well as the system architecture. With the NoSQL movement people have started to realize that the manner in which data is stored affects the full stack -- and that reduction of impedance mismatch is a good thing(TM). "Thinking in events" follows this tradition of questioning what is state-of-the-art. Modeling a system not in mutable entities (as with data stores) but as a stream of immutable events that incrementally modify state, yields results that will exceed your expectations. This talk will be about event thinking, event software modeling and how Twitter's Storm can help you process events at large.
Martin Scholl is interested in data management systems. He is also a Founder of infinipool GmbH.
Fabian Hüske : "Large-Scale Data Analysis Beyond Map/Reduce" - Stratosphere is a joint project by TU Berlin, HU Berlin, and HPI Potsdam and researches "Information Management on the Cloud". In the course of the project, a massively parallel data processing system is built. The current version of the system consists of the parallel PACT programming model, a database inspired optimizer, and the parallel dataflow processing engine, Nephele. Stratosphere has been released as open source. This talk will focus on the PACT programming model, which is a generalization of Map/Reduce, and show how PACT eases the specification of complex data analysis tasks. At the end of the talk, an overview of Stratosphere's upcoming release will be given.
Fabian has been a research associate at the Database Systems and Information Management (DIMA) group at the Technische Universität Berlin since June 2008. He is working in the Stratosphere research project, focusing on parallel programming models, parallel data processing, and query optimization. Fabian started his studies at the University of Cooperative Education, Stuttgart, in cooperation with IBM Germany in 2003. During that course, he visited the IBM Almaden Research Center in San Jose, USA, twice and finished in 2006. Fabian undertook his studies at Universität Ulm and earned a master's degree in 2008. His research interests include distributed information management, query processing, and query optimization.
A big Thank You goes to Axel Springer for providing the venue at no cost for our event and for paying for videos to be taped of the presentations. A huge thanks also to David Obermann for organising the event.
c-base - 8p.m. on a Monday evening - the room is packed (and pretty cloudy as well): Time for Dorkbot, a short series of talks on "People doing strange things with electricity" hosted by Frank Rieger.
First talk up on stage was Gismo on Raumfahrtagentur - a Berlin maker-space located in Wedding. Originating from the presenter's interest in electrical bikes a group of ten people interested in hardware hacking got together. Projects include but are not limited to 3D printing, 3D scanning, textile hacking, a collaborative podcast. Essentially the idea is to provide room and infrastructure to be used collaboratively by a group of members. From an organisational point of view the group is incorporated as a GmbH - however none of the projects is mainly targeted to commercialization: It's main target group are hobbyists, researchers and open hardware/software people. If interested: Each Monday evening there is a "Sunday of the Kosmonauts" where externals are invited to come visit.
Second talk was on the project Drinkenlights (Klackerlaken) - a way for children to learn the basics of electronics without any soldering (hardware available for three Euros max). Experiences made with giving the ingredients for creating these toys to children of varying ages were interesting: From kids of about five years playing around up to ten/eleven year olds that when in school seemingly had to re-learn being creative without being given much direction or instruction on the task at hand.
In the third talk Martin Kaltenbrunner introduced his Tworse Key - a nice symbiosis of old technology (a morse key) and new media (Twitter). Essentially built on top of an Arduino Ethernet board it made it possible to turn morse messages into Tweets. Martin also gave a brief overview of related art projects and briefly touched upon the changes that open source and open hardware bring to art: There are projects that open all design and source code to the public to benefit from a wider distribution channel (without having to actually produce anything), working on designs in a collaborative way and get improvements back to the original project. All of these form a stark contrast to the existing idea of having one single author whose contribution is to build a physical object that is then presented in exhibitions - providing both, new possibilities and new challenges to artists.
In the last presentation Milosch introduced his new project ETIB whose goal it is to bring hardware hacking geeks together with textile geeks to work on integrating circuits into clothes.
If you are interested in hacking spaces in general and what is happening in that direction in Berlin, mark this Friday in your calendar: c-base will be hosting a Hackerspace meetup - so if you want to know how hackerspaces work or want to create one yourself, this event might be interesting to you.
After quite some time off I went to the Scrumtisch Berlin. The event was incredibly well visited - roughly 50 people filled the upper floor at Cafe Hundertwasser. Today's event was organised such that participants first collected discussion topics, prioritised them together and then discussed the top three items in a timebox of 15 minutes each.
Topics collected were:
Best tricks to make teams self organised (20 votes)
What is QA doing fulltime in a team (13 votes)
Ops and planning in a team (15 votes)
PO disappears and takes backlog and vision with him - what now? (7 votes)
Working with non-Software teams (17 votes)
Pimp up my retrospective (12 votes)
Multiple teams on one projects and vice versa (4 votes)
IBM doing Scrum/ massive Scrum (9 votes)
Feature knowledge vs market knowledge - what is more important in a PO if you have to choose due to people constraints (9 votes)
How to convince a team to do more (3 votes)
Why is agile good (10 votes)
Compared to previous meetings quite some topics repeat. About half of the attendees were there for the first time - so it seems there is a common set of questions people usually run into when rolling out Scrum.
Self organising teams
Seems like this is one of the most common questions run into when rolling out Scrum - how to really get to self organising teams. The question can be answered from two positions: What are the pre-requisites it takes to enable teams to become self organising? How to actually transform teams that are used to a command and control structure and are reluctant to transform?
The discussion, mainly led by Andrea Provaglio, CSM trainer focussed mainly on the first part of the question. Even when limiting discussion that way, the answer will still depend heavily on the organisation structure, number of management levels, team sizes.
Marion made the topic a bit more concrete: Given the flexible vacation planning approach of Netflix, her question was whether that sort of loose approach could work in a typical German company (after all we have 30 instead of <20 vacation days, we have fixed holidays, she as a C?O of course wants to avoid customers being left alone when the whole team is on vacation.) Andrea re-phrased agility a bit here. His proposal was to not allow people to take their time off just anytime but to give them the freedom to figure out when to take time off. He identified five principles for leadership:
Clearly setting a goal (in that case: Everyone needs to have a vacation plan at a given date.)
Provide the team with all resources, information and with the environment they need to accomplish their task.
Define constraints ("there must be at least one guy in the office on any given working day")
Check back regularly
Make yourself available to answer any questions
The discussion on teams reluctant to adopt self organisation was separated out and deferred. His point was mainly about enabling and encouraging self organisation. Enforcing self organisation however is not possible.
Scrum in non-Software teams
Though phrased very broadly the topic quickly turned into a "how to do Scrum for hardware" discussion. Main problem here is that the further down you go the longer design generally takes. Even just routing lines on one decently sized circuit board can take several weeks. Mainly three possible ways out of the problem emerged from the discussion:
Loosening the definition of done - "potentially shipable" may not mean sellable or even really shipable. I don't think one should go down that slippery path. Only by actually shipping can I get the feedback I need to improve my design. So instead of loosening the definition of done we should instead start thinking about ways to get faster feedback, reduce risk and introduce shorter iterations.
Another way is to look for ways to reduce iteration length, even though we might not get down to software release cycles, and align releases such that integration can happen earlier.
The third way out could be to realize that maybe Scrum does not quite fit that way of working and use a different process instead that still provides the transparency and fast feedback that is needed (think Kanban).
Overall the most important result of the discussion was that within 15min discussion the issues cannot be solved. After all the solution will depend on what exactly you are working on, who your suppliers are and what your team looks like. Most important is to recognize that there is a problem and to work on removing that impediment - most important is to identify issues and to improve your process.
Operations and planning in Scrum
The last question discussed involved operations and Scrum planning: Given a team that does software development but is interrupted frequently with production issues - how should they work in a Scrum environment.
There are multiple facets to that problem: When it comes to deciding whether to deal with something immediately or not it makes sense to weigh size of the issue against amount of work it takes to resolve it. "Getting things done" states that the minimum size of an issue to deal with instantly is 2min of work. Issue with that is that the assumption of GTD was that issues flow into and inbox that is dealt is when there is time. In production environments however these issues usually trickle in instantly interrupting developers over and over again incurring a huge cost due to task switching.
One way out might be to have an event queue and assign developers (on a rotating basis) to deal with the issues and leave time for others to work in a focused way. Make sure to rotate frequently instead of by sprint - otherwise you run into the problem of making the team unstable thus delivering no stable amount of business value each sprint.
Another obvious way is to account for frequent interruptions and include a buffer for those in your plan. The most important benefit of that approach is to make the cost of this working mode clearly visible to management - leaving the decision how to deal with it up to them.
Other simple fixes include introducing some level of indirection between the actual developer and the customer raising the issue, documenting solutions as well as incoming issues for better visibility, introducing a single point of contact capable of prioritising.
Coming back to vanilla Scrum however there is one interesting observation to be made: The main contract with iterations is for developers to be able to work in a focused way. Instead of having their tasks switched each day they are promised a fixed period of time to solve a given set of stories. In the end a sprint is a compromise between what management may need (change their mind on what is important frequently) and what developers need (working on a set of defined tasks not interrupted by re-priorisation). If the assumption of focus does not longer hold true, Scrum might be the wrong model. If what needs to be done changes daily, Kanban again might be the better option. Still making sure that the cost of task switching is visible is vitally important.
To sum up a very interesting Scrumtisch - in particular as agile methods really seem to become more and more common also in Berlin. Speaking of challenges: As user groups grow sometimes their character changes as well, in particular when built around participation and discussion. It will be interesting to watch Scrumtisch deal with that growth. Maybe at some point splitting the audience and having separate breakout sessions might make sense. Admittedly I'd also love to know more on the background of the audience: How many are actually using Scrum in the trenches vs. teaching Scrum as coaches? How long have they been using Scrum? In what kind of organisation? Maybe a topic for next time.
Last week I spent several days in Chicago mainly to attend a few meetings at the local Nokia/Navteq office. Though the schedule was pretty packed, a few hours remained to explore the then frosty and windy city:
Top three images: Some impressions of the city. Bottom left: Teddy's new friend. Bottom right: Situation at ORD when flying out - fortunately both, the airport as well as the airline (Swiss) have quite some experience with challenging weather conditions so that we could leave without too much delay.
As usual I wondered whether there are any Apache people close by. So before flying in I checked our committers map. As there were a few people in that general area I sent a brief heads-up to the greatly under-advertised, private, non-archived, committers only list email@example.com. In case you've never heard about it: The main use case of that list is to provide a means for committers to arrange for meeting up with fellow Apache people and share travel details.
As a result I received a brief list of things to do in Chicago and got to attend a small but really nice meetup. Having a means to get in touch with locals can make such a difference - thanks for the warm welcome! Hopefully next time I'm there weather is as warm - would love to explore the (at least according to my travel guide book) beautiful nature of the great lakes.
Judging from the way some people become overly careful when discussing agile in general and Scrum in particular in my presence I seem to slowly have built up a reputation for being a strong proponent of these methods. Given the large number of flaky implementations as well as misunderstandings it seems to have become fashionable to blame Scrum for all badness and dismiss it altogether - up to the point where developers are proud to finally having abandoned Scrum completely - so that now they
can work in iterations,
accept new tasks only for upcoming but not for the current iteration,
develop in a test-driven way,
have daily sync meetings,
mark tasks done only when they are delivered to and accepted by the customer,
have regular “how to improve our work” meetings,
estimate tasks in story points and only plan for as much work per iteration as was done in the past iteration
… my personal take on that: Add in regular releases and you end up with a pretty decent scrum/agile implementation, no matter what your preferred name for it may be. Just for clarification: Though very often I write about what I call Scrum, I don’t use that particular method just because it is the latest fashion. It simply is a tool that has served me well on multiple occasions and given me working guidelines when I had no idea at all what software development in a professional setting should look like.
So where does all that friction with anything Scrum, agile, lean or whatever you call it come from? Recently I came across a blog post that jillesvangurp.com nicely identified some grave issues with current Scrum adoption. Unfortunately the blog post only lists the failures without going into a deeper analysis of the actual defects causing those failures.
First of all, lets assume as working hypothesis that Scrum in itself does not solve any issues in your organisation but is a great tool to uncover deficiencies. The natural conclusion should be to use it as a tool to discover problems, but search for solutions for these problems elsewhere.
With that hypothesis, lets discern the the issues discussed in the post above and assign them to one of three defect categories.
Category one: Issues with the team
Problem: You have a team of all-junior developers, or of all-mediocre developers.
Goal: Turn them into a high performing team.
Solution: Imagine you were not using Scrum at all, what would be the ideal solution? Well the obvious route probably is to re-adjust the team, add several seniors so that you end up with the right mix of people that have experience and share a vision - juniors than can learn and adapt what works from them.
Comparing that to our hypothesis: Scrum is all for short delivery cycles. You will uncover teams that perform badly much faster than in methods with longer iteration periods. So it should be reasonably simple to figure out teams that have a dysfunctional configuration. Changing that configuration however no longer is dictated by Scrum.
Category two: Bugs introduced during Scrum roll-out
The failures discussed in the blog post include people following Scrum mechanically: Only because your developers are moving post-it notes from left to right does not mean they are doing anything agile. It’s perfectly possible to do waterfall in Scrum. Whether that helps solve any of your issues is a different matter.
Instead of mechanically going through the process what is more important is to understand the reasons and goals of each of the pieces that form Scrum. To make a rather far fetched comparison:
When introducing Scrum without a deep understanding of why each piece is done, what you end up with is people following that process without understanding the meaning of each step. They end up mimicking behaviour without knowing the real world: To some extend seeing only the shadows of good development patterns without understanding the real items producing these shadows.
As a general rule: Understand why there is a retrospective meeting, remember why you need estimations, think about why there are daily stand-ups (instead of weekly meetings, instead of daily sit-togethers, instead of hourly stand-ups). Figure out why there is a product owner, what the role of a scrum master does. Pro-Tip: As soon as you really have understood Scrum, you don’t need a checklist of all meetings to hold for a successful iteration - they will just fit in naturally. If they don’t, you are probably missing an important piece of the puzzle - rather than rely on a pre-fabricated checklist, go bug your trainer or coach with questions to better understand the purpose of all the different bits and pieces.
One very grave bug on roll-out is the general believe that Scrum is equal to a little bit of fairy dust you spread over your teams and all problems will automatically be solved afterwards. It is not - it’s not a silver bullet, it’s not fairy dust, it’s no magic - such things exist in fairy tales but have been seen nowhere in the real world. According to our working hypothesis above however Scrum does something really magical: By shortening delivery cycles it introduces short feedback loops which make it very easy to uncover problems in your organisation way faster than people are able to cover them up and hide them. Finding a solution on the other hand is still up to you.
The last roll-out issue mentioned is that of crappy certification - current certification programs are designed such that the naive organisation may believe that after two days of training their employers will magically turn into super heroes. Guess what - as with any certification training is just the very first step. Actual understanding comes with experience. Compare that to learning to drive: Only because you managed to get a drivers license does not turn you into a formula one winner. Instead that requires a whole lot of training. Same applies for any Scrum Master or Product Owner.
Category three: Organisation specifics
All other issues with Scrum mentioned in the blog post are either specific to the broken structures in the organisation under investigation or due to general Scrum mis-conceptions. Leaving these aside here.
To sum up: Scrum to me is nothing but a term to summarize good, proven development practices. I don’t care how you name them - however having any one name that is well defined makes it way easier to communicate. Scrum is not silver bullet - it does not solve all the issues your organisation may have. However it is a very effective debugger uncovering the issues employees and managers are trying to cover up. If you know all those issues very precisely already or you are certain that you don't have any, chances are you don't need Scrum.
I've heard of several people who are not quite sure yet whether they should visit Berlin Buzzwords or not - in particular when having to travel far and cross 9 time zones to attend. My general recommendation is to plan to spend some more days in Europe. The conference is conveniently scheduled on Monday and Tuesday which gives you one weekend before to explore the city and the whole week afterwards to go and see more either in the city or around.
In case you are wondering whether the city is a worthy destination when travelling with children - below is a list of things to do and places to go I sent to someone recently. Hope it helps with your decision as well. In general the city is pretty green, there are several locations specially amenable to a visit with kids - so treat the list below as what it is: An incomplete listing of some of the most obvious locations that might be of interest collected by someone who knows a few parents and their children. Also in case you speak German make sure to check out one of the many guide books for Berlin with children available in local book stores - Dussmann and Hugendubel generally have the largest selection though Chatwins is my preferred one for anything about travelling.
Dawid Weiss: Badeschiff - a pool-on-the-river thing. It's not something you get in any ordinary city :)
Steve Loughran: My son's favourite part of a trip to berlin (age 9) was actually the Bauspielplatz: Smaller kids get a play area where they can use the sand + water to build streams, dam them and generally make a mess, while the 8+ get a playground where they actually help build it under adult supervision. They also run a good open air waffle/pancake/coffee shop. They're open in the afternoons.
Hope to see you in Berlin in June. If you need more information or recommendations don't hesitate to ask.
The countdown started several weeks ago - finally in the past days the date for Berlin Buzzwords was announced, the call for submissions published. It's exciting to see that the first talk is in already. Looking forward to yours.
Compared to last year there are two changes:
Submissions are no longer evaluated by Jan, Simon and myself only. Due to the large number of talks submitted last year we reached out for help to be able to split the task of reviewing talks.
Also the conference itself grew quite a bit in the past two years. As a result it now takes several full time positions to handle not only ticketing, hosting and software development, sponsorships, venue management, travel support, but also external communication and marketing. The team of newthinking grew quite a bit and is helping substantially with tasks that before were handled by Jan, Simon and myself exclusively to keep some of our time reserved for the fun part of schedule curation. Please make sure to include firstname.lastname@example.org if you have questions that need a quick answer.
We are looking forward to a successful community conference on all things scalable - be it search, NoSQL or data analytics. Don't be afraid to submit highly technical talks - Berlin Buzzwords always has been a place for developers to discuss new technologies, algorithms and implementations.
If your community need more than just a day to meet - please do talk to us. We will be providing room for meetups on Wednesday after the conference. Those are handed out on a first come first serve basis.
If you are a local Berlin company and want to get Berlin Buzzwords into your offices, please talk to us - we are more than happy to get you in touch with one of the meetup organisers.
If you would like to co-locate trainings with Berlin Buzzwords - we are happy to co-promote you event. Talk to us to be included in our official schedule. In case you need any help organising your training, newthinking will be more than happy to provide their services for your event.
Looking forward to June: It's amazing how large that event grew in the past two years - and almost scary to return back online after a flu and see how things unfolded magically.