FOSDEM - Django

2011-02-16 20:17

The languages/ cloud computing track on Sunday started with the good, the bad and the ugly of Django's architecture. Without much ado the speaker started by giving a high level overview of the general package layout of Django - unfortunately not going into too much detail on the architecture itself.

What he loves about Django are the model layer abstractions that really are no ORM only - instead both relational and non-relational databases can be supported easily. Abstractions in Django are made by task solved - there are multiple implementations available for caching, mailing, session handling etc. There is great geo support with options for defining geo objects, querying single points on a map for all their overlaying geo objects. Being a community of test driven people Django features awesome debugging and testing tools. To avoid cross side request forgery Django comes with built in protection mechanisms.

There is multi database support for building applications. Being a small core implementation features can be turned on and off as needed. In addition the framework comes with great documentation: No feature addition is accepted unless it comes with decent documentation - which fits nicely with the common perception that anything that is untested and undocumented does not exist.

The bad things about Django according to the speaker? Well, the old CSRF protection implementation that might lead to token leakage. Schema changes and migrations currently really are hard to handle. Though there is south to handle at least some of the migrations pain. The templating implementation could use some improvement as well - being designed to make inclusion of logic in the templates hard some use cases are just to clumsy to implement.

As for the ugly things: There is quite a bit of magic at work which generally leads to harder tracing of applications - that is about to get better. Too many parts of Django rely on unwieldy regular expressions. Anything that spans more than 4 lines on a screen probably is to be considered unmanageable and unchangeable. Authentication cannot really be customised - the information that is stored per user is hard coded and fixed.

Over time what was learned: Refactoring cannot be avoided as requirements change. However being consistent in what you do makes it so much easier for users to pick up the framework. What helps with creating a great open source project: People that have the time to invest - never under estimate the time needed to really go from prototype to production ready.

FOSDEM - Saturday

2011-02-15 20:17

Day one at FOSDEM started with a very interesting and timely keynote by Eben Moglen: Starting with the example of Egypt he voted for de-centralized distributed and thus harder to take over communication systems. In terms of tooling we are already almost there. Most use cases like micro blogging, social networking and real time communications can already be implemented in a distributed, fail safe way. So instead of going for convenience it is time to think about digital independence from very few central providers.

I spent most of the morning in the data dev room. The schedule was packed with interesting presentations ranging from introductory overview talks on Hadoop to more in depth treatment of the machine learning framework Apache Mahout. With an analysis of the Wikileaks cables the schedule also included case studies on what use cases can be implemented by thourough data anlysis. The afternoon featured presentations on the background to more data analytics for better usability at Wikimedia as well as talks on buiding search applications.

In the lightning talks room a wide variety of projects was presented - in only ten minutes Pieter Hintjens explained the gist of using 0MQ for messaging. That talk included "Hintjens law of concurrency: e = m * c^2, where e is effort needed to implement and maintain, m is mass - that is the amount of code written and c is complexity.

For me the day ended with a very interesting presentation by Matthias Kirschner/FSFE on one of their campaigns: has the very narrow and well scoped goal of getting links to unfree software off of governmental web pages. Using a really intuitive example they were able to convince officials of linking to their vendor neutral list of pdf readers: "Just imagine a road in your city. At this road drivers will find a sign that tells them the road is well suited to be used by VW cars. Those cars can be obtained for test drive at the following address. Your government." As unthinkable as such as sign may be that same text is included in nearly all governmental web pages linking to the acrobat reader.

What made pdfreaders successful is the combined effort of volunteers, its very narrow and clear scope, it's scalability by nature: People were asked to submit "broken" web pages to a bug tracker, campaign participants would then go and send out paper letters to these institutions and mark the bugs fixed as soon as the links were changed. Letters were pre-written and well prepared. So all that was needed was money for toner, paper and stamps.

One final cute example of how that worked out can be seen at


2011-01-23 15:46
It's already sort of a nice little tradition for me to spend the first weekend in February in Brussels for FOSDEM. This year I am particulary happy that there will be a Data Analytics Dev Room at FOSDEM. A huge Thanks to @ogrisel and @nmaillot who have done most of the heavy lifting of getting the schedule in place.

I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting

Looking forward to an interesting Cloud Track, to meeting Peter Hintjens who is going to give a talk on 0MQ, the DevOps presentation and lots of very interesting DevRooms. Looks like again it's going to be tough to decide on which presentations to go to at any one time.

CfP: Data Analysis Dev Room at Fosdem 2011

2010-10-27 06:56
Call for Presentations: Data Analysis Dev Room, FOSDEM
5 February 2011
1pm to 7pm
Brussels, Belgium

This is to announce the Data Analysis DevRoom co-located with FOSDEM. The first Meetup on analysing and learning from data, taking place in Brussels, Belgium.

Important Dates (all dates in GMT +2):

  • Submission deadline: 2010-12-17
  • Notification of accepted speakers: 2010-12-20
  • Publication of final schedule: 2011-01-10
  • Meetup: 2011-02-05

Data analysis is an increasingly popular topic in the hacker community. This trend is illustrated by declarations such as:

"I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s? The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it."

-- Hal Varian, Google’s chief economist

The event will comprise presentations on scalable data processing. We invite you to submit talks on the topics:

  • Information retrieval / Search
  • Large Scale data processing
  • Machine Learning
  • Text Mining
  • Computer vision
  • Linked Open Data
  • Sample list of related open source / data projects (not exhaustive) :
  • (including MapReduce, Pig, Hive, ...)
  • &
  • &

Closely related topics not explicitly listed above are welcome.

High quality, technical submissions are called for, ranging from principles to practice.

We are looking for presentations on the implementation of the systems themselves, real world applications and case studies.

Submissions should be based on free software solutions.

Proposals should be submitted at no later than 2010-12-17. Acceptance notifications will be sent out on 2010-12-20.

Please include your name, bio and email, the title of the talk, a brief abstract in English language. Please indicate the level of experience with the topic your audience should have (e.g. whether your talk will be suitable for newbies or is targeted for experienced users.)

The presentation format is short: 30 minutes including questions. We will be enforcing the schedule rigorously.

If you are interested in sponsoring the event (e.g. we would be happy to provide videos after the event, free drinks for attendees as well as an after-show party), please contact us. Note: "DataDevRoom sponsors" will not be endorsed as "FOSDEM sponsors" and hence not listed in the sponsors section on the website.

Follow @DataDevRoom on twitter for updates. News on the conference will be published on our website at

Program Chairs:

  • Olivier Grisel - @ogrisel
  • Isabel Drost - @MaineC
  • Nicolas Maillot - @nmaillot

Please re-distribute this CFP to people who might be interested.

FOSDEM - video recordings online

2010-02-14 20:32
As published in the FOSDEM blog the video recordings are available online - at least for the main track and the lightning talks. Happy video watching!

FOSDEM 2010 - part 3

2010-02-10 21:02
Sunday started in Janson with Andrian Bowyer's talk on RepRap machines, that is devices that can be used as manufacturing devices and are able to replicate themselves. After that I went over to the Mono dev room to listen to Miguel de Icaza on Mono Edge. A great talk on the history of Mono, the way the community interacts with Microsoft, the C# language itself and special features only available in Mono.

After this talk we went over to Janson for Andrew Tanenbaum's talk on Minix. We knew quite a bit of the talk already from Froscon two years ago, however Andrew is an awesome speaker, so it's always fun to catch up on the news on Minix.

The scalability talk started with an introduction to Hadoop by myself and continued with a talk on the facebook infrastructure by David Recordon. According to feedback I got after the talk, laughing with Thilo helped quite a bit to get myself calm. Before the talk I received one very good recommendation of one of the audio guys: Imagine you are giving the talk to one of your best friends - and forget about the microphone. Though I had way more slides than minutes to talk, we had enough time for the Q&A session after the talk. I started the talk by learning more about the audience - however this time not by handing the microphone to those listening (room too large) - I just asked them "have you heard about Hadoop?" - half of the audience. Are you Hadoop users: one quarter maybe. How large are your clusters? - 10 to 100 nodes mostly. Have you heard of Zookeeper? - some, Hive - some more, Pig - a few, Lucene - a lot, Solr - a little less, Mahout - maybe 5, Mahout users: 1.

Turns out the Mahout user in the audience was Olivier: It's so nice to meet people you know are active on the mailing lists for real and have a chat with them. Hope to see you more often on the lists - and meet you face to face again.

I used the chance to announce the Berlin Buzzwords 2010, a two day event on search and scalability buzzwords like cloud computing, Hadoop, Lucene, NoSQL and more. It takes place on June 7th and 8th in the center of Berlin. Follow this blog for further information. Judging from the input I got after the announcement there is quite some need for such a conference in Europe.

The slides of my talk are soon to be available online.

After my talk I could stay in Janson: A talk on the Facebook infrastructure (not only the Hadoop side of things) followed. After that I met Lars George at the NoSQL dev room - unfortunately I did not manage to actually talk to Steven Noels, who organised the room.

The afternoon was reserved for Greg Kroah-Hartman on how to "Write and submit your first Linux Kernel Patch" - my personal conclusion: git is really awesome. I really, really need to find a few spare minutes to learn how to effectively use it.

In the evening we met with Pieter Hintjens for dinner - and to finalize an awesome weekend in Brussels and a great 10th anniversary FOSDEM. A huge Thank You to all volunteers and organisers of FOSDEM - you did a great job this year putting together an awesome schedule, you did a fantastic job making the now pretty huge event (with 306 talks and about 5000 hackers attending) run smoothly. Even the wireless was working from minute one. See you again at FOSDEM 2011.

FOSDEM 2010 - part 2

2010-02-09 21:00
The event itself featured 306 talks - so pretty hard to choose what to watch on two days. This time, not only the main tracks were awesome, but also several dev rooms featured very interesting talks by well known FOSS developers.

Saturday started with a FOSDEM birthday dance done by all attendees. The first keynote speaker Brooks Davis explained his experiences promoting open source methods at a large company. After that Richard Clayton gave an amazing talk on the evil on the internet. He explained not only how phishing works on a technical level but also included an explanation of the economics behind these attacks, explained how the money flow from victims to attackers works.

On the afternoon Bernard Li gave an introduction to the cluster monitoring tool Ganglia. Directly after that Lindsay Holmwood gave an overview of the monitoring and notification tools flapjack and cucumber-nagios.

The evening was filled with the speakers dinner. Thanks for the organisers for providing that. We had a really nice evening together with some of the organisers, Andrew Tanenbaum and Elena Reshetova at our table.

FOSDEM visitor seems to like my baby

2010-02-09 08:19
Posted using

Another picture that was taken before the first session early in the morning:

FOSDEM 2010 - part 1

2010-02-08 21:00
Four years ago I was working in Saarbrücken. From there it is a very short ride over to FOSDEM (little more than 300km). So I decided - hey, why not stay there for a weekend. I found a very nice Brussels bed and breakfast hotel called Rovignon - featuring not only comfortable rooms at reasonable prizes but also cats in the house.

Back then, I barely knew anyone at the conference. However the lineup of speakers including St Peter from XMPP and Georg Greve from FSFE was impressive.

As a result it became a loved tradition of Thilo and myself to drive over to Brussels, attend FOSDEM and watch great talks. Over time there were more and more familiar faces, e.g. at the FSFE booth, among the Debian people...

Last weekend I had an awesome time in Brussels at FOSDEM for the fourth time in a row. I am honoured to have been invited by the FOSDEM organisers for a main track talk on Hadoop in the scalability slot (in Janson...).

We arrived on Friday afternoon, however being awefully tired we unfortunately could not join the Friday evening beer event (though, as I am not drinking beer, I would probably have missed quite a bit of the fun).

Apache Hadoop at FOSDEM 2010

2009-12-11 09:19
Though the official schedule is not yet online: I will be giving an introductory talk about Apache Hadoop at next year's FOSDEM (Free and Open Source Developer European Meeting) in Brussles. This will be the 10th birthday of the event - looking forward to a fun event, meeting other free and open source software developers from all over Europe.

If you are a Apache Hadoop developer and would like me to include some particular topic in the talk - please feel free to contact me. If you are an Apache Hadoop user and would like to learn more on the project, please come to the talk and ask questions. If you are an Apache Hadoop Newbie - feel free to join us.

In addition there will be a NoSQL Dev Room at FOSDEM as well. The call for presentations is up already. So if you are doing fun stuff with CouchDB, HBase and friends or are a developer of these projects - submit a talk and join us in early-February in Brussles.