FrOSCon - Robust Linux embedded platform

2012-09-04 20:05
The second talk I went to at FrOSCon was given by Thilo Fromm on Building a robust embedded Linux platform. For more information on the underlying project see also projec HidaV on github. Slides of the talk Building a robust Linux embedded platform are already online.

Inspired by a presentation on safe upgrade prodedures in embedded devices by Arnaut Vandecappelle in the Embedded Dev Room FOSDEM earlier this year Thilo extended the scope of the presentation a bit to cover safe kernel upgrades as well as package updates in embedded systems.

The main goal of the design he presented was to allow for developing embedded systems that are robust - both in normal operation but also when upgrading to a new firmware version or a set of new packages - the design included support for upgrading and rolling back to a known working state in an atomic way. Having systems deployed somewhere in the wild to power a wind turbine, inside of busses and trains or even within satellites pretty much forbids relying on an admin to press the "reset button".



Original image xkcd.com/705

The reason for putting that much energy into making these systems robust also lies in the ways they are deployed. Failure vectors include not only your usual software bugs, power failures or configuration incompatibilities. Transmission errors, storage corruption, temperature, humidity add their share to increase the probability of failure.

Achieving these goals by building a custom system isn't too trivial. Building a platform that is versatile enough to be used by others building embedded systems adds to the challenges: Suddenly having easy to use build and debug tools, support for software life-cycle management and extend-ability are no longer nice-to-have features.

Thilo presented two main points to address the requirements: The first is to avoid trying to cater every use case. Setting requirements for a platform in terms of performance, un-brickability (see also urban dictionary, third entry as of this writing). Even setting a requirement for dual boot support or to the internal storage technology used. As a result designing the platform can become a lot less painful.

The second step is to harden the platform itself. Here that means that upgrading the system (both firmware and packages) is atomic, can be rolled-back atomically and thus no longer carries the danger of taking the device down for longer than intended: A device that does no longer perform it's task in the embedded world usually is considered broken and shipped back to the producer. As a result upgrading may be necessary but should never render the device useless.

One way to deal with that is to store boot configurations in a round robin manner - for each configuration a "was booted" (set by the bootloader on boot) and a "is healthy" (set by the system after either a certain time of stability or after running self tests) flags are needed. This way at each boot it is clear what the last healthy configuration was.

To do the same with your favourite package management system is slightly more complicated: Imagine running something like apt-get upgrade with the option to switch back to the previous state in an atomic way if anything goes wrong. One option to deal with that presented is to work with transparant overlay filesystems that allow for having a read-only base layer - and a "transparent" r/w layer on top. If a file does not exist in the transparent layer, the filesystem will return the original r/o version. If it does exist it will return the version in the transparent overlay. In addition there's also an option to mark files as deleted in the overlay.

With that upgrading becomes as easy as installing the upgraded versions into some directory in your filesystem and mounting said directory as transparent overlay. With that roll-back as well as snapshots are easy to do.

The third ingredient to achieving a re-usable platform presented was to use Open Embedded. Including an easy to extend layer-based concept, support for often recent software versions, versioning and dependency modelling, some BSP layers officially supported by hardware manufacturers building a platform on top of Open Embedded is one option to make it easily re-useable by others.

If you want to know more on the concepts described join HiDaV platform project - many of the concepts described are already - or soon to be - implemented.

Open Source Meetup Berlin

2012-08-19 20:22
This evening the (to my knowledge first) Berlin Open Source Meetup took place at Prater (Bier-)garten in Berlin. There are lots of project specific meetings, a monthly Free Software meeting, quite some stuff on project management. However this was one of the rare occasions where you get Linux kernel hackers, Wikidata project members, Debian developers, security people, mobile developers as well as people writing on free software or making movies related to the topic around one table.

Despite the heat (over 30 degrees Celcius in Berlin today) over 30 people gathered for some food, cold beer, some drinks and lots of interesting discussions. Would be great to see another edition of this kind of event.

FrOSCon 2012

2012-07-31 20:25
On August 25th/26th the Free and Open Source Conference (FrOSCon) will again kick off in Sankt Augustin/ Germany.



The event is completely community organised, hosted by the FH Sankt Augustin. It covers a broad range of free software topics like Arduino microcontrollers, git goodies, politics, strace, open nebula, wireshark and others.

Three highlights that are on my schedule:



Looking forward to interesting talks and discussions at FrOSCon.

O'Reilly Strata coming to London

2012-07-30 20:05
O'Reilly Strata is coming to London. The first edition of Strata back in 2011 brought Big Data developers, designers, scientists and decision makers together to discuss all things scalable. This year in October the conference comes to Europe: O'Reilly Strata EU will take place in London.

Date: October 1st - 2nd 2012

Venue: Hilton London Metropole, 225 Edgware Road, London W2 1JU, UK

The schedule covers a great deal of use cases and war stories that involve big data and data driven development. Both days are packed with both deep technical but also strategy level presentations that can help drive your projects.

Having been on the program committee I got a glimpse of the diversity and high quality of the submissions received. Choosing the best wasn't easy, but there's only so much content you can sqeeze in two conference days.

Looking forward to London.

PS: If you have any interesting war stories and anti-patterns involving big data to share consider adding your input online.

Recsys meetup Berlin

2012-07-25 01:31
Planning a meetup in Berlin: 8 people register, a table for 14 people is booked, 16+ people arrive - all of that even if no pre-defined topic or talk is announced. Seems like building recommender systems is a hot topic currently in Berlin.

Thanks to Zeno Gantner from MyMedialight for organising the event - looking forward to the next edition.

Need your input: Failing big data projects - experiences from the wild

2012-07-18 20:11
A few weeks ago my talk on "How to fail your big data project quick and rapidly" was accepted at O'Reily Strata conference in London. The basic intention of this talk is to share some anti-patterns, embarrassing failure modes and "please don't do this at home" kind of advice with those entering the buzzwordy space of big data.

Inspired by Thomas Sundberg's presentation on "failing software projects the talk will be split in five chapters and highlight the top two failure-factors for each.

I only have so much knowledge of what can go wrong when dealing with big data. In addition no one likes talking about what did not work in their environment. So I'd like to invite you to share your war stories in a public etherpad - either anonymously or including your name so I can give credit. Some ideas are already sketched up - feel free to extend, adjust, re-rank or change.

Looking forward to your stories.

Apache Sling and Jackrabbit event coming to Berlin

2012-07-12 20:59
Interested in Apache Sling and/or Apache Jackrabbit? Then you might be interested in hearing that on September 26th to 28th there will be an event in town on these two topics - mainly organised by Adobe, but labeled as community event, meaning that there will be a number of active community members attending the conference: adaptTo().

From their website:

In late September 2012 Berlin will become the global heart beat for developers working on the Adobe CQ technical stack. pro!vision and Adobe are working jointly to set up a pure technical event for developers that will be focused on Apache Sling, Apache Jackrabbit, Apache Felix and more specifically on Adobe CQ: adaptTo(), Berlin. September 26-28 2012.



Teddy in Poznan

2012-05-27 20:03
Some images taken in Poznan after GeeCon - big Thanks! to Dawid for giving advise on where to go for sightseeing, exhibitions and going-out.

The tour started close to river Warta - it being a sunny day it seemed like a perfect fit to just walk through the city, starting along the river headed towards the cathedral:


   


After that Poznan Citadel was a great place to spend lunch time - sitting somewhere green and shady:




Afternoon was dedicated to discovering the city center, several local churches and the national galery:


    

GeeCon 2012 - part 1

2012-05-20 11:02
Devoxx, Java Posse, Qcon, Goto Con, an uncountable number of local Java User Groups – aren't there enough conferences on just Java, that weird programming language that “makes developers stupid by letting them type too much boiler plate” (Keith Braithwaite)? I spent Thursday and Friday last week in Poznan at a conference called GeeCon – there main focus is on anything Java, including TDD, Agile and testability. It's all community organised – switching between Poznan and Krakow on a yearly basis, backed by two corresponding Java User groups with a clear focus on good speakers and interesting content: Really well done, wish they could have fit more talks into each of these days: Five tracks in parallel left one with just around 4 regular talks + keynotes each day. That does make for a very human start and end time – but it feels like there's so much going on in parallel that most likely you miss some of the particularly interesting content. Looking forward to the videos!

One note: If you are ever invited as a speaker to GeeCon: Do accept! It's really well organised, an incredibly friendly atmosphere, and a really tasty speaker's dinner. One thing that caught me be surprise this morning: My room was all paid for even though I stayed longer and had offered to cover the additional nights myself - Thanks guys, you rock!

Watch this space for more details on the talks in the coming days.

Clojure Berlin - March 2012

2012-03-07 22:37
In today's Clojure meetup Stefan Hübner gave an introduction to Cascalog - a Clojure library based on Cascading for large scale data processing on Apache Hadoop without hassle.

After a brief overview of what he is using the tool for to do log processing at his day job for http://maps.nokia.com Stefan went into some more detail on why he chose Cascalog over other project that provide abstraction layers on top of Hadoop's plain map/reduce library: Both Pig and Hive provide easy to learn SQL-like languages to quickly write analysis jobs. The major disadvantage however comes when in need for domain specific operators - in particular when these turn out to be needed just once: Developers end up switching back and forth between e.g. Pig Latin and Java code to accomplish their analysis need. These kinds of one-off analysis tasks are exactly where Cascalog shines: No need to leave the Clojure context, just program your map/reduce jobs on a very high level (Cascalog itself is quite similar to datalog in syntax which makes it easy to read and simple to forget about all the nitty-gritty details of writing map/reduce jobs).

Writing a join to compute persons' age and gender from a trivial data model is as simple as typing:


;; Persons' age and gender
(? [?person ?age ?gender]
(age ?person ?age)
(gender ?person ?gender)


Multiple sorts of input generators are implemented already: Reading text files, using files in HDFS as input are both common use cases. Of course it is possible to provide your own implementation for that as well to integrate any type of data input in addition to what is available already.

In my view Cascalog combines the speed of development that was brought by Pig and Hive with the flexibility of being able to seemlessly switch to a powerful programming language for anything custom. If you yourself have been using or even contributing to either Cascalog or Cascading: I'd love to see your submission to Berlin Buzzwords - remember, the submission deadline is this week on Sunday *MEZ*.