Video: Max Heimel on sequence tagging w/ Apache Mahout

2010-10-26 19:58
Some time ago Max Heimel from TU Berlin gave presentation of the new HMM support in the Mahout 0.4 release at the Apache Hadoop Get Together in Berlin:

Mahout Max Heimel from Isabel Drost on Vimeo.



Thanks to JTeam for sponsoring video taping, thanks to newthinking for providing the location and thanks to Martin Schmidt from newthinking for producing the video.

Video: Sebastian Schelter on Recommendation w/ Apache Mahout

2010-10-21 13:55
A few weeks ago we had the autumn edition of the Apache Hadoop Get Together in newthinking store in Berlin. I am glad to announce the first video online:

Mahout Sebastian Schelter from Isabel Drost on Vimeo.



Thanks to JTeam for sponsoring video taping, thanks to newthinking for providing the location and thanks to Martin Schmidt from newthinking for producing the video.

Stay tuned for the second video to be published next week.

Slides available

2010-10-08 09:19
Yesterday evening the Autumn Hadoop Get Together took place in Berlin. The meetup this time focussed mainly on latest developments at Apache Mahout. The meetup was kindly sponsored by JTeam, providing video taping of the presentations as well as for free drinks. Thanks a lot for that.

After the meetup the group went over to Cafe Aufsturz for drinks, food and lots of interesting discussions - I left them there as I still have to get rid of a persistent cold. Hope you guys had fun!

The two speakers were so kind to provide the slides:



Videos of the talks will be posted very soon - so stay tuned.

Reminder: Apache Hadoop Get Together Berlin Today

2010-10-07 09:52
Just a brief reminder: The Apache Hadoop Get Together Berlin is supposed to take place today in newthinking store, Tucholskystr. 48 at 5p.m.

The meeting features two talks on Apache Mahout: Committer Sebastian Schelter will explain how to scale recommender systems with Mahout. Contributor Max Heimel is going to give an introduction to the sequence labeling facilities available in Mahout.

As usual the group will move over to Cafe Aufsturz after the meetup is over.

A big Thanks goes to JTeam for sponsoring video taping as well as to newthinking store for providing the venue for free.

A Get Together Checklist

2010-10-06 19:38
Still on the list of potentially interesting books: The Checklist Manifesto - explaining why checklists can still be valuable - especially for complex problems and tasks.

Though not very complex, I chose to come up with a checklist for running a Hadoop Get Together in Berlin as an exercise. I'm trying to stick with advise provided by the Checklist for Checklists.

Parties involved


  • Find two to three speakers two months in advance.
  • Find a sponsor for the videos.

Gathering information


  • Double check time and date with all speakers and newthinking store.
  • Get name, title, abstract from the speakers.
  • Get logo and exact conditions from sponsor.

Spreading the word


  • Put together an announcement text including thanks to video and venue sponsors.
  • Publish the event on Upcoming.
  • Publish the event on Xing.
  • Augment the announcement text by the Xing event and Upcoming links.
  • Send a newsletter to the Meetup Xing group.
  • Send the text to the Get Together mailing list, and if appropriate to the Hadoop, HBase, katta, Lucene, Solr and Mahout mailing lists.
  • On event day send a reminder to the Get Together mailing list
  • Create meetup intro slides including thanks for the sponsors, schedule, announcements of future events.

During the meetup


  • Mention newthinking bar during introduction.
  • Self-introduction of all participants.
  • Get mail addresses of future mailing list subscribers.
  • Keep presentations at 30 to 40 minutes.
  • Get speakers' slides.

After the event


  • Publish talks' slides.
  • Publish links to videos.


The more meetups you have run the larger the chance of the main organiser getting sick the day the meetup takes place. To avoid having to re-schedule the event make sure there are people that are capable and willing to take over moderation.

Apache Hadoop Get Together Berlin - October 2010

2010-09-15 07:31
This is to announce the next Apache Hadoop Get Together sponsored by JTeam that will take place in newthinking store in Berlin.




When:
October 7th, 5p.m.
Where:
Newthinking store Berlin, Tucholksystr. 48




As always there will be slots of 30min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. We will go to Cafe Aufsturz after the event for some beer and something to eat.

Talks scheduled so far:

Max Heimel: "Hidden Markov Models for Apache Mahout"

Abstract: In this talk I will present and discuss an implementation of a powerful statistical tool called Hidden Markov Models for the Apache Mahout project. Hidden Markov models allow to mathematically deduce the structure of an underlying - and unobservable - process based on the structure of the produced data. Hidden Markov Models are thus frequently applied in pattern recognition to deduce structures that are not directly observable. Examples for applications of Hidden Markov Models include the recognition of syllables in speech recordings, handwritten letter recognition and part-of-speech tagging.

Sebastian Schelter: Distributed Itembased Collaborative Filtering with Apache Mahout"

Abstract: Recommendation Mining helps users find items they like. A very popular way to implement this is by using Collaborative Filtering. This talk will give an introduction to an approach called Itembased Collaborative Filtering and explain Mahout's Map/Reduce based implementation of it.


View Larger Map



Please do indicate on Upcoming if you are coming so we can more safely plan capacities.

JTeam is looking for Java developers and search enthusiasts. Check out their jobs page for more info!

As always a big Thank You goes to newthinking store for providing the venue for free for our event.

Looking forward to seeing you in Berlin as well,
Isabel

My highly subjective Berlin Buzzwords recap

2010-06-13 18:32
Last November I innocently asked Grant what it would take to make him to give a talk in Berlin. The only requirement he told me was that I'd have to pay for his flight. About eight months later we had Berlin Buzzwords - a conference all around the topics scalability, data storage and search. With Simon Willnauer, Uwe Schindler, Michael Busch, Robert Muir, Grant Ingersoll, Andrzej Bialecki and many others we had quite a few Lucene people in town.







From the NoSQL community, Peter Neubauer, Rusty Klophaus, Jan Lehnardt, Mathias Meyer, Eric Evans and many others made sure people got their fair share of NoSQL knowledge. With Aaron Kimball, Jay Booth, Doug Judd and Steve Loughran we had several Hadoop and related people at the conference.

The conference also featured two talks on Apache Mahout: An overview from Frank Scholten as well as a more in-depth talk by Sean Owen. It's great to see the project grow - not only in terms of development community but also in terms of requests from professional Mahout users.









In addition we had a keynote by Pieter Hintjens that concentrated on messaging in general and 0MQ in particular - a scalability topic otherwise highly underrepresented at Berlin Buzzwords.







We got well over 300 attendees that filled Berlin Kosmos - a former cinema. Attendees were a good mixture of Apache and non-Apache people, developers and users. People used the breaks and bar tours after the event to get in touch, exchange ideas. It's always good to see developers discuss design issues and architectural challenges.

Monday evening was reserved for local people taking out the speakers and interested attendees for Bar Tours to Friedrichshain. Those from Berlin took Berlin Buzzwords people to their favourite restaurants and bars - or to what they considered to be "typical Berlin". Some spent evenings later that week drinking beer or Berliner Weisse.




The tour for keynote speakers Grant Ingersoll, Pieter Hintjens and friends was organised by Julia and myself. We went over to Kreuzberg - some went to famous Burgermeister for Burgers, the other half went to a nearby Indian restaurant. After that we spent the evening in Club der Visionäre - a club next to the water. Me personally I left at about midnight - several people of the Lucene community moved to the well known Fette Ecke later on.

When asking the audience about repeating the conference next year, all hands went up immediately. Beside lots of praise for the organisation, from the feedback form we put up we got some good ideas on how to improve the conference next year. I'd love to have you guys back here in 2011 - and I'd love to get even more attendees in. Was great fun having you here. Thanks for 5 great days:

Five instead of two days, because:

  • Keynote speakers got a special treatment - that is a personal city guide for the weekend before Buzzwords.
  • We had the official conference start on Sunday with a Barcamp.
  • We had another Apache dinner on Wednesday with those Apache people that live in Berlin. In addition the Aaron and Sarah joined us as they were still in town for the Apache Hadoop trainings. Also Greg Stein had pizza and beer with us - he was in town for the svn conference at the end of the week.












Thanks to all who helped turn this conference into a success: Julia Gemählich for conference management, Ulf and Wetter for WiFi setup, Nils for travel management, Simon and Jan for support ranking talks and reaching out to your communities, all speakers for fantastic talks, those taking pictures of the conference and sharing them on Flickr for showing those who stayed at home how great the conference was, peoplezapping for the videos that will soon be available online, all sponsors for supporting the conference, all attendees for their participation. I'd love to have all of you (and many more) back in Berlin next year. An informal call for presentations has been set up already - submit now and be the one to set the trend instead of just following the Buzzwords!

For those who do not want to wait for another year: We will have another Apache Hadoop Get Together in September 2010 - watch this space for more information. If you'd like to give a talk their and present your Hadoop/ Solr/ Lucene etc. system - please get in touch with me.

Scaling user groups

2010-05-26 19:32
A few hours ago, Jan Lehnardt posted a link on How to organise a nerd conference - joking that this is how we planned Berlin Buzzwords. Well, it is not exactly that easy - however the comic actually is not so far from the truth either:

About two years ago, after having started Apache Mahout together with Grant Ingersoll, Karl Wettin and others, several Apache Hadoop user groups, meetups and get togethers started to pop up all around the world. The one closest to me was the Hadoop user group UK. Back in 2008 I was pretty envious to all these user groups - being so distributed, there was no way I could ever attend all of them, though talks were certainly interesting. So the naive thought of a back then naive free software developer was: Let's have that in Berlin. To have initial talks I called Stefan Groschupf. His answer was very positive: Oh yeah, let's do this. I am in Germany for another two weeks, so it should be at about that timeframe. We agreed that if no-one showed up, we could still have some pizza together and share insights from our projects.

For the venue I knew from regular meetups of the Free Software Foundation Europe - read FSF*E* - that newthinking store was available for free for meetups for devs of free software. On I went, calling Martin from the store, booked the room. After that some mails went to the usual suspects, mailing lists and such. At the first meetup two years ago, more than 15 attendees - with two more people who had prepared slides. Pizzas obviously had to wait a little.

If you are wondering what that looked like back then - Thanks to Martin for taking the image back then and putting it online.






We (as in all attendees) decided to repeat the exercise three months later*, talks for the next time were proposed during that first session. Noone objected to having it in Berlin again - everyone knew this was the only way to avoid having to do the organization next time.

The meetup grew steadily in size, talks started being proposed three to six months in advance. I ended up creating not only a mailing list for the meetup but also a blog so I could publish news on Jan's CouchDB talk and Lars George's HBase talk back then. We got video sponsoring from Cloudera (Thanks Christophe), StudiVZ (Thanks Nils), and Nokia (Thanks Matt). Late last year I did the first European NoSQL meetup together with Jan Lehnardt - 80 attendees, lots of potential for more, the newthinking store obviously a bit too small for that :)

If you are wondering what NoSQL and Hadoop meetups looked like last time:



During that meetup the idea was born for a larger NoSQL conference in Berlin in 2010. First ideas were tossed around together with Jan and Simon Willnauer during Apache Con US in Oakland. The topic Hadoop got added there. In January 2010 finally Lucene was added to the mix. We contacted newthinking for support - got a very warm welcome.

Now - two years after the first Apache Hadoop Get Together Berlin we are proud to host Berlin Buzzwords - focussed on NoSQL, Apache Hadoop and search as in Apache Lucene.The conference is co-organised by newthinking communications, Simon Willnauer, Jan Lehnardt and myself. A big thanks to neofonie for supporting me by making it possible that I could do most of the organisation during my regular working hours.

The speaker lineup looks fantastic. Registration is going very well - exceeding expectations (did I mention that registration is still open, group and student tickets still available?).

I am really looking forward to an amazing conference on 7th and 8th of June. We will have a NoSQL barcamp in newthinking store Sunday evening before the conference. Keynote speaker packages have been sent out and were well received. Hotel rooms for speakers are booked. We are about to pull together the last loose ends in the coming days. Happy to have so many guys (and a few girls) interested in scalability topics here in town at the beginning of June. Looking forward to seeing you in Berlin.



* The second meetup turned out to be the first and so far only one that took place w/o the organiser - I broke my leg on my way to newthinking by getting hit by a BMW X5... *sigh* Note for other meetup organizers: Always have a backup moderator - in may case that was my neofonie manager Holger Düwiger who happened to attend that meetup for the first time back then.

Some pictures

2010-03-25 11:00
Uwe and Simon were so kind to take some pictures of the last Hadoop Get Together in Berlin:

Image Hadoop Get Together Berlin

Image Hadoop Get Together Berlin

Image Hadoop Get Together Berlin

Image Hadoop Get Together Berlin

Image Hadoop Get Together Berlin


Thanks for the pictures.

Bob Schulze on Tips and patterns with HBase

2010-03-24 03:41
At the last Hadoop Get Together in Berlin Bob Schulze from eCircle in Munich gave a presentation on “Tips and patterns with HBase”. The talk has been video recorded. The result is now available online:

HBase Bob Schulze from Isabel Drost on Vimeo.



Feel free to share and distribute the video. Thanks to Bob for an awesome talk on eCircle’s usage of HBase - and on providing some background information on how HBase was applied to solve your problems.

Another thanks to Nokia for sponsoring the video taping - and to newthinking for providing the location for free.

Looking forward to Berlin Buzzwords in June. Early registration is open already. Several great talk proposals have been submitted already. If you are a Hadoop Get Together visitor (or even speaker) and would like to have a community ticket, please contact me.