Apache Mahout @ Devoxx Tools in Action Track

2010-11-01 09:32
This year's Devoxx will feature several presentations coming from the Apache Hadoop ecosystem including Tom White on the basics of Hadoop: HDFS, MapReduce, Hive and Pig as well as Michael Stack on HBase.



In addition there will be a brief Tools in Action presentation on Monday evening featuring Apache Mahout.

Please let me know if you are going to Devoxx - would be great to meet some more Apache people there, maybe have dinner at one of the conference days.

Apache Mahout @ Lisbon Codebits

2010-10-31 09:36
Second week of November I'll spend a few days in Lisbon - never would have thought that I'd return so quickly when I visited this beautiful city this summer during vacation. I'll be there for Codebits - thanks to Sapo for inviting me to be there.



Back in summer I learned only after I returned to Germany that there was someone form Portugal seeking to meet with other Apache people exactly when I was down there. I contacted the guy proposing to do an Apache Dinner to see how many other committers and friends could be reached. In addition Filipe asked me whether I could imagine flying down to Sapo to give a talk on Mahout as devs there would be interested in it. Well, I told him that if I got travel support, I'd be happy to be there. This 10min chat quickly turned into an invitation to a great conference in Lisbon. Looking forward to meet you there. (And looking forward to weather that compared to Germany is way warmer and more sunny right now. :) )

CfP: Data Analysis Dev Room at Fosdem 2011

2010-10-27 06:56
Call for Presentations: Data Analysis Dev Room, FOSDEM
http://fosdem.org
5 February 2011
1pm to 7pm
Brussels, Belgium


This is to announce the Data Analysis DevRoom co-located with FOSDEM. The first Meetup on analysing and learning from data, taking place in Brussels, Belgium.

Important Dates (all dates in GMT +2):

  • Submission deadline: 2010-12-17
  • Notification of accepted speakers: 2010-12-20
  • Publication of final schedule: 2011-01-10
  • Meetup: 2011-02-05


Data analysis is an increasingly popular topic in the hacker community. This trend is illustrated by declarations such as:


"I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s? The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it."

-- Hal Varian, Google’s chief economist



Topics
The event will comprise presentations on scalable data processing. We invite you to submit talks on the topics:


  • Information retrieval / Search
  • Large Scale data processing
  • Machine Learning
  • Text Mining
  • Computer vision
  • Linked Open Data
  • Sample list of related open source / data projects (not exhaustive) :
  • http://lucene.apache.org
  • http://hadoop.apache.org (including MapReduce, Pig, Hive, ...)
  • http://www.r-project.org/
  • http://scipy.org
  • http://mahout.apache.org
  • http://opennlp.sourceforge.net
  • http://nltk.org
  • http://opencv.willowgarage.com
  • http://mloss.org & http://mldata.org
  • http://dbpedia.org & http://freebase.com


Closely related topics not explicitly listed above are welcome.

High quality, technical submissions are called for, ranging from principles to practice.

We are looking for presentations on the implementation of the systems themselves, real world applications and case studies.

Submissions should be based on free software solutions.

Submission
Proposals should be submitted at fosdem.datadevroom@gmail.com no later than 2010-12-17. Acceptance notifications will be sent out on 2010-12-20.

Please include your name, bio and email, the title of the talk, a brief abstract in English language. Please indicate the level of experience with the topic your audience should have (e.g. whether your talk will be suitable for newbies or is targeted for experienced users.)

The presentation format is short: 30 minutes including questions. We will be enforcing the schedule rigorously.

Sponsoring
If you are interested in sponsoring the event (e.g. we would be happy to provide videos after the event, free drinks for attendees as well as an after-show party), please contact us. Note: "DataDevRoom sponsors" will not be endorsed as "FOSDEM sponsors" and hence not listed in the sponsors section on the fosdem.org website.

Announcements
Follow @DataDevRoom on twitter for updates. News on the conference will be published on our website at http://fosdem.org.

Program Chairs:

  • Olivier Grisel - @ogrisel
  • Isabel Drost - @MaineC
  • Nicolas Maillot - @nmaillot

Please re-distribute this CFP to people who might be interested.

Video: Max Heimel on sequence tagging w/ Apache Mahout

2010-10-26 19:58
Some time ago Max Heimel from TU Berlin gave presentation of the new HMM support in the Mahout 0.4 release at the Apache Hadoop Get Together in Berlin:

Mahout Max Heimel from Isabel Drost on Vimeo.



Thanks to JTeam for sponsoring video taping, thanks to newthinking for providing the location and thanks to Martin Schmidt from newthinking for producing the video.

Machine Learning Gossip Meeting Berlin

2010-10-25 18:51
This evening the first Machine Learning Gossip meeting is scheduled to take place at 9p.m. at Victoriabar: Professionals working in research advancing machine learning algorithms and industry projects putting machine learning algorithms to practical use meet for some drinks, food and hopefully lots of interesting discussions.

If successful the meeting is supposed to take place on a regular schedule. Ask Michael Brückner for the date and location of the next meetup.

Video: Sebastian Schelter on Recommendation w/ Apache Mahout

2010-10-21 13:55
A few weeks ago we had the autumn edition of the Apache Hadoop Get Together in newthinking store in Berlin. I am glad to announce the first video online:

Mahout Sebastian Schelter from Isabel Drost on Vimeo.



Thanks to JTeam for sponsoring video taping, thanks to newthinking for providing the location and thanks to Martin Schmidt from newthinking for producing the video.

Stay tuned for the second video to be published next week.

Apache Mahout at Apache Con NA

2010-10-15 20:39
The upcoming Apache Con NA to take place in Atlanta will feature several tracks relevant to users of Apache Mahout, Lucene and Hadoop: There will be a full track on Hadoop as well as one on NoSQL on Wednesday featuring talks on the framework itself, Pig and Hive as well as presentations from users on special use cases and on their way of getting the system to production.

The track on Mahout, Lucene and friends starts on Thursday afternoon, followed by another series of Lucene presentations on Friday.



Also don't miss the track on the community and business tracks for a glimpse behind the scenes. Of course there will also be tracks on well-known Apache Tomcat, httpd, OSGi and many, many more.

Looking forward to meeting you in Atlanta!

Slides available

2010-10-08 09:19
Yesterday evening the Autumn Hadoop Get Together took place in Berlin. The meetup this time focussed mainly on latest developments at Apache Mahout. The meetup was kindly sponsored by JTeam, providing video taping of the presentations as well as for free drinks. Thanks a lot for that.

After the meetup the group went over to Cafe Aufsturz for drinks, food and lots of interesting discussions - I left them there as I still have to get rid of a persistent cold. Hope you guys had fun!

The two speakers were so kind to provide the slides:



Videos of the talks will be posted very soon - so stay tuned.

Reminder: Apache Hadoop Get Together Berlin Today

2010-10-07 09:52
Just a brief reminder: The Apache Hadoop Get Together Berlin is supposed to take place today in newthinking store, Tucholskystr. 48 at 5p.m.

The meeting features two talks on Apache Mahout: Committer Sebastian Schelter will explain how to scale recommender systems with Mahout. Contributor Max Heimel is going to give an introduction to the sequence labeling facilities available in Mahout.

As usual the group will move over to Cafe Aufsturz after the meetup is over.

A big Thanks goes to JTeam for sponsoring video taping as well as to newthinking store for providing the venue for free.

Apache Hadoop Get Together Berlin - October 2010

2010-09-15 07:31
This is to announce the next Apache Hadoop Get Together sponsored by JTeam that will take place in newthinking store in Berlin.




When:
October 7th, 5p.m.
Where:
Newthinking store Berlin, Tucholksystr. 48




As always there will be slots of 30min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. You can order drinks directly at the bar in the newthinking store. If you like, you can order pizza. We will go to Cafe Aufsturz after the event for some beer and something to eat.

Talks scheduled so far:

Max Heimel: "Hidden Markov Models for Apache Mahout"

Abstract: In this talk I will present and discuss an implementation of a powerful statistical tool called Hidden Markov Models for the Apache Mahout project. Hidden Markov models allow to mathematically deduce the structure of an underlying - and unobservable - process based on the structure of the produced data. Hidden Markov Models are thus frequently applied in pattern recognition to deduce structures that are not directly observable. Examples for applications of Hidden Markov Models include the recognition of syllables in speech recordings, handwritten letter recognition and part-of-speech tagging.

Sebastian Schelter: Distributed Itembased Collaborative Filtering with Apache Mahout"

Abstract: Recommendation Mining helps users find items they like. A very popular way to implement this is by using Collaborative Filtering. This talk will give an introduction to an approach called Itembased Collaborative Filtering and explain Mahout's Map/Reduce based implementation of it.


View Larger Map



Please do indicate on Upcoming if you are coming so we can more safely plan capacities.

JTeam is looking for Java developers and search enthusiasts. Check out their jobs page for more info!

As always a big Thank You goes to newthinking store for providing the venue for free for our event.

Looking forward to seeing you in Berlin as well,
Isabel