Open development and inner source for fun and profit

2017-05-26 07:17
Last in a row if interesting talks at Adobe Open Source Summit was on Open Development/ Inner Source and how it benefits internal projects given by Michael Marth. Note: He knows there's subtle differences between inner source and open development, but mentioned to use the terms interchangeably in his talk.

So what is inner source all about? Essentially: Use all the tools and processes that already work for open source projects, just internally. (Company) public mailing lists, documentation, chat software, issue trackers. Taken to it's core this is very simplistic though. The more interesting aspects are waiting when looking at the people interaction patterns that emerge.

First off: The goal of making all interaction public and easy to follow for anyone is to attract more contributors. The richest source of contributors can be tapped into if your users are tech savvy as well. Being based on this assumption inner source works best when dealing with infrastructure software, or platform software where downstream users are developers themselves.

As a general rule of thumb: From 100 users, 10 will contribute, one will stick around and become a long term committer. This translates into a lot of effort for gaining additional hands for your project.

So - assuming what you want is a wildly successful open source project: You put your code on Github (or whatever the code hosting site of the day is), start some marketing effort - but still not magic happens, no stars, no unicorns, maybe there's 10 pull requests, but that's about it. What happened?

Architecting a community around an open source project is a long term investment: Over time you'll end up training numerous newbies, help people getting started and convince some of those to contribute back.

According to the speaker Michael Marth where that works best is for infrastructure projects: Where users can be turned into contributors and where projects can be turned into platform software that lasts for a decade and longer. In his opinion what is key are two factors: Enabling distributed decision making to let others participate, and a management style that lets the community take their own decisions instead of having one entity control the project. Usually what emerges from that is a distributed, peer-peer networked organisational structure with distributed teams, no calls, no standups, consensus based decision making.

In Michaels experience what works best it to adopt an open source working model from the very start. His recommendation for projects coming from comercial entities is to go to the Apache Software Foundation: There, proven principles and rules have been identified already. In addition going there gives the project much more credibility when it comes to convincing partners that decisions are actually being made in a distributed fashion without being controllable by one single entity. So telling a customer "We have to check this with the community first" as an answer to a feature request becomes much more credible.

The result of this approach are projects that under his guidance gained ten times as many people contributing to the project outside of the original entity than inside of it. The result Michael observed were partners that were much more likely to stick with the technology by means of co-owning it. Partners were participating in development. Also the project made for a lovely recruiting pipeline filled with people already familiar with the technology.

Note to self - slides for staying sane when maintaining a popular open source project

2017-05-26 07:00
For further reference - Simon MacDonald has a great collection of good advise on how to stay sane when running and maintaining a popular open source project. Link here: http://s.apache.org/sanity

Some things he mentioned:

Include a README. It should tell people what the project is about but also what the project is not about. It should have some sort of getting started guide. Potentially link to a CONTRIBUTING doc.

Contribution guidelines should outline social rules like a code of conduct, technical instructions like how to submit a pull request, a style guide, information on how to build the project and make changes etc.

Add a LICENSE file - any OSS license really, because by default it won't be open source in no jurisdiction. Add file headers to each file you publish.

Decide how to handle questions vs. issues: Either both in the issue tracker, or in separate venues.

Add an issue template that asks the user if they searched for the issue already, asks for expected behaviour, actual behaviour, logs, reproduction code, version number used. A note on issues: Having many issues is a good thing - it shows your project is popular. Having only many stale issues is a bad thing - nobody is caring for the project anymore.

Close issues that don't follow the template. Close issues that are duplicates. Close issues that are non active after asking for user input a while ago. Repeated issues asking for seamingly obvious things: Turn those into additional documentation. Asks for easy to add functionality: Let it sit for a while to give others a chance to do it and get involved. Same for bugs that are easy to fix.

Overall people are more difficult than code. Expect trolls to show up. Remain empathetic, respectful but firm in your communication. Don't be afraid to say no to external requests even if they are urgent for the requester.

Add a pull request template that asks for a description, related issue, type tag. Remember that you don't have to merge every pull request.

Build a community: Make it easy to contribute, identify beginner bugs, document the hell out of your project, turn contributors into maintainers, thank people for their effort.

Have tests but keep build times low.

Add documentation, at the very least a README file, a how to contribute file, break those files into a separate website once they grow too large.

As for releasing: Automate as much as you can. Three options: time based release schedule, release on every commit, release "when it's done".

Async decision making

2017-05-16 06:45
This is the second in a series of posts on inner source/open source. Bertrand Delacretaz gave an interesting talk on how to avoid meetings by introducing an async way of making decisions.

He started off with a little anecdote related to Paul Graham's maker's vs. manager's schedule: Bertrand's father was a carpenter. He was working in the same house that his family was living in, so joining the family for lunch was common for him. However there was one valid excuse for him to skip lunch: "I'm glueing." How's that? Glueing together a chair is a process that cannot be interrupted once started without ruining the entire chair - it's a classical maker task that can either be completed in one go, or not at all.

Software development is pretty similar to that: People typically need several hours of focus to get anything meaningful done, in my personal experience at least two (for smaller bugs) or four (for larger changes). That's the motivation to keep forced interruptions low for development teams.

Managers tend to be on a completely different schedule: Context switching all day, communicating all day adding another one hour meeting to the schedule doesn't make much of a difference.

The implication of this observation: Adding one hour of meeting time to an engineer's schedule comes with an enourmous cost if we factor the interruption into the equation. Adding to the equation that lots of meetings actually fail for various reasons (lack of preparation, lack of participants getting prepared, bad audio or video quality, missing participants, delayed start time, bad summarisation) it seems valid to ask if there is a way to reduce the number of face to face meetings while still remaining operational.

As communication still remains key to a functional organisation, one approach taken by open development at Adobe (as well as at the Apache Software Foundation really) is to differentiate between things that can only be discussed in person and decisions that can be taken in an asynchronous fashion. While this doesn't reduce the amount of time communicating (usually quite the contrary happens) it does allow for participants to participate pretty much on their own schedule thus reducing the number of forced interruptions.

How does that work in practice? In Bertrand's experience decision making tends to be a four step process: Coming from an open brainstorming sessions, options need to be condensed, consensus established and finally a decision needs to be made.

In terms of tooling in his experience what works best is to have one and only one shared communication medium for brainstorming. At Apache those are good old mailing lists. In addition there is a need for a structured issue tracker/ case management tool to make options and choices obvious, decisions visible and archived.

When looking at tooling we are missing one important ingredient though: Each meeting needs extensive preparation and thourough post processing. As an example lets take the monthly Apache board of directors' meeting: It's scheduled to last no longer than two hours. Given each of hundreds of projects are required to report on a quarterly basis, given that executive officers need to provide reports on a monthly basis, given that each month at least one major decision item comes up and given that there is still day to day decisions about personel, budget and the like to be taken: How does reducing that to two hours work? The secret sauce is a text file in svn + a web frontend called whimsy to it.

Directors will read through those reports ahead of the meeting. They will add comments to them (which will be mailed automatically to projects), often those comments are used by directors to communicate with each other as well. They will pre-approve reports, they will mark them for discussion if there is something fishy. Some people will check projects' lists to match that up with what's being discussed in the report, some will focus on community stuff, some will focus on seeing releases being mentioned. If a report gets enough pre-approvals an no mark "to be discussed" they are not being shown or touched in the real meeting.

That way most of the discussion happens before the actual meeting leaving time for those issues that are truely contentious. As the meeting is open for anyone in te foundation to attend questions raised beforehand that could not be resolved in writing can be answered in the voice call fairly quickly.

Speaking of call: How does the actual meeting proceed? All participants dial in via good old telephone. Everyone is on a telephone so the issue of "discussions among people in the same room are hard to understand for remote participants" doesn't occur. In addition to telephone there's an IRC backchannel for background discussion, chatter, jokes and less relevant discussion. All discussion that has to be archived and that relates to discussions is kept on the voice channel though. During the meeting the board's chair is moderating through the agenda. In addition the secretary will make notes of who attended, which discussions were made and which arguments exchanged. Those notes are being shared after the meeting, approved at the following month's meeting and published thereafter. If you want to dig deeper into any project's history, there's tooling to drill down into meeting minutes per project until the very beginning of the foundation.

Does the above make decision making faster? Probably not. However it enables an asynchronous work mode that fits best with a group of volunteers working together in a global, distributed community where participants do not only live in different geographies and timezones but are on different schedules and priorities as well.