My blog has been moved to

Monday, December 13, 2010

the art of repository access


What will occur when a software team is working on a project? How to tackle the change control?

The first and foremost obvious approach is simply giving full access for everyone. This is what typically happens in a group of friends working together on a cool project, a school assignment, a short hackathon, an early prototype, or any other collaborative efforts.

While this approach is very democratic (all developers are created equal), it is not very scalable. It works on a small and agile team, mainly because everyone knows each other. Once the team grows beyond a certain limit, the overhead of the communication makes it impossible to continuously get a full picture of what is being work on by whom.

Usually this problem is minimized by having somebody responsible for each module, essentially putting some organizational hierarchy in the project. However, a slight problem in the communication (which happens quickly, we're talking about engineers here) is enough to provoke a situation where conflict occurs due to a certain check in which carries over the problems and bugs from one module to another (Law of Unintended Consequences).

Another approach, certainly an evolution from the first one, is limited commit access. This particularly works well if there is a dedicated army of maintainers, whose job is to review incoming patches, give feedback, check them in. Usually this approach is very beneficial for the new members of the team, because they can learn what works and what does not work (since they are indirectly guided by the maintainer).

Quality control is definitely easier since all patches will be always filtered. Limiting the damage due to a broken commit is also not difficult since the repository can be frozen while the disaster mitigation team carries out the work.

This approach however falls apart if there is not enough maintainers and/or the rate of submitted patches increases. The symptoms are easy to spot: patches stay far too long in the queue, a maintainer has no time to do his own development, bad patches sneak in due to the time pressure, etc.

Keep in mind that due to its implicit (often also explicit) social stratification, it is important to take into account what everyone thinks about the approach. There should not be someone who feels left behind because only certain chosen guardians have the ultimate privilege access the repository. When such a situation is not tackled early on, it would grow latently and explode at some point in the future.

If there is a worry like that, then another (slightly reversal) evolution is inevitable: full access to everyone, as long as the patch is reviewed. The gut of this approach is basically "Let's trust each other, but let's also watch each other's back". Thus, inherent code review becomes mandated as well. Everyone is free to check in his patch directly to the repository, as long as some other competent fellow has looked at it and gave the green light. In fact, this should be called "brotherhood repository access" except that it may sound too exclusive and mystical.

Sometimes the system is also setup as far as rejecting the commit (e.g. from the server-side) when the log does not contained something like Reviewed by: Joe Sixpack. This is of course not always necessary, it may even require a consensus from everyone, under the pretext of preventing accidental check in.

This last approach may still not work in some situations. For example, if the team is global and located in different time zones, then coordinating the review becomes challenging (distributed control system may alleviate the problem). When code review is not a encouraged culture, counterproductive dead lock can be depressing. Also, because it is based on mutual trust, it breaks down when some people do not play by the same rule. In short, the approach depends heavily on the strong and committed communication between the team members.

In the end, the repository access approach must be evaluated on a case by case basis. Of course, in some places, no approach will work to satisfy everyone. In that case, consider again that technology can't and won't solve every social problem.

What kind of repository access do you have in your group? Which one do you prefer and why?

No comments: