Skip directly to content

Core mentoring, and xjm's guide to patch reviews

on Wed Mar 28, 2012

Core mentoring sprint

We did something awesome at DrupalCon Denver: on code sprint Friday, we filled a room with between 75 and 100 new Drupal core contributors and helped them work on core issues for the first time.

Participants learned everything from the mechanics of triaging issues to how to write automated tests.  Six of us (myself, Andrea Soper, Tim Plunkett, Lin Clark, Cameron Eagans, and Kelly Bell) introduced and led the sprint, and several other experienced contributors (including David Rothstein, Kieran Lal, and Cathy Theys) stepped up to help mentor during the course of the sprint.  Angie Byron even demonstrated for participants how she reviews and commits core patches, and committed one participant's first core patch in real time on the big screen.

It was delightful to watch the participants' enthusiasm and collaboration.  The morning was hectic, but as the day went on, the sprint leads answered fewer and fewer questions because the sprint participants helped each other.

The terrible, glorious aftermath of all this is that I have many, many issues to review!  I've spent the past four days going through updated issues I'm following and reviewing the work that's been done, and I still haven't managed to get it down to one 50-issue page of unread updates.  So I'm sitting here now with the three next patches I'm going to review open in dreditor in three browser tabs, drinking tea, watching the sun rise, and thinking: patch review is hard.

A guide to patch review

Long ago, before I ever made my first whitespace-error-ridden commit to a contributed module, webchick wrote a post describing how she reviews patches and invited others to do the same.  Three-and-some years later, I have an answer. :)

Full frontal nicety

First and foremost, before anything else, here is the most important part of how I do a patch review: I always try to be supportive and encouraging. Real human beings have worked on this issue. They have squinted at the code, scratched their heads, and spent their valuable time. The least I can do is start by thanking them and acknowledging their work. Terse replies that only pick out the negatives can be discouraging, especially for new contributors. Rudeness sucks the joy from contributing, accelerates burnout, derails issues, and on the whole is absolutely unacceptable.

Now, I'm only human, and since I do a lot of reviews, it's easy to fall into the trap of just doing "drive-by dreditor" (cr. webchick) and rattling off the mistakes that seem obvious to me. I also lose patience on occasion like anyone else. Still, though, when I post a review that's not as supportive or encouraging as it could be, I take a couple deep breaths, or a walk, and then make myself re-read and edit my post. I try to think about how I can say what I want to say in a way that would not turn me off core development if this were the first core issue I ever saw.

I started contributing to Drupal core because people whom I admired were nice to me. Join the movement. Tag your profile. Be nice.

My mental checklist

In IRC a few weeks ago, someone asked me what I look for when I review an issue, and I rattled a list of five things off the top of my head:

  1. Does it fix the problem without introducing regressions?
  2. Does it have tests?
  3. Are the docs and code style OK?
  4. Does it remain within scope?
  5. Is it the best fix we can come up with?

I think the list holds up under scrutiny.  For me, it's not only about reviewing the patch itself, but about the issue as a whole.  The issue's job is to communicate information about a problem and build consensus for a solution.  So, an issue that meets and communicates these five criteria is one that I'm either comfortable RTBCing myself or passing on with my +1 to someone who understands the problem space better.

Details follow.

1. Fixing the problem without regressions

  • First, the patch needs to actually fix the issue.  Someone should actually test the patch and make sure it works.
  • Next, the patch should pass testbot.  Testbot tells us right away whether the patch applies and does not introduce regressions against existing test assertions.

Many novices mistakenly think the two bullet points above are all that's required for something to be RTBC, but for me those are just the first two points of the first step.

  • If the patch introduces user interface changes (including theme changes, JS changes, CSS changes, markup changes, or string changes), then the issue should include before-and-after screenshots illustrating the change. (And please, when you take screenshots: shrink your browser window to 600px or so wide and crop the image to only the relevant portion.)
  • If the patch modifies the CSS or JS, it needs to be manually tested in each supported browser for that version of Drupal, both to confirm that it fixes the problem and to confirm that it does not introduce other bugs.
    • For Drupal 8 this includes Firefox, Chrome and Safari, and IE 8+.  
    • For Drupal 7, it also includes IE 6 and 7.  (I don't care how terrible they are.  We hate them too. But the fact is that Drupal 7 is a production product that supports these browsers, so we do not introduce further regressions in these browsers.)

2. Needs tests

  • Every single functional bugfix should come with an automated test that fails when the bug is present and passes once the fix is applied.  Exceptions to this are very rare (usually related to edge-case problems on specific environments).  
  • Furthermore, the tests should not simply fail, but should fail the expected assertions.
  • The test-only patch should be uploaded above the combined patch to expose the fails on the testbot for reviewers.
  • Every feature should come with thorough test coverage for all aspects of that feature (everything a user might test manually in a browser).
  • Every update hook should be accompanied by an upgrade path test.
  • The patch should not reduce or remove existing test coverage without specific justification. Every now and then I come across issues where people make their patches pass testbot by removing a line here or there of existing coverage. This is really bad. Often testbot is doing exactly what it is supposed to (exposing regressions introduced by a change), and reducing the test coverage undercuts the entire purpose of automated testing. Basically, a removed or changed line in an automated test is a red flag, and it should always be clear exactly why it is being changed (e.g., a constant used in the line has been renamed or a matched interface text string has changed).

3. That spinach in your teeth

Imagine you are in a meeting. Your boss smiles and starts talking. And in her front teeth is... something. It appears to be dark green. You subconsciously rub your tongue across your own teeth. Should you say something? Did she have a salad for lunch? Maybe the spinach lasagna? If it were you, you'd want to know so you could take a drink of water, but on the other hand she's the boss and you don't want to disrupt the meeting. Maybe it's pesto... wait.  What the heck did she just say? Did she ask you a question?

This is how I feel when I review a patch with code or documentation style errors. They are minor, unimportant, and utterly distracting.  I can't see your code if your whitespace is screwed up. I don't understand it if your inline comments contain grammatical errors.

Update, Oct. 2012

Based on recent experience, I recommend the following strategy when you notice code style or documentation errors in a patch:

  • If the issue fails to comply with any of the core gates, including what is explicitly specified in the minimum requirements for in-code documentation , it can be considered Needs work.
  • Documentation or code style errors outside those specific points are not by themselves compelling reasons to mark an issue Needs work.
  • Contributor reviews that specifically target documentation and code style are best offered by submitting a rerolled patch with an interdiff along with a brief explanation of the changes, rather than as a lengthy dreditor review.

If you want to give a novice contributor the opportunity to clean up the code style as part of a larger review:

Please be considerate of the fact that the patch may be a work in progress. Don't interrupt an architectural discussion to nitpick wording or whitespace.

4. Scope

Patches should only include changes that are directly related to the scope of the issue. Unrelated fixes or cleanups make the patch larger, make it harder to review, and make it more likely to collide with other issues. If I notice changed lines that don't seem to have anything to do with the issue, I ask the patch author to open a followup issue instead.

5. Is there a better way?

This is by far the hardest point to evaluate, and it depends largely on intangible things like your experience as a programmer, your knowledge of the subsystem or problem space, your familiarity with Drupal's APIs and unwritten conventions, your ability to identify potential DX or performance concerns, and your grasp of the goal or big picture.

That's a whole lot of dramatic hand-waving, so here's an example of a really simple thing that falls into this category:

<?php
foreach ($foo as $bar) {
  // Do some stuff.
  $term = taxonomy_get_term_by_name('frog');
  // Do more stuff.
}
?>

What's wrong with this picture? The code is loading the same taxonomy term for each iteration of the foreach, and this is needless overhead. It would be better to load the term once before the loop.

Often an experienced developer can look at a bit of code and tell you in 15 seconds or so that it's a giant hack. If you ask them why it's a hack, though, they might have to think for a minute. It might even take several minutes to explain. For this reason, I recommend that novice reviewers focus on my points 1-4 above, and then simply give this final point their best shot and leave the issue at Needs Review. Then, watch for a review from a more experienced contributor and make note of what they say. You might also ask in #drupal-contribute to see what others think of the solution.

So, like... what do you actually do?

When an issue I'm following is updated with a new patch, or when someone pings me for a review on an issue, this is my process:

  1. I scroll to the bottom of the issue and look for the most recent patch.  I immediately note whether it has a tests-only patch and/or an interdiff accompanying the main patch.
  2. If there's an interdiff and I have reviewed the patch before, then I review the interdiff first. Otherwise, I click the dreditor review button for the main patch.
  3. I glance at the diffstats (lines inserted, deleted, etc.) and the list of modified files and functions, just so I have an idea of the overall scope.
  4. I scroll all the way to the bottom of the patch. In most cases I should see a test added there. If I don't, a red flag goes up right away.
  5. At this point I used to scroll back to the top, hide deletions, and just read all the inline comments and documentation to get an idea what was going on. For the past month or two, though, I've taken a handy tip from webchick and started reading patches from the bottom up, one line at a time.  This is especially helpful if one is exceptionally distracted by the bits of spinach or prone to the dreditor drive-by.
  6. I read each line by itself, without trying to understand it in context.  If the line uses a function I'm not familiar with, I look it up in the API documentation.
  7. For each line, I make note of any code style issues, unclear comments, or other problems I see, and for each I add a comment with dreditor explaining how they should be fixed. (This is why you see me popping factoids like "1354" and "code style?" over and over in IRC; I'm getting the link to the specific point within those docs in order to justify my recommendation and give the next contributor guidance on what to do instead.) To add a comment with dreditor, double-click on the lines you wish to reference, and then type your comment in the box.
  8. If I notice the same mistake or issue multiple times, I add to my original dreditor comment by clicking on the previously reviewed lines again, then double-clicking on the additional lines with the same mistake and updating the comment.  This helps consolidate the review a bit and makes it easier to follow.
  9. When I get to the top of a hunk or function, I then read that hunk back down from the top. At this point things should kinda make sense.  Inline comments should clearly explain what subsequent lines are doing. Automated tests should follow the steps to reproduce and make sensible assertions. Etc.
  10. If I come across anything that I don't understand or that seems questionable, I make note of it in a dreditor comment.  However, I also come back to these points after I finish reading the patch (because they might be answered in the issue, which I haven't read yet).
  11. When I'm done reading the patch and adding my comments, only then do I use the "Hide" button in dreditor to scan the issue summary and comments for information relating to any unanswered questions I had. If the question is answered already, I simply delete that comment.  If it is addressed, but I am not totally sold on how, then I will update that comment to instead reply to the existing information in the issue.
  12. I click the "Paste" button in dreditor. I copy and paste my dreditor review into a text file in case of browser disaster.  I also usually reformat the dreditor review with an <ol> tag (with each code snippet and its comment together inside an <li>) so that it is easier to read and reference specific points of my review. For a particularly long review, I also preview the comment.
  13. I then "zoom out" mentally and look at the whole issue.
  14. I check the detailed results for the test-only patch and see if the correct assertions failed.
  15. I read the issue summary and comments: What's happened so far? What elements of my 5-criteria checklist above seem incomplete? Are there any unresolved questions I should weigh in on?
  16. My dreditor review now becomes the "filling" of my comment sandwich. I start with acknowledgement of the contribution and the contributors who worked on it, and give my feedback on the issue as a whole. Next comes the dreditor review. Finally, I always try to close my comment with a brief summary and an actionable next step that the next contributor looking at the issue can take.
  17. I submit my comment, setting the issue to NW, NR, or RTBC as appropriate.
  18. I update the issue summary if appropriate.

And... that's it! Done.  Maybe I re-read my post and edit it a bit for clarity.  Maybe I go on to the next issue.  Maybe I go back to my job or to chatting on IRC. ;)

Questions?  Comments?  How do you do a review?

Comments

agentrickard 's picture

wait, how is my patch supposed to write a failing teat for when it hasn't been applied? Tha seems onerous to me because it would require a two patch approach: the first patch would just provide a failing test, which testbed would respond to how?

xjm's picture

Testing gate reference: http://drupal.org/core-gates#testing (specifies adding a test-only patch)

The "test-only patch" in action: http://drupal.org/node/312458#comment-5718088

It's actually not that onerous, really.  It can be as simple as creating your patch with git diff, making a copy of the file, and then opening that copy and manually deleting the hunks that are part of the fix.  Takes less than 30s.  Ideally, you'll run the tests locally both with and without the fix to confirm that they actually provide test coverage for the bug, but adding the test-only patch makes the bot do the work and exposes it for everyone else to see.

Kristen's picture

This is a great review of how to do a review! :) Thanks for sharing.

Btw, I was at the "newbie contributor" sprint and all of you did an awesome job. Thanks for that too.

Kristen

Damien McKenna's picture

Thank you. No, seriously, thank you for putting this together! This is one of the single best articles, if not the best, I've read on how to do code reviews!

Brock Boland's picture

You've got quite a process! How long does this typically take for a single patch? For example, you reviewed a patch I worked on a few days ago; about how long do you spend on something that size? I'm just curious how you manage to get so many of these done!

(and I don't mean to imply, of course, that patch review should be anything but a careful and time-consuming process—it's just impressive that you can do this as much as you do)

xjm's picture

That was an easy review for a few reasons:

  1. The conceptual problem space is fairly simple (something I understand without having to do research, and in fact something that has bitten me in the past)
  2. I'd already reviewed previous iterations of it and discussed it in IRC
  3. It's nice and small at 5K :)

That one probably only took me 5 minutes.  (I can spot a miswrapped comment at 80 paces, pun intended.)  Note that part of my review there was to "delegate" the manual testing for the functionality to other reviewers, both with my comments in the issue and by adding it to my list of potential office hours tasks. ;)  That gives me more bandwidth to do other reviews, and simultaneously gives other contributors a chance to engage with that patch review in a manageable way.

Jonathan's picture

Very nice write up. How long does it take you on average to get through your process?

xjm's picture

The time it takes varies enormously depending on the size and complexity of the patch, and my familiarity with the code in question.  Size is the biggest factor.  A small, simple patch like the one Brock mentions above can take as little as 5 minutes or less.  On the other hand, a 200K API documentation cleanup patch--one that doesn't even include any changes to functional code!--might hit the hour mark (!) if it has a lot of mistakes.  I can only do a couple of reviews like that in a week. (Note: Thankfully, there is a finite number of those gargantuan documentation cleanup patches, and we are in the home stretch.)

This is why I'm such a zealot about keeping patches within a defined scope, why I have tried to help with automated coder/doc reviews on testbot, and why I'm passionate about making the review process more manageable and collaborative through issue summaries, office hours, etc.  Also why patch cleanup is such a great task for new contributors. :)  Eaton once said that "Bug triage teams are caching for lead devs."  In a similar way, reviewers are caching for core maintainers, and novice contributors can in turn be caching for those reviewers, too.  In fact, that's exactly how and why I got involved in core.

xjm's picture

I probably should have mentioned this in my original post, but a lot of what I describe in my "mental checklist" above is a working-knowledge distillation of the Drupal core gates. :)

jthorson's picture

Awesome writeup ... you just continue to impress (though by this point, we shouldn't be surprised)!

While it's got a slightly different focus, I think we need to link to this post from the project application process group as well ... some times, our reviewers could benefit from a 'full frontal nicety' reminder or two.

famesbond-dot-com's picture

I wish I could have been there :( Living outside the states does have its downfalls!

vijaycs85's picture

Very helpful for new reviewers.

Post new comment