Research

Automating PR merging with Bulldozer and PolicyBot

Automating GitHub PR Merging with Bulldozer and PolicyBot

Ondřej Pelech's photo
Ondřej Pelech

Senior Software Engineer @ ThreatLabs

Published

December 16, 2020

Read time

8 Minutes

Automating PR merging with Bulldozer and PolicyBot

Written by

Ondřej Pelech

Senior Software Engineer @ ThreatLabs

Published

December 16, 2020

Read time

8 Minutes

Automating PR merging with Bulldozer and PolicyBot

    Share this article

    As you might have read in our previous post, we’re happily using Scala Steward for our public and internal Scala projects. It is a bot that keeps them up to date by opening Pull Requests with updated versions of dependencies. That alone is a great help, but it’s only half of the job, a developer has to come in, verify the PR that it makes sense and click the Merge button – it is still not fully automated. In this post, we will explore ways to increase automation regarding PRs and thus decrease the burden on developers.

    For our public projects, we use Mergify, a free service for automation of merging PRs. Check for example these projects: SST, grpc-json-bridge. For projects hosted on GitHub.com, it’s something we can greatly recommend, but what could we use for our internal GitHub Enterprise? Besides Scala Steward, we’re using Renovate for non-Scala projects, which can merge its PRs, but we would prefer to have just one tool for automated merging. That’s where Bulldozer and PolicyBot come in.

    Bulldozer

    Developed by Palantir and released as Free software, Bulldozer is a bot, a GitHub App to be more precise, for merging Pull Requests. It doesn’t have a publicly hosted instance, like Mergify for example, but it is published as a container image, which makes it easy to deploy internally at our organization.

    We deployed it to our internal Kubernetes. The app itself is configured via environment variables. This is an excerpt from the Kubernetes deployment configuration:

    Note that GITHUB_APP_INTEGRATION_ID is what GitHub calls “App ID”. To get the id, Bulldozer first has to be registered as an App to your GitHub team. The link where to do that could look something like https://git.company.com/organizations/your-team/settings/apps and there you click the New GitHub App button in the top right corner. The specific details of what permissions are necessary and other installation related things can be found in the Bulldozer’s official README.

    After you have Bulldozer App registered and the container up and running, you can install it for specific repositories or whole organizations in the Install App tab in the App’s administration panel (https://git.company.com/organizations/your-team/settings/apps/bulldozer/installations). We installed it for whole organizations, because it’s comfortable and safe, because the bot won’t touch repositories that don’t have the .bulldozer.yml configuration file in the root.

    Bulldozer’s single purpose is to merge PRs. It merges only those PRs for which the required status checks pass and that satisfy certain criteria. The important criteria for us were labels and pr_body_substrings and the build checks are the TeamCity CI and PolicyBot (more on that later). For a Scala project, the .bulldozer.yml configuration file could look like this:

    Notice how we tailored the pr_body_substrings to the messages Scala Steward puts into its PRs, so that everything besides major upgrades of libraries will get automatically merged. We also have the bulldozer-merge label. That is a very useful feature (unrelated to dependency bots), if you open a PR and don’t want to wait for the CI to finish. Just add this label to the PR and right after all necessary checks are green, Bulldozer will merge the PR for you, you don’t have to go back to it. If you are using Restrict who can push to matching branches in Branch protection rules, make sure that Bulldozer is allowed push access there.

    As you can see, Bulldozer can be very powerful, but maybe even too powerful. It could wreak havoc on your repositories if something went really really wrong – by merging PRs, it’s changing the code in your master branch after all. It most likely won’t, it is developed by competent developers and it has even been vetted by our internal security team, but still, it’s better to be safe than sorry. Isn’t there a way to further curb the possibilities of what Bulldozer can and cannot do, what files it is allowed to change?

    PolicyBot

    This is Bulldozer’s sidekick. It is developed by the same team, deployed and configured in the same manner. The two bots can work independently, one without the other, but they play very well together. What PolicyBot does is create a new kind of check in a GitHub PR that can be red or green based on a very granular criteria, like the full file names of the changed files.

    As Bulldozer, PolicyBot is easy to deploy to k8s:

    Note that GITHUB_OAUTH_CLIENT_ID is what GitHub calls “Client ID”. Register the app first with GitHub and then install it for repositories or organizations. More details about the settings, deployment and other configuration is at PolicyBot’s README. List of changed files is only a small fraction of what it can do, so check out the documentation, maybe you will be able to employ the bot for other purposes besides guarding version upgrade PRs.

    The configuration file is .policy.yml and only projects that contain it will get this new PR check. For Scala projects, we use configuration that looks something like this:

    You can see that we limit the files in a Scala Steward’s PR that Bulldozer can merge to only those that contain versions of libraries, sbt plugins, sbt itself or of Scalafmt. If Scala Steward changed anything else, the PolicyBot’s check would be red and Bulldozer wouldn’t merge such PR. (By the way, that can happen for legitimate reasons, because Scala Steward can apply Scalafix rules when updating certain libraries, but those result in changes to the code so we want a developer to have a look at it before merging.) Something similar could be said about Renovate that is keeping our TeamCity configuration for the project up to date.

    PolicyBot comes with a nice web UI, where it explains how it came to the conclusion it made. You can get to it by clicking Details on the right side of the policy-bot: master row in the PR checks. Here’s an example of how it can look:

    Conclusion

    Bulldozer and PolicyBot have saved us developers a ton of onerous work since we’ve deployed them. We hope that this post has inspired you to think about how you can automate your PR workflow. We owe a big thank you to the nice folks at Palantir who developed these two bots, we couldn’t have done this without them. They were also receptive to our improvements (1, 2, 3), so thanks again, @bluekeyes in particular!

    To automate PRs to the maximum extent possible and to do that fearlessly, we should talk about preventing runtime failures caused by linking incompatible transitive dependencies and ClassNotFoundException or preventing PRs that could introduce such issues from being merged. But those are issues specific to Scala/JVM and this post is already long enough, so we will explore this in a future post, where we will talk about sbt-missinglink.

    Ondřej Pelech

    Senior Software Engineer @ ThreatLabs

    Follow us for more