It's Pants Plugins All the Way Down

Two and a half months ago I wrote a tutorial about writing your first Pants plugin, which I concluded with the promise of a follow-up post which would cover testing. Whelp, turns out I forgot something that I wanted to put ahead of testing, which was plugin code quality!

Let's head over to pants-example-plugin and see where we left off.

Note: I won't be doing branch-by-branch (or folder-by-folder) updates for code quality, as there is no particular order. In fact, these code quality steps could be done BEFORE you even start writing a plugin. They could also be a part of a pants-plugin-template (work in progress).

To get a jump on things, crack open the linters and formatters page, since that will drive a lot of the code quality work. In fact, if you just read that page, you can probably skip a couple of steps in this post.

Formatting

I've been using Black as my formatter of choice for some years. I use it for the same reason that I use Prettier for my web work - I like something that's got reasonable defaults and doesn't allow too much configuration. Whitespace, new lines, and brackets occupy infinitely more of my dev discussions than I would like. If you can't appreciate what I'm talking about, check out the number of options for clang-format and know that you'll have a die-hard advocate on any side of each of those options.

via GIPHY

Alright, ranting aside. To get up and running with a formatter, first add it (them) to your backend_packages and then configure them with any particular tweaks in their individual configuration sections. Use ./pants help black or check the online docs to see what you can configure.

My preferred formatters are: autoflake, Black, isort, pyupgrade. I also use DocFormatter, but not in this example, since there isn't much in-code documentation.

# pants.toml

backend_packages = [
    ...
    "pants.backend.python.lint.black",
    #"pants.backend.python.lint.docformatter",
    "pants.backend.python.lint.isort",
    "pants.backend.experimental.python.lint.autoflake",
    "pants.backend.experimental.python.lint.pyupgrade",
    ...
]

[isort]
args = "--profile black"

[pyupgrade]
args = ["--py39-plus"]

Run all of the formatters on all of your code via ./pants fmt :: and you'll have consistently formatted python code and see this in your terminal:

+ Black made changes.
+ autoflake made changes.
+ isort made changes.
✓ pyupgrade made no changes.

Linting

Linters run almost identically to formatters, and in fact if you review plugin source code, you'd see that linter and formatter code is almost identical. Thanks to a recent Pants PR, writing formatter code became less boilerplatey.

In Python (and other languages), there are dedicated code linters (e.g. flake8), as well as formatters which can run in "check-only" mode. In "check-only" mode, those formatters don't actually perform any in-place formatting of your source code - instead they will warn you when code is not correctly formatted. This format check is more important when dealing with build servers and CI, rather than during development.

The cool thing about linting (unlike formatting) is that overlap is a good thing, so throw the kitchen sink at your code.

# pants.toml (including formatters from previous step)

backend_packages = [
    ...
    "pants.backend.python.lint.bandit",
    "pants.backend.python.lint.black",
    #"pants.backend.python.lint.docformatter",
    "pants.backend.python.lint.flake8",
    "pants.backend.python.lint.isort",
    "pants.backend.python.lint.pylint",
    "pants.backend.experimental.python.lint.autoflake",
    "pants.backend.experimental.python.lint.pyupgrade",
    ...
]

[bandit]
# Skipping assertion warnings - because this is an example project
args = ["--skip B101"]

[flake8]
args = ["--max-line-length 88", "--ignore=E501"]

[isort]
args = "--profile black"

[pylint]
# Skipping documentation warnings, and f-string log warnings
args = "--disable=missing-module-docstring,missing-class-docstring,missing-function-docstring,logging-fstring-interpolation"

[pyupgrade]
args = "--py39-plus"

Run ./pants lint :: and you'll see something like:

✓ Bandit succeeded.
✓ Black succeeded.
✓ Flake8 succeeded.
✓ Pylint succeeded.
✓ autoflake succeeded.
✓ isort succeeded.
✓ pyupgrade succeeded.

Typechecking

Typechecking using mypy doesn't use the lint goal, but rather the check goal. Other languages use check in order to verify compilation of files, so I guess this seems like a decent place for mypy. Python also has py_compile which could be reasonably placed under this goal is someone was in the mood. By default, Pants will pick up a mypy.ini, .mypy.ini, setup.cfg, or pyproject.toml for mypyconfiguration. In fact, Pants performs similar checks for several linters and formatters too.

In this example, I'm locking a specific version of mypy - just to show how it can be done.

# pants.toml

backend_packages = [
    ...
    "pants.backend.python.typecheck.mypy",
    ...
]

[source]
root_patterns = [
    ...
    "pants-plugins", # Ensure the plugins are pulled in as their own source root
]

[mypy]
# Lock the mypy version for this example
version = "mypy==0.942"
lockfile = "build-support/mypy.txt"

As of Pants 2.10.0, be ready for lots of complaints about being unable to find implementation or library stubs after running ./pants check ::.

[ERROR] Completed: Typecheck using MyPy - MyPy failed (exit code 1).
pants-plugins/experimental/fooify/target_types.py:1: error: Cannot find implementation or library stub for module named "pants.engine.target"
pants-plugins/experimental/fooify/subsystem.py:1: error: Cannot find implementation or library stub for module named "pants.engine.rules"
pants-plugins/experimental/fooify/subsystem.py:2: error: Cannot find implementation or library stub for module named "pants.option.subsystem"
pants-plugins/experimental/fooify/rules.py:6: error: Cannot find implementation or library stub for module named "pants.core.goals.package"
pants-plugins/experimental/fooify/rules.py:7: error: Cannot find implementation or library stub for module named "pants.engine.fs"
pants-plugins/experimental/fooify/rules.py:8: error: Cannot find implementation or library stub for module named "pants.engine.rules"
pants-plugins/experimental/fooify/rules.py:9: error: Cannot find implementation or library stub for module named "pants.engine.target"
pants-plugins/experimental/fooify/rules.py:15: error: Cannot find implementation or library stub for module named "pants.engine.unions"
pants-plugins/experimental/fooify/rules.py:16: error: Cannot find implementation or library stub for module named "pants.util.logging"
Found 9 errors in 3 files (checked 7 source files)

𐄂 MyPy failed.

mypy is unable to pick up the Pants library using default configurations. When these errors crop up in first-party code (ie. your code), the problem tends to be a missing __init__.py or mis-configured mypy paths. However, with a third-party library, the problem comes down to missing stubs, or not having correctly installed the libraries.

In this specific instance, however, the problem comes down to mypy not being able to pick up the valid Pants library (which can manually be installed via pip install pantsbuild.pants==2.10.0) because of namespacing issues. This can be fixed using a mypy configuration file (I picked mypy.ini for no particular reason) and setting namespace_packages = True. I also set strict = True as a habit, because I prefer being explicit with method definitions and stuff (I view it basically as a form of documentation).

# mypy.ini

[mypy]
# Ensure pantsbuild.pants imports are picked up correctly
namespace_packages = True

# Strictness
strict = True

# Error output
show_column_numbers = True
show_error_context = True
show_error_codes = True
show_traceback = True
pretty = True
color_output = True
error_summary = True

Build files

Everyone knows they should format and lint their source code, but our poor, lonely build files always get ignored. Not this time though, Pants has us covered.

There is a very handy command called update-build-files, which... ummm... update your build files (where possible) to the latest and greatest Pants syntax. You should run this everytime you upgrade Pants versions. As a nice little side perk, it also runs Black over your build files, so they stay nicely consistent with the codebase as well. ./pants update-build-files:

[INFO] No required changes to BUILD files found. However, there may still be deprecations that 'update-build-files' doesn't know how to fix. See https://www.pantsbuild.org/docs/upgrade-tips for upgrade tips.

Github CI

Manually checking all of your files for formatting or lint warnings is all well and good, but it's definitely brittle. And as you build up a team, what are the odds that EVERYONE pushes code where these checks have been done? Hint: Zero.

So, let's take a minute to use Github's CI system to automatically run some checks for us. Some people like doing this on every commit, but I feel as though that's really wasteful of resources (even if you're not paying for them, they DO have an environmental impact). Additionally, while I'm just messing around with code and making breaking changes, I don't like getting hit with email spam about my tests failing. So, I run the following on commits to main and on PRs to main. I basically use the same workflow that the example-python repo uses:

# .github/workflows/pants.yml

# Copyright 2020 Pants project contributors.
# Licensed under the Apache License, Version 2.0 (see LICENSE).

# See https://pants.readme.io/docs/using-pants-in-ci for tips on how to set up your CI with Pants.

name: Pants linting and formatting

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    env:
      PANTS_CONFIG_FILES: pants.ci.toml
    strategy:
      matrix:
        python-version: [3.9]

    steps:
      - uses: actions/checkout@v2

      - uses: actions/cache@v2
        id: cache
        with:
          path: |
            ~/.cache/pants/setup
            ~/.cache/pants/lmdb_store
            ~/.cache/pants/named_caches
          key: ${{ runner.os }}-

      - name: Setup Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}

      - name: Bootstrap Pants
        run: |
          ./pants --version

      - name: Check BUILD files
        run: |
          ./pants update-build-files --check

      - name: Lint and typecheck
        run: |
          ./pants lint check ::

      - name: Upload pants log
        uses: actions/upload-artifact@v2
        with:
          name: pants-log
          path: .pants.d/pants.log
        if: always() # We want the log even on failures.

Workflow optimization

In all of my examples above, I run the formatters, linters, and typecheckers on all projects/source files in the git repo. This example plugin repo is pretty small, and my computer is very fast - so it's not much of a problem to run these over everything however often I need to. However, as the size of the repo increases, or the speed of the machine decreases - there can be a noticeable impact to developer productivity. Fortunately, these tools have pretty good caching and cache invalidation, so even running commands over all source code will not re-do everything - just some of the things.

But, why even let it get that far? Why not do something more clever?

I'm not referring to tags because while that's helpful for running groups of tests, it doesn't help with running the minimal set of code. For that, we turn to the --changed-since option, with a little bit of transitive dependencies (needed for check, not fmt or lint):

./pants --changed-since=HEAD --changed-dependees=transitive fmt lint check

Eagle eyed observers will have noticed here (and in the Github CI workflow) that Pants goals can be run in succession on the same line (fmt lint check). They will be run sequentially left-to-right as necessary, which is why fmt should always come first.

In reality, on small repos, the Pants startup and setup time takes longer than running these commands on the full repo vs a subset, but as mentioned above, as the size of the repo grows - the value of a command like this increases (and it will become even more important in the next major section).

Aside: Loop lifehack

There's also a handy little helper named --loop which will monitor your files and re-run your goals when files have changed.

./pants --loop --changed-since=HEAD --changed-dependees=transitive fmt `

I think this is a good time to mention that, while I haven't yet covered tests (next Pants post), a lot of this workflow optimization is better suited to the test goal and test-driven-development, moreso than linting and formatting files... For those, we have pre-commit hooks!

Pre-commit hooks

Sweet, we're setup with formatters, linters, typecheckers, Github automatically runs the linters and typecheckers and emails us if anything fails, and we've figured out how to only run everything over changed code in the repo! What could go wrong?

A lot actually, if you're someone like me... I have a rapid edit/commit cycle while working on a feature branch, and I'll occasionally push some bugfix code on branches being PR'd without running the formatter, and then I'm back to spam city.

Well, I used to be back in spam city, until I finally started doing a better job at setting up git pre-commit hooks. If you've never used one, they're as literal as they seem. When you want to commit, git calls a hook which runs a script that needs to pass before the git commit will go through. If you've ever used a tool like pre-commit, then you know what I'm talking about. I used to use it, but it's not really necessary anymore with Pants.

If you check out a repo's .git/hooks folder, you'll see a bunch of hooks that you can run by removing the .sample extension to align the naming convention (pre-push, pre-merge, pre-rebase, etc).

You can check out what Pants-proper does to get a feel for it, but below is what I use to pseudo-replicate what happens in CI. This is a bit more efficient, as it only runs on changed code, while the CI version runs on all source code (since it happens in the background).

To automatically install this, you can run ./build-support/install-hooks.sh which will symlink files in the githooks folder back into the .git/hooks folder.

# build-support/githooks/pre-commit

#!/usr/bin/env bash

# Minimal spiritual child of: https://github.com/pantsbuild/pants/blob/main/build-support/githooks/pre-commit

set -e

COLOR_RED="\x1b[31m"
COLOR_RESET="\x1b[0m"

function git_merge_base() {
    # This prints the tracking branch if set and otherwise falls back to the commit before HEAD.
    # We fall back to the commit before HEAD to attempt to account for situations without a tracking
    # branch, which might include 'main' builds, but can also include branch-PR builds, where
    # Travis checks out a specially crafted Github '+refs/pull/11516/merge' branch.
    git rev-parse --symbolic-full-name --abbrev-ref HEAD@\{upstream\} 2> /dev/null || git rev-parse HEAD^
}

function log() {
    echo -e "$@" 1>&2
}

function die() {
    (($# > 0)) && log "\n${COLOR_RED}$*${COLOR_RESET}"
    exit 1
}

MERGE_BASE="$(git_merge_base)"

echo "* Build file checking"
./pants --changed-since="${MERGE_BASE}" update-build-files --check ||
    die "If there were errors, run \`./pants --changed-since=$(git rev-parse --symbolic "${MERGE_BASE}") update-build-files\`"

echo "* Lint checking"
./pants --changed-since="${MERGE_BASE}" lint ||
    die "If there were formatting errors, run \'./pants --changed-since=$(git rev-parse --symbolic "${MERGE_BASE}") fmt\'"

echo "* Typechecking"
./pants --changed-since="${MERGE_BASE}" --changed-dependees=transitive check

To test out the pre-commit file directly, you can run ./build-support/githooks/pre-commit - otherwise, try to commit some code. Make sure your build-support/githooks/pre-commit file is executable, otherwise the hook will silently fail.

✓ autoflake succeeded.
✓ bandit succeeded.
✓ black succeeded.
✓ flake8 succeeded.
✕ isort failed.
✓ pylint succeeded.
✓ pyupgrade succeeded.
✓ shellcheck succeeded.
✓ shfmt succeeded.

If there were formatting errors, run './pants --changed-since=c3642bf54bd932131ee6c9a3e07243531da481cf fmt'

This will prevent you from accidentally committing code that will fail on CI.

Before I forget, as we've just added some bash files to our repo - we should maintain their code quality as well. Adding the pieces below will run formatting and linting over our install script and pre-commit hook.

# pants.toml

backend_packages = [
    ...
    "pants.backend.shell",
    "pants.backend.shell.lint.shellcheck",
    "pants.backend.shell.lint.shfmt",
    ...
]

[shfmt]
# See https://github.com/mvdan/sh/blob/master/cmd/shfmt/shfmt.1.scd#printer-flags.
args = ["-i 4", "-ci", "-sr"]

# build-support/BUILD

shell_sources(sources=["install-hooks.sh", "githooks/pre-commit"])

That’s It (Again)!

Our Pants plugin is now supported by a ton of good code quality practices - formatting, linting, typechecking, CI, and pre-commit hooks.

And to repeat what I wrote last time around...

In a subsequent post, I’ll tackle the challenge of testing plugins (which I personally find trickier than creating them).

Development It's Pants Plugins All the Way Down