Raise your hand if you’ve been annoyed by imports in Haskell.
They’re not fun.
Imports are often noisy, lists are often huge, and diffs can be truly nightmarish to compare.
Using a term often requires modifying the import list, which breaks your workflow.
Fortunately, we can reduce some of the pain of these problems with a few choices in our stylish-haskell
configuration and a script that gradually implements these changes in your codebase.
This post begins with a style recommendation, continues with a script to implement it gradually in your codebase, and finishes with a discussion on relevant import styles and how they affect review quality.
I use stylish-haskell
for my formatting tool.
My editor’s default formatting choices with vim2hs
work well for me (while I maintain that fork, it’s mostly a conglomeration of a bunch of changes that other people have made to it).
I have this shortcut defined to run stylish-haskell
in vim:
" Haskell
nnoremap <leader>hs ms:%!stylish-haskell<cr>'s
This sets a mark, filters the file through stylish-haskell
, and then returns to the mark.
stylish-haskell
is configured by a .stylish-haskell.yaml
file, and it will walk up the directory tree searching for one to configure the project with.
I place mine in the root of the Haskell directory, right next to the stack.yaml
or cabal.project
files.
Here are the contents that I recommend:
steps:
- imports:
align: none
list_align: with_module_name
pad_module_names: false
long_list_align: new_line_multiline
empty_list_align: inherit
list_padding: 7 # length "import "
separate_lists: false
space_surround: false
- language_pragmas:
style: vertical
align: false
remove_redundant: true
- simple_align:
cases: false
top_level_patterns: false
records: false
- trailing_whitespace: {}
# You need to put any language extensions that's enabled for the entire project here.
language_extensions: []
# This is up to personal preference, but 80 is the right answer.
columns: 80
Let’s look at a diff that compares the default stylish-haskell and this configuration.
I created a pull request against the servant-persistent
example project to demonstrate the style.
I left a bunch of review comments to explain the differences, and the UI for reading them is nice on GitHub.
Here’s a reproduction of the differences:
- import Init (runApp)
+ import Init (runApp)
We no longer indent so that module names are aligned. This helps keep the column count low, and makes it easier to just type this out manually without worrying about alignment.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DataKinds #-}
We don’t align on pragmas anymore. The diff will only show a new language pragma, not highlighting every line that was changed just to align the imports.
- import Api.User (UserAPI, userApi, userServer)
- import Config (AppT (..), Config (..))
+ import Api.User (UserAPI, userApi, userServer)
+ import Config (AppT(..), Config(..))
We no longer align the explicit import lists along the longest module name. This is less noisy, because adding a new module import that is longer than any others will no longer trigger a reformat across all the imports.
- import Database.Persist.Postgresql (Entity (..), fromSqlKey, insert,
- selectFirst, selectList, (==.))
+ import Database.Persist.Postgresql
+ (Entity(..), fromSqlKey, insert, selectFirst, selectList, (==.))
If the import and module goes beyond the column count, then the import list is indented, but is kept on one line. This keeps the import lists compact in the smallest cases, where it’s easier to notice a small change.
- import Servant ((:<|>) ((:<|>)), Proxy (Proxy), Raw,
- Server, serve, serveDirectoryFileServer)
+ import Servant
+ ( (:<|>)((:<|>))
+ , Proxy(Proxy)
+ , Raw
+ , Server
+ , serve
+ , serveDirectoryFileServer
+ )
If a newline indented import list expands beyond the column count, then it’ll put each term on a new line. This takes up space, but it’s really easy to read, and the diff for adding or removing an import line points to exactly the change that was made.
- import Config (AppT (..))
- import Control.Monad.Metrics (increment, metricsCounters)
- import Data.HashMap.Lazy (HashMap)
- import Data.IORef (readIORef)
- import Data.Text (Text)
- import Lens.Micro ((^.))
- import Models (User (User), runDb, userEmail,
- userName)
- import qualified Models as Md
- import qualified System.Metrics.Counter as Counter
+ import Config (AppT(..))
+ import Control.Monad.Metrics (increment, metricsCounters)
+ import Data.HashMap.Lazy (HashMap)
+ import Data.IORef (readIORef)
+ import Data.Text (Text)
+ import Lens.Micro ((^.))
+ import Models (User(User), runDb, userEmail, userName)
+ import qualified Models as Md
+ import qualified System.Metrics.Counter as Counter
The end result is less pretty. It’s a little more cluttered to read. However, it dramatically improves diffs and merge conflicts when using qualified and explicit imports, which will improve the overall readability of the codebase significantly.
You don’t want to shotgun the entire project with this, because that’ll cause a nightmare of merge conflicts for everyone until the dust settles. But if you did, you could write:
$ stylish-haskell --inplace **/.hs
This is fine for small projects with few collaborators. But on large projects with many collaborators, we want to make this a bit more gentle. So instead, we’ll only require that files changed in a given PR are formatted.
We can get that information using git diff --name-status origin/master
.
If your “target” remote and branch isn’t origin master
then substitute whatever you use.
The output of that command looks like this:
M .stylish-haskell.yaml
M Setup.hs
M app/Main.hs
M src/Api.hs
M src/Api/User.hs
M src/Config.hs
M src/DevelMain.hs
M src/Init.hs
M src/Logger.hs
M src/Models.hs
M test/ApiSpec.hs
M test/UserDbSpec.hs
All of these symbols are M
, but you can also get A
for additions and R
for replacements/rewrites, and we’ll want to stylish
those up too.
We’ll handle these in three steps for these cases, because it’s easiest.
The first case is simply M
, and we can focus on that with grep "^M"
.
We only want Haskell files, so we’ll filter on those with grep ".hs"
.
We want to get the second field, so we’ll do cut -f 2
.
Finally, we’ll send all the elements as arguments to stylish-haskell --inplace
using xargs
.
The whole command is here:
git diff --name-status origin/master \
| grep .hs \
| grep "^M" \
| cut -f 2 \
| xargs stylish-haskell --inplace
Added files is the same, but you’ll have grep "^A"
instead.
Replaced/rewritten files are slightly different.
Those have three fields - the type (R
), the original filename, and the destination/new file name.
We only want the new file name.
So the script looks like this:
# renamed files
git diff --name-status origin/master \
| grep .hs \
| grep "^R" \
| cut -f 3 \
| xargs stylish-haskell --inplace
The only real difference is the cut -f 3
field.
Our full script is:
#!/usr/bin/env bash
set -Eeux
# modified files
git diff --name-status origin/master \
| grep .hs \
| grep "^M" \
| cut -f 2 \
| xargs stylish-haskell --inplace
# added files
git diff --name-status origin/master \
| grep .hs \
| grep "^A" \
| cut -f 2 \
| xargs stylish-haskell --inplace
# renamed files
git diff --name-status origin/master \
| grep .hs \
| grep "^R" \
| cut -f 3 \
| xargs stylish-haskell --inplace
Save that somewhere as stylish-haskell.sh
, and add an entry in your Makefile
that references it (you do have a Makefile, right?).
Now, we can run make stylish
and it’ll format all imports that have changed in our PR, but it won’t touch anything else.
Over time, the codebase will converge on the new style, but only as people are working on relevant changes.
We can add this to CI by calling the script and seeing if anything changed.
git
has an option --exit-code
that will cause git
to exit with a failure if there is a difference.
In this snippet, I have some uncommitted changes:
$ git diff --exit-code
diff --git a/Makefile b/Makefile
index f8d1636..df336de 100644
--- a/Makefile
+++ b/Makefile
@@ -6,4 +6,7 @@ ghcid-devel: ## Run the server in fast development mode. See DevelMain for detai
--command "stack ghci servant-persistent" \
--test "DevelMain.update"
-.PHONY: ghcid-devel help
+imports: ## Format all the imports that have changed since the master branch.
+ ./stylish-haskell.sh
+
+.PHONY: ghcid-devel help imports
$ echo $?
1
We can use this to fail CI. In Travis CI, we can add the following lines:
script:
- make imports
- git diff --exit-code
- stack --no-terminal --install-ghc test
You can adapt this to whatever CI setup you need.
However, you’ll probably need to install stylish-haskell
in CI, too.
Your build tool can handle that, just ensure that it’s present on the PATH
.
The default style is really aesthetically nice. Everything lines up, there’s a lot of horizontal whitespace, it’s uncluttered looking. But it just doesn’t scale!
It doesn’t look good with long module names. It doesn’t look good with long explicit import lists. It causes a ton of irrelevant diff noise and needless merge conflicts. It becomes a hassle when you’re working on a large codebase with other people.
So let’s look at all the choices, their alternatives, and why I selected these.
steps:
- imports:
align: none
Alignment is visually appealing but it creates diff noise and it consumes columns with whitespace that would better be used with meaning.
list_align: with_module_name
This option is superfluous, because we have selected new_line_multiline
for long_list_align
.
pad_module_names: false
The docs for this give the justification quite nicely:
Right-pad the module names to align imports in a group:
true: a little more readable
> import qualified Data.List as List (concat, foldl, foldr, > init, last, length) > import qualified Data.List.Extra as List (concat, foldl, foldr, > init, last, length)
false: diff-safe
> import qualified Data.List as List (concat, foldl, foldr, init, > last, length) > import qualified Data.List.Extra as List (concat, foldl, foldr, > init, last, length)
Default: true
Ultimately, diff-safe is preferable to aesthetics, so we go with that.
long_list_align: new_line_multiline
long_list_align
determines what happens when the import list goes over the maximum column count.
This is option a recent addition to the options.
There are a few choices here, and you may actually prefer an even more diff-friendly approach than me.
new_line_multiline
will indent if the module and list exceeds the column length.
If the new line list also exceeds the column length, then it’ll put every import on it’s own line.
This is fantastic for diffs, but takes up a lot of space.
It looks quite readable, at least.
empty_list_align: inherit
This is a mostly irrelevant choice, since there is no alignment.
list_padding: 7 # length "import "
This sets it up so that the import list clears the import
, providing a clean visual break between lines.
You could go longer or shorter, but that’s up to you.
separate_lists: false
separate_lists
adds a space between a class and it’s methods or a type and it’s constructors.
true: There is single space between Foldable type and list of it’s functions.
import Data.Foldable (Foldable (fold, foldl, foldMap))
false: There is no space between Foldable type and list of it’s functions.
import Data.Foldable (Foldable(fold, foldl, foldMap))
I like it off, but this can go either way.
space_surround: false
This doesn’t really matter and can go either way.
With multiline
and now new_line_multiline
, this is probably better to be true
.
Space surround option affects formatting of import lists on a single line. The only difference is single space after the initial parenthesis and a single space before the terminal parenthesis.
true: There is single space associated with the enclosing parenthesis.
import Data.Foo ( foo )
false: There is no space associated with the enclosing parenthesis
import Data.Foo (foo)
Default: false
- language_pragmas:
style: vertical
align: false
remove_redundant: true
I know it looks nice to have aligned pragmas, but it’s annoying to view a diff and not easily tell what pragmas were added or removed. THis makes it obvious.
- simple_align:
cases: false
top_level_patterns: false
records: false
All of this visual alignment just ruins diffs. If you want visual alignment, align on an indentation boundary. Compare:
fromMaybe default maybeA = case maybeA of
Just a -> a
Nothing -> default
This looks nice, but it’s annoying to maintain and change.
fromMaybe default maybeA =
case maybeA of
Just a ->
a
Nothing ->
default
You still get alignment of the important bits, but it’s now safe to diffs and refactoring.
Likewise, adding, removing, or changing a field to a record should only trigger a diff on the relevant fields. Anything else is noise that detracts from signal.
Anyway, these are my recommendations for large projects that have multiple collaborators. If you’re working on a small project, then you don’t need to worry about anything here. These aren’t my aesthetic preferences, but these formatting choices do annoy me a lot less than pretty code pleases me.