Making a small Clippy lint
Introduction
This started some time ago when a friend asked out into the void:
Is there a Clippy lint to enforce exhaustive structural pattern matching?
The examples they gave were the following:
Bad Case
let Foo { bar, .. } = foo;
match bar {
Bar::A { a, b, .. } => ...,
Bar::B { a, .. } => ...,
}
Good Case
let Foo { bar, baz: _ } = foo;
match bar {
Bar::A { a, b, c: _ } => ...,
Bar::B { a, b: _ } => ...,
}
The motivation for this lint was mainly to ensure that you never had any unhandled variables in a struct you are matching on. So that if you add a new field to the struct, you will get a warning everywhere you match on it.
After some digging we could not find a lint that did it, the only thing we could find was that in a request for the opposite there was a comment asking for it.
I had some spare time so I decided to figure out how to write a lint and try to implement it.
First attempt
I looked at the Clippy book which has a great guide for the basics for creating a new lint.
Not long afterwards I had the first attempt at a lint that solves the issue, lets break it down
First we import various utilities and rustc
parts:
use clippy_utils::{diagnostics::span_lint_and_then};
use rustc_lint::{EarlyContext, EarlyLintPass};
use rustc_session::declare_lint_pass;
use rustc_ast::ast::{Pat, PatKind};
We then declare a new lint with example, lint group and which short note should be listed:
declare_clippy_lint! {
/// What it does
/// Nothing yet
///
/// ### Example
/// `` `no_run
/// struct Foo {
/// bar: u8,
/// baz: u8,
/// }
///
/// let f = Foo { bar: 1, baz: 2 };
/// let Foo { bar, .. } = f;
/// `` `
#[clippy::version = "1.90.0"]
pub REST_WHEN_DESTRUCTURING,
nursery,
"rest (..) in destructuring expression"
}
declare_lint_pass!(RestDestructuring => [REST_WHEN_DESTRUCTURING]);
Finally we do the actual interesting part and write the actual lint which will find places where the Rest part is used in in struct patterns and shout loudly when doing so.
impl EarlyLintPass for RestDestructuring {
fn check_pat(&mut self, cx: &EarlyContext<'_>, pat: &Pat) {
if let PatKind::Struct(_, _, _, rustc_ast::ast::PatFieldsRest::Rest) = pat.kind {
span_lint_and_then(
cx,
REST_WHEN_DESTRUCTURING,
pat.span,
"AAAAAAAAAAAAAAAAAAAAAAAAAAa",
|diag| {
diag.help("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA");
},
);
}
}
}
Here we implement the trait EarlyLintPass
which has a lot of default implementations, so you should only implement the trait methods that you are interested in. We implement the check_pat
method which is run on every patterns, we are for now only interested in struct patterns that have a PatFieldsRest::Rest
so we match on that.
This will now shout at the user if they use a Rest
in a struct pattern, for example:
#![deny(clippy::rest_when_destructuring)]
#![allow(dead_code)]
#![allow(unused_variables)]
fn main() {
struct S {
a: u8,
b: u8,
c: u8,
}
let s = S { a: 1, b: 2, c: 3 };
let S { a, b, .. } = s;
}
Will output the following when run with Clippy
cargo dev lint bla.rs
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
Running `target/debug/clippy_dev lint bla.rs`
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.04s
Running `target/debug/clippy-driver -L ./target/debug -Z no-codegen --edition 2024 bla.rs`
error: AAAAAAAAAAAAAAAAAAAAAAAAAAa
--> bla.rs:14:9
|
14 | let S { a, b, .. } = s;
| ^^^^^^^^^^^^^^
|
= help: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#rest_when_destructuring
note: the lint level is defined here
--> bla.rs:1:9
|
1 | #![deny(clippy::rest_when_destructuring)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A cleaned up version of this with better documentation and better lint comments was the first version I opened a pull request with.
Full source of the first version
use clippy_utils::diagnostics::span_lint_and_help;
use rustc_ast::ast::{Pat, PatFieldsRest, PatKind};
use rustc_lint::{EarlyContext, EarlyLintPass};
use rustc_session::declare_lint_pass;
declare_clippy_lint! {
/// ### What it does
/// Disallows the use of rest patterns when destructuring structs.
///
/// ### Why is this bad?
/// It might lead to unhandled fields when the struct changes.
///
/// ### Example
/// ```no_run
/// struct S {
/// a: u8,
/// b: u8,
/// c: u8,
/// }
///
/// let s = S { a: 1, b: 2, c: 3 };
///
/// let S { a, b, .. } = s;
/// ```
/// Use instead:
/// ```no_run
/// struct S {
/// a: u8,
/// b: u8,
/// c: u8,
/// }
///
/// let s = S { a: 1, b: 2, c: 3 };
///
/// let S { a, b, c: _ } = s;
/// ```
#[clippy::version = "1.89.0"]
pub REST_WHEN_DESTRUCTURING_STRUCT,
nursery,
"rest (..) in destructuring expression"
}
declare_lint_pass!(RestWhenDestructuringStruct => [REST_WHEN_DESTRUCTURING_STRUCT]);
impl EarlyLintPass for RestWhenDestructuringStruct {
fn check_pat(&mut self, cx: &EarlyContext<'_>, pat: &Pat) {
if let PatKind::Struct(_, _, _, PatFieldsRest::Rest) = pat.kind {
span_lint_and_help(
cx,
REST_WHEN_DESTRUCTURING_STRUCT,
pat.span,
"struct destructuring with rest (..)",
None,
"consider explicitly ignoring remaining fields with wildcard patterns (x: _)",
);
}
}
}
Issues with the first version
The two main issues with this version as pointed out by the reviewer was that it would be nice to have suggestions and that it would get triggered by code generated by macros. Both generating suggestions and figuring out if the code was generated by a proc macro required the lint to be converted to a late pass.
<aside>
What actually is the name of ..
and foo: _
when used in a struct pattern?
When we first talked about this lint we had some discussions about what The actual name of the dots and colon-underscores were in Rust, when we looked into it there seemed to be no clear answer.
With colon-underscores foo : _
the reference was calling it "placeholder", but in other places the _
pattern would be called a wildcard, for example in a match statement.
With dot dot ..
it was a bit more confusing because it could both be called a Rest
, wildcard or it could be called "et cetera".
This <aside>
discussion ended up in a issue and concluded that it should probably be called "et cetera" when used in destructuring.
I continued to call it "rest" since that is what it is called internally in the compiler, at least for now.
The second better version using a late pass.
Clippy has two kinds of passes, early passes which has access to the AST and late passes which has access to type information. If I wanted to add a hint for which fields you should add I needed the type information to get the unnamed fields.
So now I instead need to implement a late pass, we also add some statements to ensure we are outside of macros (a bit more on those later).
To use a late pass we want to implement LateLintPass
instead.
We are still only interested in when patterns are being used so we again use the check_pat
trait method, although we now have the LateContext
which allows us to get access to type information, and the HIR version of the pattern struct.
The coding style in Clippy uses let chains quite a lot, so we do quite a few things on a couple of lines:
impl<'tcx> LateLintPass<'tcx> for RestWhenDestructuringStruct {
fn check_pat(&mut self, cx: &rustc_lint::LateContext<'tcx>, pat: &'tcx rustc_hir::Pat<'tcx>) {
if let rustc_hir::PatKind::Struct(path, fields, true) = pat.kind
&& !pat.span.in_external_macro(cx.tcx.sess.source_map())
&& !is_from_proc_macro(cx, pat)
&& let qty = cx.typeck_results().qpath_res(&path, pat.hir_id)
&& let ty = cx.typeck_results().pat_ty(pat)
&& let ty::Adt(a, _) = ty.kind()
{
Here we check the following:
- That we have a struct pattern with a rest (
..
) field, that is what the true
means.
- The pattern is not in a external macro.
- The pattern is not in a proc macro.
- Get the type of the path to the struct we are matching on. (this is used to get variant in enums for example)
- Get the specific type of the struct we are matching on.
We then get the identifier of the variant of the ADT, this is really only interesting for enums which can also contain struct variants, for normal structs this is always a special ZERO
value:
let vid = qty
.opt_def_id()
.map_or(VariantIdx::ZERO, |x| a.variant_index_with_id(x));
let mut rest_fields = a.variants()[vid]
We can then extract all the fields that the ..
represents, we turn this into a string containing the code for the underscore form of the missing fields:
let mut rest_fields = a.variants()[vid]
.fields
.iter()
.map(|field| field.ident(cx.tcx))
.filter(|pf| !fields.iter().any(|x| x.ident == *pf))
.map(|x| format!("{x}: _"));
let fmt_fields = rest_fields.join(", ");
One of the larger issues I ran into when I converted the lint to a LatePass
lint was that at some point rustc
drops the span information for the ..
, so when you only has access to the rustc_hir
information you have to be a bit creative.
To extract the span of the ..
symbol we take the span of the last field in the struct (or the whole pattern if there are no fields) and then do various operations on the span together with a source map to find the span of the ..
symbol. This is probably the worst part of this code, and it should be possible to keep the span information into the HIR, but that is in rustc
and not specifically in Clippy:
// It is not possible to get the span of the et cetera symbol at HIR level
// so we have to get it in a bit of a roundabout way:
// Find the end of the last field if any.
let last_field = fields.iter().last().map(|x| x.span.shrink_to_hi());
// If no last field take the whole pattern.
let last_field = last_field.unwrap_or(pat.span.shrink_to_lo());
// Create a new span starting and ending just before the first .
let before_dot = sm.span_extend_to_next_char(last_field, '.', true).shrink_to_hi();
// Extend the span to the end of the line
let rest_of_line = sm.span_extend_to_next_char(before_dot, '\n', true);
// Shrink the span so it only contains dots
let dotdot = sm.span_take_while(rest_of_line, |x| *x == '.');
We then finally have to execute the actual link and give the suggestion on what to change:
span_lint_and_then(
cx,
REST_WHEN_DESTRUCTURING_STRUCT,
pat.span,
"struct destructuring with rest (..)",
|diag| {
diag.span_suggestion_verbose(
dotdot,
"consider explicitly ignoring remaining fields with wildcard patterns (x: _)",
fmt_fields,
rustc_errors::Applicability::MachineApplicable,
);
},
);
Here we also said that the lint is machine applicable, this means that you can use cargo clippy --fix
to automatically resolve it.
}
}
}
Full formatted source of the second version
use crate::rustc_lint::LintContext as _;
use clippy_utils::diagnostics::span_lint_and_then;
use clippy_utils::is_from_proc_macro;
use itertools::Itertools;
use rustc_abi::VariantIdx;
use rustc_lint::LateLintPass;
use rustc_middle::ty;
use rustc_session::declare_lint_pass;
declare_clippy_lint! {
/// ### What it does
/// Disallows the use of rest patterns when destructuring structs.
///
/// ### Why is this bad?
/// It might lead to unhandled fields when the struct changes.
///
/// ### Example
/// ```no_run
/// struct S {
/// a: u8,
/// b: u8,
/// c: u8,
/// }
///
/// let s = S { a: 1, b: 2, c: 3 };
///
/// let S { a, b, .. } = s;
/// ```
/// Use instead:
/// ```no_run
/// struct S {
/// a: u8,
/// b: u8,
/// c: u8,
/// }
///
/// let s = S { a: 1, b: 2, c: 3 };
///
/// let S { a, b, c: _ } = s;
/// ```
#[clippy::version = "1.89.0"]
pub REST_WHEN_DESTRUCTURING_STRUCT,
nursery,
"rest (..) in destructuring expression"
}
declare_lint_pass!(RestWhenDestructuringStruct => [REST_WHEN_DESTRUCTURING_STRUCT]);
impl<'tcx> LateLintPass<'tcx> for RestWhenDestructuringStruct {
fn check_pat(&mut self, cx: &rustc_lint::LateContext<'tcx>, pat: &'tcx rustc_hir::Pat<'tcx>) {
if let rustc_hir::PatKind::Struct(path, fields, true) = pat.kind
&& !pat.span.in_external_macro(cx.tcx.sess.source_map())
&& !is_from_proc_macro(cx, pat)
&& let qty = cx.typeck_results().qpath_res(&path, pat.hir_id)
&& let ty = cx.typeck_results().pat_ty(pat)
&& let ty::Adt(a, _) = ty.kind()
{
let vid = qty
.opt_def_id()
.map_or(VariantIdx::ZERO, |x| a.variant_index_with_id(x));
let mut rest_fields = a.variants()[vid]
.fields
.iter()
.map(|field| field.ident(cx.tcx))
.filter(|pf| !fields.iter().any(|x| x.ident == *pf))
.map(|x| format!("{x}: _"));
let fmt_fields = rest_fields.join(", ");
let sm = cx.sess().source_map();
// It is not possible to get the span of the et cetera symbol at HIR level
// so we have to get it in a bit of a roundabout way:
// Find the end of the last field if any.
let last_field = fields.iter().last().map(|x| x.span.shrink_to_hi());
// If no last field take the whole pattern.
let last_field = last_field.unwrap_or(pat.span.shrink_to_lo());
// Create a new span starting and ending just before the first .
let before_dot = sm.span_extend_to_next_char(last_field, '.', true).shrink_to_hi();
// Extend the span to the end of the line
let rest_of_line = sm.span_extend_to_next_char(before_dot, '\n', true);
// Shrink the span so it only contains dots
let dotdot = sm.span_take_while(rest_of_line, |x| *x == '.');
span_lint_and_then(
cx,
REST_WHEN_DESTRUCTURING_STRUCT,
pat.span,
"struct destructuring with rest (..)",
|diag| {
diag.span_suggestion_verbose(
dotdot,
"consider explicitly ignoring remaining fields with wildcard patterns (x: _)",
fmt_fields,
rustc_errors::Applicability::MachineApplicable,
);
},
);
}
}
}
Dealing with macros
The way that Clippy checks if a piece of code comes from a proc macro is a bit special. It starts by loading the source map of the code which is a representation of the actual code as it is written. This source map is then used to check if what you expect to be at the span of the object is also what is in the actual code. This was not implemented for patterns, I only really needed the case of struct patterns, but I ended up implementing it for all patterns:
A big match checking all kinds of patterns
fn pat_search_pat(tcx: TyCtxt<'_>, pat: &rustc_hir::Pat<'_>) -> (Pat, Pat) {
match pat.kind {
rustc_hir::PatKind::Missing | rustc_hir::PatKind::Err(_) => (Pat::Str(""), Pat::Str("")),
rustc_hir::PatKind::Wild => (Pat::Sym(kw::Underscore), Pat::Sym(kw::Underscore)),
rustc_hir::PatKind::Binding(binding_mode, _, ident, Some(end_pat)) => {
let start = match binding_mode.0 {
rustc_hir::ByRef::Yes(rustc_hir::Mutability::Not) => Pat::Str("ref"),
rustc_hir::ByRef::Yes(rustc_hir::Mutability::Mut) => Pat::Str("ref mut"),
rustc_hir::ByRef::No => {
let (s, _) = ident_search_pat(ident);
s
},
};
let (_, end) = pat_search_pat(tcx, end_pat);
(start, end)
},
rustc_hir::PatKind::Binding(binding_mode, _, ident, None) => {
let (s, end) = ident_search_pat(ident);
let start = match binding_mode.0 {
rustc_hir::ByRef::Yes(rustc_hir::Mutability::Not) => Pat::Str("ref"),
rustc_hir::ByRef::Yes(rustc_hir::Mutability::Mut) => Pat::Str("ref mut"),
rustc_hir::ByRef::No => s,
};
(start, end)
},
rustc_hir::PatKind::Struct(path, _, _) => {
let (start, _) = qpath_search_pat(&path);
(start, Pat::Str("}"))
},
rustc_hir::PatKind::TupleStruct(path, _, _) => {
let (start, _) = qpath_search_pat(&path);
(start, Pat::Str(")"))
},
rustc_hir::PatKind::Or(plist) => {
// documented invariant
debug_assert!(plist.len() >= 2);
let (start, _) = pat_search_pat(tcx, plist.first().unwrap());
let (_, end) = pat_search_pat(tcx, plist.last().unwrap());
(start, end)
},
rustc_hir::PatKind::Never => (Pat::Str("!"), Pat::Str("")),
rustc_hir::PatKind::Tuple(_, _) => (Pat::Str("("), Pat::Str(")")),
rustc_hir::PatKind::Box(p) => {
let (_, end) = pat_search_pat(tcx, p);
(Pat::Str("box"), end)
},
rustc_hir::PatKind::Deref(_) => (Pat::Str("deref!("), Pat::Str(")")),
rustc_hir::PatKind::Ref(p, _) => {
let (_, end) = pat_search_pat(tcx, p);
(Pat::Str("&"), end)
},
rustc_hir::PatKind::Expr(expr) => pat_search_pat_expr_kind(expr),
rustc_hir::PatKind::Guard(pat, guard) => {
let (start, _) = pat_search_pat(tcx, pat);
let (_, end) = expr_search_pat(tcx, guard);
(start, end)
},
rustc_hir::PatKind::Range(None, None, range) => match range {
rustc_hir::RangeEnd::Included => (Pat::Str("..="), Pat::Str("")),
rustc_hir::RangeEnd::Excluded => (Pat::Str(".."), Pat::Str("")),
},
rustc_hir::PatKind::Range(r_start, r_end, range) => {
let (start, _) = match r_start {
Some(e) => pat_search_pat_expr_kind(e),
None => match range {
rustc_hir::RangeEnd::Included => (Pat::Str("..="), Pat::Str("")),
rustc_hir::RangeEnd::Excluded => (Pat::Str(".."), Pat::Str("")),
},
};
let (_, end) = match r_end {
Some(e) => pat_search_pat_expr_kind(e),
None => match range {
rustc_hir::RangeEnd::Included => (Pat::Str(""), Pat::Str("..=")),
rustc_hir::RangeEnd::Excluded => (Pat::Str(""), Pat::Str("..")),
},
};
(start, end)
},
rustc_hir::PatKind::Slice(_, _, _) => (Pat::Str("["), Pat::Str("]")),
}
}
fn pat_search_pat_expr_kind(expr: &crate::PatExpr<'_>) -> (Pat, Pat) {
match expr.kind {
crate::PatExprKind::Lit { lit, negated } => {
let (start, end) = lit_search_pat(&lit.node);
if negated { (Pat::Str("!"), end) } else { (start, end) }
},
crate::PatExprKind::ConstBlock(_block) => (Pat::Str("const {"), Pat::Str("}")),
crate::PatExprKind::Path(path) => qpath_search_pat(&path),
}
}
This code is mostly a big match where I define what I would expect what each pattern starts and ends with, it was quite interesting to learn that as soon as you hit HIR you no longer knows where the code comes from.
Conclusion
So a bit after I made the pull request the Clippy team made this announcement, announcing a feature freeze for Clippy for 12 weeks, so for now the PR is still open and may end up having more changes before a potential merge.
All in all no matter the outcome this small project was a interesting view into the internals both Clippy and rustc