Link to my final submission

Google Summer of Code notes

Personal stash of notes I taken to work on improving pattern matching for gccrs.

Compiler Glossary (i.e. terminologies that I encountered for the first time)

  • ADT: Algebric data types. Refers to structs and enums.

  • Variants: A enum type has multiple variants defined. Variants refer to the kind of types an enum can hold. Structs only have 1 variant.

  • Scrutinee: The expression that is getting matched. e.g.: match x { ... }, x is the scrutinee.

  • Discriminant: Some data used by enums to stored to distinguish different variants of enums. Since structs also share the same ADT representation as enums in gccrs, they also have a discriminant value though it should be set to 0 always.

Classes involved in pattern compilation

  • Rust::Resolver::* & Rust::Resolver2_0::*: They are namespaces for classes that handles name resolution. Admittedly I am not very familiar with this topic, but name resolution is carried out on the AST before lowering to HIR.
  • Rust::HIR::ASTLoweringPattern: Translates patterns from AST to HIR.
  • Rust::Compile::CompilePatternCheckExpr: Compiles the checking expression for every match arm of a pattern.
  • Rust::Compile::CompilePatternBindings: Compiles the bindings of variables (e.g. (x, y, z) => { /* use x, y & z here */ } and Foo::Bar(i) => { /* use i here */ }).
  • Rust::Compile::CompilePatternLet: Compiles let statements like let <PATTERN> = my_expression (e.g. for TuplePattern, let (x, y, z) = my_tensor).

IdentifierPattern matching

rust-compile-pattern.h: CompilePatternCheckExpr's void visit (HIR::IdentifierPattern &) override is implemented as always returning boolean_true_node, even in the case where a subpattern is specified with @. The function has to be modified to return an expression check if a subpattern is specified.

Dumping hir-pretty shows that the IdentifierPattern's subpattern is not parsed properly. Given the following simple source:

fn main() {
  let x = 2;

  match x {
    a @ 2 => {}
  }
}

We get the following AST, and the following HIR,

fn main() {
  let x = 2;
  match x {
    a @ 2 => {
    }
    ,
  } /* tail expr */ 
}
MatchExpr [
  inner_attrs: empty
  mapping: [C: 0 Nid: 20 Hid: 28]
  outer_attributes: empty
  branch_value: 
    PathInExpression [
        segments: x, 
        mapping: [C: 0 Nid: 12 Hid: 23]
        outer_attributes: empty
        has_opening_scope_resolution: 0
    ] // PathInExpression
  match_arms {
    MatchCase [
      arm {
        MatchArm [
          match_arm_patterns {
            IdentifierPattern [
                variable_ident: a
                is_ref: 0
                mut: 0
                to_bind: none
            ] // IdentifierPattern
          } // match_arm_patterns
        ] // MatchArm
      } // arm
      expr: 
        BlockExpr [
          mapping: [C: 0 Nid: 18 Hid: 25]
          outer_attributes: empty
          inner_attrs: empty
          tail_reachable: 1
          statements: empty
        ] // BlockExpr
    ] // MatchCase
  } // match_arms
] // MatchExpr

Despite the subpattern showing up on the AST, the subpattern is not lowered to the HIR.

Strategy

Pattern lowering (I'm guessing rust-ast-lower-pattern.cc: void ASTLoweringPattern::visit (AST::IdentifierPattern &pattern)) has to be updated to recognize subpatterns of an IdentifierPattern. [DONE ✅]

Afterwards, reimplement void visit (HIR::IdentifierPattern &) override to compile check expressions if to_bind subpattern is present (TO ASK: Whether we should rename this to subpattern instead?). (to_bind was renamed to subpattern). [DONE ✅]

There still exists other areas where subpattern support is missing (e.g. under CompileVarDecl, ClosureParamInfer), and those should be implemented with a lower priority. [TO-DO ❌]

TupleStructPattern matching

CompilePatternCheckExpr::visit (HIR::TupleStructPattern &pattern): payload_ref is of type INTEGER_TYPE (verified with GDB), which fails the check in the next line when passed back as parameter into Backend::struct_field_expression. This problem is a result of the code assuming that all TupleStructPattern-s being compiled here are enums, which have a different data representation from normal TupleStructPattern-s.

Strategy

I have to add a new branch in the compilation logic for TupleStructPattern for non-enum tuple structs. The compilation code for this should follow that of TuplePattern's. [DONE ✅]

TuplePattern w/ RestPattern

Type checking and compilation of check expressions are not implemented.

Strategy

Simply complete the functions for ItemType::RANGED: cases.

  • Type checking (TypeCheckPattern::visit(HIR::TuplePattern)) [DONE ✅ | Backlog TO-DO ❌: Update the type checker to continue type checking even after the size check fails]
  • Check expression compilation (CompilePatternCheckExpr::visit(HIR::TuplePattern)) [UNDER REVIEW 🚧]

SlicePattern

Type checking and compilation of check expression are not implemented.

Strategy

Simply complete the functions.

  • Type checking (TypeCheckPattern::visit(HIR::SlicePattern)) [ON-GOING 🚧]
  • Check expression compilation (CompilePatternCheckExpr::visit(HIR::SlicePattern), currently just returns a always-true node) [TO-DO ❌]

SlicePattern w/ RestPattern

AST Lowering for RestPattern is not properly implemented - the RestPattern is translated into a nullptr, likely due to the absense of an AST Lowering function that translates it. Safe to assume that type checking and compilation of check expression are also unimplemented.

Strategy

  • Implement AST Lowering for RestPattern in SlicePattern... [TO-DO ❌]
  • Followed by type checking (should be easy)... [TO-DO ❌]
  • And finally, full compilation support in CompilePatternCheckExpr::visit(HIR::SlicePattern). [TO-DO ❌]

Week 1: Debugging changes made for IdentifierPattern support

Friday (2025-06-06) log:

Made some changes to support lowering of IdentifierPattern subpattern from AST to HIR. After finding out that compilation of subpatterns is faulty and doing a lot of GDB-ing, I'm still not too sure what is causing the following check to fail, which causes the expression of the subpattern to be compiled into error_mark tree node...

// gcc/rust/backend/rust-compile-expr.cc
...
void
CompileExpr::visit (HIR::LiteralExpr &expr)
{
  TyTy::BaseType *tyty = nullptr;
  if (!ctx->get_tyctx ()->lookup_type (expr.get_mappings ().get_hirid (),
				       &tyty))
    return;
...

Gonna spend the weekend figuring out the type-check system in order to find out what change I should be making to make the type check context do this lookup properly.

Sunday (2025-06-08) log:

Turns out I needed to add respective type-checking in gcc/rust/typecheck/rust-hir-type-check-pattern.cc (Thanks Owen Avery for pointing that out!) - have to keep this in mind when implementing support for other patterns.

Week 2: Debugging bindings for IdentifierPattern

Tuesday (2025-06-10) log:

I have to make the compiler also compile the bindings of the subpattern to the match scrutinee expression, but it seems to be not as simple as adding this under CompilePatternBindings for IdentifierPattern:

  if (pattern.has_subpattern ())
  {
    CompilePatternBindings::Compile (pattern.get_subpattern (),
                                     match_scrutinee_expr, ctx);
  }

Did a HIR dump comparison between subpatterned and non-subpatterned patterns, for now I am unable to see why the short snipplet does not compile the bindings of subpatterns properly. I have to look at GDB again...

Friday (2025-06-13) log:

I had successfully implemented and tested subpattern bindings for IdentifierPattern, turns out that the codebase did not support name resolution of subpatterns, which I had added in my pull request (Thanks Pierre-Emmanuel Patry for the guidance!).

While there are still more scenarios that IdentifierPattern subpatterns are unsupported (like this example of subpattern being used in a let statement), I think it is more important to move on to implementing minimal support for other patterns first. To start investigating TupleStructPattern compilation errors tomorrow.

Week 3: Fixing compilation for TupleStructPattern

A pretty unproductive week as I was busy spending time with family and friends. With better understanding of how pattern compilation works in gccrs, I was able to deduce that the failures from compilation of TupleStructPattern comes from an oversight that assumes all TupleStructPattern-s are enums - this is partly due to enums being translated to TupleStructPattern-s when being lowered from AST to HIR. PR fixing this issue.

Week 4: Making RestPattern work

Thursday (2025-06-26) log:

As mentioned in my initial proposal, type checking support was not implemented for RestPattern and SlicePattern. I went around implementing type checking support for RestPattern for tuples (known as ItemType::RANGE in the codebase), which is a derivation of the default tuple type checking, splitted into more steps to type check the lower and upper patterns (i.e. patterns to the left and right of ..) separately. PR implementing this type checking.

Sunday (2025-06-29) log:

Full compilation support for RestPattern for tuples was implemented (Link to PR).

Subsequently while waiting for the PR to be reviewed, I started looking into SlicePattern, and had identified the functions that requires implementation as highlighted above in my notes.

Week 5: Type checking SlicePattern (Part 1)

Tiring week, didn't get to work on gccrs much after my work hours during weekdays. Slept through Saturday too.

SlicePattern type checking is more tricky than it seems.

  • rustc reference function (check_pat_slice)
  • Issue: HIR implementation for ArrayType in gccrs does not have known capacity during compile-time - capacity is compiled to an expression instead. This was actually a FIXME in ArrayType's code, maybe a future challenge for me.
    • Looking at rustc's code, I don't think there is a need to check capacity against a SliceType parent...?
  • To-do: Loop through all elements in the SlicePattern and type check it against the base type of the ArrayType. This probably can also be done on SliceType.
  • To-do: Set inferred type to be the ArrayType or SliceType depending on the parent.
  • Future challenge: AST Lowering must be updated to accomodate for RestPattern's presense in the SlicePattern to split into lower and upper patterns similarly to TuplePattern- a problem for my future me to solve since I don't have experience with the AST Lowering codebase yet~

Week 6: Type checking SlicePattern (Part 2, mid-term evaluation due end of the week)

Was not diligent in updating logs for weeks 6-8, so the logs for the 3 weeks are quick summaries of what I've done within the period. PRs merged for this week:

Week 7: Codegen for SlicePattern matching (Part 1)

Worked on implementing compilation for SlicePattern matching against ArrayType scrutinee. Didn't work on SliceType scrutinee as I'm still not confident, I had to ask Philip about it and take a bit more time to figure it out myself.

Week 8: Codegen for SlicePattern matching (Part 2)

Only worked on the PR to implement compilation for SlicePattern matching against SliceType scrutinee. The main challenging part is to create a new helper function in the backend code to generate code that accesses slice elements. It took me a long while to figure out how SliceType is represented as gcc trees within gccrs, though it's mostly thanks to the very helpful debug function in rust-gcc.cc, and I'm glad to implement it properly with the feedback and help from Philip as well!

Week 9: RestPattern support for SlicePattern

The plan is to update the parser code and every stage after it to add support for RestPattern for SlicePattern. I forsee that it will also take a while due to needing to orientate myself in the code for the parser which I never touched before.

The other thing I noticed is code duplication in TupleStructItems, TuplePatternItems and (soon) SlicePatternItems. I plan to ask Philip about whether it would be wise to refactor those code into 1 common PatternItems class that stores multi-patterns list after I'm done with the above. If things go smoothly, my project should reach its conclusion by the end of Week 10/start of Week 11.