Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use old ctx if has same expand environment during decode span #127279

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bvanjoi
Copy link
Contributor

@bvanjoi bvanjoi commented Jul 3, 2024

Fixes #112680

The root reason why #112680 failed with incremental compilation on the second attempt is the difference in opaque between the span of the field ident and the span in the incremental cache at tcx.def_ident_span(field.did).

  • Let's call the span of ident as span_a, which is generated by apply_mark_internal. Its content is similar to:
span_a_ctx -> SyntaxContextData {
      opaque: span_a_ctx,
      opaque_and_semitransparent: span_a_ctx,
      // ....
}
  • And call the span of tcx.def_ident_span as span_b, which is generated by decode_syntax_context. Its content is:
span_b_ctx -> SyntaxContextData {
      opaque: span_b_ctx,
      // note `span_b_ctx` is not same as `span_a_ctx`
      opaque_and_semitransparent: span_b_ctx,
      // ....
}

Although they have the same parent (both refer to the root) and outer_expn, I cannot find the specific connection between them. Therefore, I chose a solution that may not be the best: give up the incremental compile cache to ensure we can use span_a in this case.

r? @petrochenkov Do you have any advice on this? Or perhaps this solution is acceptable?

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 3, 2024
@petrochenkov
Copy link
Contributor

Could you check if the issue reproduces with #119412?
ident in macros from other crates is known to have buggy spans, but I don't know whether this issue is related to that or not.

@bvanjoi
Copy link
Contributor Author

bvanjoi commented Jul 4, 2024

Could you check if the issue reproduces with #119412

Unfortunately, this issue still exists

@cjgillot
Copy link
Contributor

cjgillot commented Jul 4, 2024

Whether def_ident_span is cached on disk or not must not change the returned value. If it does, the bug is in cache decoding. Avoiding the cache will just hide the bug until another code path hits it.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 4, 2024
@bvanjoi
Copy link
Contributor Author

bvanjoi commented Jul 4, 2024

Whether def_ident_span is cached on disk or not must not change the returned value

Yep, but I think this might be an omission point during decoding spans that come from the disk cache: sometimes it may not be necessary to generate a new ctxt, and using the old one is enough. So in the latest PR, I added this condition: if it comes from the same macro expand environment and the old ctxt exists, then use the old one.

This means the span_b will become:

span_b_ctx -> SyntaxContextData {
      opaque: span_a.opaque,
      opaque_and_semitransparent: span_a.opaque_and_semitransparent,
      // ....
}

I'm not sure if it's completely correct, but it's more convincing than simply disabling the disk cache for ident spans.

@bvanjoi bvanjoi changed the title not cache def_ident_span from disk use old ctx if has same expand environment during decode span Jul 4, 2024
@cjgillot cjgillot self-assigned this Jul 4, 2024
@petrochenkov
Copy link
Contributor

I think the new fix is in the right direction.

In SyntaxContextData these three fields are substantial

    outer_expn: ExpnId,
    outer_transparency: Transparency,
    parent: SyntaxContext,

and these two fields are caches / precomputed values for some operations on the substantial fields

    /// This context, but with all transparent and semi-transparent expansions filtered away.
    opaque: SyntaxContext,
    /// This context, but with all transparent expansions filtered away.
    opaque_and_semitransparent: SyntaxContext,

The last field seems to also be ignored during decoding (with a reasonable explanation).

    /// Name of the crate to which `$crate` with this context would resolve.
    dollar_crate_name: Symbol,

So there are two possible strategies for encoding/decoding SyntaxContextData.

  • Encode/decode both the substantial and the auxiliary fields.
    This strategy is used now but apparently there's a bug somewhere that prevents the decoded fields from matching.
    I'm actually interested why they don't match, maybe the root issue is somewhere else and the new fix will just hide it as well.
  • Encode/decode only the substantial fields.
    Recompute the remaining fields (possibly using a cache as well, see syntax_context_map for an example).
    We should do this if it is faster than encoding/decoding the auxiliary fields.
    However, I still think we need to first figure out why the auxiliary fields decoding misbehaves now.

@bvanjoi
Copy link
Contributor Author

bvanjoi commented Jul 5, 2024

why they don't match

As you can see, the fields outer_expn, outer_transparency, and parent have not changed, so we don't need to consider them. I'd like to explain why opaque is not equal between the first compilation and the second with incremental compilation.

The content of span which need to encode is:

span_a_ctx -> SyntaxContextData {
      opaque: span_a_ctx, 
      //~^ note that `span_a_data.opaque` and `span_a_ctx` have the same value
      // ....
}

And during the second compilation:

  • It will create span_a_ctx again during MarkerVisior which occurs before processing the incremental file.

  • And it will try to handle the disk cache when using the query system. The span_a_data.opaque will be deserialized into raw_id.

    And then the new ctxt(that is span_b_ctx) with dummy data will be appended because this is the first time raw_id is loaded during decoding, see code here

    Then it begins to decode the content of span_a_data.opaque and enters decode_syntax_context again. Because the raw_id is the same as before, it encounters a cycle and directly returns the span_b_ctx(see here).

    And now the content of span_b_data.opaque is different from span_a_data_created_during_seoncd_compilation.opaque, so make this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

decl_macro incremental compilation bug: missing field
4 participants