Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

Closed
wants to merge 45 commits into from

Conversation

virajbshah
Copy link
Collaborator

@virajbshah virajbshah commented Oct 30, 2023

Adds a general framework to add instruction level annotations to the basic block representations, intended to be used to store cache miss frequencies (and later other values, such as branch misprediction related measures).

@virajbshah virajbshah changed the title Add RuntimeAnnotation field to instruction proto. Add RuntimeAnnotations to supply Gematria models with cache miss information. Oct 30, 2023
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overal, the PR looks good. I'd just go through the naming generailization (what is in the comments in the proto file, and corresponding renames elsewhere).

gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved
gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved
gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.cc Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.h Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.cc Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.h Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.h Outdated Show resolved Hide resolved
gematria/basic_block/basic_block_protos_test.cc Outdated Show resolved Hide resolved
gematria/basic_block/python/basic_block.cc Outdated Show resolved Hide resolved
gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved
gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved
gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through basic_block/** and proto/**. I still need to read through granite/**.

gematria/basic_block/basic_block.cc Outdated Show resolved Hide resolved
gematria/basic_block/basic_block.h Show resolved Hide resolved
gematria/basic_block/basic_block_protos.h Outdated Show resolved Hide resolved
gematria/basic_block/basic_block_protos.cc Outdated Show resolved Hide resolved
gematria/proto/annotation.proto Outdated Show resolved Hide resolved
gematria/proto/annotation.proto Show resolved Hide resolved
@virajbshah
Copy link
Collaborator Author

I have made most of the requested changes. I have some questions about whether I should make the changes related to name-only annotations or get rid of them entirely, which I've mentioned in an earlier comment replying to one of the requested changes.

Thanks for your PR review and comments.

gematria/basic_block/basic_block.h Show resolved Hide resolved
gematria/basic_block/basic_block_protos.cc Outdated Show resolved Hide resolved
gematria/basic_block/basic_block_protos.cc Outdated Show resolved Hide resolved
gematria/granite/graph_builder_test.cc Outdated Show resolved Hide resolved
Comment on lines +195 to +196
// Store the annotations for later use (inclusion in embeddings), using -1
// as a default value wherever annotations are missing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there is a better default than -1, but I can't think of any.

gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
virajbshah and others added 6 commits February 12, 2024 15:02
 * Annotations no longer have `value == -1` by default.
 * Entries in the `instruction_annotations` data structure corresponding
   to missing annotations still hold -1 as a placeholder.
 * Instruction annotation values are stored in a `vector` of `vector`s
   instead of a `unordered_map` mapping the annotation type names to
   a `vector` holding annotation values.
 * Annotation names are now stored separately in a `set`, and a list
   of the names of annotation types being used is passed into the
   graph builder's constructor.
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few more comments, but otherwise it looks good.

gematria/basic_block/basic_block.h Outdated Show resolved Hide resolved
gematria/granite/graph_builder.h Outdated Show resolved Hide resolved
gematria/granite/graph_builder_model_inference.cc Outdated Show resolved Hide resolved
// Extracts the set of annotation names from the model. This should be a Const
// tensor, and as such, it should be readable without providing any inputs.
// Returns an error when the annotation names tensor is not found or when it is
// not readable.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please add a TODO (for me or for you) to extract most of the logic from here and the GetNodeTokenList() to a single function.

gematria/granite/graph_builder_model_inference.cc Outdated Show resolved Hide resolved
virajbshah and others added 7 commits March 1, 2024 22:56
This patch adds the necessary infrastructure for creating lit tests for convert_bhive_to_llvm_exegesis_input. This patch also adds in a single unit test to demonstrate the functionality.
This patch adds some additional lit regression tests for the Exegesis
conversion script to test functionality that was previously
implemented/improved now that we actually have testing facilities.
This patch removes the comment specifying that the call to
GetAccessedAddrs only got the first segfault rather than iteratively
looking through each one. This isn't true in the Exegesis case. It is
true in the current version for the fast annotator, but I believe this
has already been fixed internally and just needs to be pushed out into
the open.
Currently, when disassembly fails, the error is reported to the user.
However, before this patch, there was a missing apostrophe at the end of
the hex value. This patch fixes that behavior by adding the missing
apostrophe.
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more changes, otherwise this should be good to go.

gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved
gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved
boomanaiden154 and others added 6 commits March 11, 2024 10:54
This patch adds a none annotator type to
convert_bhive_to_llvm_exegesis_input that doesn't run any annotator.
This is useful for looking at the output assembly for debugging
purposes.
Currently, the --report_progress_every flag, even with the default
value, will always report progress on the first block. This is not the
intended behavior, as no output should be given with the default value.
This patch fixes that behavior and also adds a test to ensure that the
flag functions as intended in the default value case and in the
value-provided case.

Not really super critical to functionality, but a little bit annoying to
have this.
Currently, the logic in convert_bhive_to_llvm_exegesis_input will create
extra (empty) files as the output at the end runs unconditionally rather
than when there are actually blocks to output. This means that if we
have a number of blocks that is an exact multiple of the value in
blocks_per_json_file, we end up getting no blocks after the loop, which
means we output an empty JSON file.

Fixes google#61.
This patch fixes a clang-formatting issue introduced in a previous patch.
@virajbshah
Copy link
Collaborator Author

This PR is being split into 4 separate PRs:

@virajbshah virajbshah closed this Mar 26, 2024
@virajbshah
Copy link
Collaborator Author

The 4th PR has now been opened as well:

@virajbshah virajbshah deleted the model-runtime-annotations branch July 22, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants