-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RuntimeAnnotations to supply Gematria models with cache miss information. #27
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overal, the PR looks good. I'd just go through the naming generailization (what is in the comments in the proto file, and corresponding renames elsewhere).
* Places `AnnotationProto` up one level so that it can be used with `BasicBlock`s along with `CanonicalizedInstruction`s. * Annotations are now stored as `repeated` in protos, and in `vector`s or `list`s rather than each type of annotation having a corresponding data member. * Naming has changed accordingly e.g. `RuntimeAnnotation` becomes `Annotation`.
…atria into model-runtime-annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through basic_block/**
and proto/**
. I still need to read through granite/**
.
I have made most of the requested changes. I have some questions about whether I should make the changes related to name-only annotations or get rid of them entirely, which I've mentioned in an earlier comment replying to one of the requested changes. Thanks for your PR review and comments. |
// Store the annotations for later use (inclusion in embeddings), using -1 | ||
// as a default value wherever annotations are missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there is a better default than -1, but I can't think of any.
* Annotations no longer have `value == -1` by default. * Entries in the `instruction_annotations` data structure corresponding to missing annotations still hold -1 as a placeholder.
* Instruction annotation values are stored in a `vector` of `vector`s instead of a `unordered_map` mapping the annotation type names to a `vector` holding annotation values. * Annotation names are now stored separately in a `set`, and a list of the names of annotation types being used is passed into the graph builder's constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few more comments, but otherwise it looks good.
// Extracts the set of annotation names from the model. This should be a Const | ||
// tensor, and as such, it should be readable without providing any inputs. | ||
// Returns an error when the annotation names tensor is not found or when it is | ||
// not readable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please add a TODO (for me or for you) to extract most of the logic from here and the GetNodeTokenList() to a single function.
This patch adds the necessary infrastructure for creating lit tests for convert_bhive_to_llvm_exegesis_input. This patch also adds in a single unit test to demonstrate the functionality.
This patch adds some additional lit regression tests for the Exegesis conversion script to test functionality that was previously implemented/improved now that we actually have testing facilities.
This patch removes the comment specifying that the call to GetAccessedAddrs only got the first segfault rather than iteratively looking through each one. This isn't true in the Exegesis case. It is true in the current version for the fast annotator, but I believe this has already been fixed internally and just needs to be pushed out into the open.
Currently, when disassembly fails, the error is reported to the user. However, before this patch, there was a missing apostrophe at the end of the hex value. This patch fixes that behavior by adding the missing apostrophe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more changes, otherwise this should be good to go.
This patch adds a none annotator type to convert_bhive_to_llvm_exegesis_input that doesn't run any annotator. This is useful for looking at the output assembly for debugging purposes.
Currently, the --report_progress_every flag, even with the default value, will always report progress on the first block. This is not the intended behavior, as no output should be given with the default value. This patch fixes that behavior and also adds a test to ensure that the flag functions as intended in the default value case and in the value-provided case. Not really super critical to functionality, but a little bit annoying to have this.
Currently, the logic in convert_bhive_to_llvm_exegesis_input will create extra (empty) files as the output at the end runs unconditionally rather than when there are actually blocks to output. This means that if we have a number of blocks that is an exact multiple of the value in blocks_per_json_file, we end up getting no blocks after the loop, which means we output an empty JSON file. Fixes google#61.
This patch fixes a clang-formatting issue introduced in a previous patch.
This PR is being split into 4 separate PRs:
|
The 4th PR has now been opened as well: |
Adds a general framework to add instruction level annotations to the basic block representations, intended to be used to store cache miss frequencies (and later other values, such as branch misprediction related measures).