Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

virajbshah · 2023-10-30T15:05:43Z

Adds a general framework to add instruction level annotations to the basic block representations, intended to be used to store cache miss frequencies (and later other values, such as branch misprediction related measures).

ondrasej

Overal, the PR looks good. I'd just go through the naming generailization (what is in the comments in the proto file, and corresponding renames elsewhere).

gematria/proto/canonicalized_instruction.proto

gematria/basic_block/basic_block.cc

gematria/basic_block/basic_block.h

gematria/basic_block/basic_block.cc

gematria/basic_block/basic_block.h

gematria/basic_block/basic_block_protos_test.cc

gematria/basic_block/python/basic_block.cc

gematria/granite/graph_builder.cc

gematria/granite/python/token_graph_builder_model.py

* Places `AnnotationProto` up one level so that it can be used with `BasicBlock`s along with `CanonicalizedInstruction`s. * Annotations are now stored as `repeated` in protos, and in `vector`s or `list`s rather than each type of annotation having a corresponding data member. * Naming has changed accordingly e.g. `RuntimeAnnotation` becomes `Annotation`.

…ime-annotations

…atria into model-runtime-annotations

ondrasej

I went through basic_block/** and proto/**. I still need to read through granite/**.

gematria/basic_block/basic_block.cc

gematria/basic_block/basic_block.h

gematria/basic_block/basic_block_protos.h

gematria/basic_block/basic_block_protos.cc

gematria/proto/annotation.proto

…atria into model-runtime-annotations

virajbshah · 2024-01-22T16:56:03Z

I have made most of the requested changes. I have some questions about whether I should make the changes related to name-only annotations or get rid of them entirely, which I've mentioned in an earlier comment replying to one of the requested changes.

Thanks for your PR review and comments.

gematria/basic_block/basic_block.h

gematria/basic_block/basic_block_protos.cc

gematria/granite/graph_builder_test.cc

ondrasej · 2024-02-04T23:55:36Z

gematria/granite/graph_builder.cc

+    // Store the annotations for later use (inclusion in embeddings), using -1
+    // as a default value wherever annotations are missing.


I'm wondering if there is a better default than -1, but I can't think of any.

gematria/granite/python/graph_builder_model_base.py

gematria/granite/python/token_graph_builder_model.py

* Annotations no longer have `value == -1` by default. * Entries in the `instruction_annotations` data structure corresponding to missing annotations still hold -1 as a placeholder.

* Instruction annotation values are stored in a `vector` of `vector`s instead of a `unordered_map` mapping the annotation type names to a `vector` holding annotation values. * Annotation names are now stored separately in a `set`, and a list of the names of annotation types being used is passed into the graph builder's constructor.

…ime-annotations

…atria into model-runtime-annotations

ondrasej

I've added a few more comments, but otherwise it looks good.

gematria/basic_block/basic_block.h

gematria/granite/graph_builder.h

gematria/granite/graph_builder_model_inference.cc

ondrasej · 2024-02-19T15:01:20Z

gematria/granite/graph_builder_model_inference.cc

+// Extracts the set of annotation names from the model. This should be a Const
+// tensor, and as such, it should be readable without providing any inputs.
+// Returns an error when the annotation names tensor is not found or when it is
+// not readable.


Nit: Please add a TODO (for me or for you) to extract most of the logic from here and the GetNodeTokenList() to a single function.

gematria/granite/graph_builder_model_inference.cc

This patch adds the necessary infrastructure for creating lit tests for convert_bhive_to_llvm_exegesis_input. This patch also adds in a single unit test to demonstrate the functionality.

This patch adds some additional lit regression tests for the Exegesis conversion script to test functionality that was previously implemented/improved now that we actually have testing facilities.

This patch removes the comment specifying that the call to GetAccessedAddrs only got the first segfault rather than iteratively looking through each one. This isn't true in the Exegesis case. It is true in the current version for the fast annotator, but I believe this has already been fixed internally and just needs to be pushed out into the open.

Currently, when disassembly fails, the error is reported to the user. However, before this patch, there was a missing apostrophe at the end of the hex value. This patch fixes that behavior by adding the missing apostrophe.

ondrasej

A few more changes, otherwise this should be good to go.

gematria/granite/graph_builder.cc

This patch adds a none annotator type to convert_bhive_to_llvm_exegesis_input that doesn't run any annotator. This is useful for looking at the output assembly for debugging purposes.

Currently, the --report_progress_every flag, even with the default value, will always report progress on the first block. This is not the intended behavior, as no output should be given with the default value. This patch fixes that behavior and also adds a test to ensure that the flag functions as intended in the default value case and in the value-provided case. Not really super critical to functionality, but a little bit annoying to have this.

Currently, the logic in convert_bhive_to_llvm_exegesis_input will create extra (empty) files as the output at the end runs unconditionally rather than when there are actually blocks to output. This means that if we have a number of blocks that is an exact multiple of the value in blocks_per_json_file, we end up getting no blocks after the loop, which means we output an empty JSON file. Fixes google#61.

This patch fixes a clang-formatting issue introduced in a previous patch.

virajbshah · 2024-03-26T21:45:36Z

This PR is being split into 4 separate PRs:

Add instruction annotations to the basic block representations. #92
Add support for instruction annotations to the graph builder. #93
Add support for consuming instruction annotations to TokenGraphBuilderModels. #94
And an upcoming PR including the C++ inference API changes from this PR with some improvements.

virajbshah · 2024-03-28T16:18:49Z

The 4th PR has now been opened as well:

Add support for models consuming annotations to C++ inference API. #98

Add RuntimeAnnotation field to instruction proto.

a1c0ff2

virajbshah changed the title ~~Add RuntimeAnnotation field to instruction proto.~~ Add RuntimeAnnotations to supply Gematria models with cache miss information. Oct 30, 2023

ondrasej reviewed Nov 3, 2023

View reviewed changes

gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved

gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved

gematria/proto/canonicalized_instruction.proto Outdated Show resolved Hide resolved

Add instruction annotations to the instruction node feature vectors.

70b0982

ondrasej requested changes Nov 20, 2023

View reviewed changes

virajbshah and others added 13 commits November 27, 2023 20:00

Fix to accommodate annotations.

8cb7bb1

Merge branch 'google:main' into model-runtime-annotations

89ba1cf

Merge branch 'main' of github.com:virajbshah/gematria into model-runt…

a41f803

…ime-annotations

Make fixes for annotation related bindings and tests.

3e61d7f

Fix typo.

9a424a9

Fix missing function declaration.

e4b9d2e

Add basic test for annotations in the graph builder.

f2a62e8

Add yet some more test cases for annotations.

6db2989

Make adjustments to how annotations are included in node embeddings.

8c7a8a0

Merge branch 'model-runtime-annotations' of github.com:virajbshah/gem…

c16b29f

…atria into model-runtime-annotations

Fix formatting.

c2cc43b

Merge branch 'google:main' into model-runtime-annotations

a567109

ondrasej requested changes Jan 21, 2024

View reviewed changes

virajbshah and others added 3 commits January 22, 2024 22:15

Make requested changes to instruction annotations patch.

9134c0b

Merge branch 'google:main' into model-runtime-annotations

b2b5d88

Merge branch 'model-runtime-annotations' of github.com:virajbshah/gem…

fc7bceb

…atria into model-runtime-annotations

ondrasej requested changes Feb 5, 2024

View reviewed changes

virajbshah and others added 6 commits February 12, 2024 15:02

Make requested changes to basic block proto conversion.

56bc854

Remove partial support for name only annotations.

b3571e0

* Annotations no longer have `value == -1` by default. * Entries in the `instruction_annotations` data structure corresponding to missing annotations still hold -1 as a placeholder.

Make requested change to graph builder instruction_annotations test.

444a6c9

Merge branch 'google:main' into model-runtime-annotations

fbd9179

Merge branch 'main' of github.com:virajbshah/gematria into model-runt…

c20d7f7

…ime-annotations

virajbshah added 4 commits February 19, 2024 15:25

Merge branch 'model-runtime-annotations' of github.com:virajbshah/gem…

8a9fb2b

…atria into model-runtime-annotations

Make fixes for instruction annotations with tflite inference.

5e59b3c

Fix typo.

116b464

Format some files.

2f17d6f

ondrasej reviewed Feb 19, 2024

View reviewed changes

virajbshah and others added 7 commits March 1, 2024 22:56

Change shape of instruction annotation matrix stored in graph builder.

2774669

Keep C++ inference be compatible with models without annotations.

c73ebca

Merge branch 'google:main' into model-runtime-annotations

c96a0f0

Add lit tests for convert_bhive_to_llvm_exegesis_input (google#53)

dfd130a

This patch adds the necessary infrastructure for creating lit tests for convert_bhive_to_llvm_exegesis_input. This patch also adds in a single unit test to demonstrate the functionality.

Add additional lit tests for Exegesis conversion script (google#62)

6516f5f

This patch adds some additional lit regression tests for the Exegesis conversion script to test functionality that was previously implemented/improved now that we actually have testing facilities.

Add missing apostrophe in exegesis conversion error log

63f4a18

Currently, when disassembly fails, the error is reported to the user. However, before this patch, there was a missing apostrophe at the end of the hex value. This patch fixes that behavior by adding the missing apostrophe.

ondrasej requested changes Mar 11, 2024

View reviewed changes

gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved

gematria/granite/graph_builder.cc Outdated Show resolved Hide resolved

boomanaiden154 and others added 6 commits March 11, 2024 10:54

Add none annotator type (google#67)

78f6845

This patch adds a none annotator type to convert_bhive_to_llvm_exegesis_input that doesn't run any annotator. This is useful for looking at the output assembly for debugging purposes.

Fix clang formatting issue (google#69)

eb3f213

This patch fixes a clang-formatting issue introduced in a previous patch.

Fix broken git history.

13f0fc2

Refactor code to store instruction annotations.

571bda2

virajbshah mentioned this pull request Mar 17, 2024

Add functionality to convert JSON datasets to Gematria protos. #77

Draft

virajbshah added 4 commits March 18, 2024 19:35

Refactor and fix bugs with the handling of annotations by models.

ff08d2a

Add command-line flag for passing in annotation name lists.

d25bae9

Fix bugs with C++ inference instruction annotation compatibility.

f6c4eda

Allow C++ inference API to consume models with variations in inputs.

6d1ce1a

virajbshah closed this Mar 26, 2024

virajbshah deleted the model-runtime-annotations branch July 22, 2024 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

virajbshah commented Oct 30, 2023 •

edited

Loading

ondrasej left a comment

ondrasej left a comment

virajbshah commented Jan 22, 2024

ondrasej Feb 4, 2024

ondrasej left a comment

ondrasej Feb 19, 2024

ondrasej left a comment

virajbshah commented Mar 26, 2024

virajbshah commented Mar 28, 2024

		// Store the annotations for later use (inclusion in embeddings), using -1
		// as a default value wherever annotations are missing.

Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

Add RuntimeAnnotations to supply Gematria models with cache miss information. #27

Conversation

virajbshah commented Oct 30, 2023 • edited Loading

ondrasej left a comment

Choose a reason for hiding this comment

ondrasej left a comment

Choose a reason for hiding this comment

virajbshah commented Jan 22, 2024

ondrasej Feb 4, 2024

Choose a reason for hiding this comment

ondrasej left a comment

Choose a reason for hiding this comment

ondrasej Feb 19, 2024

Choose a reason for hiding this comment

ondrasej left a comment

Choose a reason for hiding this comment

virajbshah commented Mar 26, 2024

virajbshah commented Mar 28, 2024

virajbshah commented Oct 30, 2023 •

edited

Loading