Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for consuming instruction annotations to TokenGraphBuilderModels. #94

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

virajbshah
Copy link
Collaborator

Uses instruction annotations obtained from the graph builder to compute instruction node embeddings. The ends of the instruction node embedding vectors are overwritten by the values of the annotations on the corresponding instruction.

 * Add new message type `AnnotationProto`
   (`gematria/proto/annotation.proto`).
 * Include it `CanonicalizedInstructionProto`.
 * Add new `Annotation` class.
 * Add `instruction_annotations` vector to `Instruction`.
 * Add Python bindings for these.
 * Add tests for all above changes.
 * Add `Annotation` conversion and update `Instruction` conversion.
 * Refactor `ToVector` and `ToRepeatedPtrField` to be generic.
 * Add Python bindings for conversions.
 * Add tests for all changes.
 * Make instruction annotations available to models via the graph
   builder, where they are stored as a vector-of-vector matrix.
 * Add Python bindings for the same.
 * Add tests for the C++ graph builder implementation, and update the
   tests for the Python bindings to be compatible with the changes.
 * Add a test adding basic blocks with annotations to the graph builder.
 * Add `annotated_basic_blocks_with_throughput.pbtxt` as a source for
   annotated basic blocks for tests.
 * Add utilities to get the annotated blocks to `gematria/testing`.
 * The ends of node embedding vectors used by `TokenGraphBuilderModel`s
   for instructions are overwritten by the corresponding instruction
   annotations.
 * A test for `TokenGraphBuilderModel`s consuming instruction
   annotations has been added as well.
 * Add a `--gematria_annotation_names_file` flag to `run_granite_model`
   to pass in a list of instruction annotation names consumed by the
   model.
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/granite/python/graph_builder_model_base.py Outdated Show resolved Hide resolved
gematria/granite/python/token_graph_builder_model.py Outdated Show resolved Hide resolved
tf.slice(
embeddings,
begin=[0, 0],
size=[-1, self.embed_dim - self.num_annotations],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should put this to the model (i.e. either compute the learned embedding size from a specified node feature size + annotation size, or compute the node state size from a specified learned embedding size + annotation size).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure I understand what is to be changed - this snippet should update the last num_annotations with the annotations for instruction nodes and keep the rest embed_dim - num_annotations intact, with embed_dim and num_annotations both coming from the model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for not being more clear.

I think what I meant here is that we should not need to do this computation here (and we shouldn't need the slice). Ideally, we'd compute the effective size of the embedding vector before (in the constructor?) and then create the embedding vector so that all we need to do here is concatenate.

As an aside: perhaps we should use composition over inheritance here, and make the embedding module a submodule of this one rather than inherit from it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it.

Regarding the slicing - the reason why it currently isn't as clean as a simple concatenate operation is that we are only updating instruction nodes, so we'd have to handle the last num_annotation entries of non-instruction nodes separately.

A possible solution would be to have two snt.Embeds composed in this class - one holding the learnt embeddings of length embed_dim - num_annotations for all nodes, and another holding learnt embeddings of length num_annotations for only non-instruction nodes, which would be interleaved with instruction_annotations accordingly to the instruction_node_mask and then concatenated with the other one.

Or instead of having the second set of learnt embeddings we could just use a larger embed_dim and let the last num_annotation embedding entries for non-instruction nodes be something like zeros - though that sounds like it might be wasteful.

@virajbshah virajbshah requested a review from ondrasej June 17, 2024 11:18
tf.slice(
embeddings,
begin=[0, 0],
size=[-1, self.embed_dim - self.num_annotations],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for not being more clear.

I think what I meant here is that we should not need to do this computation here (and we shouldn't need the slice). Ideally, we'd compute the effective size of the embedding vector before (in the constructor?) and then create the embedding vector so that all we need to do here is concatenate.

As an aside: perhaps we should use composition over inheritance here, and make the embedding module a submodule of this one rather than inherit from it.

Comment on lines 427 to 435
tf.tensor_scatter_nd_update(
tf.slice(
embeddings,
begin=[0, self.embed_dim - self.num_annotations],
size=[-1, self.num_annotations],
),
indices=tf.where(self.instruction_node_mask),
updates=self.instruction_annotations,
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this with TF inference? The set of supported ops in TFLite is a bit limited, and more specialized ones (like scatters and gathers) might not be there.

If not, please add a TODO here and we should test it when possible (and maybe look for another way to implement it).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do remember testing TFLite inference after these changes, and I didn't see anything related to the graphdef having unsupported ops. I can see scatter and gather related ops in the TFLite MLIR dialect tfl as well - so it looks like we are safe here :)

gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
gematria/model/python/token_model_flags.py Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a copy of the original basic_blocks_with_throughput, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - just with random instruction annotations added to it.

Comment on lines 163 to 167
self._instruction_node_mask = tf.placeholder(
dtype=tf.dtypes.bool,
shape=(None,),
name=GraphBuilderModelBase.INSTRUCTION_NODE_MASK_TENSOR_NAME,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two tf.placeholder()s should be created in self._create_tf_graph(), along with the rest of the TF ops.

I remember that in the past creating tensors in the constructor led to different issues with having TF ops in two different TF graphs (which then caused execution issues).

 * Move annotation-related tensors from `TokenGraphBuilderModel` to
   `GraphBuilderModelBase`.
 * Adjust the order in which annotation-related tensors are created
   relative to other parts of the graph.
 * Refactor `TokenGraphBuilderModelNodeEmbed`.
@virajbshah virajbshah requested a review from ondrasej June 24, 2024 10:56
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last few minor changes, otherwise this looks good.

Comment on lines +370 to +371
"""`snt.Embed`-like class representing node embeddings with instruction
annotations included.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nitpicky: the first line of the docstring should fit into 80 columns. I think we can move the note about snt.Embed to the extended part of the docstring.

Comment on lines 398 to 402
self.instruction_annotations = instruction_annotations
self.instruction_node_mask = instruction_node_mask

def __call__(self, inputs):
embeddings = super().__call__(inputs)
# The first `embed_dim - num_annotations` embedding values for all nodes.
self.common_embed = snt.Embed(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably mark all the attributes as internal (by using a leading underscore). I don't see a need for a user to access those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants