-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for consuming instruction annotations to TokenGraphBuilderModel
s.
#94
base: main
Are you sure you want to change the base?
Conversation
* Add new message type `AnnotationProto` (`gematria/proto/annotation.proto`). * Include it `CanonicalizedInstructionProto`.
* Add new `Annotation` class. * Add `instruction_annotations` vector to `Instruction`. * Add Python bindings for these. * Add tests for all above changes.
* Add `Annotation` conversion and update `Instruction` conversion. * Refactor `ToVector` and `ToRepeatedPtrField` to be generic. * Add Python bindings for conversions. * Add tests for all changes.
* Make instruction annotations available to models via the graph builder, where they are stored as a vector-of-vector matrix. * Add Python bindings for the same. * Add tests for the C++ graph builder implementation, and update the tests for the Python bindings to be compatible with the changes.
* Add a test adding basic blocks with annotations to the graph builder. * Add `annotated_basic_blocks_with_throughput.pbtxt` as a source for annotated basic blocks for tests. * Add utilities to get the annotated blocks to `gematria/testing`.
* The ends of node embedding vectors used by `TokenGraphBuilderModel`s for instructions are overwritten by the corresponding instruction annotations. * A test for `TokenGraphBuilderModel`s consuming instruction annotations has been added as well.
* Add a `--gematria_annotation_names_file` flag to `run_granite_model` to pass in a list of instruction annotation names consumed by the model.
tf.slice( | ||
embeddings, | ||
begin=[0, 0], | ||
size=[-1, self.embed_dim - self.num_annotations], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we should put this to the model (i.e. either compute the learned embedding size from a specified node feature size + annotation size, or compute the node state size from a specified learned embedding size + annotation size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure I understand what is to be changed - this snippet should update the last num_annotations
with the annotations for instruction nodes and keep the rest embed_dim - num_annotations
intact, with embed_dim
and num_annotations
both coming from the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for not being more clear.
I think what I meant here is that we should not need to do this computation here (and we shouldn't need the slice). Ideally, we'd compute the effective size of the embedding vector before (in the constructor?) and then create the embedding vector so that all we need to do here is concatenate.
As an aside: perhaps we should use composition over inheritance here, and make the embedding module a submodule of this one rather than inherit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it.
Regarding the slicing - the reason why it currently isn't as clean as a simple concatenate operation is that we are only updating instruction nodes, so we'd have to handle the last num_annotation
entries of non-instruction nodes separately.
A possible solution would be to have two snt.Embed
s composed in this class - one holding the learnt embeddings of length embed_dim - num_annotations
for all nodes, and another holding learnt embeddings of length num_annotations
for only non-instruction nodes, which would be interleaved with instruction_annotations
accordingly to the instruction_node_mask
and then concatenated with the other one.
Or instead of having the second set of learnt embeddings we could just use a larger embed_dim
and let the last num_annotation
embedding entries for non-instruction nodes be something like zeros - though that sounds like it might be wasteful.
tf.slice( | ||
embeddings, | ||
begin=[0, 0], | ||
size=[-1, self.embed_dim - self.num_annotations], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for not being more clear.
I think what I meant here is that we should not need to do this computation here (and we shouldn't need the slice). Ideally, we'd compute the effective size of the embedding vector before (in the constructor?) and then create the embedding vector so that all we need to do here is concatenate.
As an aside: perhaps we should use composition over inheritance here, and make the embedding module a submodule of this one rather than inherit from it.
tf.tensor_scatter_nd_update( | ||
tf.slice( | ||
embeddings, | ||
begin=[0, self.embed_dim - self.num_annotations], | ||
size=[-1, self.num_annotations], | ||
), | ||
indices=tf.where(self.instruction_node_mask), | ||
updates=self.instruction_annotations, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this with TF inference? The set of supported ops in TFLite is a bit limited, and more specialized ones (like scatters and gathers) might not be there.
If not, please add a TODO here and we should test it when possible (and maybe look for another way to implement it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do remember testing TFLite inference after these changes, and I didn't see anything related to the graphdef having unsupported ops. I can see scatter and gather related ops in the TFLite MLIR dialect tfl
as well - so it looks like we are safe here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copy of the original basic_blocks_with_throughput, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - just with random instruction annotations added to it.
self._instruction_node_mask = tf.placeholder( | ||
dtype=tf.dtypes.bool, | ||
shape=(None,), | ||
name=GraphBuilderModelBase.INSTRUCTION_NODE_MASK_TENSOR_NAME, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two tf.placeholder()
s should be created in self._create_tf_graph()
, along with the rest of the TF ops.
I remember that in the past creating tensors in the constructor led to different issues with having TF ops in two different TF graphs (which then caused execution issues).
* Move annotation-related tensors from `TokenGraphBuilderModel` to `GraphBuilderModelBase`. * Adjust the order in which annotation-related tensors are created relative to other parts of the graph. * Refactor `TokenGraphBuilderModelNodeEmbed`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last few minor changes, otherwise this looks good.
"""`snt.Embed`-like class representing node embeddings with instruction | ||
annotations included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nitpicky: the first line of the docstring should fit into 80 columns. I think we can move the note about snt.Embed
to the extended part of the docstring.
self.instruction_annotations = instruction_annotations | ||
self.instruction_node_mask = instruction_node_mask | ||
|
||
def __call__(self, inputs): | ||
embeddings = super().__call__(inputs) | ||
# The first `embed_dim - num_annotations` embedding values for all nodes. | ||
self.common_embed = snt.Embed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably mark all the attributes as internal (by using a leading underscore). I don't see a need for a user to access those.
Uses instruction annotations obtained from the graph builder to compute instruction node embeddings. The ends of the instruction node embedding vectors are overwritten by the values of the annotations on the corresponding instruction.