-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify EL batching to doc-wise streaming approach #12367
Modify EL batching to doc-wise streaming approach #12367
Conversation
…p from Iterable[Span].
Co-authored-by: Sofie Van Landeghem <[email protected]>
…o refactor/el-candidates
Co-authored-by: Sofie Van Landeghem <[email protected]>
…ntions # Conflicts: # spacy/ml/models/entity_linker.py # website/docs/api/inmemorylookupkb.mdx
Co-authored-by: Sofie Van Landeghem <[email protected]>
# Conflicts: # spacy/pipeline/entity_linker.py # website/docs/api/entitylinker.mdx
Co-authored-by: Madeesh Kannan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, had some comments that still need addressing. Was the rewrite sufficiently tested on some sample docs to determine that it all works well?
spacy/kb/kb.pyx
Outdated
@@ -30,26 +30,15 @@ cdef class KnowledgeBase: | |||
self.entity_vector_length = entity_vector_length | |||
self.mem = Pool() | |||
|
|||
def get_candidates_batch(self, mentions: SpanGroup) -> Iterable[Iterable[Candidate]]: | |||
def get_candidates(self, mentions: Iterator[SpanGroup]) -> Iterator[Iterable[Iterable[Candidate]]]: | |||
""" | |||
Return candidate entities for a specified Span mention. Each candidate defines at least the entity and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still requires updating - the text refers to a single specified Span Mention.
…ch/spaCy into feature/docwise-generator-batching
Co-authored-by: Sofie Van Landeghem <[email protected]>
…ch/spaCy into feature/docwise-generator-batching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this in v4 😎
Description
Modify EL batching to work doc-based instead of a mention-based. For prior discussion as to why this is useful see #11669 (comment).
Review and merge after #12341.
Split off of #11669.
Types of change
Enhancement.
Checklist