Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Add support for arbitrary sequences of hashable objects #128

Merged
merged 4 commits into from
May 4, 2019

Conversation

jbaiter
Copy link
Contributor

@jbaiter jbaiter commented Mar 10, 2019

This implements @Martinsos' suggestion from #79 to add support for sequences of arbitrary hashable objects in the Python bindings. If either query or target contain non-ascii values, they are mapped into an ASCII alphabet and the resulting byte sequences are used for doing the alignment.

One limitation that we can't get around at the moment is that the query and target sequence together must not contain more than 256 unique values.

It certainly is not the ideal way to go about this, but it should serve as an acceptable workaround for a lot of use cases until #90 is implemented.

This should help with #123, #114, #109, #104, #89 and #79.

This is a workaround until Martinsos#90
is implemented.
If either query or target contain non-ascii values, they are mapped into
an ASCII alphabet and the resulting byte sequences are used for doing
the alignment.
@Martinsos
Copy link
Owner

Martinsos commented Mar 13, 2019

@jbaiter thanks this is very cool :)!
I am certainly in for integrating this, I will take a look at it over the weekend and write comments on the code to make sure we get it right, stay tuned for that. Thanks again :)!

Copy link
Owner

@Martinsos Martinsos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a detailed look, looks pretty good!
There are few things I hope we could slightly improve before merging, I left comments, please check them out.

bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
- Refactor input mapping code into separate function
- Allow no more than 256 unique values (was: 255)
- Also map additional equalities if query or target need mapping
- Update docstrings
- Fix code style
@jbaiter
Copy link
Contributor Author

jbaiter commented Mar 20, 2019

Thanks for the thorough review!

Copy link
Owner

@Martinsos Martinsos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, I think we are almost there, left a few more comments and that should be it I think. Thanks for the patience :).

bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
bindings/python/edlib.pyx Outdated Show resolved Hide resolved
Copy link
Owner

@Martinsos Martinsos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM!
I will merge it now, and will publish the new version of python package in couple of days when I catch some time.
Thanks a lot :)!

@Martinsos Martinsos merged commit 910ff3a into Martinsos:master May 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants