Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hufman-Shaped Integer Wavelet Tree with fixed-size Alphabet #339

Open
Jabro opened this issue Sep 5, 2016 · 5 comments
Open

Hufman-Shaped Integer Wavelet Tree with fixed-size Alphabet #339

Jabro opened this issue Sep 5, 2016 · 5 comments

Comments

@Jabro
Copy link

Jabro commented Sep 5, 2016

Hello,

When construction a huffman-shaped integer wavelet tree from a fixed-width integer, the result seems to be wrong:

wt_huff_int<> wt;
int_vector<8> vec = {3,12,4,4,5,1,6,4,2};
construct_im(wt, vec);
cout << "wt.sigma : " << wt.sigma << endl; //5
cout << wt << endl; //4 1 0 2 0 0 1 0 5 0 4 0 0 4 1 0 4 0 0 1 0 0 0 0

I noticed a similar issue #145 which is already closed. Unfortunately the supposed fix won't compile on recent versions of the sdsl (no matching function for call to ‘serialize(const sdsl::int_vector<64u>::raw_wrapper&, sdsl::osfstream&)’)

Regards & Thanks,
Jan

simongog added a commit that referenced this issue Sep 7, 2016
@Jabro
Copy link
Author

Jabro commented Sep 12, 2016

Unfortunately #340 doesn't fix the issue.

The above code still leads to the same output.
The code suggested in #145 now compiles, but leads to an empty wavelet tree.

@Jabro
Copy link
Author

Jabro commented Sep 13, 2016

When using the patch and the third parameter of the construct_im method to set the amount of bytes of the alphabet, the construction works!

    wt_huff_int<> wt;
    int_vector<8> vec = {3,12,4,4,5,1,6,4,2};
    construct_im(wt, vec.raw, 1);
    cout << "wt.sigma : " << wt.sigma << endl;
    cout << wt << endl;

Is this intended behavior?

@simongog
Copy link
Owner

Yes, of course ;)

@Jabro
Copy link
Author

Jabro commented Sep 13, 2016

Unfortunately when using a non byte-sized alphabet the construction fails, because there seems to be a bug when serializing raw int_vectors which are not byte sized.

When I run the following code and look at the file with a hex editor, it only contains the first 2 numbers encoded as 32 Bit.

    int_vector<32> vec = {3,12,4,4,5,1,6,4,2};
    store_to_file(vec.raw, "file");

Storing to file also happens during the construction in construct.hpp:70

@Jabro
Copy link
Author

Jabro commented Sep 13, 2016

When I change back line include/sdsl/int_vector.hpp:612 to write_data instead of raw_data, everything works like a charm.

It seems that the provided pull request breaks the serialization of the raw_int_vector.
Without it the program won't compile, but adding the following line (as in the pull request) in int_vector.hpp:604 seems to fix the problems:

typedef int_vector::size_type size_type;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants