Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EncodeOctets fails when bytes() starts with b'0x' #152

Open
jokjr opened this issue Mar 14, 2021 · 4 comments
Open

EncodeOctets fails when bytes() starts with b'0x' #152

jokjr opened this issue Mar 14, 2021 · 4 comments

Comments

@jokjr
Copy link

jokjr commented Mar 14, 2021

  • EncodeOctets fails when a bytes object starts with 0x - I ran into this where my code was failing after 50.000 invocations because I was assigning random bytes to an octet attribute
  • It looks like the function is too eager with the length check, so it will reject hex-encoded values that contain 126 octets: (253 - len('0x')) / 2 == 125.5. I looks like it should do the length checking after the decoding?
  • The length check for EncodeString also runs before the encoding, so if encoding six.text_type can expand during encoding that might also be work looking into. I don't know what six.text_type is, so I can't say if this is an issue.

I think the fix would be to have something like:

def EncodeOctets(octets):
    if isinstance(octets, str) and str.startswith('0x'):
        hexstring = str.split('0x')[1]
        octets = binascii.unhexlify(hexstring)
    if len(octets) > 253:
        raise ValueError('Can only encode strings of <= 253 characters')
    return octets

See:

pyrad/pyrad/tools.py

Lines 14 to 26 in 59b8a15

if isinstance(str, six.text_type):
return str.encode('utf-8')
else:
return str
def EncodeOctets(str):
if len(str) > 253:
raise ValueError('Can only encode strings of <= 253 characters')
if str.startswith(b'0x'):
hexstring = str.split(b'0x')[1]
return binascii.unhexlify(hexstring)

nward added a commit to SearchLightNZ/pyrad that referenced this issue May 23, 2021
nward added a commit to SearchLightNZ/pyrad that referenced this issue Jul 14, 2021
@jokjr
Copy link
Author

jokjr commented Jul 14, 2021

@nward @Istvan91 I appreciate the patch, but I think that this still has problems, consider these two cases:

b1 = b'0d\xff'
b2 = b='0x\xff'  # this will raise an exception?
s1=  '0x3064ff' # equivalent to b1
s2 = '0x3078ff' # equivalent to b2

I'm also not sure how to fix it without breaking semantics. In my code I've worked around the issue by always hex encoding strings and bytes before giving them to the library.

@Istvan91
Copy link
Collaborator

What problem do you see with the semantics?

The "obvious" solution would be to just catch the binascii error and fall back to treating the parameter as a "normal" str/bytes.

@Istvan91 Istvan91 reopened this Jul 23, 2021
@jokjr
Copy link
Author

jokjr commented Jul 23, 2021

The problem I originally ran into was that my bytes were random (MD5 for Message-Authenticator), and that my outgoing messages caused exceptions with probability 1/65536 because every now and then the 16 random bytes would start with 0x3078 ("0x"), which would fail the binascii decoding step.

Catching the exception would prevent that from happing, and would instead fail to treat bytes that contained valid hex (which wouldn't throw an error), an example would be b'0x34' with the hex encoding '0x30783334'. This would get encoded as 4 with that solution.

For the MD5, there are 16 bytes of hash, and this will happen when the first digit is 0, the second is x, and the remaining letters are in [0-9a-f], so (1/256) * (1/256) * ((10+6)/256)**(16-2) (or 16**14 / 256**16), so that is not going to happen very often, but it eventually still will.

My workaround was to always hex-encode it, regardless of the content, so it would always hit the hex decoding branch.

@benzea
Copy link

benzea commented Oct 23, 2023

I just ran into this, and as @jokjr said, the bugfix itself is a bug. We are forced to always hex encode it (using an old version right now, so even doing reply.AddAttribute("Message-Authenticator", b'0x' + hmac.hexdigest().encode('ascii'))).

A byte string can by definition contain any characters. It can include 0x, it can include asdf it can even be all hexadecimal and still be a valid md5 digest! It is an md5sum, even b'0xdeadbeefdeadbe' is a valid binary md5sum! And you would successfully "decode" it into b'\xde\xad\xbe\xef\xde\xad\xbe' which is not acceptable.

The ugly thing is, you cannot fix this without breaking some code out there, in particular because b'0x....' is the only safe input for me (older pyrad version).

I guess you do not have any choice but either breaking the API or adding a new sane API. Maybe one that just completely bypasses the encoding API and asserts that a byte string is passed. No one in their right mind would pass a bytestring that contains hex encoded binary. And if one has a bytestring, then doing the unhexlify in the surrounding code really is not enough of a burden to create an API that is impossible to use safely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants