Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: write binary as hex strings in mysql text protocol #70

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

CookiePieWw
Copy link

@CookiePieWw CookiePieWw commented Sep 19, 2024

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Currently the text protocol write binary value directly into the packet, thus the client would get the original binary values and display it like this:
image

But actually the mysql client display the binary value as hex strings (MySQL (client v8.4.0, server v9.0.1)):
image

Through the client source code it only parse binary values as hex strings when using binary charset:

// See https://github.com/mysql/mysql-server/blob/596f0d238489a9cf9f43ce1ff905984f58d227b6/client/mysql.cc
// L3916-L3918 and L4033-L4034
tatic bool is_binary_field(MYSQL_FIELD *field) {
  return (
      (field->charsetnr == 63) &&
// ...
// when write results
if (opt_binhex && is_binary_field(field))
    print_as_hex(PAGER, cur[off], lengths[off], field_max_length);

So I wonder if it's proper to directly write hex strings when using text protocol :)

@CookiePieWw CookiePieWw changed the title fix: write binary as hex strings in text protocol fix: write binary as hex strings in mysql text protocol Sep 19, 2024
@CookiePieWw
Copy link
Author

CookiePieWw commented Sep 20, 2024

After further investigations, it seems MySQL server sends original binary value as well, while the charset of text protocol is actually binary one:

// See https://github.com/mysql/mysql-server/blob/f7680e98b6bbe3500399fbad465d08a6b75d7a5c/sql/field.cc
// This is an old version (v5.7), I guess there's no change in the newer versions here.
bool Field::send_text(Protocol *protocol)
{
  if (is_null())
    return protocol->store_null();
  char buff[MAX_FIELD_WIDTH];
  // Charset is set to `my_charset_bin` all the time no matter what protocol is.
  String str(buff, sizeof(buff), &my_charset_bin);
  String *res= val_str(&str);
  return res ? protocol->store(res) : protocol->store_null();
}

String *Field_blob::val_str(String *val_buffer MY_ATTRIBUTE((unused)),
			    String *val_ptr)
{
  char *blob;
  // Directly copy the binary value, no conversion to hex strings.
  memcpy(&blob, ptr+packlength, sizeof(char*));
  if (!blob)
    val_ptr->set("",0,charset());	// A bit safer than ->length(0)
  else
    val_ptr->set((const char*) blob,get_length(ptr),charset());
  return val_ptr;
}

Then the binary to hex string conversion is still performed at client side. So maybe the correct fix is set the charset to binary here?
https://github.com/datafuselabs/opensrv/blob/6cbb80670116e8acce5dd4c473356e8df039804f/mysql/src/writers.rs#L134

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant