Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to handle 0-byte in pattern #31

Open
glensc opened this issue Apr 6, 2018 · 0 comments
Open

unable to handle 0-byte in pattern #31

glensc opened this issue Apr 6, 2018 · 0 comments

Comments

@glensc
Copy link

glensc commented Apr 6, 2018

so i wanted to implement strip control chars from input, like php-equivalent:

https://github.com/glensc/php-filename-normalizer/blob/d772aaad6b2a157787ae17320de5db4d3715df72/src/Normalizer.php#L30

select preg_replace('/[\x00-\x08\x0b-\x1f\x7f]/', 'a', concat('C',char(0x10),'kammkala'));
+------------------------------------------------------------------------------------+
| preg_replace('/[\x00-\x08\x0b-\x1f\x7f]/', 'a', concat('C',char(0x10),'kammkala')) |
+------------------------------------------------------------------------------------+
| aaaaaaaaa                                                                         |
+------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

first, as the \x00 is interpreted by php engine (in mysql it just translates to literal string \, x, 0, 0), needed to use different approach, \0 or char(0) both output null byte, but that's also char * string terminator in C, which results error that pattern separator is missing:

mysql> select preg_replace(concat('/[', char(0), '-', 0x08, 0x0b, '-', 0x1f, 0x7f, ']/'), 'a', concat('C',char(0x10),'kammkala'));
ERROR:
No ending delimiter found
mysql> select preg_replace(concat('/[', '\0', '-', 0x08, 0x0b, '-', 0x1f, 0x7f, ']/'), 'a', concat('C',char(0x10),'kammkala'));
ERROR:
No ending delimiter found
mysql>

so as workaround to my problem, i'm using mysql native replace function.

mysql> select replace(preg_replace(concat('/[', char(1), '-', 0x08, 0x0b, '-', 0x1f, 0x7f, ']/'), 'a', concat('C',char(0x0),'kammkala')), char(0), '!');
+-------------------------------------------------------------------------------------------------------------------------------------------+
| replace(preg_replace(concat('/[', char(1), '-', 0x08, 0x0b, '-', 0x1f, 0x7f, ']/'), 'a', concat('C',char(0x0),'kammkala')), char(0), '!') |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| C!kammkala                                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

the UDF function should be able to accept \0 in input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant