Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

searchkit doesn't handle unicode errors #860

Open
pponnuvel opened this issue May 10, 2024 · 2 comments
Open

searchkit doesn't handle unicode errors #860

pponnuvel opened this issue May 10, 2024 · 2 comments

Comments

@pponnuvel
Copy link
Member

2024-05-10 12:12:15,355 791905 ERROR searchkit [-] caught UnicodeDecodeError while searching ./var/log/kern.log
Traceback (most recent call last):
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1043, in execute
    fd.read(1)
  File "/usr/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.8/gzip.py", line 479, in read
    if not self._read_gzip_header():
  File "/usr/lib/python3.8/gzip.py", line 427, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'Ap')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1048, in execute
    stats = self._run_search(fd)
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 993, in _run_search
    line = line.decode("utf-8", **self.decode_kwargs)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 117: invalid continuation byte
2024-05-10 12:12:15,363 791878 DEBUG searchkit [-] joining/stopping queue consumer thread
2024-05-10 12:12:15,434 791878 DEBUG searchkit [-] exiting results thread
2024-05-10 12:12:15,434 791878 DEBUG searchkit [-] stopped fetching results (total received=0)
2024-05-10 12:12:15,434 791878 DEBUG searchkit [-] consumer thread stopped successfully
2024-05-10 12:12:15,437 791878 ERROR hotsos.plugin.lxd [-] part 'auto_scenario_check' raised exception: 'utf-8' codec can't decode byte 0xd2 in position 117: invalid continuation byte
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1043, in execute
    fd.read(1)
  File "/usr/lib/python3.8/gzip.py", line 292, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.8/gzip.py", line 479, in read
    if not self._read_gzip_header():
  File "/usr/lib/python3.8/gzip.py", line 427, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'Ap')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1048, in execute
    stats = self._run_search(fd)
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 993, in 
      line = line.decode("utf-8", **self.decode_kwargs)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 117: invalid continuation byte
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/hotsos/core/plugintools.py", line 402, in run
    always_parts().run()
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/hotsos/core/ycheck/scenarios.py", line 142, in run
    self.load()
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/hotsos/core/ycheck/scenarios.py", line 99, in load
    results = self.searcher.run()
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1446, in run
    self._run_mp(mgr, results, rs)
  File "/home/pponnuvel/.local/pipx/venvs/hotsos/lib/python3.8/site-packages/searchkit/search.py", line 1396, in _run_mp
    self.stats.update(future.result())
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 117: invalid continuation byte

/customers/sncf/00382281/sosreport-ht202opp01-SNCF-00382281-2024-04-16-rmraxob.tar.xz has the problematic kern.log.

@xmkg
Copy link
Contributor

xmkg commented May 10, 2024

It looks like there are some characters from the ANSI charset in the kern.log file, which does not play nicely with the UTF-8 decoding. I think there are two possible solution paths here:

a-) skip the offending line
b-) Use a fallback decoder (e.g. cp1252)

pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 10, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
@pponnuvel
Copy link
Member Author

It looks like there are some characters from the ANSI charset in the kern.log file, which does not play nicely with the UTF-8 decoding. I think there are two possible solution paths here:

a-) skip the offending line b-) Use a fallback decoder (e.g. cp1252)

Yeah, searchkit's maintainer actually handled it by providing an option to ignore it :)
I've used it: #861.

pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 11, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 22, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 31, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 31, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 31, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue May 31, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
pponnuvel added a commit to pponnuvel/hotsos that referenced this issue Jun 14, 2024
When UTF-8 decoding fails, searchkit throws an exception.

Passing decode_errors='backslashreplace' to cope with that.

Fixes canonical#860.

Signed-off-by: Ponnuvel Palaniyappan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants