This repository has been archived by the owner on Sep 2, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Warn/fail when an address is out of range #6
Comments
There probably needs to be detection both in the assembler step (for numeric constants) and in the linker step (during symbol resolution). |
mbitsnbites
pushed a commit
that referenced
this issue
Feb 20, 2020
This option allows to tell GDB to detect and possibly handle mismatched exec-files. A recurrent problem with GDB is that GDB uses the wrong exec-file when using the attach/detach commands successively. Also, in case the user specifies a file on the command line but attaches to the wrong PID, this error is not made visible and gives a not user understandable behaviour. For example: $ gdb ... (gdb) atta 2682 ############################################ PID running 'sleepers' executable Attaching to process 2682 [New LWP 2683] [New LWP 2684] [New LWP 2685] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f5ff829f603 in select () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) det Detaching from program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 2682 [Inferior 1 (process 2682) detached] (gdb) atta 31069 ############################################ PID running 'gdb' executable Attaching to program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 31069 Reading symbols from /lib64/ld-linux-x86-64.so.2... Reading symbols from /usr/lib/debug/.build-id/60/6df9c355103e82140d513bc7a25a635591c153.debug... 0x00007f43c23478a0 in ?? () (gdb) bt #0 0x00007f43c23478a0 in ?? () #1 0x0000558909e3ad91 in ?? () #2 0x0000202962646700 in ?? () #3 0x00007ffc69c74e70 in ?? () #4 0x000055890c1d2350 in ?? () #5 0x0000000000000000 in ?? () (gdb) The second attach has kept the executable of the first attach. (in this case, 31069 is the PID of a GDB, that has nothing to do with the first determined 'sleepers' executable). Similarly, if specifying an executable, but attaching to a wrong pid, we get: gdb /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers ... Reading symbols from /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers... (gdb) atta 31069 ############################################ PID running 'gdb' executable Attaching to program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 31069 Reading symbols from /lib64/ld-linux-x86-64.so.2... Reading symbols from /usr/lib/debug/.build-id/60/6df9c355103e82140d513bc7a25a635591c153.debug... 0x00007f43c23478a0 in ?? () (gdb) bt #0 0x00007f43c23478a0 in ?? () #1 0x0000558909e3ad91 in ?? () #2 0x0000202962646700 in ?? () #3 0x00007ffc69c74e70 in ?? () #4 0x000055890c1d2350 in ?? () #5 0x0000000000000000 in ?? () (gdb) And it is unclear to the user what has happened/what is going wrong. This patch series implements a new option: (gdb) apropos exec-file-mismatch set exec-file-mismatch -- Set exec-file-mismatch handling (ask|warn|off). show exec-file-mismatch -- Show exec-file-mismatch handling (ask|warn|off). (gdb) help set exec-file-mismatch Set exec-file-mismatch handling (ask|warn|off). Specifies how to handle a mismatch between the current exec-file name loaded by GDB and the exec-file name automatically determined when attaching to a process: ask - warn the user and ask whether to load the determined exec-file. warn - warn the user, but do not change the exec-file. off - do not check for mismatch. "ask" means: in case of mismatch between the current exec-file name and the automatically determined exec-file name of the PID we are attaching to, give a warning to the user and ask whether to load the automatically determined exec-file. "warn" means: in case of mismatch, just give a warning to the user. "off" means: do not check for mismatch. This fixes PR gdb/17626. There was a previous trial to fix this PR. See https://sourceware.org/ml/gdb-patches/2015-07/msg00118.html This trial was however only fixing the problem for the automatically determined executable files when doing attach. It was differentiating the 'user specified executable files' ("sticky") from the executable files automatically found by GDB. But such user specified sticky executables are in most cases due to a wrong manipulation by the user, giving unexpected results such as backtrace showing no function like in the above example. This patch ensures that whenever a process executable can be determined, that the user is warned if there is a mismatch. The same tests as above then give: (gdb) atta 2682 Attaching to process 2682 [New LWP 2683] [New LWP 2684] [New LWP 2685] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f5ff829f603 in select () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) det Detaching from program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 2682 [Inferior 1 (process 2682) detached] (gdb) atta 31069 Attaching to program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 31069 warning: Mismatch between current exec-file /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers and automatically determined exec-file /bd/home/philippe/gdb/git/build_fixes/gdb/gdb exec-file-mismatch handling is currently "ask" Load new symbol table from "/bd/home/philippe/gdb/git/build_fixes/gdb/gdb"? (y or n) y Reading symbols from /bd/home/philippe/gdb/git/build_fixes/gdb/gdb... Setting up the environment for debugging gdb. ... Reading symbols from /usr/lib/debug/.build-id/60/6df9c355103e82140d513bc7a25a635591c153.debug... 0x00007f43c23478a0 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (top-gdb) bt During symbol reading: incomplete CFI data; unspecified registers (e.g., rax) at 0x7f43c23478ad During symbol reading: unsupported tag: 'DW_TAG_unspecified_type' During symbol reading: cannot get low and high bounds for subprogram DIE at 0x12282a7 During symbol reading: Child DIE 0x12288ba and its abstract origin 0x1228b26 have different parents During symbol reading: DW_AT_call_target target DIE has invalid low pc, for referencing DIE 0x1229540 [in module /bd/home/philippe/gdb/git/build_fixes/gdb/gdb] #0 0x00007f43c23478a0 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:84 #1 0x0000558909e3ad91 in poll (__timeout=-1, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46 #2 gdb_wait_for_event (block=block@entry=1) at ../../fixes/gdb/event-loop.c:772 #3 0x0000558909e3aef4 in gdb_do_one_event () at ../../fixes/gdb/event-loop.c:347 #4 0x0000558909e3b085 in gdb_do_one_event () at ../../fixes/gdb/common/common-exceptions.h:219 #5 start_event_loop () at ../../fixes/gdb/event-loop.c:371 During symbol reading: Member function "~_Sp_counted_base" (offset 0x1c69bf7) is virtual but the vtable offset is not specified During symbol reading: Multiple children of DIE 0x1c8f5a0 refer to DIE 0x1c8f0ee as their abstract origin #6 0x0000558909ed3b78 in captured_command_loop () at ../../fixes/gdb/main.c:331 #7 0x0000558909ed4b6d in captured_main (data=<optimized out>) at ../../fixes/gdb/main.c:1174 #8 gdb_main (args=<optimized out>) at ../../fixes/gdb/main.c:1190 #9 0x0000558909c1e9a8 in main (argc=<optimized out>, argv=<optimized out>) at ../../fixes/gdb/gdb.c:32 (top-gdb) gdb /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers ... Reading symbols from /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers... (gdb) atta 31069 Attaching to program: /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers, process 31069 warning: Mismatch between current exec-file /home/philippe/valgrind/git/trunk_untouched/gdbserver_tests/sleepers and automatically determined exec-file /bd/home/philippe/gdb/git/build_fixes/gdb/gdb exec-file-mismatch handling is currently "ask" Load new symbol table from "/bd/home/philippe/gdb/git/build_fixes/gdb/gdb"? (y or n) y Reading symbols from /bd/home/philippe/gdb/git/build_fixes/gdb/gdb... Setting up the environment for debugging gdb. .... In other words, it now works as intuitively expected by the user. If ever the user gave the correct executable on the command line, then attached to the wrong pid, then confirmed loading the wrong executable, the user can simply fix this by detaching, and attaching to the correct pid, GDB will then tell again to the user that the exec-file might better be loaded. The default value of "ask" is chosen instead of e.g. "warn" as in most cases, switching of executable will be the correct action, and in any case, the user can decide to not load the executable, as GDB asks a confirmation to the user to load the new executable. For settings "ask" and "warn", the new function validate_exec_file () tries to get the inferior pid exec file and compares it with the current exec file. In case of mismatch, it warns the user and optionally load the executable. This function is called in the attach_command implementation to cover most cases of attaching to a running process. It must also be called in remote.c, as the attach command is not supported for all types of remote gdbserver. gdb/ChangeLog 2020-01-25 Philippe Waroquiers <[email protected]> * exec.c (exec_file_mismatch_names, exec_file_mismatch_mode) (show_exec_file_mismatch_command, set_exec_file_mismatch_command) (validate_exec_file): New variables, enums, functions. (exec_file_locate_attach, print_section_info): Style the filenames. (_initialize_exec): Install show_exec_file_mismatch_command and set_exec_file_mismatch_command. * gdbcore.h (validate_exec_file): Declare. * infcmd.c (attach_command): Call validate_exec_file. * remote.c ( remote_target::remote_add_inferior): Likewise.
mbitsnbites
pushed a commit
that referenced
this issue
Mar 28, 2020
Running anything with the fission.exp board fails since commit c0ab21c ("Replace init_cutu_and_read_dies with a class"). GDB crashes while reading the DWARF info. cu is NULL in read_signatured_type: Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x000055555780663e in read_signatured_type sig_type=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22782 22782 gdb_assert (cu->die_hash == NULL); (top-gdb) bt #0 0x000055555780663e in read_signatured_type (sig_type=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22782 #1 0x00005555578062dd in load_full_type_unit (per_cu=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22758 #2 0x00005555577c5fb7 in queue_and_load_dwo_tu (slot=0x60600007fc00, info=0x6210000c34e0) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:12674 #3 0x0000555559934232 in htab_traverse_noresize (htab=0x60b000063670, callback=0x5555577c5e61 <queue_and_load_dwo_tu(void**, void*)>, info=0x6210000c34e0) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:775 #4 0x00005555577c6252 in queue_and_load_all_dwo_tus (per_cu=0x6210000c34e0) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:12701 #5 0x000055555777ebd8 in dw2_do_instantiate_symtab (per_cu=0x6210000c34e0, skip_partial=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:2371 #6 0x000055555777eea2 in dw2_instantiate_symtab (per_cu=0x6210000c34e0, skip_partial=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:2395 #7 0x0000555557786ab6 in dw2_lookup_symbol (objfile=0x614000007240, block_index=GLOBAL_BLOCK, name=0x602000025310 "main", domain=VAR_DOMAIN) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:3539 After creating the reader object, the reader.cu field should not be NULL. By checking the commit previous to the faulty one mentioned above, I noticed that the cu field is normally set by init_cu_die_reader, called from read_cutu_die_from_dwo, itself called from cutu_reader::init_tu_and_read_dwo_dies, itself called from cutu_reader's constructor. However, cutu_reader::init_tu_and_read_dwo_dies calls read_cutu_die_from_dwo, passing a pointer to a local `die_reader_specs` variable. So it's the `cu` field of that object that gets set. cutu_reader itself is a `die_reader_specs` (it inherits from it), and the intention was most likely to pass `this` to read_cutu_die_from_dwo. This way, the fields of the cutu_reader object, which read_signatured_type will use, are set. With this, I am able to use: make check RUNTESTFLAGS='--target_board=fission' and it looks much better. There are still some failures to be investigated, but that's the usual state of the testsuite. gdb/ChangeLog: * dwarf2/read.c (cutu_reader::init_tu_and_read_dwo_dies): Remove reader variable, pass `this` to read_cutu_die_from_dwo.
mbitsnbites
pushed a commit
that referenced
this issue
Apr 7, 2020
Since commit 981c08c "Change how complex types are printed in C", we see these FAILs: ... FAIL: gdb.fortran/mixed-lang-stack.exp: lang=auto: info args in frame #6 FAIL: gdb.fortran/mixed-lang-stack.exp: lang=c: info args in frame #6 FAIL: gdb.fortran/mixed-lang-stack.exp: lang=c: info args in frame #7 FAIL: gdb.fortran/mixed-lang-stack.exp: lang=c++: info args in frame #6 FAIL: gdb.fortran/mixed-lang-stack.exp: lang=c++: info args in frame #7 ... The problem is that printing of complex types has changed from: ... d = 4 + 5 * I ... to: ... d = 4 + 5i ... but the test-case still checks for the old printing style. Fix this by updating the test-case to check for the new style. gdb/testsuite/ChangeLog: 2020-04-02 Tom de Vries <[email protected]> * gdb.fortran/mixed-lang-stack.exp: Accept new complex printing style.
mbitsnbites
pushed a commit
that referenced
this issue
Jun 5, 2020
There has been some breakage for aarch64-linux, arm-linux and s390-linux in terms of inline frame unwinding. There may be other targets, but these are the ones i'm aware of. The following testcases started to show numerous failures and trigger internal errors in GDB after commit 1009d92, "Find tailcall frames before inline frames". gdb.opt/inline-break.exp gdb.opt/inline-cmds.exp gdb.python/py-frame-inline.exp gdb.reverse/insn-reverse.exp The internal errors were of this kind: binutils-gdb/gdb/frame.c:579: internal-error: frame_id get_frame_id(frame_info*): Assertion `fi->level == 0' failed. After a lengthy investigation to try and find the cause of these assertions, it seems we're dealing with some fragile/poorly documented code to handle inline frames and we are attempting to unwind from this fragile section of code. Before commit 1009d92, the tailcall sniffer was invoked from dwarf2_frame_prev_register. By the time we invoke the dwarf2_frame_prev_register function, we've probably already calculated the frame id (via compute_frame_id). After said commit, the call to dwarf2_tailcall_sniffer_first was moved to dwarf2_frame_cache. This is very early in a frame creation process, and we're still calculating the frame ID (so compute_frame_id is in the call stack). This would be fine for regular frames, but the above testcases all deal with some inline frames. The particularity of inline frames is that their frame ID's depend on the previous frame's ID, and the previous frame's ID relies in the inline frame's registers. So it is a bit of a messy situation. We have comments in various parts of the code warning about some of these particularities. In the case of dwarf2_tailcall_sniffer_first, we attempt to unwind the PC, which goes through various functions until we eventually invoke frame_unwind_got_register. This function will eventually attempt to create a lazy value for a particular register, and this lazy value will require a valid frame ID. Since the inline frame doesn't have a valid frame ID yet (remember we're still calculating the previous frame's ID so we can tell what the inline frame ID is) we will call compute_frame_id for the inline frame (level 0). We'll eventually hit the assertion above, inside get_frame_id: -- /* If we haven't computed the frame id yet, then it must be that this is the current frame. Compute it now, and stash the result. The IDs of other frames are computed as soon as they're created, in order to detect cycles. See get_prev_frame_if_no_cycle. */ gdb_assert (fi->level == 0); -- It seems to me we shouldn't have reached this assertion without having the inline frame ID already calculated. In fact, it seems we even start recursing a bit when we invoke get_prev_frame_always within inline_frame_this_id. But a check makes us quit the recursion and proceed to compute the id. Here's the call stack for context: #0 get_prev_frame_always_1 (this_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:2109 RECURSION - #1 0x0000aaaaaae1d098 in get_prev_frame_always (this_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:2124 #2 0x0000aaaaaae95768 in inline_frame_this_id (this_frame=0xaaaaab85a670, this_cache=0xaaaaab85a688, this_id=0xaaaaab85a6d0) at ../../../repos/binutils-gdb/gdb/inline-frame.c:165 #3 0x0000aaaaaae1916c in compute_frame_id (fi=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:550 #4 0x0000aaaaaae19318 in get_frame_id (fi=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:582 #5 0x0000aaaaaae13480 in value_of_register_lazy (frame=0xaaaaab85a730, regnum=30) at ../../../repos/binutils-gdb/gdb/findvar.c:296 #6 0x0000aaaaaae16c00 in frame_unwind_got_register (frame=0xaaaaab85a730, regnum=30, new_regnum=30) at ../../../repos/binutils-gdb/gdb/frame-unwind.c:268 #7 0x0000aaaaaad52604 in dwarf2_frame_prev_register (this_frame=0xaaaaab85a730, this_cache=0xaaaaab85a748, regnum=30) at ../../../repos/binutils-gdb/gdb/dwarf2/frame.c:1296 #8 0x0000aaaaaae1ae68 in frame_unwind_register_value (next_frame=0xaaaaab85a730, regnum=30) at ../../../repos/binutils-gdb/gdb/frame.c:1229 #9 0x0000aaaaaae1b304 in frame_unwind_register_unsigned (next_frame=0xaaaaab85a730, regnum=30) at ../../../repos/binutils-gdb/gdb/frame.c:1320 #10 0x0000aaaaaab76574 in aarch64_dwarf2_prev_register (this_frame=0xaaaaab85a730, this_cache=0xaaaaab85a748, regnum=32) at ../../../repos/binutils-gdb/gdb/aarch64-tdep.c:1114 #11 0x0000aaaaaad52724 in dwarf2_frame_prev_register (this_frame=0xaaaaab85a730, this_cache=0xaaaaab85a748, regnum=32) at ../../../repos/binutils-gdb/gdb/dwarf2/frame.c:1316 #12 0x0000aaaaaae1ae68 in frame_unwind_register_value (next_frame=0xaaaaab85a730, regnum=32) at ../../../repos/binutils-gdb/gdb/frame.c:1229 #13 0x0000aaaaaae1b304 in frame_unwind_register_unsigned (next_frame=0xaaaaab85a730, regnum=32) at ../../../repos/binutils-gdb/gdb/frame.c:1320 #14 0x0000aaaaaae16a84 in default_unwind_pc (gdbarch=0xaaaaab81edc0, next_frame=0xaaaaab85a730) at ../../../repos/binutils-gdb/gdb/frame-unwind.c:223 #15 0x0000aaaaaae32124 in gdbarch_unwind_pc (gdbarch=0xaaaaab81edc0, next_frame=0xaaaaab85a730) at ../../../repos/binutils-gdb/gdb/gdbarch.c:3074 #16 0x0000aaaaaad4f15c in dwarf2_tailcall_sniffer_first (this_frame=0xaaaaab85a730, tailcall_cachep=0xaaaaab85a830, entry_cfa_sp_offsetp=0x0) at ../../../repos/binutils-gdb/gdb/dwarf2/frame-tailcall.c:388 #17 0x0000aaaaaad520c0 in dwarf2_frame_cache (this_frame=0xaaaaab85a730, this_cache=0xaaaaab85a748) at ../../../repos/binutils-gdb/gdb/dwarf2/frame.c:1190 #18 0x0000aaaaaad52204 in dwarf2_frame_this_id (this_frame=0xaaaaab85a730, this_cache=0xaaaaab85a748, this_id=0xaaaaab85a790) at ../../../repos/binutils-gdb/gdb/dwarf2/frame.c:1218 #19 0x0000aaaaaae1916c in compute_frame_id (fi=0xaaaaab85a730) at ../../../repos/binutils-gdb/gdb/frame.c:550 #20 0x0000aaaaaae1c958 in get_prev_frame_if_no_cycle (this_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:1927 #21 0x0000aaaaaae1cc44 in get_prev_frame_always_1 (this_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:2006 FIRST CALL - #22 0x0000aaaaaae1d098 in get_prev_frame_always (this_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:2124 #23 0x0000aaaaaae18f68 in skip_artificial_frames (frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:495 #24 0x0000aaaaaae193e8 in get_stack_frame_id (next_frame=0xaaaaab85a670) at ../../../repos/binutils-gdb/gdb/frame.c:596 #25 0x0000aaaaaae87a54 in process_event_stop_test (ecs=0xffffffffefc8) at ../../../repos/binutils-gdb/gdb/infrun.c:6857 #26 0x0000aaaaaae86bdc in handle_signal_stop (ecs=0xffffffffefc8) at ../../../repos/binutils-gdb/gdb/infrun.c:6381 #27 0x0000aaaaaae84fd0 in handle_inferior_event (ecs=0xffffffffefc8) at ../../../repos/binutils-gdb/gdb/infrun.c:5578 #28 0x0000aaaaaae81588 in fetch_inferior_event (client_data=0x0) at ../../../repos/binutils-gdb/gdb/infrun.c:4020 #29 0x0000aaaaaae5f7fc in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at ../../../repos/binutils-gdb/gdb/inf-loop.c:43 #30 0x0000aaaaaae8d768 in infrun_async_inferior_event_handler (data=0x0) at ../../../repos/binutils-gdb/gdb/infrun.c:9377 #31 0x0000aaaaaabff970 in check_async_event_handlers () at ../../../repos/binutils-gdb/gdb/async-event.c:291 #32 0x0000aaaaab27cbec in gdb_do_one_event () at ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:194 #33 0x0000aaaaaaef1894 in start_event_loop () at ../../../repos/binutils-gdb/gdb/main.c:356 #34 0x0000aaaaaaef1a04 in captured_command_loop () at ../../../repos/binutils-gdb/gdb/main.c:416 #35 0x0000aaaaaaef3338 in captured_main (data=0xfffffffff1f0) at ../../../repos/binutils-gdb/gdb/main.c:1254 #36 0x0000aaaaaaef33a0 in gdb_main (args=0xfffffffff1f0) at ../../../repos/binutils-gdb/gdb/main.c:1269 #37 0x0000aaaaaab6e0dc in main (argc=6, argv=0xfffffffff348) at ../../../repos/binutils-gdb/gdb/gdb.c:32 The following patch addresses this by using a function that unwinds the PC from the next (inline) frame directly as opposed to creating a lazy value that is bound to the next frame's ID (still not computed). gdb/ChangeLog: 2020-04-23 Luis Machado <[email protected]> * dwarf2/frame-tailcall.c (dwarf2_tailcall_sniffer_first): Use get_frame_register instead of gdbarch_unwind_pc.
mbitsnbites
pushed a commit
that referenced
this issue
Jun 5, 2020
Python3.9b1 is now available on Rawhide. GDB w/ Python 3.9 support can be built using the configure switch -with-python=/usr/bin/python3.9. Attempting to run gdb/Python3.9 segfaults on startup: #0 0x00007ffff7b0582c in PyEval_ReleaseLock () from /lib64/libpython3.9.so.1.0 #1 0x000000000069ccbf in do_start_initialization () at worktree-test1/gdb/python/python.c:1789 #2 _initialize_python () at worktree-test1/gdb/python/python.c:1877 #3 0x00000000007afb0a in initialize_all_files () at init.c:237 ... Consulting the the documentation... https://docs.python.org/3/c-api/init.html ...we find that PyEval_ReleaseLock() has been deprecated since version 3.2. It recommends using PyEval_SaveThread or PyEval_ReleaseThread() instead. In do_start_initialization, in gdb/python/python.c, we can replace the calls to PyThreadState_Swap() and PyEval_ReleaseLock() with a single call to PyEval_SaveThread. (Thanks to Keith Seitz for working this out.) With that in place, GDB gets a little bit further. It still dies on startup, but the backtrace is different: #0 0x00007ffff7b04306 in PyOS_InterruptOccurred () from /lib64/libpython3.9.so.1.0 #1 0x0000000000576e86 in check_quit_flag () at worktree-test1/gdb/extension.c:776 #2 0x0000000000576f8a in set_active_ext_lang (now_active=now_active@entry=0x983c00 <extension_language_python>) at worktree-test1/gdb/extension.c:705 #3 0x000000000069d399 in gdbpy_enter::gdbpy_enter (this=0x7fffffffd2d0, gdbarch=0x0, language=0x0) at worktree-test1/gdb/python/python.c:211 #4 0x0000000000686e00 in python_new_inferior (inf=0xddeb10) at worktree-test1/gdb/python/py-inferior.c:251 #5 0x00000000005d9fb9 in std::function<void (inferior*)>::operator()(inferior*) const (__args#0=<optimized out>, this=0xccad20) at /usr/include/c++/10/bits/std_function.h:617 #6 gdb::observers::observable<inferior*>::notify (args#0=0xddeb10, this=<optimized out>) at worktree-test1/gdb/../gdbsupport/observable.h:106 #7 add_inferior_silent (pid=0) at worktree-test1/gdb/inferior.c:113 #8 0x00000000005dbcb8 in initialize_inferiors () at worktree-test1/gdb/inferior.c:947 ... We checked with some Python Developers and were told that we should acquire the GIL prior to calling any Python C API function. We definitely don't have the GIL for calls of PyOS_InterruptOccurred(). I moved class_gdbpy_gil earlier in the file and use it in gdbpy_check_quit_flag() to acquire (and automatically release) the GIL. With those changes in place, I was able to run to a GDB prompt. But, when trying to quit, it segfaulted again due to due to some other problems with gdbpy_check_quit_flag(): Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x00007ffff7bbab0c in new_threadstate () from /lib64/libpython3.9.so.1.0 (top-gdb) bt 8 #0 0x00007ffff7bbab0c in new_threadstate () from /lib64/libpython3.9.so.1.0 #1 0x00007ffff7afa5ea in PyGILState_Ensure.cold () from /lib64/libpython3.9.so.1.0 #2 0x000000000069b58c in gdbpy_gil::gdbpy_gil (this=<synthetic pointer>) at worktree-test1/gdb/python/python.c:278 #3 gdbpy_check_quit_flag (extlang=<optimized out>) at worktree-test1/gdb/python/python.c:278 #4 0x0000000000576e96 in check_quit_flag () at worktree-test1/gdb/extension.c:776 #5 0x000000000057700c in restore_active_ext_lang (previous=0xe9c050) at worktree-test1/gdb/extension.c:729 #6 0x000000000088913a in do_my_cleanups ( pmy_chain=0xc31870 <final_cleanup_chain>, old_chain=0xae5720 <sentinel_cleanup>) at worktree-test1/gdbsupport/cleanups.cc:131 #7 do_final_cleanups () at worktree-test1/gdbsupport/cleanups.cc:143 In this case, we're trying to call a Python C API function after Py_Finalize() has been called from finalize_python(). I made finalize_python set gdb_python_initialized to false and then cause check_quit_flag() to return early when it's false. With these changes in place, GDB seems to be working again with Python3.9b1. I think it likely that there are other problems lurking. I wouldn't be surprised to find that there are other calls into Python where we don't first make sure that we have the GIL. Further changes may well be needed. I see no regressions testing on Rawhide using a GDB built with the default Python version (3.8.3) versus one built using Python 3.9b1. I've also tested on Fedora 28, 29, 30, 31, and 32 (all x86_64) using the default (though updated) system installed versions of Python on those OSes. This means that I've tested against Python versions 2.7.15, 2.7.17, 2.7.18, 3.7.7, 3.8.2, and 3.8.3. In each case GDB still builds without problem and shows no regressions after applying this patch. gdb/ChangeLog: 2020-MM-DD Kevin Buettner <[email protected]> Keith Seitz <[email protected]> * python/python.c (do_start_initialization): For Python 3.9 and later, call PyEval_SaveThread instead of PyEval_ReleaseLock. (class gdbpy_gil): Move to earlier in file. (finalize_python): Set gdb_python_initialized. (gdbpy_check_quit_flag): Acquire GIL via gdbpy_gil. Return early when not initialized.
mbitsnbites
pushed a commit
that referenced
this issue
Jun 26, 2020
Since the multi-target patch, the run command fails on Solaris with an assertion failure even for a trivial program: $ ./gdb -D ./data-directory ./hello GNU gdb (GDB) 10.0.50.20200106-git [...] Reading symbols from ./hello... (gdb) run Starting program: /vol/obj/gnu/gdb/gdb/reghunt/no-resync/122448/gdb/hello /vol/src/gnu/gdb/hg/master/reghunt/gdb/thread.c:336: internal-error: thread_info::thread_info(inferior*, ptid_t): Assertion `inf_ != NULL' failed. Here's the start of the corresponding stack trace: #0 internal_error ( file=file@entry=0x966150 "/vol/src/gnu/gdb/hg/master/reghunt/gdb/thread.c", line=line@entry=336, fmt=0x9ddb94 "%s: Assertion `%s' failed.") at /vol/src/gnu/gdb/hg/master/reghunt/gdb/gdbsupport/errors.c:51 #1 0x0000000000ef81f4 in thread_info::thread_info (this=0x1212020, inf_=<optimized out>, ptid_=...) at /vol/src/gnu/gdb/hg/master/reghunt/gdb/thread.c:344 #2 0x0000000000ef82cd in new_thread (inf=inf@entry=0x0, ptid=...) at /vol/src/gnu/gdb/hg/master/reghunt/gdb/thread.c:239 #3 0x0000000000efac3c in add_thread_silent ( targ=targ@entry=0x11b0940 <the_procfs_target>, ptid=...) at /vol/src/gnu/gdb/hg/master/reghunt/gdb/thread.c:304 #4 0x0000000000d90692 in procfs_target::create_inferior ( this=0x11b0940 <the_procfs_target>, exec_file=0x13dbef0 "/vol/obj/gnu/gdb/gdb/reghunt/no-resync/122448/gdb/hello", allargs="", env=0x13c48f0, from_tty=<optimized out>) at /vol/src/gnu/gdb/hg/master/reghunt/gdb/gdbsupport/ptid.h:47 #5 0x0000000000c84e64 in run_command_1 (args=<optimized out>, from_tty=1, run_how=run_how@entry=RUN_NORMAL) at /vol/gcc-9/include/c++/9.1.0/bits/basic_string.h:263 #6 0x0000000000c85007 in run_command (args=<optimized out>, from_tty=<optimized out>) at /vol/src/gnu/gdb/hg/master/reghunt/gdb/infcmd.c:687 Looking closer, I found that in add_thread_silent as called from procfs.c (procfs_target::create_inferior) find_inferior_ptid returns NULL. The all_inferiors (targ) iterator comes up empty. Going from there, I see that in add_thread_silent m_target_stack = {m_top = file_stratum, m_stack = {0x20190e0 <the_dummy_target>, 0x200b8c0 <exec_ops>, 0x0, 0x0, 0x0, 0x0, 0x0}}} i.e. the_procfs_target is missing compared to the_amd64_linux_nat_target on Linux/x86_64. Moving the push_target call earlier allows debugging to get over the initial assertion failure. I run instead into procfs: couldn't find pid 0 in procinfo list. which is fixed by https://sourceware.org/pipermail/gdb-patches/2020-June/169674.html Both patches tested together on amd64-pc-solaris2.11. PR gdb/25939 * procfs.c (procfs_target::procfs_init_inferior): Move push_target call ... (procfs_target::create_inferior): ... here.
mbitsnbites
pushed a commit
that referenced
this issue
Sep 7, 2020
…, part 1 Running the testsuite against an Asan-enabled build of GDB makes gdb.base/multi-target.exp expose this bug. scoped_restore_current_thread's ctor calls get_frame_id to record the selected frame's ID to restore later. If the frame ID hasn't been computed yet, it will be computed on the spot, and that will usually require accessing the target's memory and registers, which requires remote accesses. If the remote connection closes while we're computing the frame ID, the remote target exits its inferiors, unpushes itself, and throws a TARGET_CLOSE_ERROR error. If that happens, GDB can currently crash, here: > ==18555==ERROR: AddressSanitizer: heap-use-after-free on address 0x621004670aa8 at pc 0x0000007ab125 bp 0x7ffdecaecd20 sp 0x7ffdecaecd10 > READ of size 4 at 0x621004670aa8 thread T0 > #0 0x7ab124 in dwarf2_frame_this_id src/binutils-gdb/gdb/dwarf2/frame.c:1228 > #1 0x983ec5 in compute_frame_id src/binutils-gdb/gdb/frame.c:550 > #2 0x9841ee in get_frame_id(frame_info*) src/binutils-gdb/gdb/frame.c:582 > #3 0x1093faa in scoped_restore_current_thread::scoped_restore_current_thread() src/binutils-gdb/gdb/thread.c:1462 > #4 0xaee5ba in fetch_inferior_event(void*) src/binutils-gdb/gdb/infrun.c:3968 > #5 0xaa990b in inferior_event_handler(inferior_event_type, void*) src/binutils-gdb/gdb/inf-loop.c:43 > #6 0xea61b6 in remote_async_serial_handler src/binutils-gdb/gdb/remote.c:14161 > #7 0xefca8a in run_async_handler_and_reschedule src/binutils-gdb/gdb/ser-base.c:137 > #8 0xefcd23 in fd_event src/binutils-gdb/gdb/ser-base.c:188 > #9 0x15a7416 in handle_file_event src/binutils-gdb/gdbsupport/event-loop.cc:548 > #10 0x15a7c36 in gdb_wait_for_event src/binutils-gdb/gdbsupport/event-loop.cc:673 > #11 0x15a5dbb in gdb_do_one_event() src/binutils-gdb/gdbsupport/event-loop.cc:215 > #12 0xbfe62d in start_event_loop src/binutils-gdb/gdb/main.c:356 > #13 0xbfe935 in captured_command_loop src/binutils-gdb/gdb/main.c:416 > #14 0xc01d39 in captured_main src/binutils-gdb/gdb/main.c:1253 > #15 0xc01dc9 in gdb_main(captured_main_args*) src/binutils-gdb/gdb/main.c:1268 > #16 0x414ddd in main src/binutils-gdb/gdb/gdb.c:32 > #17 0x7f590110b82f in __libc_start_main ../csu/libc-start.c:291 > #18 0x414bd8 in _start (build/binutils-gdb/gdb/gdb+0x414bd8) What happens is that above, we're in dwarf2_frame_this_id, just after the dwarf2_frame_cache call. The "cache" variable that the dwarf2_frame_cache function returned is already stale. It's been released here, from within the dwarf2_frame_cache: (top-gdb) bt #0 reinit_frame_cache () at src/gdb/frame.c:1855 #1 0x00000000014ff7b0 in switch_to_no_thread () at src/gdb/thread.c:1301 #2 0x0000000000f66d3e in switch_to_inferior_no_thread (inf=0x615000338180) at src/gdb/inferior.c:626 #3 0x00000000012f3826 in remote_unpush_target (target=0x6170000c5900) at src/gdb/remote.c:5521 #4 0x00000000013097e0 in remote_target::readchar (this=0x6170000c5900, timeout=2) at src/gdb/remote.c:9137 #5 0x000000000130be4d in remote_target::getpkt_or_notif_sane_1 (this=0x6170000c5900, buf=0x6170000c5918, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9683 #6 0x000000000130c8ab in remote_target::getpkt_sane (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9790 #7 0x000000000130bc0d in remote_target::getpkt (this=0x6170000c5900, buf=0x6170000c5918, forever=0) at src/gdb/remote.c:9623 #8 0x000000000130838e in remote_target::remote_read_bytes_1 (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len_units=64, unit_size=1, xfered_len_units=0x7fff6a29b9a0) at src/gdb/remote.c:8860 #9 0x0000000001308bd2 in remote_target::remote_read_bytes (this=0x6170000c5900, memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64, unit_size=1, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:8987 #10 0x0000000001311ed1 in remote_target::xfer_partial (this=0x6170000c5900, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/remote.c:10988 #11 0x00000000014ba969 in raw_memory_xfer_partial (ops=0x6170000c5900, readbuf=0x6080000ad3bc "", writebuf=0x0, memaddr=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:918 #12 0x00000000014bb720 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, readbuf=0x6080000ad3bc "", writebuf=0x0, offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1148 #13 0x00000000014bc3b5 in target_read_partial (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64, xfered_len=0x7fff6a29b9a0) at src/gdb/target.c:1380 #14 0x00000000014bc593 in target_read (ops=0x6170000c5900, object=TARGET_OBJECT_RAW_MEMORY, annex=0x0, buf=0x6080000ad3bc "", offset=140737488342464, len=64) at src/gdb/target.c:1419 #15 0x00000000014bbd4d in target_read_raw_memory (memaddr=0x7fffffffcdc0, myaddr=0x6080000ad3bc "", len=64) at src/gdb/target.c:1252 #16 0x0000000000bf27df in dcache_read_line (dcache=0x6060001eddc0, db=0x6080000ad3a0) at src/gdb/dcache.c:336 #17 0x0000000000bf2b72 in dcache_peek_byte (dcache=0x6060001eddc0, addr=0x7fffffffcdd8, ptr=0x6020001231b0 "") at src/gdb/dcache.c:403 #18 0x0000000000bf3103 in dcache_read_memory_partial (ops=0x6170000c5900, dcache=0x6060001eddc0, memaddr=0x7fffffffcdd8, myaddr=0x6020001231b0 "", len=8, xfered_len=0x7fff6a29bf20) at src/gdb/dcache.c:484 #19 0x00000000014bafe9 in memory_xfer_partial_1 (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1034 #20 0x00000000014bb212 in memory_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, readbuf=0x6020001231b0 "", writebuf=0x0, memaddr=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1076 #21 0x00000000014bb6b3 in target_xfer_partial (ops=0x6170000c5900, object=TARGET_OBJECT_STACK_MEMORY, annex=0x0, readbuf=0x6020001231b0 "", writebuf=0x0, offset=140737488342488, len=8, xfered_len=0x7fff6a29bf20) at src/gdb/target.c:1133 #22 0x000000000164564d in read_value_memory (val=0x60f000029440, bit_offset=0, stack=1, memaddr=0x7fffffffcdd8, buffer=0x6020001231b0 "", length=8) at src/gdb/valops.c:956 #23 0x0000000001680fff in value_fetch_lazy_memory (val=0x60f000029440) at src/gdb/value.c:3764 #24 0x0000000001681efd in value_fetch_lazy (val=0x60f000029440) at src/gdb/value.c:3910 #25 0x0000000001676143 in value_optimized_out (value=0x60f000029440) at src/gdb/value.c:1411 #26 0x0000000000e0fcb8 in frame_register_unwind (next_frame=0x6210066bfde0, regnum=16, optimizedp=0x7fff6a29c200, unavailablep=0x7fff6a29c240, lvalp=0x7fff6a29c2c0, addrp=0x7fff6a29c300, realnump=0x7fff6a29c280, bufferp=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1144 #27 0x0000000000e10418 in frame_unwind_register (next_frame=0x6210066bfde0, regnum=16, buf=0x7fff6a29c3a0 "@\304)j\377\177") at src/gdb/frame.c:1196 #28 0x0000000000f00431 in i386_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/i386-tdep.c:1969 #29 0x0000000000e39724 in gdbarch_unwind_pc (gdbarch=0x6210043d0110, next_frame=0x6210066bfde0) at src/gdb/gdbarch.c:3056 #30 0x0000000000c2ea90 in dwarf2_tailcall_sniffer_first (this_frame=0x6210066bfde0, tailcall_cachep=0x6210066bfee0, entry_cfa_sp_offsetp=0x0) at src/gdb/dwarf2/frame-tailcall.c:423 #31 0x0000000000c36bdb in dwarf2_frame_cache (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8) at src/gdb/dwarf2/frame.c:1198 #32 0x0000000000c36eb3 in dwarf2_frame_this_id (this_frame=0x6210066bfde0, this_cache=0x6210066bfdf8, this_id=0x6210066bfe40) at src/gdb/dwarf2/frame.c:1226 Note that remote_target::readchar in frame #4 throws TARGET_CLOSE_ERROR after the remote_unpush_target in frame #3 returns. The problem is that the TARGET_CLOSE_ERROR is swallowed by value_optimized_out in frame #25. If we fix that one, then we run into dwarf2_tailcall_sniffer_first swallowing the exception in frame #30 too. The attached patch fixes it by making those spots swallow fewer kinds of errors. gdb/ChangeLog: * frame-tailcall.c (dwarf2_tailcall_sniffer_first): Only swallow NO_ENTRY_VALUE_ERROR / MEMORY_ERROR / OPTIMIZED_OUT_ERROR / NOT_AVAILABLE_ERROR. * value.c (value_optimized_out): Only swallow MEMORY_ERROR / OPTIMIZED_OUT_ERROR / NOT_AVAILABLE_ERROR.
mbitsnbites
pushed a commit
that referenced
this issue
Sep 7, 2020
When building gdbserver with AddressSanitizer, I get this annoying little leak when gdbserver exits: ==307817==ERROR: LeakSanitizer: detected memory leaks Direct leak of 14 byte(s) in 1 object(s) allocated from: #0 0x7f7fd4256459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x563bef981b80 in xmalloc /home/simark/src/binutils-gdb/gdbserver/../gdb/alloc.c:60 #2 0x563befb53301 in xstrdup /home/simark/src/binutils-gdb/libiberty/xstrdup.c:34 #3 0x563bef9d742b in handle_query /home/simark/src/binutils-gdb/gdbserver/server.cc:2286 #4 0x563bef9ed0b7 in process_serial_event /home/simark/src/binutils-gdb/gdbserver/server.cc:4061 #5 0x563bef9f1d9e in handle_serial_event(int, void*) /home/simark/src/binutils-gdb/gdbserver/server.cc:4402 #6 0x563befb0ec65 in handle_file_event /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:548 #7 0x563befb0f49f in gdb_wait_for_event /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:673 #8 0x563befb0d4a1 in gdb_do_one_event() /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:215 #9 0x563bef9e721a in start_event_loop /home/simark/src/binutils-gdb/gdbserver/server.cc:3484 #10 0x563bef9eb90a in captured_main /home/simark/src/binutils-gdb/gdbserver/server.cc:3875 #11 0x563bef9ec2c7 in main /home/simark/src/binutils-gdb/gdbserver/server.cc:3961 #12 0x7f7fd3330001 in __libc_start_main (/usr/lib/libc.so.6+0x27001) SUMMARY: AddressSanitizer: 14 byte(s) leaked in 1 allocation(s). This is due to the handling of unknown qsupported features in handle_query. The `qsupported` vector is built, containing all the feature names received from GDB. As we iterate on them, when we encounter unknown ones, we move them at the beginning of the vector, in preparation of passing this vector of unknown features down to the target (which may know about them). When moving these unknown features to other slots in the vector, we overwrite other pointers without freeing them, which therefore leak. An easy fix would be to add a `free` when doing the move. However, I think this is a good opportunity to sprinkle a bit of automatic memory management in this code. So, use a vector of std::string which owns all the entries. And use a separate vector (that doesn't own the entries) for the unknown ones, which is then passed to target_process_qsupported. Given that the `c_str` method of std::string returns a `const char *`, it follows that process_stratum_target::process_qsupported must accept a `const char **` instead of a `char **`. And while at it, change the pointer + size paramters to use an array_view instead. gdbserver/ChangeLog: * server.cc (handle_query): Use std::vector of std::string for `qsupported` vector. Use separate vector for unknowns. * target.h (class process_stratum_target) <process_qsupported>: Change parameters to array_view of const char *. (target_process_qsupported): Remove `count` parameter. * target.cc (process_stratum_target::process_qsupported): Change parameters to array_view of const char *. * linux-x86-low.cc (class x86_target) <process_qsupported>: Likewise. Change-Id: I97f133825faa6d7abbf83a58504eb0ba77462812
mbitsnbites
pushed a commit
that referenced
this issue
Sep 7, 2020
gdb.base/corefile.exp is showing an unexpected failure and an unresolved testcase when testing against unix/-m32: (gdb) PASS: gdb.base/corefile.exp: attach: sanity check we see the core file attach 15741 gdb/dwarf2-frame.c:1009: internal-error: dwarf2_frame_cache* dwarf2_frame_cache(frame_info*, void**): Assertion `fde != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.base/corefile.exp: attach: with core (GDB internal error) Resyncing due to internal error. This regressed with: From 5b6d1e4 Mon Sep 17 00:00:00 2001 From: Pedro Alves <[email protected]> Date: Fri, 10 Jan 2020 20:06:08 +0000 Subject: [PATCH] Multi-target support The assertion is here: #0 internal_error (file=0xbffffccb0 <error: Cannot access memory at address 0xbffffccb0>, line=0, fmt=0x555556327320 "en_US.UTF-8") at sr c/gdbsupport/errors.cc:51 #1 0x00005555557d4e45 in dwarf2_frame_cache (this_frame=0x55555672f950, this_cache=0x55555672f968) at src/gdb/dwarf2/frame.c:1013 #2 0x00005555557d5886 in dwarf2_frame_this_id (this_frame=0x55555672f950, this_cache=0x55555672f968, this_id=0x55555672f9b0) at src/gdb/d warf2/frame.c:1226 #3 0x00005555558b184e in compute_frame_id (fi=0x55555672f950) at src/gdb/frame.c:558 #4 0x00005555558b19b2 in get_frame_id (fi=0x55555672f950) at src/gdb/frame.c:588 #5 0x0000555555bda338 in scoped_restore_current_thread::scoped_restore_current_thread (this=0x7fffffffd0d8) at src/gdb/thread.c:1458 #6 0x00005555556ce41f in scoped_restore_current_pspace_and_thread::scoped_restore_current_pspace_and_thread (During symbol reading: .debug_line address at offset 0x1db2d3 is 0 [in module /home/pedro/gdb/cascais-builds/binutils-gdb/gdb/gdb] this=0x7fffffffd0d0) at src/gdb/progspace-and-thread.h:29 #7 0x0000555555898ea6 in remove_target_sections (owner=0x555556935550) at src/gdb/exec.c:798 #8 0x0000555555b700b6 in symfile_free_objfile (objfile=0x555556935550) at src/gdb/symfile.c:3742 #9 0x000055555565050e in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffd190 : 0x555556935550) at /usr/include/c++/9/bits/std_function.h:300 #10 0x0000555555a3053d in std::function<void (objfile*)>::operator()(objfile*) const (this=0x555556752a20, __args#0=0x555556935550) at /usr/include/c++/9/bits/std_function. h:688 #11 0x0000555555a2ff01 in gdb::observers::observable<objfile*>::notify (this=0x5555562eaa80 <gdb::observers::free_objfile>, args#0=0x555556935550) at /net/cascais.nfs/gdb/b inutils-gdb/src/gdb/../gdbsupport/observable.h:106 #12 0x0000555555a2c56a in objfile::~objfile (this=0x555556935550, __in_chrg=<optimized out>) at src/gdb/objfiles.c:521 #13 0x0000555555a31d46 in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c1f6f0) at /usr/include/c++/9/bits/shared_ptr_base.h:377 #14 0x00005555556d3444 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c1f6f0) at /usr/include/c++/9/bits/shared_ptr_base.h:155 #15 0x00005555556cec77 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556b99ee8, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730 #16 0x0000555555a2f8da in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556b99ee0, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169 #17 0x0000555555a2f8fa in std::shared_ptr<objfile>::~shared_ptr (this=0x555556b99ee0, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103 #18 0x0000555555a63fba in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x55555679f0c0, __p=0x555556b99ee0) at /usr/include/c++/9/ext/new_allocator.h:153 #19 0x0000555555a638fb in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556b99ee0) at /usr/include/c++/9/bits/alloc_traits.h:497 #20 0x0000555555a6351c in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x55555679f0c0, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556935550}) at /usr/include/c++/9/bits/stl_list.h:1921 #21 0x0000555555a62dab in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x55555679f0c0, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556935550}) at /usr/include/c++/9/bits/list.tcc:158 #22 0x0000555555a614dd in program_space::remove_objfile (this=0x55555679f080, objfile=0x555556935550) at src/gdb/progspace.c:207 #23 0x0000555555a2c4dc in objfile::unlink (this=0x555556935550) at src/gdb/objfiles.c:497 #24 0x0000555555a2da65 in objfile_purge_solibs () at src/gdb/objfiles.c:904 #25 0x0000555555b3af74 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236 #26 0x0000555555bbafc7 in target_pre_inferior (from_tty=1) at src/gdb/target.c:1900 #27 0x0000555555940afb in attach_command (args=0x5555563277c7 "15741", from_tty=1) at src/gdb/infcmd.c:2582 ... The problem is that the multi-target commit added a scoped_restore_current_thread to remove_target_sections (frame #7 above). scoped_restore_current_thread's ctor fetches the selected frame's frame id. If the frame had not had its frame id computed yet, it is computed then (frame #4 above). Because it has been determined earlier that the frame's unwinder is the DWARF unwinder, we end up here: static struct dwarf2_frame_cache * dwarf2_frame_cache (struct frame_info *this_frame, void **this_cache) { ... /* Find the correct FDE. */ fde = dwarf2_frame_find_fde (&pc1, &cache->per_objfile); gdb_assert (fde != NULL); And, that assertion fails. The assertion is reasonable, because the DWARF unwinder only claims the frame if it managed to find the FDE earlier (in dwarf2_frame_sniffer). (unix/-m32 is thus really a red herring here -- it's just that on x86_64 -m64, the frame is not claimed by the DWARF unwinder.) The reason the assertion is failing, is because the objfile that contains the FDE has been removed from the objfiles list already when we get here (frame #22 above). This suggests that the fix should be to invalidate DWARF frames when their objfile is removed. Or to keep it simple and safe, invalidate the frame cache when an objfile is removed. That is what this commit does. OOC, I checked why is it that when you unload a file with plain "(gdb) file", we don't hit the assertion. It must be because we're already flushing the frame cache somewhere else in that case. And indeed, we flush the frame cache here: (gdb) bt #0 reinit_frame_cache () at src/gdb/frame.c:1857 #1 0x0000555555ad1ad6 in registers_changed_ptid (target=0x0, ptid=...) at src/gdb/regcache.c:470 #2 0x0000555555ad1b58 in registers_changed () at src/gdb/regcache.c:485 #3 0x00005555558d095e in set_target_gdbarch (new_gdbarch=0x555556d5f5b0) at src/gdb/gdbarch.c:5528 #4 0x0000555555677175 in set_gdbarch_from_file (abfd=0x0) at src/gdb/arch-utils.c:601 #5 0x0000555555897c6b in exec_file_attach (filename=0x0, from_tty=1) at src/gdb/exec.c:409 #6 0x000055555589852d in exec_file_command (args=0x0, from_tty=1) at src/gdb/exec.c:571 #7 0x00005555558985a1 in file_command (arg=0x0, from_tty=1) at src/gdb/exec.c:583 #8 0x000055555572b55f in do_const_cfunc (c=0x55555672e200, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:95 #9 0x000055555572f3d3 in cmd_func (cmd=0x55555672e200, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:2181 #10 0x0000555555be1ecc in execute_command (p=0x555556327804 "", from_tty=1) at src/gdb/top.c:668 #11 0x0000555555895427 in command_handler (command=0x555556327800 "file") at src/gdb/event-top.c:588 #12 0x00005555558958af in command_line_handler (rl=...) at src/gdb/event-top.c:773 #13 0x0000555555894b3e in gdb_rl_callback_handler (rl=0x55555a09e240 "file") at src/gdb/event-top.c:219 #14 0x0000555555ccfeec in rl_callback_read_char () at src/readline/readline/callback.c:281 #15 0x000055555589495a in gdb_rl_callback_read_char_wrapper_noexcept () at src/gdb/event-top.c:177 #16 0x0000555555894a08 in gdb_rl_callback_read_char_wrapper (client_data=0x555556327520) at src/gdb/event-top.c:194 #17 0x00005555558952a5 in stdin_event_handler (error=0, client_data=0x555556327520) at src/gdb/event-top.c:516 #18 0x0000555555e027d6 in handle_file_event (file_ptr=0x555558d20840, ready_mask=1) at src/gdbsupport/event-loop.cc:548 #19 0x0000555555e02d88 in gdb_wait_for_event (block=1) at src/gdbsupport/event-loop.cc:673 #20 0x0000555555e01c42 in gdb_do_one_event () at src/gdbsupport/event-loop.cc:215 #21 0x00005555559c47c2 in start_event_loop () at src/gdb/main.c:356 #22 0x00005555559c490d in captured_command_loop () at src/gdb/main.c:416 #23 0x00005555559c6217 in captured_main (data=0x7fffffffdc00) at src/gdb/main.c:1253 #24 0x00005555559c6289 in gdb_main (args=0x7fffffffdc00) at src/gdb/main.c:1268 #25 0x0000555555621756 in main (argc=3, argv=0x7fffffffdd18) at src/gdb/gdb.c:32 gdb/ChangeLog: PR gdb/26336 * progspace.c (program_space::remove_objfile): Invalidate the frame cache.
mbitsnbites
pushed a commit
that referenced
this issue
Oct 25, 2020
"thread find" with multiple inferiors got broken with the multi-target work: Thread 1 "gdb" hit Breakpoint 1, internal_error (...) at ../../src/gdbsupport/errors.cc:51 51 { (top-gdb) bt #0 internal_error (file=0xffffd4d0 <error: Cannot access memory at address 0xffffd4d0>, line=0, fmt=0x555556330320 "en_US.UTF-8") at ../../src/gdbsupport/errors.cc:51 #1 0x0000555555bca4c7 in target_thread_name (info=0x555556801290) at ../../src/gdb/target.c:2035 #2 0x0000555555beb07a in thread_find_command (arg=0x7fffffffe08e "1", from_tty=0) at ../../src/gdb/thread.c:1959 #3 0x000055555572ec49 in do_const_cfunc (c=0x555556786bc0, args=0x7fffffffe08e "1", from_tty=0) at ../../src/gdb/cli/cli-decode.c:95 #4 0x0000555555732abd in cmd_func (cmd=0x555556786bc0, args=0x7fffffffe08e "1", from_tty=0) at ../../src/gdb/cli/cli-decode.c:2181 #5 0x0000555555bf1245 in execute_command (p=0x7fffffffe08e "1", from_tty=0) at ../../src/gdb/top.c:664 #6 0x00005555559cad10 in catch_command_errors (command=0x555555bf0c31 <execute_command(char const*, int)>, arg=0x7fffffffe082 "thread find 1", from_tty=0) at ../../src/gdb/main.c:457 #7 0x00005555559cc33d in captured_main_1 (context=0x7fffffffdb60) at ../../src/gdb/main.c:1218 #8 0x00005555559cc571 in captured_main (data=0x7fffffffdb60) at ../../src/gdb/main.c:1243 #9 0x00005555559cc5e8 in gdb_main (args=0x7fffffffdb60) at ../../src/gdb/main.c:1268 #10 0x0000555555623816 in main (argc=17, argv=0x7fffffffdc78) at ../../src/gdb/gdb.c:32 The problem is that we're not switching to the inferior/target before calling target methods, which trips on an assertion put in place exactly to catch this sort of problem. gdb/testsuite/ChangeLog: PR gdb/26631 * gdb.multi/multi-target-thread-find.exp: New file. gdb/ChangeLog: PR gdb/26631 * thread.c (thread_find_command): Switch inferior before calling target methods.
mbitsnbites
pushed a commit
that referenced
this issue
Oct 25, 2020
In some runtimes, there may be a "main" function in some class or namespace. The breakpoint created by runto_main may therefore have unexpected locations on some other functions than the actual main. These breakpoint locations can unexpectedly get hit during tests and lead to failures. I saw this while playing with AMD's ROCm toolchain -- I wrote a board file to run the testsuite against device kernels. There, the runtime calls a "main" function before the device kernel code is reached: Thread 4 "bit_extract" hit Breakpoint 1, 0x00007ffeea140960 in lld::elf::LinkerDriver::main(llvm::ArrayRef<char const*>) () from /opt/rocm/lib/libamd_comgr.so.1 (gdb) bt #0 0x00007ffeea140960 in lld::elf::LinkerDriver::main(llvm::ArrayRef<char const*>) () from /opt/rocm/lib/libamd_comgr.so.1 #1 0x00007ffeea2257a5 in lld::elf::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&, llvm::raw_ostream&) () from /opt/rocm/lib/libamd_comgr.so.1 #2 0x00007ffeea1bc374 in COMGR::linkWithLLD(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&) () from /opt/rocm/lib/libamd_comgr.so.1 #3 0x00007ffeea1bfb09 in COMGR::InProcessDriver::execute(llvm::ArrayRef<char const*>) () from /opt/rocm/lib/libamd_comgr.so.1 #4 0x00007ffeea1c4da9 in COMGR::AMDGPUCompiler::linkToExecutable() () from /opt/rocm/lib/libamd_comgr.so.1 #5 0x00007ffeea1fde20 in dispatchCompilerAction(amd_comgr_action_kind_s, COMGR::DataAction*, COMGR::DataSet*, COMGR::DataSet*, llvm::raw_ostream&) () from /opt/rocm/lib/libamd_comgr.so.1 #6 0x00007ffeea203a87 in amd_comgr_do_action () from /opt/rocm/lib/libamd_comgr.so.1 ... To avoid that, pass "qualified" to runto, in runto_main, so that gdb_breakpoint ends up creating a breakpoint with -qualified. This avoids creating breakpoints locations for other unrelated "main" functions. Note: I first tried making runto itself use "-qualified", but that caused regressions in the gdb.ada/ tests, which use runto without specifying the whole fully-qualified function name (i.e., without the package). So I end up restricting the -qualified to runto_main/mi_runto_main. The gdb.base/ui-redirect.exp change is necessary because that testcase is looking at what "save breakpoint" generates. gdb/testsuite/ChangeLog: * gdb.base/ui-redirect.exp: Expect "break -qualified main" in saved breakpoints file. * gdb.guile/scm-breakpoint.exp: Expect "-qualified main" when inspecting breakpoint list. * lib/gdb.exp (runto_main): Add "qualified" to options. * lib/mi-support.exp (mi_runto_helper): Add 'qualified' parameter, and handle it. (mi_runto_main): Pass 1 as qualified argument. Change-Id: I51468359ab0a518f05f7c0394c97f7e33b45fe69
mbitsnbites
pushed a commit
that referenced
this issue
Oct 25, 2020
…693) Fix a regression introduced by commit 7188ed0 ("Replace dwarf2_per_cu_data::cu backlink with per-objfile map"). This patch targets both master and gdb-10-branch, since this is a regression from GDB 9. Analysis -------- The DWARF generated by the included test case looks like: 0x0000000b: DW_TAG_compile_unit DW_AT_language [DW_FORM_sdata] (4) 0x0000000d: DW_TAG_base_type DW_AT_name [DW_FORM_string] ("int") DW_AT_byte_size [DW_FORM_data1] (0x04) DW_AT_encoding [DW_FORM_sdata] (5) 0x00000014: DW_TAG_subprogram DW_AT_name [DW_FORM_string] ("apply") 0x0000001b: DW_TAG_subprogram DW_AT_specification [DW_FORM_ref4] (0x00000014 "apply") DW_AT_low_pc [DW_FORM_addr] (0x0000000000001234) DW_AT_high_pc [DW_FORM_data8] (0x0000000000000020) 0x00000030: DW_TAG_template_type_parameter DW_AT_name [DW_FORM_string] ("T") DW_AT_type [DW_FORM_ref4] (0x0000000d "int") 0x00000037: NULL 0x00000038: NULL Simply loading the file in GDB makes it crash: $ ./gdb -nx --data-directory=data-directory testsuite/outputs/gdb.dwarf2/pr26693/pr26693 [1] 15188 abort (core dumped) ./gdb -nx --data-directory=data-directory The crash happens here, where htab (a dwarf2_cu::die_hash field) is unexpectedly NULL while generating partial symbols: #0 0x000055555fa28188 in htab_find_with_hash (htab=0x0, element=0x7fffffffbfa0, hash=27) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:591 #1 0x000055555cb4eb2e in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22951 #2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968 #3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441 #4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469 #5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737 #6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230 #7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614 #8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712 #9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073 The special thing about this DWARF is that the subprogram at 0x1b is a template specialization described with DW_AT_specification, and has no DW_AT_name in itself. To compute the name of this subprogram, partial_die_full_name needs to load the full DIE for this partial DIE. The name is generated from the templated function name and the actual tempalate parameter values of the specialization. To load the full DIE, partial_die_full_name creates a dummy DWARF attribute of form DW_FORM_ref_addr that points to our subprogram's DIE, and calls follow_die_ref on it. This eventually causes load_full_comp_unit to be called for the exact same CU we are currently making partial symbols for: #0 load_full_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, skip_partial=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9238 #1 0x000055555cb4e943 in follow_die_offset (sect_off=(unknown: 27), offset_in_dwz=0, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22942 #2 0x000055555cb4edfb in follow_die_ref (src_die=0x0, attr=0x7fffffffc130, ref_cu=0x7fffffffc110) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22968 #3 0x000055555caa48c5 in partial_die_full_name (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8441 #4 0x000055555caa4d79 in add_partial_symbol (pdi=0x621000157e70, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8469 #5 0x000055555caa7d8c in add_partial_subprogram (pdi=0x621000157e70, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8737 #6 0x000055555caa265c in scan_partial_symbols (first_die=0x621000157e00, lowpc=0x7fffffffc5c0, highpc=0x7fffffffc5e0, set_addrmap=1, cu=0x615000023f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8230 #7 0x000055555ca98e3f in process_psymtab_comp_unit_reader (reader=0x7fffffffc6b0, info_ptr=0x60600009650d "\003int", comp_unit_die=0x621000157d10, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7614 #8 0x000055555ca9aa2c in process_psymtab_comp_unit (this_cu=0x621000155510, per_objfile=0x613000009f80, want_partial_unit=false, pretend_language=language_minimal) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:7712 #9 0x000055555caa051a in dwarf2_build_psymtabs_hard (per_objfile=0x613000009f80) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:8073 load_full_comp_unit creates a cutu_reader for the CU. Since a dwarf2_cu object already exists for the CU, load_full_comp_unit is expected to find it and pass it to cutu_reader, so that cutu_reader doesn't create a new dwarf2_cu for the CU. And this is the difference between before and after the regression. Before commit 7188ed0, the dwarf2_per_cu_data -> dwarf2_cu link was a simple pointer in dwarf2_per_cu_data. This pointer was set up when starting the read the partial symbols. So it was already available at that point where load_full_comp_unit gets called. Post-7188ed02d2a7, this link is per-objfile, kept in the dwarf2_per_objfile::m_dwarf2_cus hash map. The entry is only put in the hash map once the partial symbols have been successfully read, when cutu_reader::keep is called. Therefore, it is _not_ set at the point load_full_comp_unit is called. As a consequence, a new dwarf2_cu object gets created and initialized by load_full_comp_unit (including initializing that dwarf2_cu::die_hash field). Meanwhile, the dwarf2_cu object created and used by the callers up the stack does not get initialized for full symbol reading, and the dwarf2_cu::die_hash field stays unexpectedly NULL. Solution -------- Since the caller of load_full_comp_unit knows about the existing dwarf2_cu object for the CU we are reading (the one load_full_comp_unit is expected to find), we can simply make it pass it down, instead of having load_full_comp_unit look up the per-objfile map. load_full_comp_unit therefore gets a new `existing_cu` parameter. All other callers get updated to pass `per_objfile->get_cu (per_cu)`, so the behavior shouldn't change for them, compared to the current HEAD. A test is added, which is the bare minimum to reproduce the issue. Notes ----- The original problem was reproduced by downloading https://github.com/oneapi-src/oneTBB/releases/download/v2020.3/tbb-2020.3-lin.tgz and loading libtbb.so in GDB. This code was compiled with the Intel C/C++ compiler. I was not able to reproduce the issue using GCC, I think because GCC puts a DW_AT_name in the specialized subprogram, so there's no need for partial_die_full_name to load the full DIE of the subprogram, and the faulty code doesn't execute. gdb/ChangeLog: PR gdb/26693 * dwarf2/read.c (load_full_comp_unit): Add existing_cu parameter. (load_cu): Pass existing CU. (process_imported_unit_die): Likewise. (follow_die_offset): Likewise. gdb/testsuite/ChangeLog: PR gdb/26693 * gdb.dwarf2/template-specification-full-name.exp: New test. Change-Id: I57c8042f96c45f15797a3848e4d384181c56bb44
mbitsnbites
pushed a commit
that referenced
this issue
Dec 26, 2020
The recent commit to make scoped_restore_current_thread's cdtors exception free regressed gdb.base/eh_return.exp: Breakpoint 1, 0x00000000004012bb in eh2 (gdb/frame.c:641: internal-error: frame_id get_frame_id(frame_info*): Assertion `stashed' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.base/eh_return.exp: hit breakpoint (GDB internal error) That testcase uses __builtin_eh_return and, before the regression, the backtrace at eh2 looked like this: (gdb) bt #0 0x00000000004006eb in eh2 (p=0x4006ec <continuation>) at src/gdb/testsuite/gdb.base/eh_return.c:54 Backtrace stopped: previous frame identical to this frame (corrupt stack?) That "previous frame identical to this frame" is caught by the cycle detection based on frame id. The assertion failing is this one: 638 /* Since this is the first frame in the chain, this should 639 always succeed. */ 640 bool stashed = frame_stash_add (fi); 641 gdb_assert (stashed); originally added by commit f245535 Author: Pedro Alves <[email protected]> AuthorDate: Mon Sep 5 18:41:38 2016 +0100 Fix PR19927: Avoid unwinder recursion if sniffer uses calls parse_and_eval The assertion is failing because frame #1's frame id was stashed before the id of frame #0 is stashed. The frame id of frame #1 was stashed here: (top-gdb) bt #0 frame_stash_add (frame=0x1e24c90) at src/gdb/frame.c:276 #1 0x0000000000669c1b in get_prev_frame_if_no_cycle (this_frame=0x19f8370) at src/gdb/frame.c:2120 #2 0x000000000066a339 in get_prev_frame_always_1 (this_frame=0x19f8370) at src/gdb/frame.c:2303 #3 0x000000000066a360 in get_prev_frame_always (this_frame=0x19f8370) at src/gdb/frame.c:2319 #4 0x000000000066b56c in get_frame_unwind_stop_reason (frame=0x19f8370) at src/gdb/frame.c:3028 #5 0x000000000059f929 in dwarf2_frame_cfa (this_frame=0x19f8370) at src/gdb/dwarf2/frame.c:1462 #6 0x00000000005ce434 in dwarf_evaluate_loc_desc::get_frame_cfa (this=0x7fffffffc070) at src/gdb/dwarf2/loc.c:666 #7 0x00000000005989a9 in dwarf_expr_context::execute_stack_op (this=0x7fffffffc070, op_ptr=0x1b2a053 "\364\003", op_end=0x1b2a053 "\364\003") at src/gdb/dwarf2/expr.c:1161 #8 0x0000000000596af6 in dwarf_expr_context::eval (this=0x7fffffffc070, addr=0x1b2a052 "\234\364\003", len=1) at src/gdb/dwarf2/expr.c:303 #9 0x0000000000597b4e in dwarf_expr_context::execute_stack_op (this=0x7fffffffc070, op_ptr=0x1b2a063 "", op_end=0x1b2a063 "") at src/gdb/dwarf2/expr.c:865 #10 0x0000000000596af6 in dwarf_expr_context::eval (this=0x7fffffffc070, addr=0x1b2a061 "\221X", len=2) at src/gdb/dwarf2/expr.c:303 #11 0x00000000005c8b5a in dwarf2_evaluate_loc_desc_full (type=0x1b564d0, frame=0x19f8370, data=0x1b2a061 "\221X", size=2, per_cu=0x1b28760, per_objfile=0x1a84930, subobj_type=0x1b564d0, subobj_byte_offset=0) at src/gdb/dwarf2/loc.c:2260 #12 0x00000000005c9243 in dwarf2_evaluate_loc_desc (type=0x1b564d0, frame=0x19f8370, data=0x1b2a061 "\221X", size=2, per_cu=0x1b28760, per_objfile=0x1a84930) at src/gdb/dwarf2/loc.c:2444 #13 0x00000000005cb769 in locexpr_read_variable (symbol=0x1b59840, frame=0x19f8370) at src/gdb/dwarf2/loc.c:3687 #14 0x0000000000663137 in language_defn::read_var_value (this=0x122ea60 <c_language_defn>, var=0x1b59840, var_block=0x0, frame=0x19f8370) at src/gdb/findvar.c:618 #15 0x0000000000663c3b in read_var_value (var=0x1b59840, var_block=0x0, frame=0x19f8370) at src/gdb/findvar.c:822 #16 0x00000000008c7d9f in read_frame_arg (fp_opts=..., sym=0x1b59840, frame=0x19f8370, argp=0x7fffffffc470, entryargp=0x7fffffffc490) at src/gdb/stack.c:542 #17 0x00000000008c89cd in print_frame_args (fp_opts=..., func=0x1b597c0, frame=0x19f8370, num=-1, stream=0x1aba860) at src/gdb/stack.c:890 #18 0x00000000008c9bf8 in print_frame (fp_opts=..., frame=0x19f8370, print_level=0, print_what=SRC_AND_LOC, print_args=1, sal=...) at src/gdb/stack.c:1394 #19 0x00000000008c92b9 in print_frame_info (fp_opts=..., frame=0x19f8370, print_level=0, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at src/gdb/stack.c:1119 #20 0x00000000008c75f0 in print_stack_frame (frame=0x19f8370, print_level=0, print_what=SRC_AND_LOC, set_current_sal=1) at src/gdb/stack.c:366 #21 0x000000000070250b in print_stop_location (ws=0x7fffffffc9e0) at src/gdb/infrun.c:8110 #22 0x0000000000702569 in print_stop_event (uiout=0x1a8b9e0, displays=true) at src/gdb/infrun.c:8126 #23 0x000000000096d04b in tui_on_normal_stop (bs=0x1bcd1c0, print_frame=1) at src/gdb/tui/tui-interp.c:98 ... Before the commit to make scoped_restore_current_thread's cdtors exception free, scoped_restore_current_thread's dtor would call get_frame_id on the selected frame, and we use scoped_restore_current_thread pervasively. That had the side effect of stashing the frame id of frame #0 before reaching the path shown in the backtrace. I.e., the frame id of frame #0 happened to be stashed before the frame id of frame #1. But that was by chance, not by design. This commit: commit 256ae5d Author: Kevin Buettner <[email protected]> AuthorDate: Mon Oct 31 12:47:42 2016 -0700 Stash frame id of current frame before stashing frame id for previous frame Fixed a similar problem, by making sure get_prev_frame computes the frame id of the current frame before unwinding the previous frame, so that the cycle detection works properly. That fix misses the scenario we're now running against, because if you notice, the backtrace above shows that frame #4 calls get_prev_frame_always, not get_prev_frame. I.e., nothing is calling get_frame_id on the current frame. The fix here is to move Kevin's fix down from get_prev_frame to get_prev_frame_always. Or actually, a bit further down to get_prev_frame_always_1 -- note that inline_frame_this_id calls get_prev_frame_always, so we need to be careful to avoid recursion in that scenario. gdb/ChangeLog: * frame.c (get_prev_frame): Move get_frame_id call from here ... (get_prev_frame_always_1): ... to here. * inline-frame.c (inline_frame_this_id): Mention get_prev_frame_always_1 in comment. Change-Id: Id960c98ab2d072c48a436c3eb160cc4b2a5cfd1d
mbitsnbites
pushed a commit
that referenced
this issue
Jan 22, 2021
…onfig_dir As reported in PR 27157, if some environment variables read at startup by GDB are defined but empty, we hit the assert in gdb_abspath: $ XDG_CACHE_HOME= ./gdb -nx --data-directory=data-directory -q AddressSanitizer:DEADLYSIGNAL ================================================================= ==2007040==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000001b0 (pc 0x5639d4aa4127 bp 0x7ffdac232c00 sp 0x7ffdac232bf0 T0) ==2007040==The signal is caused by a READ memory access. ==2007040==Hint: address points to the zero page. #0 0x5639d4aa4126 in target_stack::top() const /home/smarchi/src/binutils-gdb/gdb/target.h:1334 #1 0x5639d4aa41f1 in inferior::top_target() /home/smarchi/src/binutils-gdb/gdb/inferior.h:369 #2 0x5639d4a70b1f in current_top_target() /home/smarchi/src/binutils-gdb/gdb/target.c:120 #3 0x5639d4b00591 in gdb_readline_wrapper_cleanup::gdb_readline_wrapper_cleanup() /home/smarchi/src/binutils-gdb/gdb/top.c:1046 #4 0x5639d4afab31 in gdb_readline_wrapper(char const*) /home/smarchi/src/binutils-gdb/gdb/top.c:1104 #5 0x5639d4ccce2c in defaulted_query /home/smarchi/src/binutils-gdb/gdb/utils.c:893 #6 0x5639d4ccd6af in query(char const*, ...) /home/smarchi/src/binutils-gdb/gdb/utils.c:985 #7 0x5639d4ccaec1 in internal_vproblem /home/smarchi/src/binutils-gdb/gdb/utils.c:373 #8 0x5639d4ccb3d1 in internal_verror(char const*, int, char const*, __va_list_tag*) /home/smarchi/src/binutils-gdb/gdb/utils.c:439 #9 0x5639d5151a92 in internal_error(char const*, int, char const*, ...) /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55 #10 0x5639d5162ab4 in gdb_abspath(char const*) /home/smarchi/src/binutils-gdb/gdbsupport/pathstuff.cc:132 #11 0x5639d5162fac in get_standard_cache_dir[abi:cxx11]() /home/smarchi/src/binutils-gdb/gdbsupport/pathstuff.cc:228 #12 0x5639d3e76a81 in _initialize_index_cache() /home/smarchi/src/binutils-gdb/gdb/dwarf2/index-cache.c:325 #13 0x5639d4dbbe92 in initialize_all_files() /home/smarchi/build/binutils-gdb/gdb/init.c:321 #14 0x5639d4b00259 in gdb_init(char*) /home/smarchi/src/binutils-gdb/gdb/top.c:2344 #15 0x5639d4440715 in captured_main_1 /home/smarchi/src/binutils-gdb/gdb/main.c:950 #16 0x5639d444252e in captured_main /home/smarchi/src/binutils-gdb/gdb/main.c:1229 #17 0x5639d44425cf in gdb_main(captured_main_args*) /home/smarchi/src/binutils-gdb/gdb/main.c:1254 #18 0x5639d3923371 in main /home/smarchi/src/binutils-gdb/gdb/gdb.c:32 #19 0x7fa002d3f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #20 0x5639d392314d in _start (/home/smarchi/build/binutils-gdb/gdb/gdb+0x4d414d) gdb_abspath doesn't handle empty strings, so handle this case in the callers. If a variable is defined but empty, I think it's reasonable in this case to just ignore it, as if it was not defined. Note that this sometimes also lead to a segfault, because the failed assertion happens very early during startup, before things are fully initialized. gdbsupport/ChangeLog: PR gdb/27157 * pathstuff.cc (get_standard_cache_dir, get_standard_config_dir, find_gdb_home_config_file): Add empty string check. gdb/testsuite/ChangeLog: PR gdb/27157 * gdb.base/empty-host-env-vars.exp: New test. Change-Id: I8654d8e97e74e1dff6d308c111ae4b1bbf07bef9
mbitsnbites
pushed a commit
that referenced
this issue
Jan 22, 2021
When GDB is waiting trying to connect to a remote target and it receives a SIGWINCH (terminal gets resized), the blocking system call gets interrupted and we abort. For example, I connect to some port (on which nothing listens): (gdb) tar rem :1234 ... GDB blocks here, resize the terminal ... 🔢 Interrupted system call. The backtrace where GDB is blocked while waiting for the connection to establish is: #0 0x00007fe9db805b7b in select () from /usr/lib/libc.so.6 #1 0x000055f2472e9c42 in gdb_select (n=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x7ffe8fafe050) at /home/simark/src/binutils-gdb/gdb/posix-hdep.c:31 #2 0x000055f24759c212 in wait_for_connect (sock=-1, polls=0x7ffe8fafe300) at /home/simark/src/binutils-gdb/gdb/ser-tcp.c:147 #3 0x000055f24759d0e8 in net_open (scb=0x62500015b900, name=0x6020000601d8 ":1234") at /home/simark/src/binutils-gdb/gdb/ser-tcp.c:356 #4 0x000055f2475a0395 in serial_open_ops_1 (ops=0x55f24892ca60 <tcp_ops>, open_name=0x6020000601d8 ":1234") at /home/simark/src/binutils-gdb/gdb/serial.c:244 #5 0x000055f2475a01d6 in serial_open (name=0x6020000601d8 ":1234") at /home/simark/src/binutils-gdb/gdb/serial.c:231 #6 0x000055f2474d5274 in remote_serial_open (name=0x6020000601d8 ":1234") at /home/simark/src/binutils-gdb/gdb/remote.c:5019 #7 0x000055f2474d7025 in remote_target::open_1 (name=0x6020000601d8 ":1234", from_tty=1, extended_p=0) at /home/simark/src/binutils-gdb/gdb/remote.c:5571 #8 0x000055f2474d47d5 in remote_target::open (name=0x6020000601d8 ":1234", from_tty=1) at /home/simark/src/binutils-gdb/gdb/remote.c:4898 #9 0x000055f24776379f in open_target (args=0x6020000601d8 ":1234", from_tty=1, command=0x611000042bc0) at /home/simark/src/binutils-gdb/gdb/target.c:242 Fix that by using interruptible_select in wait_for_connect, instead of gdb_select. Resizing the terminal now no longer aborts the connection. It is still possible to interrupt the connection using ctrl-c. gdb/ChangeLog: * ser-tcp.c (wait_for_connect): Use interruptible_select instead of gdb_select. Change-Id: Ie25577bd1e5699e4847b6b53fdfa10b8c0dc5c89
mbitsnbites
pushed a commit
that referenced
this issue
Jan 25, 2021
Commit 5b7d941 ("gdb: add owner-related methods to struct type") introduced a regression when running gdb.base/jit-reader-simple.exp and others. A NULL pointer dereference happens here: #3 0x0000557b7e9e8650 in gdbarch_obstack (arch=0x0) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:484 #4 0x0000557b7ea5b138 in copy_type_recursive (objfile=0x614000006640, type=0x62100018da80, copied_types=0x62100018e280) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:5537 #5 0x0000557b7ea5dcbb in copy_type_recursive (objfile=0x614000006640, type=0x62100018e200, copied_types=0x62100018e280) at /home/simark/src/binutils-gdb/gdb/gdbtypes.c:5598 #6 0x0000557b802cef51 in preserve_one_value (value=0x6110000b3640, objfile=0x614000006640, copied_types=0x62100018e280) at /home/simark/src/binutils-gdb/gdb/value.c:2518 #7 0x0000557b802cf787 in preserve_values (objfile=0x614000006640) at /home/simark/src/binutils-gdb/gdb/value.c:2562 #8 0x0000557b7fbaf19b in reread_symbols () at /home/simark/src/binutils-gdb/gdb/symfile.c:2489 #9 0x0000557b7ec65d1d in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/simark/src/binutils-gdb/gdb/infcmd.c:439 #10 0x0000557b7ec67a97 in run_command (args=0x0, from_tty=1) at /home/simark/src/binutils-gdb/gdb/infcmd.c:546 This is inside a TYPE_ALLOC macro. The fact that gdbarch_obstack is called means that the type is flagged as being arch-owned, but arch=0x0 means that type::arch returned NULL, probably meaning that the m_owner field contains NULL. If we look at the code before the problematic patch, in the copy_type_recursive function, we see: if (! TYPE_OBJFILE_OWNED (type)) return type; ... TYPE_OBJFILE_OWNED (new_type) = 0; TYPE_OWNER (new_type).gdbarch = get_type_arch (type); The last two lines were replaced with: new_type->set_owner (type->arch ()); get_type_arch and type->arch isn't the same thing: get_type_arch gets the type's arch owner if it is arch-owned, and gets the objfile's arch if the type is objfile owned. So it always returns non-NULL. type->arch returns the type's arch if the type is arch-owned, else NULL. So since the original type is objfile owned, it effectively made the new type arch-owned (that is good) but set the owner to NULL (that is bad). Fix this by using get_type_arch again there. I spotted one other similar change in lookup_array_range_type, in the original patch. But that one appears to be correct, as it is executed only if the type is arch-owned. Add some asserts in type::set_owner to ensure we never set a NULL owner. That would have helped catch the issue a little bit earlier, so it could help in the future. gdb/ChangeLog: * gdbtypes.c (copy_type_recursive): Use get_type_arch. * gdbtypes.h (struct type) <set_owner>: Add asserts. Change-Id: I5d8bc7bfc83b3abc579be0b5aadeae4241179a00
mbitsnbites
pushed a commit
that referenced
this issue
Feb 4, 2021
With "target extended-remote" + "maint set target-non-stop", attaching hangs like so: (gdb) attach 1244450 Attaching to process 1244450 [New Thread 1244450.1244450] [New Thread 1244450.1244453] [New Thread 1244450.1244454] [New Thread 1244450.1244455] [New Thread 1244450.1244456] [New Thread 1244450.1244457] [New Thread 1244450.1244458] [New Thread 1244450.1244459] [New Thread 1244450.1244461] [New Thread 1244450.1244462] [New Thread 1244450.1244463] * hang * Attaching to the hung GDB shows that GDB is busy in an infinite loop in stop_all_threads: (top-gdb) bt #0 stop_all_threads () at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:4755 #1 0x000055555597b424 in stop_waiting (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:7738 #2 0x0000555555976fba in handle_signal_stop (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:5868 #3 0x0000555555975f6a in handle_inferior_event (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:5527 #4 0x0000555555971da4 in fetch_inferior_event () at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:3910 #5 0x00005555559540b2 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/pedro/gdb/binutils-gdb/src/gdb/inf-loop.c:42 #6 0x000055555597e825 in infrun_async_inferior_event_handler (data=0x0) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:9162 #7 0x0000555555687d1d in check_async_event_handlers () at /home/pedro/gdb/binutils-gdb/src/gdb/async-event.c:328 #8 0x0000555555e48284 in gdb_do_one_event () at /home/pedro/gdb/binutils-gdb/src/gdbsupport/event-loop.cc:216 #9 0x00005555559e7512 in start_event_loop () at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:347 #10 0x00005555559e765d in captured_command_loop () at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:407 #11 0x00005555559e8f80 in captured_main (data=0x7fffffffdb70) at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:1239 #12 0x00005555559e8ff2 in gdb_main (args=0x7fffffffdb70) at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:1254 #13 0x0000555555627c86 in main (argc=12, argv=0x7fffffffdc88) at /home/pedro/gdb/binutils-gdb/src/gdb/gdb.c:32 The problem is that the remote sends stops for all the threads: Packet received: l/home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.threads/attach-non-stop/attach-non-stop Sending packet: $vStopped#55...Packet received: T0006:f06e25edec7f0000;07:f06e25edec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd2f;core:15; Sending packet: $vStopped#55...Packet received: T0006:f0dea5f0ec7f0000;07:f0dea5f0ec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd27;core:4; Sending packet: $vStopped#55...Packet received: T0006:f0ee25f1ec7f0000;07:f0ee25f1ec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd26;core:5; Sending packet: $vStopped#55...Packet received: T0006:f0bea5efec7f0000;07:f0bea5efec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd29;core:1; Sending packet: $vStopped#55...Packet received: T0006:f0ce25f0ec7f0000;07:f0ce25f0ec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd28;core:a; Sending packet: $vStopped#55...Packet received: T0006:f07ea5edec7f0000;07:f07ea5edec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd2e;core:f; Sending packet: $vStopped#55...Packet received: T0006:f0ae25efec7f0000;07:f0ae25efec7f0000;10:df4190ccf4550000;thread:p12fd22.12fd2a;core:6; Sending packet: $vStopped#55...Packet received: T0006:0000000000000000;07:c0e8a381fe7f0000;10:bf43b4f1ec7f0000;thread:p12fd22.12fd22;core:2; Sending packet: $vStopped#55...Packet received: T0006:f0fea5f1ec7f0000;07:f0fea5f1ec7f0000;10:df4190ccf4550000;thread:p12fd22.12fd25;core:8; Sending packet: $vStopped#55...Packet received: T0006:f09ea5eeec7f0000;07:f09ea5eeec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd2b;core:b; Sending packet: $vStopped#55...Packet received: OK But then wait_one never consumes them, always hitting this path: 4473 if (nfds == 0) 4474 { 4475 /* No waitable targets left. All must be stopped. */ 4476 return {NULL, minus_one_ptid, {TARGET_WAITKIND_NO_RESUMED}}; 4477 } Resulting in GDB constanly calling target_stop to stop threads, but the remote target never reporting back the stops to infrun. That TARGET_WAITKIND_NO_RESUMED path shown above is always taken because here, in wait_one too, just above: 4428 for (inferior *inf : all_inferiors ()) 4429 { 4430 process_stratum_target *target = inf->process_target (); 4431 if (target == NULL 4432 || !target->is_async_p () ^^^^^^^^^^^^^^^^^^^^^ 4433 || !target->threads_executing) 4434 continue; ... the remote target is not async. And in turn that happened because extended_remote_target::attach misses enabling async in the target-non-stop path. A testcase exercising this will be added in a following patch. gdb/ChangeLog: * remote.c (extended_remote_target::attach): Set target async in the target-non-stop path too.
mbitsnbites
pushed a commit
that referenced
this issue
Mar 4, 2021
…PR gdb/27147) PR 27147 shows that on sparc64, GDB is unable to properly unwind: Expected result (from GDB 9.2): #0 0x0000000000108de4 in puts () #1 0x0000000000100950 in hello () at gdb-test.c:4 #2 0x0000000000100968 in main () at gdb-test.c:8 Actual result (from GDB latest git): #0 0x0000000000108de4 in puts () #1 0x0000000000100950 in hello () at gdb-test.c:4 Backtrace stopped: previous frame inner to this frame (corrupt stack?) The first failing commit is 5b6d1e4 ("Multi-target support"). The cause of the change in behavior is due to (thanks for Andrew Burgess for finding this): - inferior_ptid is no longer set on entry of target_ops::wait, whereas it was set to something valid previously - deep down in linux_nat_target::wait (see stack trace below), we fetch the registers of the event thread - on sparc64, fetching registers involves reading memory (in sparc_supply_rwindow, see stack trace below) - reading memory (target_ops::xfer_partial) relies on inferior_ptid being set to the thread from which we want to read memory This is where things go wrong: #0 linux_nat_target::xfer_partial (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7feffe3b000 "", writebuf=0x0, offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3697 #1 0x00000100007f5b10 in raw_memory_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:912 #2 0x00000100007f60e8 in memory_xfer_partial_1 (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1043 #3 0x00000100007f61b4 in memory_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1072 #4 0x00000100007f6538 in target_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7feffe3b000 "", writebuf=0x0, offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1129 #5 0x00000100007f7094 in target_read_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7feffe3b000 "", offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1375 #6 0x00000100007f721c in target_read (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7feffe3b000 "", offset=8791798050744, len=8) at /home/simark/src/binutils-gdb/gdb/target.c:1415 #7 0x00000100007f69d4 in target_read_memory (memaddr=8791798050744, myaddr=0x7feffe3b000 "", len=8) at /home/simark/src/binutils-gdb/gdb/target.c:1218 #8 0x0000010000758520 in sparc_supply_rwindow (regcache=0x10000fea4f0, sp=8791798050736, regnum=-1) at /home/simark/src/binutils-gdb/gdb/sparc-tdep.c:1960 #9 0x000001000076208c in sparc64_supply_gregset (gregmap=0x10000be3190 <sparc64_linux_ptrace_gregmap>, regcache=0x10000fea4f0, regnum=-1, gregs=0x7feffe3b230) at /home/simark/src/binutils-gdb/gdb/sparc64-tdep.c:1974 #10 0x0000010000751b64 in sparc_fetch_inferior_registers (regcache=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/sparc-nat.c:170 #11 0x0000010000759d68 in sparc64_linux_nat_target::fetch_registers (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, regcache=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/sparc64-linux-nat.c:38 #12 0x00000100008146ec in target_fetch_registers (regcache=0x10000fea4f0, regno=80) at /home/simark/src/binutils-gdb/gdb/target.c:3287 #13 0x00000100006a8c5c in regcache::raw_update (this=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/regcache.c:584 #14 0x00000100006a8d94 in readable_regcache::raw_read (this=0x10000fea4f0, regnum=80, buf=0x7feffe3b7c0 "") at /home/simark/src/binutils-gdb/gdb/regcache.c:598 #15 0x00000100006a93b8 in readable_regcache::cooked_read (this=0x10000fea4f0, regnum=80, buf=0x7feffe3b7c0 "") at /home/simark/src/binutils-gdb/gdb/regcache.c:690 #16 0x00000100006b288c in readable_regcache::cooked_read<unsigned long, void> (this=0x10000fea4f0, regnum=80, val=0x7feffe3b948) at /home/simark/src/binutils-gdb/gdb/regcache.c:777 #17 0x00000100006a9b44 in regcache_cooked_read_unsigned (regcache=0x10000fea4f0, regnum=80, val=0x7feffe3b948) at /home/simark/src/binutils-gdb/gdb/regcache.c:791 #18 0x00000100006abf3c in regcache_read_pc (regcache=0x10000fea4f0) at /home/simark/src/binutils-gdb/gdb/regcache.c:1295 #19 0x0000010000507920 in save_stop_reason (lp=0x10000fc5b10) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:2612 #20 0x00000100005095a4 in linux_nat_filter_event (lwpid=520983, status=1407) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3050 #21 0x0000010000509f9c in linux_nat_wait_1 (ptid=..., ourstatus=0x7feffe3c8f0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3194 #22 0x000001000050b1d0 in linux_nat_target::wait (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, ptid=..., ourstatus=0x7feffe3c8f0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3432 #23 0x00000100007f8ac0 in target_wait (ptid=..., status=0x7feffe3c8f0, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2000 #24 0x00000100004ac17c in do_target_wait_1 (inf=0x1000116d280, ptid=..., status=0x7feffe3c8f0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3464 #25 0x00000100004ac3b8 in operator() (__closure=0x7feffe3c678, inf=0x1000116d280) at /home/simark/src/binutils-gdb/gdb/infrun.c:3527 #26 0x00000100004ac7cc in do_target_wait (wait_ptid=..., ecs=0x7feffe3c8c8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3540 #27 0x00000100004ad8c4 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:3880 #28 0x0000010000485568 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42 #29 0x000001000050d394 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060 #30 0x0000010000ab5c8c in handle_file_event (file_ptr=0x10001207270, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575 #31 0x0000010000ab6334 in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701 #32 0x0000010000ab487c in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212 #33 0x0000010000542668 in start_event_loop () at /home/simark/src/binutils-gdb/gdb/main.c:348 #34 0x000001000054287c in captured_command_loop () at /home/simark/src/binutils-gdb/gdb/main.c:408 #35 0x0000010000544e84 in captured_main (data=0x7feffe3d188) at /home/simark/src/binutils-gdb/gdb/main.c:1242 #36 0x0000010000544f2c in gdb_main (args=0x7feffe3d188) at /home/simark/src/binutils-gdb/gdb/main.c:1257 #37 0x00000100000c1f14 in main (argc=4, argv=0x7feffe3d548) at /home/simark/src/binutils-gdb/gdb/gdb.c:32 There is a target_read_memory call in sparc_supply_rwindow, whose return value is not checked. That call fails, because inferior_ptid does not contain a valid ptid, and uninitialized buffer contents is used. Ultimately it results in a corrupt stop_pc. target_ops::fetch_registers can be (and should remain, in my opinion) independent of inferior_ptid, because the ptid of the thread from which to fetch registers can be obtained from the regcache. In other words, implementations of target_ops::fetch_registers should not rely on inferior_ptid having a sensible value on entry. The sparc64_linux_nat_target::fetch_registers case is special, because it calls a target method that is dependent on the inferior_ptid value (target_read_inferior, and ultimately target_ops::xfer_partial). So I would say it's the responsibility of sparc64_linux_nat_target::fetch_registers to set up inferior_ptid correctly prior to calling target_read_inferior. This patch makes sparc64_linux_nat_target::fetch_registers (and store_registers, since it works the same) temporarily set inferior_ptid. If we ever make target_ops::xfer_partial independent of inferior_ptid, setting inferior_ptid won't be necessary, we'll simply pass down the ptid as a parameter in some way. I chose to set/restore inferior_ptid in sparc_fetch_inferior_registers, because I am not convinced that doing so in an inner location (in sparc_supply_rwindow for instance) would always be correct. We have access to the ptid in sparc_supply_rwindow (from the regcache), so we _could_ set inferior_ptid there. However, I don't want to just set inferior_ptid, as that would make it not desync'ed with `current_thread ()` and `current_inferior ()`. It's preferable to use switch_to_thread instead, as that switches all the global "current" stuff in a coherent way. But doing so requires a `thread_info *`, and getting a `thread_info *` from a ptid requires a `process_stratum_target *`. We could use `current_inferior()->process_target()` in sparc_supply_rwindow for this (using target_read_memory uses the current inferior's target stack anyway). However, sparc_supply_rwindow is also used in the context of BSD uthreads, where a thread stratum target defines threads. I presume the ptid in the regcache would be the ptid of the uthread, defined by the thread stratum target (bsd_uthread_target). Using `current_inferior()->process_target()` would look up a ptid defined by the thread stratum target using the process stratum target. I don't think it would give good results. So I prefer playing it safe and looking up the thread earlier, in sparc_fetch_inferior_registers. I added some assertions (in sparc_supply_rwindow and others) to verify that the regcache's ptid matches inferior_ptid. That verifies that the caller has properly set the correct global context. This would have caught (though a failed assertion) the current problem. gdb/ChangeLog: PR gdb/27147 * sparc-nat.h (sparc_fetch_inferior_registers): Add process_stratum_target parameter, sparc_store_inferior_registers): update callers. * sparc-nat.c (sparc_fetch_inferior_registers, sparc_store_inferior_registers): Add process_stratum_target parameter. Switch current thread before calling sparc_supply_gregset / sparc_collect_rwindow. (sparc_store_inferior_registers): Likewise. * sparc-obsd-tdep.c (sparc32obsd_supply_uthread): Add assertion. (sparc32obsd_collect_uthread): Likewise. * sparc-tdep.c (sparc_supply_rwindow, sparc_collect_rwindow): Add assertion. * sparc64-obsd-tdep.c (sparc64obsd_collect_uthread, sparc64obsd_supply_uthread): Add assertion. Change-Id: I16c658cd70896cea604516714f7e2428fbaf4301
mbitsnbites
pushed a commit
that referenced
this issue
Mar 20, 2021
Running gdb-term.exp against gdbserver with "maint set target-non-stop on", runs into this: [infrun] fetch_inferior_event: exit [infrun] fetch_inferior_event: enter /home/pedro/gdb/binutils-gdb/src/gdb/thread.c:72: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. This is a bug, please report it. For instructions, see: <https://www.gnu.org/software/gdb/bugs/>. FAIL: gdb.base/gdb-sigterm.exp: expect eof #2 (GDB internal error) Resyncing due to internal error. ERROR: : spawn id exp9 not open while executing "expect { -i exp9 -timeout 10 -re "Quit this debugging session\\? \\(y or n\\) $" { send_gdb "n\n" answer incr count } -re "Create ..." ("uplevel" body line 1) invoked from within "uplevel $body" NONE : spawn id exp9 not open ERROR: Could not resync from internal error (timeout) gdb.base/gdb-sigterm.exp: expect eof #2: stepped 0 times UNRESOLVED: gdb.base/gdb-sigterm.exp: 50 SIGTERM passes The assertion fails here: ... #5 0x000055af4b4a7164 in internal_error (file=0x55af4b5e5de8 "/home/pedro/gdb/binutils-gdb/src/gdb/thread.c", line=72, fmt=0x55af4b5e5ce9 "%s: Assertion `%s' failed.") at /home/pedro/gdb/binutils-gdb/src/gdbsupport/errors.cc:55 #6 0x000055af4b25fc43 in inferior_thread () at /home/pedro/gdb/binutils-gdb/src/gdb/thread.c:72 #7 0x000055af4b26177e in any_thread_of_inferior (inf=0x55af4cf874f0) at /home/pedro/gdb/binutils-gdb/src/gdb/thread.c:638 #8 0x000055af4b26eec8 in kill_or_detach (inf=0x55af4cf874f0, from_tty=0) at /home/pedro/gdb/binutils-gdb/src/gdb/top.c:1665 #9 0x000055af4b26f37f in quit_force (exit_arg=0x0, from_tty=0) at /home/pedro/gdb/binutils-gdb/src/gdb/top.c:1767 #10 0x000055af4b2f72a7 in quit () at /home/pedro/gdb/binutils-gdb/src/gdb/utils.c:633 #11 0x000055af4b2f730b in maybe_quit () at /home/pedro/gdb/binutils-gdb/src/gdb/utils.c:657 #12 0x000055af4b1adb74 in ser_base_wait_for (scb=0x55af4d02e460, timeout=0) at /home/pedro/gdb/binutils-gdb/src/gdb/ser-base.c:236 #13 0x000055af4b1adf0f in do_ser_base_readchar (scb=0x55af4d02e460, timeout=0) at /home/pedro/gdb/binutils-gdb/src/gdb/ser-base.c:365 #14 0x000055af4b1ae06d in generic_readchar (scb=0x55af4d02e460, timeout=0, do_readchar=0x55af4b1adeb1 <do_ser_base_readchar(serial*, int)>) at /home/pedro/gdb/binutils-gdb/src/gdb/ser-base.c:444 ... The bug is that any_thread_of_inferior incorrectly assumes that there's always a selected thread. This fixes it. gdb/ChangeLog: * thread.c (any_thread_of_inferior): Check if there's a selected thread before calling inferior_thread(). Change-Id: Ica4b9ec746121a7a7c22bef09baea72103b3853d
mbitsnbites
pushed a commit
that referenced
this issue
May 2, 2021
Running gdb.server/stop-reply-no-thread-multi.exp with "maint set target-non-stop on" occasionally hit an internal error like this: ... continue Continuing. warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread /home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:291: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. This is a bug, please report it. FAIL: gdb.server/stop-reply-no-thread-multi.exp: to_disable=Tthread: continue until exit (GDB internal error) The backtrace looks like this: ... #5 0x0000560357b0879c in internal_error (file=0x560357be6c18 "/home/pedro/gdb/binutils-gdb/src/gdb/inferior.c", line=291, fmt=0x560357be6b21 "%s: Assertion `%s' failed.") at /home/pedro/gdb/binutils-gdb/src/gdbsupport/errors.cc:55 #6 0x000056035762061b in find_inferior_pid (targ=0x5603596e9560, pid=0) at /home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:291 #7 0x00005603576206e6 in find_inferior_ptid (targ=0x5603596e9560, ptid=...) at /home/pedro/gdb/binutils-gdb/src/gdb/inferior.c:305 #8 0x00005603577d43ed in remote_target::check_pending_events_prevent_wildcard_vcont (this=0x5603596e9560, may_global_wildcard=0x7fff84fb05f0) at /home/pedro/gdb/binutils-gdb/src/gdb/remote.c:7215 #9 0x00005603577d2a9c in remote_target::commit_resumed (this=0x5603596e9560) at /home/pedro/gdb/binutils-gdb/src/gdb/remote.c:6680 ... pid is 0 in this case because the queued event is a process exit event with no pid associated: (top-gdb) p event->ws During symbol reading: .debug_line address at offset 0x563c9a is 0 [in module /home/pedro/gdb/binutils-gdb/build/gdb/gdb] $1 = {kind = TARGET_WAITKIND_EXITED, value = {integer = 0, sig = GDB_SIGNAL_0, related_pid = {m_pid = 0, m_lwp = 0, m_tid = 0}, execd_pathname = 0x0, syscall_number = 0}} (top-gdb) This fixes it, and adds a "maint set target-non-stop on/off" axis to the testcase. gdb/ChangeLog: * remote.c (remote_target::check_pending_events_prevent_wildcard_vcont): Check whether the event's ptid is not null_ptid before looking up the corresponding inferior. gdb/testsuite/ChangeLog: * gdb.server/stop-reply-no-thread-multi.exp (run_test): Add "target_non_stop" parameter and use it. (top level): Add "maint set target-non-stop on/off" testing axis. Change-Id: Ia30cf275305ee4dcbbd33f731534cd71d1550eaa
mbitsnbites
pushed a commit
that referenced
this issue
May 2, 2021
When testing with "maint set target-non-stop on", gdb.server/bkpt-other-inferior.exp sometimes fails like so: (gdb) inferior 2 [Switching to inferior 2 [process 368191] (<noexec>)] [Switching to thread 2.1 (Thread 368191.368191)] [remote] Sending packet: $m7ffff7fd0100,1#5b [remote] Packet received: 48 [remote] Sending packet: $m7ffff7fd0100,1#5b [remote] Packet received: 48 [remote] Sending packet: $m7ffff7fd0100,9#63 [remote] Packet received: 4889e7e8e80c000049 #0 0x00007ffff7fd0100 in ?? () (gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: switch to inferior break -q main Breakpoint 2 at 0x1138: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.server/server.c, line 21. (gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: set breakpoint delete breakpoints Delete all breakpoints? (y or n) y (gdb) [remote] wait: enter [remote] wait: exit FAIL: gdb.server/bkpt-other-inferior.exp: inf 2: delete all breakpoints in delete_breakpoints (timeout) ERROR: breakpoints not deleted Remote debugging from host ::1, port 55876 monitor exit The problem is here: (gdb) [remote] wait: enter The testcase isn't expecting any output after the prompt. Why is that "[remote] wait" output? What happens is that "delete breakpoints" queries the user, and `query` disables/reenables target async, which results in the remote target's async event handler ending up marked: (top-gdb) bt #0 mark_async_event_handler (async_handler_ptr=0x556bffffffff) at ../../src/gdb/async-event.c:295 #1 0x0000556bf71b711f in infrun_async (enable=1) at ../../src/gdb/infrun.c:119 #2 0x0000556bf7471387 in target_async (enable=1) at ../../src/gdb/target.c:3684 #3 0x0000556bf748a0bd in gdb_readline_wrapper_cleanup::~gdb_readline_wrapper_cleanup (this=0x7ffe3cf30eb0, __in_chrg=<optimized out>) at ../../src/gdb/top.c:1074 #4 0x0000556bf74874e2 in gdb_readline_wrapper (prompt=0x556bfa17da60 "Delete all breakpoints? (y or n) ") at ../../src/gdb/top.c:1096 #5 0x0000556bf75111c5 in defaulted_query(const char *, char, typedef __va_list_tag __va_list_tag *) (ctlstr=0x556bf7717f34 "Delete all breakpoints? ", defchar=0 '\000', args=0x7ffe3cf31020) at ../../src/gdb/utils.c:893 #6 0x0000556bf751166f in query (ctlstr=0x556bf7717f34 "Delete all breakpoints? ") at ../../src/gdb/utils.c:985 #7 0x0000556bf6f11404 in delete_command (arg=0x0, from_tty=1) at ../../src/gdb/breakpoint.c:13500 ... ... which then later results in a target_wait call: (top-gdb) bt #0 remote_target::wait_ns (this=0x7ffe3cf30f80, ptid=..., status=0xde530314f0802800, options=...) at ../../src/gdb/remote.c:7937 #1 0x0000556bf7369dcb in remote_target::wait (this=0x556bfa0b2180, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/remote.c:8173 #2 0x0000556bf745e527 in target_wait (ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/target.c:2000 #3 0x0000556bf71be686 in do_target_wait_1 (inf=0x556bfa1573d0, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/infrun.c:3463 #4 0x0000556bf71be88b in <lambda(inferior*)>::operator()(inferior *) const (__closure=0x7ffe3cf31320, inf=0x556bfa1573d0) at ../../src/gdb/infrun.c:3526 #5 0x0000556bf71bebcd in do_target_wait (wait_ptid=..., ecs=0x7ffe3cf31540, options=...) at ../../src/gdb/infrun.c:3539 #6 0x0000556bf71bf97b in fetch_inferior_event () at ../../src/gdb/infrun.c:3879 #7 0x0000556bf71a27f8 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:42 #8 0x0000556bf71cc8b7 in infrun_async_inferior_event_handler (data=0x0) at ../../src/gdb/infrun.c:9220 #9 0x0000556bf6ecb80f in check_async_event_handlers () at ../../src/gdb/async-event.c:327 #10 0x0000556bf76b011a in gdb_do_one_event () at ../../src/gdbsupport/event-loop.cc:216 ... ... which returns TARGET_WAITKIND_IGNORE. Fix this by only enabling remote output around setting the breakpoint. gdb/testsuite/ChangeLog: * gdb.server/bkpt-other-inferior.exp: Only enable remote output around setting the breakpoint. Change-Id: I2fd152fd9c46b1c5e7fa678cc4d4054dac0b2bd4
mbitsnbites
pushed a commit
that referenced
this issue
Jun 26, 2021
When loading the debug info package libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we run into a dwarf error: ... $ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug Dwarf Error: Cannot not find DIE at 0x18a936e7 \ [from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug] ... The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52. No error message is printed when using -readnow. What happens is the following: - a dwarf2_per_cu_data P is created for the CU. - a dwarf2_cu A is created for the same CU. - another dwarf2_cu B is created for the same CU. - the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that per_objfile->get_cu (P) returns B. - P->load_all_dies is set to 1. - all dies are read into the A->partial_dies htab - dwarf2_cu A is destroyed. - we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies. We can't find it, but do not try to load all dies, because P->load_all_dies is already set to 1. - an error message is generated. The question is why we're creating dwarf2_cu A and B for the same CU. The dwarf2_cu A is created here: ... (gdb) bt #0 dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30, per_objfile=0x1ad01b0) at dwarf2/cu.c:38 #1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040, this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0, existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487 #2 0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30, per_objfile=0x1ad01b0, want_partial_unit=false, pretend_language=language_minimal) at dwarf2/read.c:7028 ... And the dwarf2_cu B is created here: ... (gdb) bt #0 dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30, per_objfile=0x1ad01b0) at dwarf2/cu.c:38 #1 0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50, this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0, existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487 #2 0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30, per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436 #3 0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054), offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391 #4 0x000000000069755b in partial_die_info::fixup (this=0x9096900, cu=0xa6a85f0) at dwarf2/read.c:19512 #5 0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0, cu=0xa6a85f0) at dwarf2/read.c:19516 #6 0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40, lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660) at dwarf2/read.c:7563 #7 0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0, lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660) at dwarf2/read.c:7580 #8 0x0000000000676b82 in process_psymtab_comp_unit_reader (reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0, pretend_language=language_minimal) at dwarf2/read.c:6954 #9 0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30, per_objfile=0x1ad01b0, want_partial_unit=false, pretend_language=language_minimal) at dwarf2/read.c:7057 ... So in frame #9, a cutu_reader is created with dwarf2_cu A. Then a fixup takes us to the following CU @ 0x18aa33d6, in frame #5. And a similar fixup in frame #4 takes us back to CU @ 0x18a23e52. At that point, there's no information available that we're already trying to read that CU, and we end up creating another cutu_reader with dwarf2_cu B. It seems that there are two related problems: - creating two dwarf2_cu's is not optimal - the unoptimal case is not handled correctly This patch addresses the last problem, by moving the load_all_dies flag from dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies field, which ensures that the two can be kept in sync. Tested on x86_64-linux. gdb/ChangeLog: 2021-05-27 Tom de Vries <[email protected]> PR symtab/27898 * dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init. * dwarf2/cu.h (dwarf2_cu): Add load_all_dies field. * dwarf2/read.c (load_partial_dies, find_partial_die): Update. * dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove load_all_dies init. (dwarf2_per_cu_data): Remove load_all_dies field.
mbitsnbites
pushed a commit
that referenced
this issue
Jun 26, 2021
… when attaching / handling a fork child When trying to attach to a pthread process on a Linux system with glibc 2.33, we get: $ ./gdb -q -nx --data-directory=data-directory -p 1472010 Attaching to process 1472010 [New LWP 1472013] [New LWP 1472014] [New LWP 1472015] Error while reading shared library symbols for /usr/lib/libpthread.so.0: Cannot find user-level thread for LWP 1472015: generic error 0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6 (gdb) When attaching to a process (or handling a fork child, an operation very similar to attaching), GDB reads the shared library list from the process. For each shared library (if "set auto-solib-add" is on), it reads its symbols and calls the "new_objfile" observable. The libthread-db code monitors this observable, and if it sees an objfile named somewhat like "libpthread.so" go by, it tries to load libthread_db.so in the GDB process itself. libthread_db knows how to navigate libpthread's data structures to get information about the existing threads. To locate these data structures, libthread_db calls ps_pglobal_lookup (implemented in proc-service.c), passing in a symbol name and expecting an address in return. Before glibc 2.33, libthread_db always asked for symbols found in libpthread. There was no ordering problem: since we were always trying to load libthread_db in reaction to processing libpthread (and reading in its symbols) and libthread_db only asked symbols from libpthread, the requested symbols could always be found. Starting with glibc 2.33, libthread_db now asks for a symbol name that can be found in /lib/ld-linux-x86-64.so.2 (_rtld_global). And the ordering in which GDB reads the shared libraries from the inferior when attaching is unfortunate, in that libpthread is processed before ld-linux. So when loading libthread_db in reaction to processing libpthread, and libthread_db requests the symbol that is from ld-linux, GDB is not yet able to supply it. That problematic symbol lookup happens in the thread_from_lwp function, when we call td_ta_map_lwp2thr_p, and an exception is thrown at this point: #0 0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78 #1 0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200 #2 0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208 #3 0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171 #4 0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43 #5 0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418 #6 0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912 #7 0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014 #8 0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091 #9 0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146 #10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203 #11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246 #12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275 #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60 #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153 #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291 #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622 #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106 #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131 #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167 #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681 #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987 #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238 #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049 #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195 #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318 #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439 #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887 #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064 #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006 #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062 #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727 #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105 #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42 #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060 #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575 #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701 #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212 #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528 #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545 #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452 #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149 #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232 #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257 #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32 The exception is caught here: #0 __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84 #1 0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689 #2 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987 #3 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238 #4 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049 #5 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195 #6 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318 #7 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439 #8 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887 #9 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064 #10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006 #11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062 #12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727 #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105 #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42 #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060 #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575 #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701 #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212 #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528 #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545 #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452 #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149 #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232 #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257 #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32 Catching the exception at this point means that the thread_db_info object for this inferior will be left in place, despite the failure to load libthread_db. This means that there won't be further attempts at loading libthread_db, because thread_db_load will think that libthread_db is already loaded for this inferior and will always exit early. To fix this, add a try/catch around calling try_thread_db_load_1 in try_thread_db_load, such that if some exception is thrown while trying to load libthread_db, we reset / delete the thread_db_info for that inferior. That alone makes attach work fine again, because check_for_thread_db is called again in the thread_db_inferior_created observer (that happens after we learned about all shared libraries and their symbols), and libthread_db is successfully loaded then. When attaching, I think that the inferior_created observer is a good place to try to load libthread_db: it is called once everything has stabilized, when we learned about all shared libraries. The only problem then is that when we first try (and fail) to load libthread_db, in reaction to learning about libpthread, we show this warning: warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. This is misleading, because we do succeed in loading it later. So when attaching, I think we shouldn't try to load libthread_db in reaction to the new_objfile events, we should wait until we have learned about all shared libraries (using the inferior_created observable). To do so, add an `in_initial_library_scan` flag to struct inferior. This flag is used to postpone loading libthread_db if we are attaching or handling a fork child. When debugging remotely with GDBserver, the same problem happens, except that the qSymbol mechanism (allowing the remote side to ask GDB for symbols values) is involved. The fix there is the same idea, we make GDB wait until all shared libraries and their symbols are known before sending out a qSymbol packet. This way, we never present the remote side a state where libpthread.so's symbols are known but ld-linux's symbols aren't. gdb/ChangeLog: * inferior.h (class inferior) <in_initial_library_scan>: New. * infcmd.c (post_create_inferior): Set in_initial_library_scan. * infrun.c (follow_fork_inferior): Likewise. * linux-thread-db.c (try_thread_db_load): Catch exception thrown by try_thread_db_load_1 (thread_db_load): Return early if in_initial_library_scan is set. * remote.c (remote_new_objfile): Return early if in_initial_library_scan is set. Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
mbitsnbites
pushed a commit
that referenced
this issue
Jun 30, 2021
When loading a mach-o (macOS) executable and trying to set a breakpoint, a GDB built with ASan or -D_GLIBCXX_DEBUG will crash with an out-of-bound vector access. This can be reproduced on Linux using the repro files in bug 28017 [1]: $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch /usr/include/c++/11.1.0/debug/vector:445: In function: std::__debug::vector<_Tp, _Allocator>::const_reference std::__debug::vector<_Tp, _Allocator>::operator[](std::__debug::vector<_Tp, _Allocator>::size_type) const [with _Tp = long unsigned int; _Allocator = std::allocator<long unsigned int>; std::__debug::vector<_Tp, _Allocator>::const_reference = const long unsigned int&; std::__debug::vector<_Tp, _Allocator>::size_type = long unsigned int] Error: attempt to subscript container with out-of-bounds index 13, but container only holds 13 elements. Objects involved in the operation: sequence "this" @ 0x0x61300000a590 { type = std::__debug::vector<unsigned long, std::allocator<unsigned long> >; } The out-of-bound access happens here: #0 0x00007ffff6405d22 in raise () from /usr/lib/libc.so.6 #1 0x00007ffff63ef862 in abort () from /usr/lib/libc.so.6 #2 0x00007ffff664e21e in __gnu_debug::_Error_formatter::_M_error() const [clone .cold] from /usr/lib/libstdc++.so.6 #3 0x000055555699e5ff in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x61300000a590, __n=13) at /usr/include/c++/11.1.0/debug/vector:445 #4 0x0000555556a58c17 in objfile::section_offset (this=0x61300000a4c0, section=0x55555bbe4ac0 <_bfd_std_section>) at /home/simark/src/binutils-gdb/gdb/objfiles.h:644 #5 0x0000555556a58cac in obj_section::offset (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:838 #6 0x0000555556a58cfa in obj_section::addr (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:850 #7 0x000055555779f5f7 in sort_cmp (sect1=0x62100016d2a8, sect2=0x62100016d170) at /home/simark/src/binutils-gdb/gdb/objfiles.c:902 #8 0x00005555577aae35 in __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)>::operator()<obj_section**, obj_section**> (this=0x7fffffffa9e0, __it1=0x60c000015970, __it2=0x60c000015940) at /usr/include/c++/11.1.0/bits/predefined_ops.h:158 #9 0x00005555577aa2b8 in std::__insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1826 #10 0x00005555577a8e26 in std::__final_insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1871 #11 0x00005555577a723c in std::__sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1957 #12 0x00005555577a50f4 in std::sort<obj_section**, bool (*)(obj_section const*, obj_section const*)> (__first=0x60c000015940, __last=0x60c0000159c0, __comp=0x55555779f4e7 <sort_cmp(obj_section const*, obj_section const*)>) at /usr/include/c++/11.1.0/bits/stl_algo.h:4875 #13 0x00005555577a147e in update_section_map (pspace=0x61200001d2c0, pmap=0x6030000d40b0, pmap_size=0x6030000d40b8) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1165 #14 0x00005555577a19a0 in find_pc_section (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1212 #15 0x00005555576dd39e in lookup_minimal_symbol_by_pc_section (pc_in=0x100003fa0, section=0x0, prefer=lookup_msym_prefer::TEXT, previous=0x0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:750 #16 0x00005555576de552 in lookup_minimal_symbol_by_pc (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:986 #17 0x0000555557d44b54 in find_pc_sect_line (pc=0x100003fa0, section=0x62100016d170, notcurrent=0) at /home/simark/src/binutils-gdb/gdb/symtab.c:3163 #18 0x0000555557d489fa in find_function_start_sal_1 (func_addr=0x100003fa0, section=0x62100016d170, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3650 #19 0x0000555557d49015 in find_function_start_sal (sym=0x621000191670, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3706 #20 0x0000555557485283 in symbol_to_sal (result=0x7fffffffbb30, funfirstline=1, sym=0x621000191670) at /home/simark/src/binutils-gdb/gdb/linespec.c:4460 #21 0x00005555574728c2 in convert_linespec_to_sals (state=0x7fffffffc390, ls=0x7fffffffc3e0) at /home/simark/src/binutils-gdb/gdb/linespec.c:2335 #22 0x0000555557475a8e in parse_linespec (parser=0x7fffffffc360, arg=0x60200007a550 "main", match_type=symbol_name_match_type::WILD) at /home/simark/src/binutils-gdb/gdb/linespec.c:2716 #23 0x0000555557479027 in event_location_to_sals (parser=0x7fffffffc360, location=0x606000097be0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3173 #24 0x00005555574798f7 in decode_line_full (location=0x606000097be0, flags=1, search_pspace=0x0, default_symtab=0x0, default_line=0, canonical=0x7fffffffcca0, select_mode=0x0, filter=0x0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3253 #25 0x0000555556b4949f in parse_breakpoint_sals (location=0x606000097be0, canonical=0x7fffffffcca0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9134 #26 0x0000555556b6ce95 in create_sals_from_location_default (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:13819 #27 0x0000555556b645a6 in bkpt_create_sals_from_location (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:12631 #28 0x0000555556b4badf in create_breakpoint (gdbarch=0x621000152d10, location=0x606000097be0, cond_string=0x0, thread=0, extra_string=0x0, force_condition=false, parse_extra=1, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0, pending_break_support=AUTO_BOOLEAN_AUTO, ops=0x55555bd728a0 <bkpt_breakpoint_ops>, from_tty=0, enabled=1, internal=0, flags=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9410 #29 0x0000555556b4d3b1 in break_command_1 (arg=0x7fffffffe291 "", flag=0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9590 #30 0x0000555556b4dc1b in break_command (arg=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9660 #31 0x0000555556d24ca9 in do_const_cfunc (c=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:102 #32 0x0000555556d2fcd3 in cmd_func (cmd=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2160 #33 0x0000555557e84e93 in execute_command (p=0x7fffffffe290 "n", from_tty=0) at /home/simark/src/binutils-gdb/gdb/top.c:674 #34 0x00005555575a9933 in catch_command_errors (command=0x555557e84043 <execute_command(char const*, int)>, arg=0x7fffffffe28b "b main", from_tty=0, do_bp_actions=true) at /home/simark/src/binutils-gdb/gdb/main.c:523 #35 0x00005555575a9fdb in execute_cmdargs (cmdarg_vec=0x7fffffffd910, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd5b0) at /home/simark/src/binutils-gdb/gdb/main.c:618 #36 0x00005555575ad48a in captured_main_1 (context=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1322 #37 0x00005555575ada9c in captured_main (data=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1343 #38 0x00005555575adb31 in gdb_main (args=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1368 #39 0x000055555681e179 in main (argc=8, argv=0x7fffffffde78) at /home/simark/src/binutils-gdb/gdb/gdb.c:32 The section being dealt with at that moment is the special *COM* section: (top-gdb) p section.name $1 = 0x55555a1bbe60 "*COM*" (top-gdb) p section $2 = (bfd_section *) 0x55555bbe4ac0 <_bfd_std_section> I'm not too sure what this section is for, but this is one of four special BFD sections that GDB puts after the regular sections in the objfile::sections and objfile::section_offsets lists. You can check gdb_bfd_section_index to see how they are handled. gdb_bfd_count_sections returns "+ 4" to account for those sections. The problem is that macho_symfile_offsets uses bfd_count_sections instead of gdb_bfd_count_sections when allocating the objfile::section_offsets vector. The vector will therefore contain, say, 13 elements instead of 17. When trying to access the section offset of the *COM* section, the first after the regular sections, we access section_offsets[13], which is out of bounds. Fix that by using gdb_bfd_count_sections instead of bfd_count_sections. I'm fairly confident that this is correct, as this is what default_symfile_offsets does. With this patch, the command shown above terminates normally: $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch Breakpoint 1 at 0x100003fad: file test.c, line 2. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=28017 gdb/ChangeLog: PR gdb/28017 * machoread.c (macho_symfile_offsets): Use gdb_bfd_count_sections to allocate objfile::section_offsets. Change-Id: Ic3a56f46f7232e9f24581f8255fc1ab981935450
mbitsnbites
pushed a commit
that referenced
this issue
Jul 5, 2021
When loading a file using the file command on macOS, we get: $ ./gdb -nx --data-directory=data-directory -q -ex "file ./test" Reading symbols from ./test... Reading symbols from /Users/smarchi/build/binutils-gdb/gdb/test.dSYM/Contents/Resources/DWARF/test... /Users/smarchi/src/binutils-gdb/gdb/thread.c:72: internal-error: struct thread_info *inferior_thread(): Assertion `current_thread_ != nullptr' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) The backtrace is: * frame #0: 0x0000000101fcb826 gdb`internal_error(file="/Users/smarchi/src/binutils-gdb/gdb/thread.c", line=72, fmt="%s: Assertion `%s' failed.") at errors.cc:52:3 frame #1: 0x00000001018a2584 gdb`inferior_thread() at thread.c:72:3 frame #2: 0x0000000101469c09 gdb`get_current_regcache() at regcache.c:421:31 frame #3: 0x00000001015f9812 gdb`darwin_solib_get_all_image_info_addr_at_init(info=0x0000603000006d00) at solib-darwin.c:464:34 frame #4: 0x00000001015f7a04 gdb`darwin_solib_create_inferior_hook(from_tty=1) at solib-darwin.c:515:5 frame #5: 0x000000010161205e gdb`solib_create_inferior_hook(from_tty=1) at solib.c:1200:3 frame #6: 0x00000001016d8f76 gdb`symbol_file_command(args="./test", from_tty=1) at symfile.c:1650:7 frame #7: 0x0000000100abab17 gdb`file_command(arg="./test", from_tty=1) at exec.c:555:3 frame #8: 0x00000001004dc799 gdb`do_const_cfunc(c=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:102:3 frame #9: 0x00000001004ea042 gdb`cmd_func(cmd=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:2160:7 frame #10: 0x00000001018d4f59 gdb`execute_command(p="t", from_tty=1) at top.c:674:2 frame #11: 0x0000000100eee430 gdb`catch_command_errors(command=(gdb`execute_command(char const*, int) at top.c:561), arg="file ./test", from_tty=1, do_bp_actions=true)(char const*, int), char const*, int, bool) at main.c:523:7 frame #12: 0x0000000100eee902 gdb`execute_cmdargs(cmdarg_vec=0x00007ffeefbfeba0 size=1, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x00007ffeefbfec20) at main.c:618:9 frame #13: 0x0000000100eed3a4 gdb`captured_main_1(context=0x00007ffeefbff780) at main.c:1322:3 frame #14: 0x0000000100ee810d gdb`captured_main(data=0x00007ffeefbff780) at main.c:1343:3 frame #15: 0x0000000100ee8025 gdb`gdb_main(args=0x00007ffeefbff780) at main.c:1368:7 frame #16: 0x00000001000044f1 gdb`main(argc=6, argv=0x00007ffeefbff8a0) at gdb.c:32:10 frame #17: 0x00007fff20558f5d libdyld.dylib`start + 1 The solib_create_inferior_hook call in symbol_file_command was added by commit ea142fb ("Fix breakpoints on file reloads for PIE binaries"). It causes solib_create_inferior_hook to be called while the inferior is not running, which darwin_solib_create_inferior_hook does not expect. darwin_solib_get_all_image_info_addr_at_init, in particular, assumes that there is a current thread, as it tries to get the current thread's regcache. Fix it by adding a target_has_execution check and returning early. Note that there is a similar check in svr4_solib_create_inferior_hook. gdb/ChangeLog: * solib-darwin.c (darwin_solib_create_inferior_hook): Return early if no execution. Change-Id: Ia11dd983a1e29786e5ce663d0fcaa6846dc611bb
mbitsnbites
pushed a commit
that referenced
this issue
Jul 20, 2021
Commit 408f668 ("detach in all-stop with threads running") regressed "detach" with "target remote": (gdb) detach Detaching from program: target:/any/program, process 3671843 Detaching from process 3671843 Ending remote debugging. [Inferior 1 (process 3671843) detached] In main terminate called after throwing an instance of 'gdb_exception_error' Aborted (core dumped) Here's the exception above being thrown: (top-gdb) bt #0 throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222 #1 0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440 #2 0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928 #3 0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030 #4 0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137 #5 0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455 #6 0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb] this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462 #7 0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365 #8 0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439 #9 0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769 So frame #28 already detached the remote process, and then we're purging the shared libraries. GDB had opened remote shared libraries via the target: sysroot, so it tries closing them. GDBserver is tearing down already, so remote communication breaks down and we close the remote target and throw TARGET_CLOSE_ERROR. Note frame #14: #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573 That's a dtor, thus noexcept. That's the reason for the std::terminate. Stepping back a bit, why do we still have open remote files if we've managed to detach already, and, we're debugging with "target remote"? The reason is that commit 408f668 makes detach_command hold a reference to the target, so the remote target won't be finally closed until frame #28 returns. It's closing the target that invalidates target file I/O handles. This commit fixes the issue by not relying on target_close to invalidate the target file I/O handles, instead invalidate them immediately in remote_unpush_target. So when GDB purges the solibs, and we end up in target_fileio_close (frame #7 above), there's nothing to do, and we don't try to talk with the remote target anymore. The regression isn't seen when testing with --target_board=native-gdbserver, because that does "set sysroot" to disable the "target:" sysroot, for test run speed reasons. So this commit adds a testcase that explicitly tests detach with "set sysroot target:". gdb/ChangeLog: yyyy-mm-dd Pedro Alves <[email protected]> PR gdb/28080 * remote.c (remote_unpush_target): Invalidate file I/O target handles. * target.c (fileio_handles_invalidate_target): Make extern. * target.h (fileio_handles_invalidate_target): Declare. gdb/testsuite/ChangeLog: yyyy-mm-dd Pedro Alves <[email protected]> PR gdb/28080 * gdb.base/detach-sysroot-target.exp: New. * gdb.base/detach-sysroot-target.c: New. Reported-By: Jonah Graham <[email protected]> Change-Id: I851234910172f42a1b30e731161376c344d2727d
mbitsnbites
pushed a commit
that referenced
this issue
Jul 20, 2021
…080) Before PR gdb/28080 was fixed by the previous patch, GDB was crashing like this: (gdb) detach Detaching from program: target:/any/program, process 3671843 Detaching from process 3671843 Ending remote debugging. [Inferior 1 (process 3671843) detached] In main terminate called after throwing an instance of 'gdb_exception_error' Aborted (core dumped) Here's the exception above being thrown: (top-gdb) bt #0 throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222 #1 0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440 #2 0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928 #3 0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030 #4 0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137 #5 0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455 #6 0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb] this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462 #7 0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365 #8 0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439 #9 0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814 #12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769 Note frame #14: #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573 That's a dtor, thus noexcept. That's the reason for the std::terminate. The previous patch fixed things such that the exception above isn't thrown anymore. However, it's possible that e.g., the remote connection drops just while a user types "nosharedlibrary", or some other reason that leads to objfile::~objfile, and then we end up the same std::terminate problem. Also notice that frames #9-#11 are BFD frames: #9 0x0000555555e09e3f in opncls_bclose (abfd=0x555556bc27e0) at src/bfd/opncls.c:599 #10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556bc27e0) at src/bfd/opncls.c:847 #11 0x0000555555e0a27a in bfd_close (abfd=0x555556bc27e0) at src/bfd/opncls.c:814 BFD is written in C and thus throwing exceptions over such frames may either not clean up properly, or, may abort if bfd is not compiled with -fasynchronous-unwind-tables (x86-64 defaults that on, but not all GCC ports do). Thus frame #8 seems like a good place to swallow exceptions. More so since in this spot we already ignore target_fileio_close return errors. That's what this commit does. Without the previous fix, we'd see: (gdb) detach Detaching from program: target:/any/program, process 2197701 Ending remote debugging. [Inferior 1 (process 2197701) detached] warning: cannot close "target:/lib64/ld-linux-x86-64.so.2": Remote connection closed Note it prints a warning, which would still be a regression compared to GDB 10, if it weren't for the previous fix. gdb/ChangeLog: yyyy-mm-dd Pedro Alves <[email protected]> PR gdb/28080 * gdb_bfd.c (gdb_bfd_close_warning): New. (gdb_bfd_iovec_fileio_close): Wrap target_fileio_close in try/catch and print warning on exception. (gdb_bfd_close_or_warn): Use gdb_bfd_close_warning. Change-Id: Ic7a26ddba0a4444e3377b0e7c1c89934a84545d7
mbitsnbites
pushed a commit
that referenced
this issue
Nov 6, 2021
Simon Marchi tried gdb on OpenBSD, and it immediately segfaults when running a program. Simon tracked down the problem to x86_dr_low.get_status being nullptr at this point: (lldb) print x86_dr_low.get_status (unsigned long (*)()) $0 = 0x0000000000000000 (lldb) bt * thread #1, stop reason = step over * frame #0: 0x0000033b64b764aa gdb`x86_dr_stopped_data_address(state=0x0000033d7162a310, addr_p=0x00007f7ffffc5688) at x86-dregs.c:645:12 frame #1: 0x0000033b64b766de gdb`x86_dr_stopped_by_watchpoint(state=0x0000033d7162a310) at x86-dregs.c:687:10 frame #2: 0x0000033b64ea5f72 gdb`x86_stopped_by_watchpoint() at x86-nat.c:206:10 frame #3: 0x0000033b64637fbb gdb`x86_nat_target<obsd_nat_target>::stopped_by_watchpoint(this=0x0000033b65252820) at x86-nat.h:100:12 frame #4: 0x0000033b64d3ff11 gdb`target_stopped_by_watchpoint() at target.c:468:46 frame #5: 0x0000033b6469b001 gdb`watchpoints_triggered(ws=0x00007f7ffffc61c8) at breakpoint.c:4790:32 frame #6: 0x0000033b64a8bb8b gdb`handle_signal_stop(ecs=0x00007f7ffffc61a0) at infrun.c:6072:29 frame #7: 0x0000033b64a7e3a7 gdb`handle_inferior_event(ecs=0x00007f7ffffc61a0) at infrun.c:5694:7 frame #8: 0x0000033b64a7c1a0 gdb`fetch_inferior_event() at infrun.c:4090:5 frame #9: 0x0000033b64a51921 gdb`inferior_event_handler(event_type=INF_REG_EVENT) at inf-loop.c:41:7 frame #10: 0x0000033b64a827c9 gdb`infrun_async_inferior_event_handler(data=0x0000000000000000) at infrun.c:9384:3 frame #11: 0x0000033b6465bd4f gdb`check_async_event_handlers() at async-event.c:335:4 frame #12: 0x0000033b65070917 gdb`gdb_do_one_event() at event-loop.cc:216:10 frame #13: 0x0000033b64af0db1 gdb`start_event_loop() at main.c:421:13 frame #14: 0x0000033b64aefe9a gdb`captured_command_loop() at main.c:481:3 frame #15: 0x0000033b64aed5c2 gdb`captured_main(data=0x00007f7ffffc6470) at main.c:1353:4 frame #16: 0x0000033b64aed4f2 gdb`gdb_main(args=0x00007f7ffffc6470) at main.c:1368:7 frame #17: 0x0000033b6459d787 gdb`main(argc=5, argv=0x00007f7ffffc6518) at gdb.c:32:10 frame #18: 0x0000033b6459d521 gdb`___start + 321 On BSDs, get_status is set in _initialize_x86_bsd_nat, but only if HAVE_PT_GETDBREGS is defined. PT_GETDBREGS doesn't exist on OpenBSD, so get_status (and the other fields of x86_dr_low) are left as nullptr. OpenBSD doesn't support getting or setting the x86 debug registers, so fix by omitting debug register support entirely on OpenBSD: - Change x86bsd_nat_target to only inherit from x86_nat_target if PT_GETDBREGS is supported. - Don't include x86-nat.o and nat/x86-dregs.o for OpenBSD/amd64. They were already omitted for OpenBSD/i386.
mbitsnbites
pushed a commit
that referenced
this issue
Nov 6, 2021
The original reproducer for PR28030 required use of a specific compiler version - gcc-c++-11.1.1-3.fc34 is mentioned in the PR, though it seems probable that other gcc versions might also be able to reproduce the bug as well. This commit introduces a test case which, using the DWARF assembler, provides a reproducer which is independent of the compiler version. (Well, it'll work with whatever compilers the DWARF assembler works with.) To the best of my knowledge, it's also the first test case which uses the DWARF assembler to provide debug info for a shared object. That being the case, I provided more than the usual commentary which should allow this case to be used as a template when a combo shared library / DWARF assembler test case is required in the future. I provide some details regarding the bug in a comment near the beginning of locexpr-dml.exp. This problem was difficult to reproduce; I found myself constantly referring to the backtrace while trying to figure out what (else) I might be missing while trying to create a reproducer. Below is a partial backtrace which I include for posterity. #0 internal_error ( file=0xc50110 "/ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c", line=5575, fmt=0xc520c0 "Unexpected type field location kind: %d") at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdbsupport/errors.cc:51 #1 0x00000000006ef0c5 in copy_type_recursive (objfile=0x1635930, type=0x274c260, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5575 #2 0x00000000006ef382 in copy_type_recursive (objfile=0x1635930, type=0x274ca10, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5602 #3 0x0000000000a7409a in preserve_one_value (value=0x24269f0, objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2529 #4 0x000000000072012a in gdbscm_preserve_values ( extlang=0xc55720 <extension_language_guile>, objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/guile/scm-value.c:94 #5 0x00000000006a3f82 in preserve_ext_lang_values (objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/extension.c:568 #6 0x0000000000a7428d in preserve_values (objfile=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2579 #7 0x000000000082d514 in objfile::~objfile (this=0x1635930, __in_chrg=<optimized out>) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:549 #8 0x0000000000831cc8 in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:348 #9 0x00000000004e6617 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:168 #10 0x00000000004e1d2f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x190bb88, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:705 #11 0x000000000082feee in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x190bb80, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:1154 #12 0x000000000082ff0a in std::shared_ptr<objfile>::~shared_ptr ( this=0x190bb80, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr.h:122 #13 0x000000000085ed7e in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x114bc00, __p=0x190bb80) at /usr/include/c++/11/ext/new_allocator.h:168 #14 0x000000000085e88d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x190bb80) at /usr/include/c++/11/bits/alloc_traits.h:531 #15 0x000000000085e50c in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x114bc00, __position= std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930}) at /usr/include/c++/11/bits/stl_list.h:1925 #16 0x000000000085df0e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x114bc00, __position= std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930}) at /usr/include/c++/11/bits/list.tcc:158 #17 0x000000000085c748 in program_space::remove_objfile (this=0x114bbc0, objfile=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/progspace.c:210 #18 0x000000000082d3ae in objfile::unlink (this=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:487 #19 0x000000000082e68c in objfile_purge_solibs () at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:875 #20 0x000000000092dd37 in no_shared_libraries (ignored=0x0, from_tty=1) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/solib.c:1236 #21 0x00000000009a37fe in target_pre_inferior (from_tty=1) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/target.c:2496 #22 0x00000000007454d6 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/infcmd.c:437 I'll note a few points regarding this backtrace: Frame #1 is where the internal error occurs. It's caused by an unhandled case for FIELD_LOC_KIND_DWARF_BLOCK. The fix for this bug adds support for this case. Frame #22 - it's a partial backtrace - shows that GDB is attempting to (re)run the program. You can see the exact command sequence that was used for reproducing this problem in the PR (at https://sourceware.org/bugzilla/show_bug.cgi?id=28030), but in a nutshell, after starting the program and advancing to the appropriate source line, GDB was asked to step into libstdc++; a "finish" command was issued, returning a value. The fact that a value was returned is very important. GDB was then used to step back into libstdc++. A breakpoint was set on a source line in the library after which a "run" command was issued. Frame #19 shows a call to objfile_purge_solibs. It's aptly named. Frame #7 is a call to the destructor for one of the objfile solibs; it turned out to be the one for libstdc++. Frames #6 thru #3 show various value preservation frames. If you look at preserve_values() in gdb/value.c, the value history is preserved first, followed by internal variables, followed by values for the extension languages (python and guile).
mbitsnbites
pushed a commit
that referenced
this issue
Nov 6, 2021
…es.exp When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have: ... (gdb) bt^M #0 0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M #1 0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M #2 0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M #3 0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M #4 0x0000000000000001 in ?? ()^M #5 0x00007fffffffdaac in ?? ()^M #6 0x0000000000000000 in ?? ()^M (gdb) FAIL: gdb.base/break-probes.exp: ensure using probes ... The test-case intends to emit an UNTESTED in this case, but fails to do so because it tries to do it in a regexp clause in a gdb_test_multiple, which doesn't trigger. Instead, a default clause triggers which produces the FAIL. Also the use of UNTESTED is not appropriate, and we should use UNSUPPORTED instead. Fix this by silencing the FAIL, and emitting an UNSUPPORTED after the gdb_test_multiple: ... if { ! $using_probes } { + unsupported "probes not present on this system" return -1 } ... Tested on x86_64-linux.
mbitsnbites
pushed a commit
that referenced
this issue
Nov 6, 2021
When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have: ... (gdb) run^M Starting program: break-probes^M Stopped due to shared library event (no libraries added or removed)^M (gdb) bt^M #0 0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M #1 0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M #2 0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M #3 0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M #4 0x0000000000000001 in ?? ()^M #5 0x00007fffffffdaac in ?? ()^M #6 0x0000000000000000 in ?? ()^M (gdb) UNSUPPORTED: gdb.base/break-probes.exp: probes not present on this system ... Using the backtrace, the test-case tries to establish that we're stopped in dl_main, which is used as proof that we're using probes. However, the backtrace only shows an address, because: - the dynamic linker contains no minimal symbols and no debug info, and - gdb is build without --with-separate-debug-dir so it can't find the corresponding .debug file, which does contain the mimimal symbols and debug info. Fix this by instead printing the pc and grepping for the value in the info probes output: ... (gdb) p /x $pc^M $1 = 0x7ffff7dd6e12^M (gdb) info probes^M Type Provider Name Where Object ^M ... stap rtld init_start 0x00007ffff7dd6e12 /lib64/ld-linux-x86-64.so.2 ^M ... (gdb) ... Tested on x86_64-linux.
mbitsnbites
pushed a commit
that referenced
this issue
Nov 6, 2021
When running test-case gdb.base/break-interp.exp on ubuntu 18.04.5, we have: ... (gdb) bt^M #0 0x00007eff7ad5ae12 in ?? () from break-interp-LDprelinkNOdebugNO^M #1 0x00007eff7ad71f50 in ?? () from break-interp-LDprelinkNOdebugNO^M #2 0x00007eff7ad59128 in ?? () from break-interp-LDprelinkNOdebugNO^M #3 0x00007eff7ad58098 in ?? () from break-interp-LDprelinkNOdebugNO^M #4 0x0000000000000002 in ?? ()^M #5 0x00007fff505d7a32 in ?? ()^M #6 0x00007fff505d7a94 in ?? ()^M #7 0x0000000000000000 in ?? ()^M (gdb) FAIL: gdb.base/break-interp.exp: ldprelink=NO: ldsepdebug=NO: \ first backtrace: dl bt ... Using the backtrace, the test-case tries to establish that we're stopped in dl_main. However, the backtrace only shows an address, because: - the dynamic linker contains no minimal symbols and no debug info, and - gdb is build without --with-separate-debug-dir so it can't find the corresponding .debug file, which does contain the mimimal symbols and debug info. As in "[gdb/testsuite] Improve probe detection in gdb.base/break-probes.exp", fix this by doing info probes and grepping for the address. Tested on x86_64-linux.
mbitsnbites
pushed a commit
that referenced
this issue
Nov 23, 2021
This commit fixes Bug 28308, titled "Strange interactions with dprintf and break/commands": Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28308 Since creating that bug report, I've found a somewhat simpler way of reproducing the problem. I've encapsulated it into the GDB test case which I've created along with this bug fix. The name of the new test is gdb.base/dprintf-execution-x-script.exp, I'll demonstrate the problem using this test case, though for brevity, I've placed all relevant files in the same directory and have renamed the files to all start with 'dp-bug' instead of 'dprintf-execution-x-script'. The script file, named dp-bug.gdb, consists of the following commands: dprintf increment, "dprintf in increment(), vi=%d\n", vi break inc_vi commands continue end run Note that the final command in this script is 'run'. When 'run' is instead issued interactively, the bug does not occur. So, let's look at the interactive case first in order to see the correct/expected output: $ gdb -q -x dp-bug.gdb dp-bug ... eliding buggy output which I'll discuss later ... (gdb) run Starting program: /mesquite2/sourceware-git/f34-master/bld/gdb/tmp/dp-bug vi=0 dprintf in increment(), vi=0 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=1 dprintf in increment(), vi=1 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=2 dprintf in increment(), vi=2 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=3 [Inferior 1 (process 1539210) exited normally] In this run, in which 'run' was issued from the gdb prompt (instead of at the end of the script), there are three dprintf messages along with three 'Breakpoint 2' messages. This is the correct output. Now let's look at the output that I snipped above; this is the output when 'run' is issued from the script loaded via GDB's -x switch: $ gdb -q -x dp-bug.gdb dp-bug Reading symbols from dp-bug... Dprintf 1 at 0x40116e: file dprintf-execution-x-script.c, line 38. Breakpoint 2 at 0x40113a: file dprintf-execution-x-script.c, line 26. vi=0 dprintf in increment(), vi=0 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 dprintf-execution-x-script.c: No such file or directory. vi=1 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=2 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=3 [Inferior 1 (process 1539175) exited normally] In the output shown above, only the first dprintf message is printed. The 2nd and 3rd dprintf messages are missing! However, all three 'Breakpoint 2...' messages are still printed. Why does this happen? bpstat_do_actions_1() in gdb/breakpoint.c contains the following comment and code near the start of the function: /* Avoid endless recursion if a `source' command is contained in bs->commands. */ if (executing_breakpoint_commands) return 0; scoped_restore save_executing = make_scoped_restore (&executing_breakpoint_commands, 1); Also, as described by this comment prior to the 'async' field in 'struct ui' in top.h, the main UI starts off in sync mode when processing command line arguments: /* True if the UI is in async mode, false if in sync mode. If in sync mode, a synchronous execution command (e.g, "next") does not return until the command is finished. If in async mode, then running a synchronous command returns right after resuming the target. Waiting for the command's completion is later done on the top event loop. For the main UI, this starts out disabled, until all the explicit command line arguments (e.g., `gdb -ex "start" -ex "next"') are processed. */ This combination of things, the state of the static global 'executing_breakpoint_commands' plus the state of the async field in the main UI causes this behavior. This is a backtrace after hitting the dprintf breakpoint for the second time when doing 'run' from the script file, i.e. non-interactively: Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffc2b8) at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431 4431 if (executing_breakpoint_commands) #0 bpstat_do_actions_1 (bsp=0x7fffffffc2b8) at gdb/breakpoint.c:4431 #1 0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x1538090) at gdb/breakpoint.c:13048 #2 0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x137f450, ws=0x7fffffffc718, stop_chain=0x1538090) at gdb/breakpoint.c:5498 #3 0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffc6f0) at gdb/infrun.c:6172 #4 0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffc6f0) at gdb/infrun.c:5662 #5 0x0000000000763cd5 in fetch_inferior_event () at gdb/infrun.c:4060 #6 0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT) at gdb/inf-loop.c:41 #7 0x00000000007a702f in handle_target_event (error=0, client_data=0x0) at gdb/linux-nat.c:4207 #8 0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0) at gdbsupport/event-loop.cc:701 #9 0x0000000000b8d032 in gdb_wait_for_event (block=0) at gdbsupport/event-loop.cc:597 #10 gdb_do_one_event () at gdbsupport/event-loop.cc:212 #11 0x00000000009d19b6 in wait_sync_command_done () at gdb/top.c:528 #12 0x00000000009d1a3f in maybe_wait_sync_command_done (was_sync=0) at gdb/top.c:545 #13 0x00000000009d2033 in execute_command (p=0x7fffffffcb18 "", from_tty=0) at gdb/top.c:676 #14 0x0000000000560d5b in execute_control_command_1 (cmd=0x13b9bb0, from_tty=0) at gdb/cli/cli-script.c:547 #15 0x000000000056134a in execute_control_command (cmd=0x13b9bb0, from_tty=0) at gdb/cli/cli-script.c:717 #16 0x00000000004c3bbe in bpstat_do_actions_1 (bsp=0x137f530) at gdb/breakpoint.c:4469 #17 0x00000000004c3d40 in bpstat_do_actions () at gdb/breakpoint.c:4533 #18 0x00000000006a473a in command_handler (command=0x1399ad0 "run") at gdb/event-top.c:624 #19 0x00000000009d182e in read_command_file (stream=0x113e540) at gdb/top.c:443 #20 0x0000000000563697 in script_from_file (stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb") at gdb/cli/cli-script.c:1642 #21 0x00000000006abd63 in source_gdb_script (extlang=0xc44e80 <extension_language_gdb>, stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb") at gdb/extension.c:188 #22 0x0000000000544400 in source_script_from_stream (stream=0x113e540, file=0x7fffffffd91a "dp-bug.gdb", file_to_open=0x13bb0b0 "dp-bug.gdb") at gdb/cli/cli-cmds.c:692 #23 0x0000000000544557 in source_script_with_search (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1, search_path=0) at gdb/cli/cli-cmds.c:750 #24 0x00000000005445cf in source_script (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1) at gdb/cli/cli-cmds.c:759 #25 0x00000000007cf6d9 in catch_command_errors (command=0x5445aa <source_script(char const*, int)>, arg=0x7fffffffd91a "dp-bug.gdb", from_tty=1, do_bp_actions=false) at gdb/main.c:523 #26 0x00000000007cf85d in execute_cmdargs (cmdarg_vec=0x7fffffffd1b0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd18c) at gdb/main.c:615 #27 0x00000000007d0c8e in captured_main_1 (context=0x7fffffffd3f0) at gdb/main.c:1322 #28 0x00000000007d0eba in captured_main (data=0x7fffffffd3f0) at gdb/main.c:1343 #29 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0) at gdb/main.c:1368 #30 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508) at gdb/gdb.c:32 There are two frames for bpstat_do_actions_1(), one at frame #16 and the other at frame #0. The one at frame #16 is processing the actions for Breakpoint 2, which is a 'continue'. The one at frame #0 is attempting to process the dprintf breakpoint action. However, at this point, the value of 'executing_breakpoint_commands' is 1, forcing an early return, i.e. prior to executing the command(s) associated with the dprintf breakpoint. For the sake of comparison, this is what the stack looks like when hitting the dprintf breakpoint for the second time when issuing the 'run' command from the GDB prompt. Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffccd8) at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431 4431 if (executing_breakpoint_commands) #0 bpstat_do_actions_1 (bsp=0x7fffffffccd8) at gdb/breakpoint.c:4431 #1 0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x16b0290) at gdb/breakpoint.c:13048 #2 0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x13f0e60, ws=0x7fffffffd138, stop_chain=0x16b0290) at gdb/breakpoint.c:5498 #3 0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffd110) at gdb/infrun.c:6172 #4 0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffd110) at gdb/infrun.c:5662 #5 0x0000000000763cd5 in fetch_inferior_event () at gdb/infrun.c:4060 #6 0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT) at gdb/inf-loop.c:41 #7 0x00000000007a702f in handle_target_event (error=0, client_data=0x0) at gdb/linux-nat.c:4207 #8 0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0) at gdbsupport/event-loop.cc:701 #9 0x0000000000b8d032 in gdb_wait_for_event (block=0) at gdbsupport/event-loop.cc:597 #10 gdb_do_one_event () at gdbsupport/event-loop.cc:212 #11 0x00000000007cf512 in start_event_loop () at gdb/main.c:421 #12 0x00000000007cf631 in captured_command_loop () at gdb/main.c:481 #13 0x00000000007d0ebf in captured_main (data=0x7fffffffd3f0) at gdb/main.c:1353 #14 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0) at gdb/main.c:1368 #15 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508) at gdb/gdb.c:32 This relatively short backtrace is due to the current UI's async field being set to 1. Yet another thing to be aware of regarding this problem is the difference in the way that commands associated to dprintf breakpoints versus regular breakpoints are handled. While they both use a command list associated with the breakpoint, regular breakpoints will place the commands to be run on the bpstat chain constructed in bp_stop_status(). These commands are run later on. For dprintf breakpoints, commands are run via the 'after_condition_true' function pointer directly from bpstat_stop_status(). (The 'commands' field in the bpstat is cleared in dprintf_after_condition_true(). This prevents the dprintf commands from being run again later on when other commands on the bpstat chain are processed.) Another thing that I noticed is that dprintf breakpoints are the only type of breakpoint which use 'after_condition_true'. This suggests that one possible way of fixing this problem, that of making dprintf breakpoints work more like regular breakpoints, probably won't work. (I must admit, however, that my understanding of this code isn't complete enough to say why. I'll trust that whoever implemented it had a good reason for doing it this way.) The comment referenced earlier regarding 'executing_breakpoint_commands' states that the reason for checking this variable is to avoid potential endless recursion when a 'source' command appears in bs->commands. We know that a dprintf command is constrained to either 1) execution of a GDB printf command, 2) an inferior function call of a printf-like function, or 3) execution of an agent-printf command. Therefore, infinite recursion due to a 'source' command cannot happen when executing commands upon hitting a dprintf breakpoint. I chose to fix this problem by having dprintf_after_condition_true() directly call execute_control_commands(). This means that it no longer attempts to go through bpstat_do_actions_1() avoiding the infinite recursion check for potential 'source' commands on the command chain. I think it simplifies this code a little bit too, a definite bonus. Summary: * breakpoint.c (dprintf_after_condition_true): Don't call bpstat_do_actions_1(). Call execute_control_commands() instead.
mbitsnbites
pushed a commit
that referenced
this issue
Mar 24, 2022
Fedora Rawhide is now using gcc-12.0. As part of updating to the gcc-12.0 package set, Rawhide is also now using a version of libgcc_s which lacks a .data section. This causes gdb to fail in the following fashion while debugging a program (such as gdb) which uses libgcc_s: (top-gdb) run Starting program: rawhide-master/bld/gdb/gdb ... objfiles.h:467: internal-error: sect_index_data not initialized A problem internal to GDB has been detected, further debugging may prove unreliable. ... I snipped the backtrace from the above output. Instead, here's a portion of a backtrace obtained using GDB's backtrace command. (Obviously, in order to obtain it, I used a GDB which has been patched with this commit.) #0 internal_error ( file=0xc6a508 "gdb/objfiles.h", line=467, fmt=0xc6a4e8 "sect_index_data not initialized") at gdbsupport/errors.cc:51 #1 0x00000000005f9651 in objfile::data_section_offset (this=0x4fa48f0) at gdb/objfiles.h:467 #2 0x000000000097c5f8 in relocate_address (address=0x17244, objfile=0x4fa48f0) at gdb/stap-probe.c:1333 #3 0x000000000097c630 in stap_probe::get_relocated_address (this=0xa1a17a0, objfile=0x4fa48f0) at gdb/stap-probe.c:1341 #4 0x00000000004d7025 in create_exception_master_breakpoint_probe ( objfile=0x4fa48f0) at gdb/breakpoint.c:3505 #5 0x00000000004d7426 in create_exception_master_breakpoint () at gdb/breakpoint.c:3575 #6 0x00000000004efcc1 in breakpoint_re_set () at gdb/breakpoint.c:13407 #7 0x0000000000956998 in solib_add (pattern=0x0, from_tty=0, readsyms=1) at gdb/solib.c:1001 #8 0x00000000009576a8 in handle_solib_event () at gdb/solib.c:1269 ... The function 'relocate_address' in gdb/stap-probe.c attempts to do its "relocation" by using objfile->data_section_offset(). That method, data_section_offset() is defined as follows in objfiles.h: CORE_ADDR data_section_offset () const { return section_offsets[SECT_OFF_DATA (this)]; } The internal error occurs when the SECT_OFF_DATA macro finds that the 'sect_index_data' field is -1: #define SECT_OFF_DATA(objfile) \ ((objfile->sect_index_data == -1) \ ? (internal_error (__FILE__, __LINE__, \ _("sect_index_data not initialized")), -1) \ : objfile->sect_index_data) relocate_address() is obtaining the section offset in order to compute a relocated address. For some ABIs, such as the System V ABI, the section offsets will all be the same. So for those ABIs, it doesn't matter which offset is used. However, other ABIs, such as the FDPIC ABI, will have different offsets for the various sections. Thus, for those ABIs, it is vital that this and other relocation code use the correct offset. In stap_probe::get_relocated_address, the address to which to add the offset (thus forming the relocated address) is obtained via this->get_address (); get_address is a getter for m_address in probe.h. It's documented/defined as follows (also in probe.h): /* The address where the probe is inserted, relative to SECT_OFF_TEXT. */ CORE_ADDR m_address; (Thanks to Tom Tromey for this observation.) So, based on this, the current use of data_section_offset / SECT_OFF_DATA is wrong. This relocation code should have been using text_section_offset / SECT_OFF_TEXT all along. That being the case, I've adjusted the stap-probe.c relocation code accordingly. Searching the sources turned up one other use of data_section_offset, in gdb/dtrace-probe.c, so I've updated that code as well. The same reasoning presented above applies to this case too. Summary: * gdb/dtrace-probe.c (dtrace_probe::get_relocated_address): Use method text_section_offset instead of data_section_offset. * gdb/stap-probe.c (relocate_address): Likewise.
mbitsnbites
pushed a commit
that referenced
this issue
Mar 24, 2022
g++ 11.1.0 has a bug where it will emit a negative DW_AT_data_member_location in some cases: $ cat test.cpp #include <memory> int main() { std::unique_ptr<int> ptr; } $ g++ -g test.cpp $ llvm-dwarfdump -F a.out ... 0x00000964: DW_TAG_member DW_AT_name [DW_FORM_strp] ("_M_head_impl") DW_AT_decl_file [DW_FORM_data1] ("/usr/include/c++/11.1.0/tuple") DW_AT_decl_line [DW_FORM_data1] (125) DW_AT_decl_column [DW_FORM_data1] (0x27) DW_AT_type [DW_FORM_ref4] (0x0000067a "default_delete<int>") DW_AT_data_member_location [DW_FORM_sdata] (-1) ... This leads to a GDB crash (when built with ASan, otherwise probably garbage results), since it tries to read just before (to the left, in ASan speak) of the value's buffer: ==888645==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000c52af at pc 0x7f711b239f4b bp 0x7fff356bd470 sp 0x7fff356bcc18 READ of size 1 at 0x6020000c52af thread T0 #0 0x7f711b239f4a in __interceptor_memcpy /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 #1 0x555c4977efa1 in value_contents_copy_raw /home/simark/src/binutils-gdb/gdb/value.c:1347 #2 0x555c497909cd in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3126 #3 0x555c478f2eaa in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:333 #4 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #5 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #6 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #10 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #11 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #12 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #13 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 #14 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151 #15 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335 #16 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #17 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #18 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #19 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #20 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #21 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 #22 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151 #23 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335 #24 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #25 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #26 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #27 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 #28 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151 #29 0x555c4760f04c in c_value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:587 #30 0x555c483ff954 in language_defn::value_print(value*, ui_file*, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:614 #31 0x555c49759f61 in value_print(value*, ui_file*, value_print_options const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1189 #32 0x555c48950f70 in print_formatted /home/simark/src/binutils-gdb/gdb/printcmd.c:337 #33 0x555c48958eda in print_value(value*, value_print_options const&) /home/simark/src/binutils-gdb/gdb/printcmd.c:1258 #34 0x555c48959891 in print_command_1 /home/simark/src/binutils-gdb/gdb/printcmd.c:1367 #35 0x555c4895a3df in print_command /home/simark/src/binutils-gdb/gdb/printcmd.c:1458 #36 0x555c4767f974 in do_simple_func /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:97 #37 0x555c47692e25 in cmd_func(cmd_list_element*, char const*, int) /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2475 #38 0x555c4936107e in execute_command(char const*, int) /home/simark/src/binutils-gdb/gdb/top.c:670 #39 0x555c485f1bff in catch_command_errors /home/simark/src/binutils-gdb/gdb/main.c:523 #40 0x555c485f249c in execute_cmdargs /home/simark/src/binutils-gdb/gdb/main.c:618 #41 0x555c485f6677 in captured_main_1 /home/simark/src/binutils-gdb/gdb/main.c:1317 #42 0x555c485f6c83 in captured_main /home/simark/src/binutils-gdb/gdb/main.c:1338 #43 0x555c485f6d65 in gdb_main(captured_main_args*) /home/simark/src/binutils-gdb/gdb/main.c:1363 #44 0x555c46e41ba8 in main /home/simark/src/binutils-gdb/gdb/gdb.c:32 #45 0x7f71198bcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) #46 0x555c46e4197d in _start (/home/simark/build/binutils-gdb-one-target/gdb/gdb+0x77f197d) 0x6020000c52af is located 1 bytes to the left of 8-byte region [0x6020000c52b0,0x6020000c52b8) allocated by thread T0 here: #0 0x7f711b2b7459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x555c470acdc9 in xcalloc /home/simark/src/binutils-gdb/gdb/alloc.c:100 #2 0x555c49b775cd in xzalloc(unsigned long) /home/simark/src/binutils-gdb/gdbsupport/common-utils.cc:29 #3 0x555c4977bdeb in allocate_value_contents /home/simark/src/binutils-gdb/gdb/value.c:1029 #4 0x555c4977be25 in allocate_value(type*) /home/simark/src/binutils-gdb/gdb/value.c:1040 #5 0x555c4979030d in value_primitive_field(value*, long, int, type*) /home/simark/src/binutils-gdb/gdb/value.c:3092 #6 0x555c478f6280 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:501 #7 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #8 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #9 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #10 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #11 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #12 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #13 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #14 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #15 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 #16 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151 #17 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335 #18 0x555c478f63b2 in cp_print_value /home/simark/src/binutils-gdb/gdb/cp-valprint.c:513 #19 0x555c478f02ca in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:161 #20 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #21 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #22 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #23 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 #24 0x555c49759b17 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) /home/simark/src/binutils-gdb/gdb/valprint.c:1151 #25 0x555c478f2fcb in cp_print_value_fields(value*, ui_file*, int, value_print_options const*, type**, int) /home/simark/src/binutils-gdb/gdb/cp-valprint.c:335 #26 0x555c4760d45f in c_value_print_struct /home/simark/src/binutils-gdb/gdb/c-valprint.c:383 #27 0x555c4760df4c in c_value_print_inner(value*, ui_file*, int, value_print_options const*) /home/simark/src/binutils-gdb/gdb/c-valprint.c:438 #28 0x555c483ff9a7 in language_defn::value_print_inner(value*, ui_file*, int, value_print_options const*) const /home/simark/src/binutils-gdb/gdb/language.c:632 #29 0x555c49758b68 in do_val_print /home/simark/src/binutils-gdb/gdb/valprint.c:1048 Since there are some binaries with this in the wild, I think it would be useful for GDB to work around this. I did the obvious simple thing, if the DW_AT_data_member_location's value is -1, replace it with 0. I added a producer check to only apply this fixup for GCC 11. The idea is that if some other compiler ever uses a DW_AT_data_member_location value of -1 by mistake, we don't know (before analyzing the bug at least) if they did mean 0 or some other value. So I wouldn't want to apply the fixup in that case. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28063 Change-Id: Ieef3459b0b9bbce8bdad838ba83b4b64e7269d42
mbitsnbites
pushed a commit
that referenced
this issue
Mar 24, 2022
Starting with commit commit 1da5d0e Date: Tue Jan 4 08:02:24 2022 -0700 Change how Python architecture and language are handled we see a failure in gdb.threads/killed-outside.exp: ... Executing on target: kill -9 16622 (timeout = 300) builtin_spawn -ignore SIGHUP kill -9 16622 continue Continuing. Couldn't get registers: No such process. (gdb) [Thread 0x7ffff77c2700 (LWP 16626) exited] Program terminated with signal SIGKILL, Killed. The program no longer exists. FAIL: gdb.threads/killed-outside.exp: prompt after first continue (timeout) This is not a regression but a failure due to a change in GDB's output. Prior to the aforementioned commit, GDB has been printing the "Couldn't get registers: No such process." message twice. The second one came from (top-gdb) bt #0 amd64_linux_nat_target::fetch_registers (this=0x555557f31440 <the_amd64_linux_nat_target>, regcache=0x555558805ce0, regnum=16) at /gdb-up/gdb/amd64-linux-nat.c:225 #1 0x000055555640ac5f in target_ops::fetch_registers (this=0x555557d636d0 <the_thread_db_target>, arg0=0x555558805ce0, arg1=16) at /gdb-up/gdb/target-delegates.c:502 #2 0x000055555641a647 in target_fetch_registers (regcache=0x555558805ce0, regno=16) at /gdb-up/gdb/target.c:3945 #3 0x0000555556278e68 in regcache::raw_update (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:587 #4 0x0000555556278f14 in readable_regcache::raw_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:601 #5 0x00005555562792aa in readable_regcache::cooked_read (this=0x555558805ce0, regnum=16, buf=0x555558881950 "") at /gdb-up/gdb/regcache.c:690 #6 0x000055555627965e in readable_regcache::cooked_read_value (this=0x555558805ce0, regnum=16) at /gdb-up/gdb/regcache.c:748 #7 0x0000555556352a37 in sentinel_frame_prev_register (this_frame=0x555558181090, this_prologue_cache=0x5555581810a8, regnum=16) at /gdb-up/gdb/sentinel-frame.c:53 #8 0x0000555555fa4773 in frame_unwind_register_value (next_frame=0x555558181090, regnum=16) at /gdb-up/gdb/frame.c:1235 #9 0x0000555555fa420d in frame_register_unwind (next_frame=0x555558181090, regnum=16, optimizedp=0x7fffffffd570, unavailablep=0x7fffffffd574, lvalp=0x7fffffffd57c, addrp=0x7fffffffd580, realnump=0x7fffffffd578, bufferp=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1143 #10 0x0000555555fa455f in frame_unwind_register (next_frame=0x555558181090, regnum=16, buf=0x7fffffffd5b0 "") at /gdb-up/gdb/frame.c:1199 #11 0x00005555560178e2 in i386_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/i386-tdep.c:1972 #12 0x0000555555cd2b9d in gdbarch_unwind_pc (gdbarch=0x5555587c4a70, next_frame=0x555558181090) at /gdb-up/gdb/gdbarch.c:3007 #13 0x0000555555fa3a5b in frame_unwind_pc (this_frame=0x555558181090) at /gdb-up/gdb/frame.c:948 #14 0x0000555555fa7621 in get_frame_pc (frame=0x555558181160) at /gdb-up/gdb/frame.c:2572 #15 0x0000555555fa7706 in get_frame_address_in_block (this_frame=0x555558181160) at /gdb-up/gdb/frame.c:2602 #16 0x0000555555fa77d0 in get_frame_address_in_block_if_available (this_frame=0x555558181160, pc=0x7fffffffd708) at /gdb-up/gdb/frame.c:2665 #17 0x0000555555fa5f8d in select_frame (fi=0x555558181160) at /gdb-up/gdb/frame.c:1890 #18 0x0000555555fa5bab in lookup_selected_frame (a_frame_id=..., frame_level=-1) at /gdb-up/gdb/frame.c:1720 #19 0x0000555555fa5e47 in get_selected_frame (message=0x0) at /gdb-up/gdb/frame.c:1810 #20 0x0000555555cc9c6e in get_current_arch () at /gdb-up/gdb/arch-utils.c:848 #21 0x000055555625b239 in gdbpy_before_prompt_hook (extlang=0x555557451f20 <extension_language_python>, current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/python/python.c:1063 #22 0x0000555555f7cfbb in ext_lang_before_prompt (current_gdb_prompt=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/extension.c:922 #23 0x0000555555f7d442 in std::_Function_handler<void (char const*), void (*)(char const*)>::_M_invoke(std::_Any_data const&, char const*&&) (__functor=..., __args#0=@0x7fffffffd900: 0x555557f4d890 <top_prompt+16> "(gdb) ") at /usr/include/c++/7/bits/std_function.h:316 #24 0x0000555555f752dd in std::function<void (char const*)>::operator()(char const*) const (this=0x55555817d838, __args#0=0x555557f4d890 <top_prompt+16> "(gdb) ") at /usr/include/c++/7/bits/std_function.h:706 #25 0x0000555555f75100 in gdb::observers::observable<char const*>::notify (this=0x555557f49060 <gdb::observers::before_prompt>, args#0=0x555557f4d890 <top_prompt+16> "(gdb) ") at /gdb-up/gdb/../gdbsupport/observable.h:150 #26 0x0000555555f736dc in top_level_prompt () at /gdb-up/gdb/event-top.c:444 #27 0x0000555555f735ba in display_gdb_prompt (new_prompt=0x0) at /gdb-up/gdb/event-top.c:411 #28 0x00005555564611a7 in tui_on_command_error () at /gdb-up/gdb/tui/tui-interp.c:205 #29 0x0000555555c2173f in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316 #30 0x0000555555e10c20 in std::function<void ()>::operator()() const (this=0x5555580f9028) at /usr/include/c++/7/bits/std_function.h:706 #31 0x0000555555e10973 in gdb::observers::observable<>::notify() const (this=0x555557f48d20 <gdb::observers::command_error>) at /gdb-up/gdb/../gdbsupport/observable.h:150 #32 0x00005555560e9b3f in start_event_loop () at /gdb-up/gdb/main.c:438 #33 0x00005555560e9bcc in captured_command_loop () at /gdb-up/gdb/main.c:481 #34 0x00005555560eb616 in captured_main (data=0x7fffffffddd0) at /gdb-up/gdb/main.c:1348 #35 0x00005555560eb67c in gdb_main (args=0x7fffffffddd0) at /gdb-up/gdb/main.c:1363 #36 0x0000555555c1b6b3 in main (argc=12, argv=0x7fffffffded8) at /gdb-up/gdb/gdb.c:32 Commit 1da5d0e eliminated the call to 'get_current_arch' in 'gdbpy_before_prompt_hook'. Hence, the second instance of "Couldn't get registers: No such process." does not appear anymore. Fix the failure by updating the regular expression in the test.
mbitsnbites
pushed a commit
that referenced
this issue
Mar 24, 2022
…ync." Commit 14b3360 ("do_target_wait_1: Clear TARGET_WNOHANG if the target isn't async.") broke some multi-target tests, such as gdb.multi/multi-target-info-inferiors.exp. The symptom is that execution just hangs at some point. What happens is: 1. One remote inferior is started, and now sits stopped at a breakpoint. It is not "async" at this point (but it "can async"). 2. We run a native inferior, the event loop gets woken up by the native target's fd. 3. In do_target_wait, we randomly choose an inferior to call target_wait on first, it happens to be the remote inferior. 4. Because the target is currently not "async", we clear TARGET_WNOHANG, resulting in synchronous wait. We therefore block here: #0 0x00007fe9540dbb4d in select () from /usr/lib/libc.so.6 #1 0x000055fc7e821da7 in gdb_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/posix-hdep.c:31 #2 0x000055fc7ddef905 in interruptible_select (n=15, readfds=0x7ffdb77c1fb0, writefds=0x0, exceptfds=0x7ffdb77c2050, timeout=0x7ffdb77c1f90) at /home/simark/src/binutils-gdb/gdb/event-top.c:1134 #3 0x000055fc7eda58e4 in ser_base_wait_for (scb=0x6250002e4100, timeout=1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:240 #4 0x000055fc7eda66ba in do_ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:365 #5 0x000055fc7eda6ff6 in generic_readchar (scb=0x6250002e4100, timeout=-1, do_readchar=0x55fc7eda663c <do_ser_base_readchar(serial*, int)>) at /home/simark/src/binutils-gdb/gdb/ser-base.c:444 #6 0x000055fc7eda718a in ser_base_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/ser-base.c:471 #7 0x000055fc7edb1ecd in serial_readchar (scb=0x6250002e4100, timeout=-1) at /home/simark/src/binutils-gdb/gdb/serial.c:393 #8 0x000055fc7ec48b8f in remote_target::readchar (this=0x617000038780, timeout=-1) at /home/simark/src/binutils-gdb/gdb/remote.c:9446 #9 0x000055fc7ec4da82 in remote_target::getpkt_or_notif_sane_1 (this=0x617000038780, buf=0x6170000387a8, forever=1, expecting_notif=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:9928 #10 0x000055fc7ec4f045 in remote_target::getpkt_or_notif_sane (this=0x617000038780, buf=0x6170000387a8, forever=1, is_notif=0x7ffdb77c24f0) at /home/simark/src/binutils-gdb/gdb/remote.c:10037 #11 0x000055fc7ec354d4 in remote_target::wait_ns (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8147 #12 0x000055fc7ec38aa1 in remote_target::wait (this=0x617000038780, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/remote.c:8337 #13 0x000055fc7f1409ce in target_wait (ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2612 #14 0x000055fc7e19da98 in do_target_wait_1 (inf=0x617000038080, ptid=..., status=0x7ffdb77c33c8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3636 #15 0x000055fc7e19e26b in operator() (__closure=0x7ffdb77c2f90, inf=0x617000038080) at /home/simark/src/binutils-gdb/gdb/infrun.c:3697 #16 0x000055fc7e19f0c4 in do_target_wait (ecs=0x7ffdb77c33a0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3716 #17 0x000055fc7e1a31f7 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4061 Before the aforementioned commit, we would not have cleared TARGET_WNOHANG, the remote target's wait would have returned nothing, and we would have consumed the native target's event. After applying this revert, the testsuite state looks as good as before for me on Ubuntu 20.04 amd64. Change-Id: Ic17a1642935cabcc16c25cb6899d52e12c2f5c3f
mbitsnbites
pushed a commit
that referenced
this issue
May 23, 2022
While working on a different patch, I triggered an assertion from the initialize_current_architecture code, specifically from one of the *_gdbarch_init functions in a *-tdep.c file. This exposes a couple of issues with GDB. This is easy enough to reproduce by adding 'gdb_assert (false)' into a suitable function. For example, I added a line into i386_gdbarch_init and can see the following issue. I start GDB and immediately hit the assert, the output is as you'd expect, except for the very last line: $ ./gdb/gdb --data-directory ./gdb/data-directory/ ../../src.dev-1/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- ... snip ... --------------------- ../../src.dev-1/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) ../../src.dev-1/gdb/ser-event.c:212:16: runtime error: member access within null pointer of type 'struct serial' Something goes wrong when we try to query the user. Note, I configured GDB with --enable-ubsan, I suspect that without this the above "error" would actually just be a crash. The backtrace from ser-event.c:212 looks like this: (gdb) bt 10 #0 serial_event_clear (event=0x675c020) at ../../src/gdb/ser-event.c:212 #1 0x0000000000769456 in invoke_async_signal_handlers () at ../../src/gdb/async-event.c:211 #2 0x000000000295049b in gdb_do_one_event () at ../../src/gdbsupport/event-loop.cc:194 #3 0x0000000001f015f8 in gdb_readline_wrapper ( prompt=0x67135c0 "../../src/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed.\nA problem internal to GDB has been detected,\nfurther debugging may prove unreliable.\nQuit this debugg"...) at ../../src/gdb/top.c:1141 #4 0x0000000002118b64 in defaulted_query(const char *, char, typedef __va_list_tag __va_list_tag *) ( ctlstr=0x2e4eb68 "%s\nQuit this debugging session? ", defchar=0 '\000', args=0x7fffffffa6e0) at ../../src/gdb/utils.c:934 #5 0x0000000002118f72 in query (ctlstr=0x2e4eb68 "%s\nQuit this debugging session? ") at ../../src/gdb/utils.c:1026 #6 0x00000000021170f6 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) (problem=0x6107bc0 <internal_error_problem>, file=0x2b976c8 "../../src/gdb/i386-tdep.c", line=8455, fmt=0x2b96d7f "%s: Assertion `%s' failed.", ap=0x7fffffffa8e8) at ../../src/gdb/utils.c:417 #7 0x00000000021175a0 in internal_verror (file=0x2b976c8 "../../src/gdb/i386-tdep.c", line=8455, fmt=0x2b96d7f "%s: Assertion `%s' failed.", ap=0x7fffffffa8e8) at ../../src/gdb/utils.c:485 #8 0x00000000029503b3 in internal_error (file=0x2b976c8 "../../src/gdb/i386-tdep.c", line=8455, fmt=0x2b96d7f "%s: Assertion `%s' failed.") at ../../src/gdbsupport/errors.cc:55 #9 0x000000000122d5b6 in i386_gdbarch_init (info=..., arches=0x0) at ../../src/gdb/i386-tdep.c:8455 (More stack frames follow...) It turns out that the problem is that the async event handler mechanism has been invoked, but this has not yet been initialized. If we look at gdb_init (in gdb/top.c) we can indeed see the call to gdb_init_signals is after the call to initialize_current_architecture. If I reorder the calls, moving gdb_init_signals earlier, then the initial error is resolved, however, things are still broken. I now see the same "Quit this debugging session? (y or n)" prompt, but when I provide an answer and press return GDB immediately crashes. So what's going on now? The next problem is that the call_readline field within the current_ui structure is not initialized, and this callback is invoked to process the reply I entered. The problem is that call_readline is setup as a result of calling set_top_level_interpreter, which is called from captured_main_1. Unfortunately, set_top_level_interpreter is called after gdb_init is called. I wondered how to solve this problem for a while, however, I don't know if there's an easy "just reorder some lines" solution here. Looking through captured_main_1 there seems to be a bunch of dependencies between printing various things, parsing config files, and setting up the interpreter. I'm sure there is a solution hiding in there somewhere.... I'm just not sure I want to spend any longer looking for it. So. I propose a simpler solution, more of a hack/work-around. In utils.c we already have a function filtered_printing_initialized, this is checked in a few places within internal_vproblem. In some of these cases the call gates whether or not GDB will query the user. My proposal is to add a new readline_initialized function, which checks if the current_ui has had readline initialized yet. If this is not the case then we should not attempt to query the user. After this change GDB prints the error message, the backtrace, and then aborts (including dumping core). This actually seems pretty sane as, if GDB has not yet made it through the initialization then it doesn't make much sense to allow the user to say "no, I don't want to quit the debug session" (I think).
mbitsnbites
pushed a commit
that referenced
this issue
May 23, 2022
The variable right_lib_flags is not being set correctly to define RIGHT. The value RIGHT is needed to force the address of the library functions lib1_func3 and lib2_func4 to occur at different address in the wrong and right libraries. With RIGHT defined correctly, functions lib1_func3 and lib2_func4 occur at different addresses the test runs correctly on Powerpc. The test needs the lib2 addresses to be different in the right and wrong cases. That is the point of introducing function lib2_spacer with the ifdef RIGHT compiler directive. On Intel, the ARRAY_SIZE of 1 versus 8192 is sufficient to get the dynamic linker to move the addresses of the library. You can also get the same effect on PowerPC but you must use a value much larger than 8192. The key thing is that the test was not properly setting RIGHT to defined to get the lib2_spacer function on Intel and Powerpc. Without the patch, we have the Intel backtrace for the bad libraries: backtrace #0 break_here () at /home/ ... /gdb/testsuite/gdb.base/solib-search.c:30 #1 0x00007ffff7fae156 in ?? () #2 0x00007fffffffc150 in ?? () #3 0x00007ffff7fbb156 in ?? () #4 0x00007fffffffc160 in ?? () #5 0x00007ffff7fae146 in ?? () #6 0x00007fffffffc170 in ?? () #7 0x00007ffff7fbb146 in ?? () #8 0x00007fffffffc180 in ?? () #9 0x0000555555555156 in main () at /home/ ... /binutils-gdb/gdb/testsuite/gdb.base/solib-search.c:23 Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) PASS: gdb.base/solib-search.exp: backtrace (with wrong libs) (data collection) The backtrace on Intel with the good libraries is: backtrace #0 break_here () at /.../binutils-gdb/gdb/testsuite/gdb.base/solib-search.c:30 #1 0x00007ffff7fae156 in lib2_func4 () at /.../binutils-gdb/gdb/testsuite/gdb.base/solib-search-lib2.c:49 #2 0x00007ffff7fbb156 in lib1_func3 () at /.../gdb.base/solib-search-lib1.c:49 #3 0x00007ffff7fae146 in lib2_func2 () at /.../testsuite/gdb.base/solib-search-lib2.c:30 #4 0x00007ffff7fbb146 in lib1_func1 () at /.../gdb.base/solib-search-lib1.c:30 #5 0x0000555555555156 in main () at /...solib-search.c:23 (gdb) PASS: gdb.base/solib-search.exp: backtrace (with right libs) (data collection) PASS: gdb.base/solib-search.exp: backtrace (with right libs) In one case the backtrace is correct and the other it is wrong on Intel. This is due to the fact that the ARRAY_SIZE caused the dynamic linker to move the library function addresses around. I believe it has to do with the default size of the data and code sections used by the dynamic linker. So without the patch the backtrace on PowerPC looks like: backtrace #0 break_here () at /.../solib-search.c:30 #1 0x00007ffff7f007f4 in lib2_func4 () at /.../solib-search-lib2.c:49 #2 0x00007ffff7f307f4 in lib1_func3 () at /.../solib-search-lib1.c:49 #3 0x00007ffff7f007ac in lib2_func2 () at /.../solib-search-lib2.c:30 #4 0x00007ffff7f307ac in lib1_func1 () at /.../solib-search-lib1.c:30 #5 0x000000001000074c in main () at /.../solib-search.c:23 for both the good and bad libraries. The patch fixes defining RIGHT in solib-search-lib1.c and solib-search- lib2.c. Note, without the patch the lib1_spacer and lib2_spacer functions do not show up in the object dump of the Intel or Powerpc libraries as it should. The patch fixes that by making sure RIGHT gets defined. Now with the patch the backtrace for the bad library on PowerPC looks like: backtrace #0 break_here () at /.../solib-search.c:30 #1 0x00007ffff7f0083c in __glink_PLTresolve () from /.../solib-search-lib2.so Backtrace stopped: frame did not save the PC And the backtrace for the good libraries on PowerPC looks like: backtrace #0 break_here () at /.../solib-search.c:30 #1 0x00007ffff7f0083c in lib2_func4 () at /.../solib-search-lib2.c:49 #2 0x00007ffff7f3083c in lib1_func3 () at /.../solib-search-lib1.c:49 #3 0x00007ffff7f007cc in lib2_func2 () at /.../solib-search-lib2.c:30 #4 0x00007ffff7f307cc in lib1_func1 () at /.../solib-search-lib1.c:30 #5 0x000000001000074c in main () at /.../solib-search.c:23 (gdb) PASS: gdb.base/solib-search.exp: backtrace (with right libs) (data collection) PASS: gdb.base/solib-search.exp: backtrace (with right libs) The issue then is on Power where the ARRAY_SIZE of 1 versus 8192 is not sufficient to cause the dymanic linker to allocate the libraries at different addresses. I don't claim to understand the specifics of how the dynamic linker works and what the default size is for the data and code sections are. My guess is by default PowerPC allocates a larger data size by default, which is large enough to hold array[8192]. The default size of the data section allocated by the dynamic linker on Intel is not large enough to hold array[8192] thus causing the code section on Intel to have to move when the large array is defined. Note on PowerPC, if you make ARRAY_SIZE big enough, then you will cause the library addresses to occur at different addresses as the larger data section forces the code section to a different address. That was actually my original fix for the program until I spoke with Doug Evans who originally wrote the test. Doug noticed that RIGHT was not getting defined as he originally intended in the test. With the patch to fix the definition of RIGHT, PowerPC has a bad and a good backtrace because the address of lib1_func3 and lib2_func4 both move because lib1_spacer and lib2_spacer are now defined before lib1_func3 and lib2_func4. Without the patch, the lib1_spacer and lib2_spacer function doesn't show up in the binary for the correct or incorrect library on Intel or PowerPC. With the patch, RIGHT gets defined as originally intended for the test on both architectures and lib1_spacer and lib2_spacer function show up in the binaries on both architectures changing the other function addresses as intended thus causing the test work as intended on PowerPC.
mbitsnbites
pushed a commit
that referenced
this issue
May 23, 2022
… failing to attach Running $ ../gdbserver/gdbserver --once --attach :1234 539436 with ASan while /proc/sys/kernel/yama/ptrace_scope is set to 1 (prevents attaching) shows that we fail to free some platform-specific objects tied to the process_info (process_info_private and arch_process_info): Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f6b558b3fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x562eaf15d04a in xcalloc /home/simark/src/binutils-gdb/gdbserver/../gdb/alloc.c:100 #2 0x562eaf251548 in xcnew<process_info_private> /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/poison.h:122 #3 0x562eaf22810c in linux_process_target::add_linux_process_no_mem_file(int, int) /home/simark/src/binutils-gdb/gdbserver/linux-low.cc:426 #4 0x562eaf22d33f in linux_process_target::attach(unsigned long) /home/simark/src/binutils-gdb/gdbserver/linux-low.cc:1132 #5 0x562eaf1a7222 in attach_inferior /home/simark/src/binutils-gdb/gdbserver/server.cc:308 #6 0x562eaf1c1016 in captured_main /home/simark/src/binutils-gdb/gdbserver/server.cc:3949 #7 0x562eaf1c1d60 in main /home/simark/src/binutils-gdb/gdbserver/server.cc:4084 #8 0x7f6b552f630f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f) Indirect leak of 56 byte(s) in 1 object(s) allocated from: #0 0x7f6b558b3fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x562eaf15d04a in xcalloc /home/simark/src/binutils-gdb/gdbserver/../gdb/alloc.c:100 #2 0x562eaf2a0d79 in xcnew<arch_process_info> /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/poison.h:122 #3 0x562eaf295e2c in x86_target::low_new_process() /home/simark/src/binutils-gdb/gdbserver/linux-x86-low.cc:723 #4 0x562eaf22819b in linux_process_target::add_linux_process_no_mem_file(int, int) /home/simark/src/binutils-gdb/gdbserver/linux-low.cc:428 #5 0x562eaf22d33f in linux_process_target::attach(unsigned long) /home/simark/src/binutils-gdb/gdbserver/linux-low.cc:1132 #6 0x562eaf1a7222 in attach_inferior /home/simark/src/binutils-gdb/gdbserver/server.cc:308 #7 0x562eaf1c1016 in captured_main /home/simark/src/binutils-gdb/gdbserver/server.cc:3949 #8 0x562eaf1c1d60 in main /home/simark/src/binutils-gdb/gdbserver/server.cc:4084 #9 0x7f6b552f630f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f) Those objects are deleted by linux_process_target::mourn, but that is not called if we fail to attach, we only call remove_process. I initially fixed this by making linux_process_target::attach call linux_process_target::mourn on failure (before calling error). But this isn't done anywhere else (including in GDB) so it would just be confusing to do things differently here. Instead, add a linux_process_target::remove_linux_process helper method (which calls remove_process), and call that instead of remove_process in the Linux target. Move the free-ing of the extra data from the mourn method to that new method. Change-Id: I277059a69d5f08087a7f3ef0b8f1792a1fcf7a85
mbitsnbites
pushed a commit
that referenced
this issue
May 23, 2022
Commit 152a174 ("gdb: prune inferiors at end of fetch_inferior_event, fix intermittent failure of gdb.threads/fork-plus-threads.exp") broke some tests with the native-gdbserver board, such as: (gdb) PASS: gdb.base/step-over-syscall.exp: detach-on-fork=off: follow-fork=child: break cond on target : vfork: break marker continue^M Continuing.^M terminate called after throwing an instance of 'gdb_exception_error'^M I can manually reproduce the issue by running (just the commands that the test does as a one liner): $ ./gdb -q --data-directory=data-directory \ testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork \ -ex "tar rem | ../gdbserver/gdbserver - testsuite/outputs/gdb.base/step-over-syscall/step-over-vfork" \ -ex "b main" \ -ex c \ -ex "d 1" \ -ex "set displaced-stepping off" \ -ex "b *0x7ffff7d7ac5a if main == 0" \ -ex "set detach-on-fork off" \ -ex "set follow-fork-mode child" \ -ex c \ -ex "inferior 1" \ -ex "b marker" \ -ex c ... where 0x7ffff7d7ac5a is the exact address of the vfork syscall (which can be found by looking at gdb.log). The important part of the above is that a vfork syscall creates inferior 2, then inferior 2 executes until exit, then we switch back to inferior 1 and try to resume it. The uncaught exception happens here: #4 0x00005596969d81a9 in error (fmt=0x559692da9e40 "Cannot execute this command while the target is running.\nUse the \"interrupt\" command to stop the target\nand then try again.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43 #5 0x0000559695af6f66 in remote_target::putpkt_binary (this=0x617000038080, buf=0x559692da4380 "qSymbol::", cnt=9) at /home/simark/src/binutils-gdb/gdb/remote.c:9560 #6 0x0000559695af6aaf in remote_target::putpkt (this=0x617000038080, buf=0x559692da4380 "qSymbol::") at /home/simark/src/binutils-gdb/gdb/remote.c:9518 #7 0x0000559695ab50dc in remote_target::remote_check_symbols (this=0x617000038080) at /home/simark/src/binutils-gdb/gdb/remote.c:5141 #8 0x0000559695b3cccf in remote_new_objfile (objfile=0x0) at /home/simark/src/binutils-gdb/gdb/remote.c:14600 #9 0x0000559693bc52a9 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:61 #10 0x0000559693bb2848 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x61b0000167f8: 0x559695b3cb1d <remote_new_objfile(objfile*)>) at /usr/include/c++/11.2.0/bits/invoke.h:111 #11 0x0000559693b8dddf in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7ffe0bae0590: 0x0) at /usr/include/c++/11.2.0/bits/std_function.h:291 #12 0x00005596956374b2 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x61b0000167f8, __args#0=0x0) at /usr/include/c++/11.2.0/bits/std_function.h:560 #13 0x0000559695633c64 in gdb::observers::observable<objfile*>::notify (this=0x55969ef5c480 <gdb::observers::new_objfile>, args#0=0x0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:150 #14 0x0000559695df6cc2 in clear_symtab_users (add_flags=...) at /home/simark/src/binutils-gdb/gdb/symfile.c:2873 #15 0x000055969574c263 in program_space::~program_space (this=0x6120000c8a40, __in_chrg=<optimized out>) at /home/simark/src/binutils-gdb/gdb/progspace.c:154 #16 0x0000559694fc086b in delete_inferior (inf=0x61700003bf80) at /home/simark/src/binutils-gdb/gdb/inferior.c:205 #17 0x0000559694fc341f in prune_inferiors () at /home/simark/src/binutils-gdb/gdb/inferior.c:390 #18 0x0000559695017ada in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4293 #19 0x0000559694f629e6 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:41 #20 0x0000559695b3b0e3 in remote_async_serial_handler (scb=0x6250001ef100, context=0x6170000380a8) at /home/simark/src/binutils-gdb/gdb/remote.c:14466 #21 0x0000559695c59eb7 in run_async_handler_and_reschedule (scb=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:138 #22 0x0000559695c5a42a in fd_event (error=0, context=0x6250001ef100) at /home/simark/src/binutils-gdb/gdb/ser-base.c:189 #23 0x00005596969d9ebf in handle_file_event (file_ptr=0x60700005af40, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:574 #24 0x00005596969da7fa in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:700 #25 0x00005596969d8539 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212 If I enable "set debug infrun" just before the last continue, we see: (gdb) continue Continuing. [infrun] clear_proceed_status_thread: 965604.965604.0 [infrun] proceed: enter [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT [infrun] scoped_disable_commit_resumed: reason=proceeding [infrun] start_step_over: enter [infrun] start_step_over: stealing global queue of threads to step, length = 0 [infrun] operator(): step-over queue now empty [infrun] start_step_over: exit [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c [infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0 [infrun] prepare_to_wait: prepare_to_wait [infrun] reset: reason=proceeding [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote [infrun] proceed: exit [infrun] fetch_inferior_event: enter [infrun] scoped_disable_commit_resumed: reason=handling event [infrun] do_target_wait: Found 2 inferiors, starting at #1 [infrun] random_pending_event_thread: None found. [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = [infrun] print_target_wait_results: 965604.965604.0 [Thread 965604.965604], [infrun] print_target_wait_results: status->kind = VFORK_DONE [infrun] handle_inferior_event: status->kind = VFORK_DONE [infrun] context_switch: Switching context from 0.0.0 to 965604.965604.0 [infrun] handle_vfork_done: not waiting for a vfork-done event [infrun] start_step_over: enter [infrun] start_step_over: stealing global queue of threads to step, length = 0 [infrun] operator(): step-over queue now empty [infrun] start_step_over: exit [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [965604.965604.0] at 0x7ffff7d7ac5c [infrun] do_target_resume: resume_ptid=965604.0.0, step=0, sig=GDB_SIGNAL_0 [infrun] prepare_to_wait: prepare_to_wait [infrun] reset: reason=handling event [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote terminate called after throwing an instance of 'gdb_exception_error' What happens is: - After doing the "continue" on inferior 1, the remote target gives us a VFORK_DONE event. The core ignores it and resumes inferior 1. - Since prune_inferiors is now called after each handled event, in fetch_inferior_event, it is called after we handled that VFORK_DONE event and resumed inferior 1. - Inferior 2 is pruned, which (see backtrace above) causes its program space to be deleted, which clears the symtabs for that program space, which calls the new_objfile observable and remote_new_objfile observer (with a nullptr objfile, to indicate that the previously loaded symbols have been discarded), which calls remote_check_symbols. remote_check_symbols is the function that sends the qSymbol packet, to let the remote side ask for symbol addresses. The problem is that the remote target is working in all-stop / sync mode and is currently resumed. It has sent a vCont packet to resume the target and is waiting for a stop reply. It can't send any packets in the mean time. That causes the exception to be thrown. This wasn't a problem before, when prune_inferiors was called in normal_stop, because it was always called at a time the target was not resumed. An important observation here is that the new_objfile observable is invoked for a change in inferior 2's program space (inferior 2's program space is the current program space). Inferior 2 isn't bound to any process on the remote side (it has exited, that's why it's being pruned). It doesn't make sense to try to send a qSymbol packet for a process that doesn't exist on the remote side. remote_check_symbols actually attempts to avoid that: /* The remote side has no concept of inferiors that aren't running yet, it only knows about running processes. If we're connected but our current inferior is not running, we should not invite the remote target to request symbol lookups related to its (unrelated) current process. */ if (!target_has_execution ()) return; The problem here is that while inferior 2's program space is the current program space, inferior 1 is the current inferior. So the check above passes, since inferior has execution. We therefore try to send a qSymbol packet for inferior 1 in reaction to a change in inferior 2's program space, that's wrong. This exposes a conceptual flaw in remote_new_objfile. The "new_objfile" event concerns a specific program space, which can concern multiple inferiors, as inferiors can share a program space. We shouldn't consider the current inferior at all, but instead all inferiors bound to the affected program space. Especially since the current inferior can be unrelated to the current program space at that point. To be clear, we are in this state because ~program_space sets itself as the current program space, but there is no more inferior having that program space to switch to, inferior 2 has already been unlinked. To fix this, make remote_new_objfile iterate on all inferiors bound to the affected program space. Remove the target_has_execution check from remote_check_symbols, replace it with an assert. All callers must ensure that the current inferior has execution before calling it. Change-Id: Ica643145bcc03115248290fd310cadab8ec8371c
mbitsnbites
pushed a commit
that referenced
this issue
May 23, 2022
Luis noticed that the recent changes to gdbserver to make it track process and threads independently regressed a few gdb.multi/*.exp tests for aarch64-linux. We started seeing the following internal error for gdb.multi/multi-target-continue.exp for example: Starting program: binutils-gdb/gdb/testsuite/outputs/gdb.multi/multi-target-continue/multi-target-continue ^M Error in re-setting breakpoint 2: Remote connection closed^M ../../../repos/binutils-gdb/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.^M A problem internal to GDB has been detected,^M further debugging may prove unreliable. A backtrace looks like: #0 thread_regcache_data (thread=thread@entry=0x0) at ../../../repos/binutils-gdb/gdbserver/inferiors.cc:120 #1 0x0000aaaaaaabf0e8 in get_thread_regcache (thread=0x0, fetch=fetch@entry=0) at ../../../repos/binutils-gdb/gdbserver/regcache.cc:31 #2 0x0000aaaaaaad785c in is_64bit_tdesc () at ../../../repos/binutils-gdb/gdbserver/linux-aarch64-low.cc:194 #3 0x0000aaaaaaad8a48 in aarch64_target::sw_breakpoint_from_kind (this=<optimized out>, kind=4, size=0xffffffffef04) at ../../../repos/binutils-gdb/gdbserver/linux-aarch64-low.cc:3226 #4 0x0000aaaaaaabe220 in bp_size (bp=0xaaaaaab6f3d0) at ../../../repos/binutils-gdb/gdbserver/mem-break.cc:226 #5 check_mem_read (mem_addr=187649984471104, buf=buf@entry=0xaaaaaab625d0 "\006", mem_len=mem_len@entry=56) at ../../../repos/binutils-gdb/gdbserver/mem-break.cc:1862 #6 0x0000aaaaaaacc660 in read_inferior_memory (memaddr=<optimized out>, myaddr=0xaaaaaab625d0 "\006", len=56) at ../../../repos/binutils-gdb/gdbserver/target.cc:93 #7 0x0000aaaaaaac3d9c in gdb_read_memory (len=56, myaddr=0xaaaaaab625d0 "\006", memaddr=187649984471104) at ../../../repos/binutils-gdb/gdbserver/server.cc:1071 #8 gdb_read_memory (memaddr=187649984471104, myaddr=0xaaaaaab625d0 "\006", len=56) at ../../../repos/binutils-gdb/gdbserver/server.cc:1048 #9 0x0000aaaaaaac82a4 in process_serial_event () at ../../../repos/binutils-gdb/gdbserver/server.cc:4307 #10 handle_serial_event (err=<optimized out>, client_data=<optimized out>) at ../../../repos/binutils-gdb/gdbserver/server.cc:4520 #11 0x0000aaaaaaafbcd0 in gdb_wait_for_event (block=block@entry=1) at ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:700 #12 0x0000aaaaaaafc0b0 in gdb_wait_for_event (block=1) at ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:596 #13 gdb_do_one_event () at ../../../repos/binutils-gdb/gdbsupport/event-loop.cc:237 #14 0x0000aaaaaaacacb0 in start_event_loop () at ../../../repos/binutils-gdb/gdbserver/server.cc:3518 #15 captured_main (argc=4, argv=<optimized out>) at ../../../repos/binutils-gdb/gdbserver/server.cc:3998 #16 0x0000aaaaaaab66dc in main (argc=<optimized out>, argv=<optimized out>) at ../../../repos/binutils-gdb/gdbserver/server.cc:4084 This sequence of functions is invoked due to a series of conditions: 1 - The probe-based breakpoint mechanism failed (for some reason) so ... 2 - ... gdbserver has to know what type of architecture it is dealing with so it can pick the right breakpoint kind, so it wants to check if we have a 64-bit target. 3 - To determine the size of a register, we currently fetch the current thread's register cache, and the current thread pointer is now nullptr. In #3, the current thread is nullptr because gdb_read_memory clears it on purpose, via set_desired_process, exactly to expose code relying on the current thread when it shouldn't. It was always possible to end up in this situation (when the current thread exits), but it was harder to reproduce before. This commit fixes it by tweaking is_64bit_tdesc to look at the current process's tdesc instead of the current thread's tdesc. Note that the thread's tdesc is itself filled from the process's tdesc, so this should be equivalent: struct regcache * get_thread_regcache (struct thread_info *thread, int fetch) { struct regcache *regcache; regcache = thread_regcache_data (thread); ... if (regcache == NULL) { struct process_info *proc = get_thread_process (thread); gdb_assert (proc->tdesc != NULL); regcache = new_register_cache (proc->tdesc); set_thread_regcache_data (thread, regcache); } ... Change-Id: Ibc809d7345e70a2f058b522bdc5cdbdca97e2cdc
mbitsnbites
pushed a commit
that referenced
this issue
Jul 3, 2022
Simon reported that the recent change to make GDB and GDBserver avoid reading shell registers caused a GDBserver regression, caught with ASan while running gdb.server/non-existing-program.exp: $ /home/smarchi/build/binutils-gdb/gdb/testsuite/../../gdb/../gdbserver/gdbserver stdio non-existing-program ================================================================= ==127719==ERROR: AddressSanitizer: heap-use-after-free on address 0x60f0000000e9 at pc 0x55bcbfa301f4 bp 0x7ffd238a7320 sp 0x7ffd238a7310 WRITE of size 1 at 0x60f0000000e9 thread T0 #0 0x55bcbfa301f3 in scoped_restore_tmpl<bool>::~scoped_restore_tmpl() /home/smarchi/src/binutils-gdb/gdbserver/../gdbsupport/scoped_restore.h:86 #1 0x55bcbfa2ffe9 in post_fork_inferior(int, char const*) /home/smarchi/src/binutils-gdb/gdbserver/fork-child.cc:120 #2 0x55bcbf9c9199 in linux_process_target::create_inferior(char const*, std::__debug::vector<char*, std::allocator<char*> > const&) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:991 #3 0x55bcbf954549 in captured_main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3941 #4 0x55bcbf9552f0 in main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4084 #5 0x7ff9d663b0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) #6 0x55bcbf8ef2bd in _start (/home/smarchi/build/binutils-gdb/gdbserver/gdbserver+0x1352bd) 0x60f0000000e9 is located 169 bytes inside of 176-byte region [0x60f000000040,0x60f0000000f0) freed by thread T0 here: #0 0x7ff9d6c6f0c7 in operator delete(void*) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:160 #1 0x55bcbf910d00 in remove_process(process_info*) /home/smarchi/src/binutils-gdb/gdbserver/inferiors.cc:164 #2 0x55bcbf9c4ac7 in linux_process_target::remove_linux_process(process_info*) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:454 #3 0x55bcbf9cdaa6 in linux_process_target::mourn(process_info*) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:1599 #4 0x55bcbf988dc4 in target_mourn_inferior(ptid_t) /home/smarchi/src/binutils-gdb/gdbserver/target.cc:205 #5 0x55bcbfa32020 in startup_inferior(process_stratum_target*, int, int, target_waitstatus*, ptid_t*) /home/smarchi/src/binutils-gdb/gdbserver/../gdb/nat/fork-inferior.c:515 #6 0x55bcbfa2fdeb in post_fork_inferior(int, char const*) /home/smarchi/src/binutils-gdb/gdbserver/fork-child.cc:111 #7 0x55bcbf9c9199 in linux_process_target::create_inferior(char const*, std::__debug::vector<char*, std::allocator<char*> > const&) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:991 #8 0x55bcbf954549 in captured_main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3941 #9 0x55bcbf9552f0 in main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4084 #10 0x7ff9d663b0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) previously allocated by thread T0 here: #0 0x7ff9d6c6e5a7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x55bcbf910ad0 in add_process(int, int) /home/smarchi/src/binutils-gdb/gdbserver/inferiors.cc:144 #2 0x55bcbf9c477d in linux_process_target::add_linux_process_no_mem_file(int, int) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:425 #3 0x55bcbf9c8f4c in linux_process_target::create_inferior(char const*, std::__debug::vector<char*, std::allocator<char*> > const&) /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:985 #4 0x55bcbf954549 in captured_main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3941 #5 0x55bcbf9552f0 in main /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4084 #6 0x7ff9d663b0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) Above we see that in the non-existing-program case, the process gets deleted before the starting_up flag gets restored to false. This happens because startup_inferior calls target_mourn_inferior before throwing an error, and in GDBserver, unlike in GDB, mourning deletes the process. Fix this by not using a scoped_restore to manage the starting_up flag, since we should only clear it when startup_inferior doesn't throw. Change-Id: I67325d6f81c64de4e89e20e4ec4556f57eac7f6c
mbitsnbites
pushed a commit
that referenced
this issue
Jul 3, 2022
When building gdb with -fsanitize=thread and gcc 12, and running test-case gdb.dwarf2/dwz.exp, we run into a data race between thread T2 and the main thread in the same write: ... Write of size 1 at 0x7b200000300c:^M #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6252 \ (gdb+0x82f3b3)^M ... which is here: ... this_cu->dwarf_version = cu->header.version; ... Both writes are called from the parallel for in dwarf2_build_psymtabs_hard, this one directly: ... #1 process_psymtab_comp_unit gdb/dwarf2/read.c:6774 (gdb+0x8304d7)^M #2 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M #3 operator() gdbsupport/parallel-for.h:163 (gdb+0x872380)^M ... and this via the PU import: ... #1 cooked_indexer::ensure_cu_exists(cutu_reader*, dwarf2_per_objfile*, \ sect_offset, bool, bool) gdb/dwarf2/read.c:17964 (gdb+0x85c43b)^M #2 cooked_indexer::index_imported_unit(cutu_reader*, unsigned char const*, \ abbrev_info const*) gdb/dwarf2/read.c:18248 (gdb+0x85d8ff)^M #3 cooked_indexer::index_dies(cutu_reader*, unsigned char const*, \ cooked_index_entry const*, bool) gdb/dwarf2/read.c:18302 (gdb+0x85dcdb)^M #4 cooked_indexer::make_index(cutu_reader*) gdb/dwarf2/read.c:18443 \ (gdb+0x85e68a)^M #5 process_psymtab_comp_unit gdb/dwarf2/read.c:6812 (gdb+0x830879)^M #6 operator() gdb/dwarf2/read.c:7098 (gdb+0x8317be)^M #7 operator() gdbsupport/parallel-for.h:171 (gdb+0x8723e2)^M ... Fix this by setting the field earlier, in read_comp_units_from_section. The write in cutu_reader::cutu_reader() is still needed, in case read_comp_units_from_section is not used (run the test-case with say, target board cc-with-gdb-index). Make the write conditional, such that it doesn't trigger if the field is already set by read_comp_units_from_section. Instead, verify that the field already has the value that we're trying to set it to. Move this logic into into a member function set_version (in analogy to the already present member function version) to make sure it's used consistenly, and make the field private in order to enforce access through the member functions, and rename it to m_dwarf_version. While we're at it, make sure that the version is set before read, to avoid say returning true for "per_cu.version () < 5" if "per_cu.version () == 0". Tested on x86_64-linux.
mbitsnbites
pushed a commit
that referenced
this issue
Jul 12, 2022
After loading a core file, you're supposed to be able to use "detach" to unload the core file. That unfortunately regressed starting with GDB 11, with these commits: 1192f12 - gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses 408f668 - detach in all-stop with threads running resulting in a GDB crash: ... Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x0000555555e842bf in maybe_set_commit_resumed_all_targets () at ../../src/gdb/infrun.c:2899 2899 if (proc_target->commit_resumed_state) (top-gdb) bt #0 0x0000555555e842bf in maybe_set_commit_resumed_all_targets () at ../../src/gdb/infrun.c:2899 #1 0x0000555555e848bf in scoped_disable_commit_resumed::reset (this=0x7fffffffd440) at ../../src/gdb/infrun.c:3023 #2 0x0000555555e84a0c in scoped_disable_commit_resumed::reset_and_commit (this=0x7fffffffd440) at ../../src/gdb/infrun.c:3049 #3 0x0000555555e739cd in detach_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:2791 #4 0x0000555555c0ba46 in do_simple_func (args=0x0, from_tty=1, c=0x55555662a600) at ../../src/gdb/cli/cli-decode.c:95 #5 0x0000555555c112b0 in cmd_func (cmd=0x55555662a600, args=0x0, from_tty=1) at ../../src/gdb/cli/cli-decode.c:2514 #6 0x0000555556173b1f in execute_command (p=0x5555565c5916 "", from_tty=1) at ../../src/gdb/top.c:699 The code that crashes looks like: static void maybe_set_commit_resumed_all_targets () { scoped_restore_current_thread restore_thread; for (inferior *inf : all_non_exited_inferiors ()) { process_stratum_target *proc_target = inf->process_target (); if (proc_target->commit_resumed_state) ^^^^^^^^^^^ With 'proc_target' above being null. all_non_exited_inferiors filters out inferiors that have pid==0. We get here at the end of detach_command, after core_target::detach has already run, at which point the inferior _should_ have pid==0 and no process target. It is clear it no longer has a process target, but, it still has a pid!=0 somehow. The reason the inferior still has pid!=0, is that core_target::detach just unpushes, and relies on core_target::close to actually do the getting rid of the core and exiting the inferior. The problem with that is that detach_command grabs an extra strong reference to the process stratum target, so the unpush_target inside core_target::detach doesn't actually result in a call to core_target::close. Fix this my moving the cleaning up the core inferior to a shared routine called by both core_target::close and core_target::detach. We still need to cleanup the inferior from within core_file::close because there are paths to it that want to get rid of the core without going through detach. E.g., "core-file" -> "run". This commit includes a new test added to gdb.base/corefile.exp to cover the "core-file core" -> "detach" scenario. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29275 Change-Id: Ic42bdd03182166b19f598428b0dbc2ce6f67c893
mbitsnbites
pushed a commit
that referenced
this issue
Sep 1, 2022
Bug 29374 shows this crash: $ ./gdb -nx --data-directory=data-directory -q -batch -ex "catch throw" -ex r -ex bt a.out ... /home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h:217: internal-error: copy: Assertion `dest.size () == src.size ()' failed. The backtrace is: #0 internal_error (file=0x5555606504c0 "/home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h", line=217, fmt=0x55556064b700 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:51 #1 0x000055555d41c0bb in gdb::copy<unsigned char const, unsigned char> (src=..., dest=...) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/array-view.h:217 #2 0x000055555deef28c in dwarf_expr_context::fetch_result (this=0x7fffffffb830, type=0x621007a86830, subobj_type=0x621007a86830, subobj_offset=0, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1040 #3 0x000055555def0015 in dwarf_expr_context::evaluate (this=0x7fffffffb830, addr=0x62f00004313e "0", len=1, as_lval=false, per_cu=0x60b000069550, frame=0x621007c9e910, addr_info=0x0, type=0x621007a86830, subobj_type=0x621007a86830, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1091 #4 0x000055555e084327 in dwarf2_evaluate_loc_desc_full (type=0x621007a86830, frame=0x621007c9e910, data=0x62f00004313e "0", size=1, per_cu=0x60b000069550, per_objfile=0x613000006080, subobj_type=0x621007a86830, subobj_byte_offset=0, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1485 #5 0x000055555e0849e2 in dwarf2_evaluate_loc_desc (type=0x621007a86830, frame=0x621007c9e910, data=0x62f00004313e "0", size=1, per_cu=0x60b000069550, per_objfile=0x613000006080, as_lval=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1529 #6 0x000055555e0828c6 in dwarf_entry_parameter_to_value (parameter=0x621007a96e58, deref_size=0x0, type=0x621007a86830, caller_frame=0x621007c9e910, per_cu=0x60b000069550, per_objfile=0x613000006080) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1235 #7 0x000055555e082f55 in value_of_dwarf_reg_entry (type=0x621007a86890, frame=0x621007acc510, kind=CALL_SITE_PARAMETER_DWARF_REG, kind_u=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1332 #8 0x000055555e083449 in value_of_dwarf_block_entry (type=0x621007a86890, frame=0x621007acc510, block=0x61e000033568 "T\004\205\001\240\004\004\243\001T\237\004\240\004\261\004\001T\004\261\004\304\005\004\243\001T\237\004\304\005\310\005\001T\004\310\005\311\005\004\243\001T\237", block_len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1365 #9 0x000055555e094d40 in loclist_read_variable_at_entry (symbol=0x621007a99bd0, frame=0x621007acc510) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3889 #10 0x000055555f5192e0 in read_frame_arg (fp_opts=..., sym=0x621007a99bd0, frame=0x621007acc510, argp=0x7fffffffbf20, entryargp=0x7fffffffbf60) at /home/simark/src/binutils-gdb/gdb/stack.c:559 #11 0x000055555f51c352 in print_frame_args (fp_opts=..., func=0x621007a99ad0, frame=0x621007acc510, num=-1, stream=0x6030000bad90) at /home/simark/src/binutils-gdb/gdb/stack.c:887 #12 0x000055555f521919 in print_frame (fp_opts=..., frame=0x621007acc510, print_level=1, print_what=LOCATION, print_args=1, sal=...) at /home/simark/src/binutils-gdb/gdb/stack.c:1390 #13 0x000055555f51f22e in print_frame_info (fp_opts=..., frame=0x621007acc510, print_level=1, print_what=LOCATION, print_args=1, set_current_sal=0) at /home/simark/src/binutils-gdb/gdb/stack.c:1116 #14 0x000055555f526c6d in backtrace_command_1 (fp_opts=..., bt_opts=..., count_exp=0x0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/stack.c:2079 #15 0x000055555f527ae5 in backtrace_command (arg=0x0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/stack.c:2198 The problem is that the type that gets passed down to dwarf_expr_context::fetch_result (the type of a variable of which we're trying to read the entry value) is a typedef whose size has never been computed yet (check_typedef has never been called on it). As we get in the DWARF_VALUE_STACK case (line 1028 of dwarf2/expr.c), the `len` variable is therefore set to 0, instead of the actual type length. We then call allocate_value on subobj_type, which does call check_typedef, so the length of the typedef gets filled in at that point. We end up passing to the copy function a source array view of length 0 and a target array view of length 4, and the assertion fails. Fix this by calling check_typedef on both type and subobj_type at the beginning of fetch_result. I tried writing a test for this using the DWARF assembler, but I haven't succeeded. It's possible that we need to get into this specific code path (value_of_dwarf_reg_entry and all) to manage to get to dwarf_expr_context::fetch_result with a typedef type that has never been resolved. In all my attempts, the typedef would always be resolved already, so the bug wouldn't show up. As a fallback, I made a gdb.dwarf2 test with compiler-generated .S files. I don't particularly like those, but I think it's better than no test. The .cpp source code is the smallest reproducer I am able to make from the reproducer given in the bug (thanks to Pedro for suggestions on how to minimize it further than I had). Since I tested on both amd64 and aarch64, I added versions of the test for these two architectures. Change-Id: I182733ad08e34df40d8bcc47af72c482fabf4900 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29374
mbitsnbites
pushed a commit
that referenced
this issue
Oct 8, 2022
When a GDB built with -D_GLIBCXX_DEBUG=1 reads a binary with a single character name, we hit this assertion failure: $ ./gdb -q --data-directory=data-directory -nx ./x /usr/include/c++/12.1.0/string_view:239: constexpr const std::basic_string_view<_CharT, _Traits>::value_type& std::basic_string_view<_CharT, _Traits>::operator[](size_type) const [with _CharT = char; _Traits = std::char_traits<char>; const_reference = const char&; size_type = long unsigned int]: Assertion '__pos < this->_M_len' failed. The backtrace: #3 0x00007ffff6c0f002 in std::__glibcxx_assert_fail (file=<optimized out>, line=<optimized out>, function=<optimized out>, condition=<optimized out>) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/debug.cc:60 #4 0x000055555da8a864 in std::basic_string_view<char, std::char_traits<char> >::operator[] (this=0x7fffffffcc30, __pos=1) at /usr/include/c++/12.1.0/string_view:239 #5 0x00005555609dcb88 in path_join[abi:cxx11](gdb::array_view<std::basic_string_view<char, std::char_traits<char> > const>) (paths=...) at /home/simark/src/binutils-gdb/gdbsupport/pathstuff.cc:203 #6 0x000055555e0443f4 in path_join<char const*, char const*> () at /home/simark/src/binutils-gdb/gdb/../gdbsupport/pathstuff.h:84 #7 0x00005555609dc336 in gdb_realpath_keepfile[abi:cxx11](char const*) (filename=0x6060000a8d40 "/home/simark/build/binutils-gdb-one-target/gdb/./x") at /home/simark/src/binutils-gdb/gdbsupport/pathstuff.cc:122 #8 0x000055555ebd2794 in exec_file_attach (filename=0x7fffffffe0f9 "./x", from_tty=1) at /home/simark/src/binutils-gdb/gdb/exec.c:471 #9 0x000055555f2b3fb0 in catch_command_errors (command=0x55555ebd1ab6 <exec_file_attach(char const*, int)>, arg=0x7fffffffe0f9 "./x", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:513 #10 0x000055555f2b7e11 in captured_main_1 (context=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1209 #11 0x000055555f2b9144 in captured_main (data=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1319 #12 0x000055555f2b9226 in gdb_main (args=0x7fffffffdb60) at /home/simark/src/binutils-gdb/gdb/main.c:1344 #13 0x000055555d938c5e in main (argc=5, argv=0x7fffffffdcf8) at /home/simark/src/binutils-gdb/gdb/gdb.c:32 The problem is this line in path_join: gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path)); ... where `path` is "x". IS_ABSOLUTE_PATH eventually calls HAS_DRIVE_SPEC_1: #define HAS_DRIVE_SPEC_1(dos_based, f) \ ((f)[0] && ((f)[1] == ':') && (dos_based)) This macro accesses indices 0 and 1 of the input string. However, `f` is a string_view of length 1, so it's incorrect to try to access index 1. We know that the string_view's underlying object is a null-terminated string, so in practice there's no harm. But as far as the string_view is concerned, index 1 is considered out of bounds. This patch makes the easy fix, that is to change the path_join parameter from a vector of to a vector of `const char *`. Another solution would be to introduce a non-standard gdb::cstring_view class, which would be a view over a null-terminated string. With that class, it would be correct to access index 1, it would yield the NUL character. If there is interest in having this class (it has been mentioned a few times in the past) I can do it and use it here. This was found by running tests such as gdb.ada/arrayidx.exp, which produce 1-char long filenames, so adding a new test is not necessary. Change-Id: Ia41a16c7243614636b18754fd98a41860756f7af
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
On Ubuntu 22.04.1 x86_64 (with glibc 2.35), I run into: ... (gdb) PASS: gdb.base/corefile.exp: $_exitcode is void bt^M #0 __pthread_kill_implementation (...) at ./nptl/pthread_kill.c:44^M #1 __pthread_kill_internal (...) at ./nptl/pthread_kill.c:78^M #2 __GI___pthread_kill (...) at ./nptl/pthread_kill.c:89^M #3 0x00007f4985e1a476 in __GI_raise (...) at ../sysdeps/posix/raise.c:26^M #4 0x00007f4985e007f3 in __GI_abort () at ./stdlib/abort.c:79^M #5 0x0000556b4ea4b504 in func2 () at gdb.base/coremaker.c:153^M #6 0x0000556b4ea4b516 in func1 () at gdb.base/coremaker.c:159^M #7 0x0000556b4ea4b578 in main (...) at gdb.base/coremaker.c:171^M (gdb) PASS: gdb.base/corefile.exp: backtrace up^M #1 __pthread_kill_internal (...) at ./nptl/pthread_kill.c:78^M 78 in ./nptl/pthread_kill.c^M (gdb) FAIL: gdb.base/corefile.exp: up ... The problem is that the regexp used here: ... gdb_test "up" "#\[0-9\]* *\[0-9xa-fH'\]* in .* \\(.*\\).*" "up" ... does not fit the __pthread_kill_internal line which lacks the instruction address due to inlining. Fix this by making the regexp less strict. Tested on x86_64-linux.
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
I would like to improve frame_info_ptr to automatically grab the information needed to reinflate a frame, and automatically reinflate it as needed. One thing that is in the way is the fact that some frames can be created out of thin air by the create_new_frame function. These frames are not the fruit of unwinding from the target's current frame. These frames are created by the "select-frame view" command. These frames are not correctly handled by the frame save/restore functions, save_selected_frame, restore_selected_frame and lookup_selected_frame. This can be observed here, using the test included in this patch: $ ./gdb --data-directory=data-directory -nx -q testsuite/outputs/gdb.base/frame-view/frame-view Reading symbols from testsuite/outputs/gdb.base/frame-view/frame-view... (gdb) break thread_func Breakpoint 1 at 0x11a2: file /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c, line 42. (gdb) run Starting program: /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/frame-view/frame-view [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1". [New Thread 0x7ffff7cc46c0 (LWP 4171134)] [Switching to Thread 0x7ffff7cc46c0 (LWP 4171134)] Thread 2 "frame-view" hit Breakpoint 1, thread_func (p=0x0) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42 42 foo (11); (gdb) info frame Stack level 0, frame at 0x7ffff7cc3ee0: rip = 0x5555555551a2 in thread_func (/home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42); saved rip = 0x7ffff7d4e8fd called by frame at 0x7ffff7cc3f80 source language c. Arglist at 0x7ffff7cc3ed0, args: p=0x0 Locals at 0x7ffff7cc3ed0, Previous frame's sp is 0x7ffff7cc3ee0 Saved registers: rbp at 0x7ffff7cc3ed0, rip at 0x7ffff7cc3ed8 (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff7cc5740 (LWP 4171122))] #0 0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6 Here, we create a custom frame for thread 1 (using the stack from thread 2, for convenience): (gdb) select-frame view 0x7ffff7cc3f80 0x5555555551a2 The first calls to "frame" looks good: (gdb) frame #0 thread_func (p=0x7ffff7d4e630) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42 42 foo (11); But not the second one: (gdb) frame #0 0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6 This second "frame" command shows the current target frame instead of the user-created frame. It's not totally clear how the "select-frame view" feature is expected to behave, especially since it's not tested. I heard accounts that it used to be possible to select a frame like this and do "up" and "down" to navigate the backtrace starting from that frame. The fact that create_new_frame calls frame_unwind_find_by_frame to install the right unwinder suggest that it used to be possible. But that doesn't work today: (gdb) select-frame view 0x7ffff7cc3f80 0x5555555551a2 (gdb) up Initial frame selected; you cannot go up. (gdb) down Bottom (innermost) frame selected; you cannot go down. and "backtrace" always shows the actual thread's backtrace, it ignores the user-created frame: (gdb) bt #0 0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6 #1 0x00007ffff7d50403 in ?? () from /usr/lib/libc.so.6 #2 0x000055555555521a in main () at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:56 I don't want to address all the `select-frame view` issues , but I think we can agree that the "frame" command changing the selected frame, as shown above, is a bug. I would expect that command to show the currently selected frame and not change it. This happens because of the scoped_restore_selected_frame object in print_frame_args. The frame information is saved in the constructor (the backtrace below), and restored in the destructor. #0 save_selected_frame (frame_id=0x7ffdc0020ad0, frame_level=0x7ffdc0020af0) at /home/simark/src/binutils-gdb/gdb/frame.c:1682 #1 0x00005631390242f0 in scoped_restore_selected_frame::scoped_restore_selected_frame (this=0x7ffdc0020ad0) at /home/simark/src/binutils-gdb/gdb/frame.c:324 #2 0x000056313993581e in print_frame_args (fp_opts=..., func=0x62100023bde0, frame=..., num=-1, stream=0x60b000000300) at /home/simark/src/binutils-gdb/gdb/stack.c:755 #3 0x000056313993ad49 in print_frame (fp_opts=..., frame=..., print_level=1, print_what=SRC_AND_LOC, print_args=1, sal=...) at /home/simark/src/binutils-gdb/gdb/stack.c:1401 #4 0x000056313993835d in print_frame_info (fp_opts=..., frame=..., print_level=1, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:1126 #5 0x0000563139932e0b in print_stack_frame (frame=..., print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:368 #6 0x0000563139932bbe in print_stack_frame_to_uiout (uiout=0x611000016840, frame=..., print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:346 #7 0x0000563139b0641e in print_selected_thread_frame (uiout=0x611000016840, selection=...) at /home/simark/src/binutils-gdb/gdb/thread.c:1993 #8 0x0000563139940b7f in frame_command_core (fi=..., ignored=true) at /home/simark/src/binutils-gdb/gdb/stack.c:1871 #9 0x000056313994db9e in frame_command_helper<frame_command_core>::base_command (arg=0x0, from_tty=1) at /home/simark/src/binutils-gdb/gdb/stack.c:1976 Since the user-created frame has level 0 (identified by the saved level -1), lookup_selected_frame just reselects the target's current frame, and the user-created frame is lost. My goal here is to fix this particular problem. Currently, select_frame does not set selected_frame_id and selected_frame_level for frames with level 0. It leaves them at null_frame_id / -1, indicating to restore_selected_frame to use the target's current frame. User-created frames also have level 0, so add a special case them such that select_frame saves their selected id and level. save_selected_frame does not need any change. Change the assertion in restore_selected_frame that checks `frame_level != 0` to account for the fact that we can restore user-created frames, which have level 0. Finally, change lookup_selected_frame to make it able to re-create user-created frame_info objects from selected_frame_level and selected_frame_id. Add a minimal test case for the case described above, that is the "select-frame view" command followed by the "frame" command twice. In order to have a known stack frame to switch to, the test spawns a second thread, and tells the first thread to use the other thread's top frame. Change-Id: Ifc77848dc465fbd21324b9d44670833e09fe98c7 Reviewed-By: Bruno Larsen <[email protected]>
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
…ames to the frame cache The test gdb.base/frame-view.exp fails like this on AArch64: frame^M #0 baz (z1=hahaha, /home/simark/src/binutils-gdb/gdb/value.c:4056: internal-error: value_fetch_lazy_register: Assertion `next_frame != NULL' failed.^M A problem internal to GDB has been detected,^M further debugging may prove unreliable.^M FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error) The sequence of events leading to this is the following: - When we create the user frame (the "select-frame view" command), we create a sentinel frame just for our user-created frame, in create_new_frame. This sentinel frame has the same id as the regular sentinel frame. - When printing the frame, after doing the "select-frame view" command, the argument's pretty printer is invoked, which does an inferior function call (this is the point of the test). This clears the frame cache, including the "real" sentinel frame, which sets the sentinel_frame global to nullptr. - Later in the frame-printing process (when printing the second argument), the auto-reinflation mechanism re-creates the user frame by calling create_new_frame again, creating its own special sentinel frame again. However, note that the "real" sentinel frame, the sentinel_frame global, is still nullptr. If the selected frame had been a regular frame, we would have called get_current_frame at some point during the reinflation, which would have re-created the "real" sentinel frame. But it's not the case when reinflating a user frame. - Deep down the stack, something wants to fill in the unwind stop reason for frame 0, which requires trying to unwind frame 1. This leads us to trying to unwind the PC of frame 1: #0 gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955 #1 0x000000000134569c in dwarf2_tailcall_sniffer_first (this_frame=..., tailcall_cachep=0xffff773fcae0, entry_cfa_sp_offsetp=0xfffff7f7d450) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390 #2 0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089 #3 0x00000000013562b0 in dwarf2_frame_unwind_stop_reason (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101 #4 0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2281 #5 0x0000000001993034 in get_prev_frame_always (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2376 #6 0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:3051 #7 0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356 #8 0x000000000132122c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", op_end=0xffff8d8883ee "\217\002") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110 #9 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239 #10 0x000000000131d68c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811 #11 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239 #12 0x0000000001314c3c in dwarf_expr_context::evaluate (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, per_cu=0xffff90b03700, frame=..., addr_info=0x0, type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078 #13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513 #14 0x00000000014a0100 in dwarf2_evaluate_loc_desc (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557 #15 0x00000000014aa584 in locexpr_read_variable (symbol=0xffff8f6cd770, frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052 - AArch64 defines a special "prev register" function, aarch64_dwarf2_prev_register, to handle unwinding the PC. This function does frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM); - frame_unwind_register_unsigned ultimately creates a lazy register value, saving the frame id of this_frame->next. this_frame is the user-created frame, to this_frame->next is the special sentinel frame we created for it. So the saved ID is the sentinel frame ID. - When time comes to un-lazify the value, value_fetch_lazy_register calls frame_find_by_id, to find the frame with the ID we saved. - frame_find_by_id sees it's the sentinel frame ID, so returns the sentinel_frame global, which is, if you remember, nullptr. - We hit the `gdb_assert (next_frame != NULL)` assertion in value_fetch_lazy_register. The issues I see here are: - The ID of the sentinel frame created for the user-created frame is not distinguishable from the ID of the regular sentinel frame. So there's no way frame_find_by_id could find the right frame, in value_fetch_lazy_register. - Even if they had distinguishable IDs, sentinel frames created for user frames are not registered anywhere, so there's no easy way frame_find_by_id could find it. This patch addresses these two issues: - Give sentinel frames created for user frames their own distinct IDs - Register sentinel frames in the frame cache, so they can be found with frame_find_by_id. I initially had this split in two patches, but I then found that it was easier to explain as a single patch. Rergarding the first part of the change: with this patch, the sentinel frames created for user frames (in create_new_frame) still have stack_status == FID_STACK_SENTINEL, but their code_addr and stack_addr fields are now filled with the addresses used to create the user frame. This ensures this sentinel frame ID is different from the "target" sentinel frame ID, as well as any other "user" sentinel frame ID. If the user tries to create the same frame, with the same addresses, multiple times, create_sentinel_frame just reuses the existing frame. So we won't end up with multiple user sentinels with the same ID. Regular "target" sentinel frames remain with code_addr and stack_addr unset. The concrete changes for that part are: - Remove the sentinel_frame_id constant, since there isn't one "sentinel frame ID" now. Add the frame_id_build_sentinel function for building sentinel frame IDs and a is_sentinel_frame_id function to check if a frame id represents a sentinel frame. - Replace the sentinel_frame_id check in frame_find_by_id with a comparison to `frame_id_build_sentinel (0, 0)`. The sentinel_frame global is meant to contain a reference to the "target" sentinel, so the one with addresses (0, 0). - Add stack and code address parameters to create_sentinel_frame, to be able to create the various types of sentinel frames. - Adjust get_current_frame to create the regular "target" sentinel. - Adjust create_new_frame to create a sentinel with the ID specific to the created user frame. - Adjust sentinel_frame_prev_register to get the sentinel frame ID from the frame_info object, since there isn't a single "sentinel frame ID" now. - Change get_next_frame_sentinel_okay to check for a sentinel-frame-id-like frame ID, rather than for sentinel_frame specifically, since this function could be called with another sentinel frame (and we would want the assert to catch it). The rest of the change is about registering the sentinel frame in the frame cache: - Change frame_stash_add's assertion to allow sentinel frame levels (-1). - Make create_sentinel_frame add the frame to the frame cache. - Change the "sentinel_frame != NULL" check in reinit_frame_cache for a check that the frame stash is not empty. The idea is that if we only have some user-created frames in the cache when reinit_frame_cache is called, we probably want to emit the frames invalid annotation. The goal of that check is to avoid unnecessary repeated annotations, I suppose, so the "frame cache not empty" check should achieve that. After this change, I think we could theoritically get rid of the sentienl_frame global. That sentinel frame could always be found by looking up `frame_id_build_sentinel (0, 0)` in the frame cache. However, I left the global there to avoid slowing the typical case down for nothing. I however, noted in its comment that it is an optimization. With this fix applied, the gdb.base/frame-view.exp now passes for me on AArch64. value_of_register_lazy now saves the special sentinel frame ID in the value, and value_fetch_lazy_register is able to find that sentinel frame after the frame cache reinit and after the user-created frame was reinflated. Tested-By: Alexandra Petlanova Hajkova <[email protected]> Tested-By: Luis Machado <[email protected]> Change-Id: I8b77b3448822c8aab3e1c3dda76ec434eb62704f
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
Tom de Vries reported [1] a regression in gdb.btrace/record_goto.exp caused by 6d3717d ("gdb: call frame unwinders' dealloc_cache methods through destroying the frame cache"). This issue is caught by ASan. On a non-ASan build, it may or may not cause a crash or some other issue, I haven't tried. I managed to narrow it down to: $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.btrace/record_goto/record_goto -ex "start" -ex "record btrace" -ex "next" ... and then doing repeatedly "record goto 19" and "record goto 27". Eventually, I get: (gdb) record goto 27 ================================================================= ==1527735==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210003392a8 at pc 0x55e4c26eef86 bp 0x7ffd229f24e0 sp 0x7ffd229f24d8 READ of size 8 at 0x6210003392a8 thread T0 #0 0x55e4c26eef85 in bfcache_eq /home/simark/src/binutils-gdb/gdb/record-btrace.c:1639 #1 0x55e4c37cdeff in htab_find_slot_with_hash /home/simark/src/binutils-gdb/libiberty/hashtab.c:659 #2 0x55e4c37ce24a in htab_find_slot /home/simark/src/binutils-gdb/libiberty/hashtab.c:703 #3 0x55e4c26ef0c6 in bfcache_new /home/simark/src/binutils-gdb/gdb/record-btrace.c:1653 #4 0x55e4c26f1242 in record_btrace_frame_sniffer /home/simark/src/binutils-gdb/gdb/record-btrace.c:1820 #5 0x55e4c1b926a1 in frame_unwind_try_unwinder /home/simark/src/binutils-gdb/gdb/frame-unwind.c:136 #6 0x55e4c1b930d7 in frame_unwind_find_by_frame(frame_info_ptr, void**) /home/simark/src/binutils-gdb/gdb/frame-unwind.c:196 #7 0x55e4c1bb867f in get_frame_type(frame_info_ptr) /home/simark/src/binutils-gdb/gdb/frame.c:2925 #8 0x55e4c2ae6798 in print_frame_info(frame_print_options const&, frame_info_ptr, int, print_what, int, int) /home/simark/src/binutils-gdb/gdb/stack.c:1049 #9 0x55e4c2ade3e1 in print_stack_frame(frame_info_ptr, int, print_what, int) /home/simark/src/binutils-gdb/gdb/stack.c:367 #10 0x55e4c26fda03 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2779 #11 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843 #12 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169 #13 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372 #14 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383 0x6210003392a8 is located 424 bytes inside of 4064-byte region [0x621000339100,0x62100033a0e0) freed by thread T0 here: #0 0x7f6ca34a5b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123 #1 0x55e4c38a4c17 in rpl_free /home/simark/src/binutils-gdb/gnulib/import/free.c:44 #2 0x55e4c1bbd378 in xfree<void> /home/simark/src/binutils-gdb/gdb/../gdbsupport/gdb-xfree.h:37 #3 0x55e4c37d1b63 in call_freefun /home/simark/src/binutils-gdb/libiberty/obstack.c:103 #4 0x55e4c37d25a2 in _obstack_free /home/simark/src/binutils-gdb/libiberty/obstack.c:280 #5 0x55e4c1bad701 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2112 #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564 #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573 #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772 #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843 #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169 #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372 #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383 previously allocated by thread T0 here: #0 0x7f6ca34a5e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x55e4c0b55c60 in xmalloc /home/simark/src/binutils-gdb/gdb/alloc.c:57 #2 0x55e4c37d1a6d in call_chunkfun /home/simark/src/binutils-gdb/libiberty/obstack.c:94 #3 0x55e4c37d1c20 in _obstack_begin_worker /home/simark/src/binutils-gdb/libiberty/obstack.c:141 #4 0x55e4c37d1ed7 in _obstack_begin /home/simark/src/binutils-gdb/libiberty/obstack.c:164 #5 0x55e4c1bad728 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2113 #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564 #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573 #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772 #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843 #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169 #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372 #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383 The problem is a stale entry in the bfcache hash table (in record-btrace.c), left across a reinit_frame_cache. This entry points to something that used to be allocated on the frame obstack, that has since been wiped by reinit_frame_cache. Before the aforementioned, unwinder deallocation functions were called by iterating on the frame chain, starting with the sentinel frame, like so: /* Tear down all frame caches. */ for (frame_info *fi = sentinel_frame; fi != NULL; fi = fi->prev) { if (fi->prologue_cache && fi->unwind->dealloc_cache) fi->unwind->dealloc_cache (fi, fi->prologue_cache); if (fi->base_cache && fi->base->unwind->dealloc_cache) fi->base->unwind->dealloc_cache (fi, fi->base_cache); } After that patch, we relied on the fact that all frames are (supposedly) in the frame_stash. A deletion function was added to the frame_stash hash table, so that dealloc functions would be called when emptying the frame stash. There is one case, however, where a frame_info is not in the frame stash. That is when we create the frame_info for the current frame (level 0, unwound from the sentinel frame), but don't compute its frame id. The computation of the frame id for that frame (and only that frame, AFAIK) is done lazily. And putting a frame_info in the frame stash requires knowing its id. So a frame 0 whose frame id is not computed yet is necessarily not in the frame stash. When replaying with btrace, record_btrace_frame_sniffer insert entries corresponding to frames in the "bfcache" hash table. It then relies on record_btrace_frame_dealloc_cache being called for each frame to remove all those entries when the frames get invalidated. If a frame reinit happens while frame 0's id is not computed (and therefore that frame is not in frame_stash), record_btrace_frame_dealloc_cache does not get called for it, and it leaves a stale entry in bfcache. That then leads to a use-after-free when that entry is accessed later, which ASan catches. The proposed solution is to explicitly call frame_info_del on frame 0, if it exists, and if its frame id is not computed. If its frame id is computed, it is expected that it will be in the frame stash, so it will be "deleted" through that. [1] https://inbox.sourceware.org/gdb-patches/[email protected]/T/#mcf1340ce2906a72ec7ed535ec0c97dba11c3d977 Reported-By: Tom de Vries <[email protected]> Tested-By: Tom de Vries <[email protected]> Change-Id: I2351882dd511f3bbc01e4152e9db13b69b3ba384
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
I noticed that if Ctrl-C was typed just while GDB is evaluating a breakpoint condition in the background, and GDB ends up reaching out to the Python interpreter, then the breakpoint condition would still fail, like: c& Continuing. (gdb) Error in testing breakpoint condition: Quit That happens because while evaluating the breakpoint condition, we enter Python, and end up calling PyErr_SetInterrupt (it's called by gdbpy_set_quit_flag, in frame #0): (top-gdb) bt #0 gdbpy_set_quit_flag (extlang=0x558c68f81900 <extension_language_python>) at ../../src/gdb/python/python.c:288 #1 0x0000558c6845f049 in set_quit_flag () at ../../src/gdb/extension.c:785 #2 0x0000558c6845ef98 in set_active_ext_lang (now_active=0x558c68f81900 <extension_language_python>) at ../../src/gdb/extension.c:743 #3 0x0000558c686d3e56 in gdbpy_enter::gdbpy_enter (this=0x7fff2b70bb90, gdbarch=0x558c6ab9eac0, language=0x0) at ../../src/gdb/python/python.c:212 #4 0x0000558c68695d49 in python_on_memory_change (inferior=0x558c6a830b00, addr=0x555555558014, len=4, data=0x558c6af8a610 "") at ../../src/gdb/python/py-inferior.c:146 #5 0x0000558c6823a071 in std::__invoke_impl<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__f=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:61 #6 0x0000558c68237591 in std::__invoke_r<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__fn=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:111 #7 0x0000558c68233e64 in std::_Function_handler<void (inferior*, unsigned long, long, unsigned char const*), void (*)(inferior*, unsigned long, long, unsigned char const*)>::_M_invoke(std::_Any_data const&, inferior*&&, unsigned long&&, long&&, unsigned char const*&&) (__functor=..., __args#0=@0x7fff2b70bd40: 0x558c6a830b00, __args#1=@0x7fff2b70bd38: 93824992247828, __args#2=@0x7fff2b70bd30: 4, __args#3=@0x7fff2b70bd28: 0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:290 #8 0x0000558c6830a96e in std::function<void (inferior*, unsigned long, long, unsigned char const*)>::operator()(inferior*, unsigned long, long, unsigned char const*) const (this=0x558c6a8ecd98, __args#0=0x558c6a830b00, __args#1=93824992247828, __args#2=4, __args#3=0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:590 #9 0x0000558c6830a620 in gdb::observers::observable<inferior*, unsigned long, long, unsigned char const*>::notify (this=0x558c690828c0 <gdb::observers::memory_changed>, args#0=0x558c6a830b00, args#1=93824992247828, args#2=4, args#3=0x558c6af8a610 "") at ../../src/gdb/../gdbsupport/observable.h:166 #10 0x0000558c68309d95 in write_memory_with_notification (memaddr=0x555555558014, myaddr=0x558c6af8a610 "", len=4) at ../../src/gdb/corefile.c:363 #11 0x0000558c68904224 in value_assign (toval=0x558c6afce910, fromval=0x558c6afba6c0) at ../../src/gdb/valops.c:1190 #12 0x0000558c681e3869 in expr::assign_operation::evaluate (this=0x558c6af8e150, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/expop.h:1902 #13 0x0000558c68450c89 in expr::logical_or_operation::evaluate (this=0x558c6afab060, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:2330 #14 0x0000558c6844a896 in expression::evaluate (this=0x558c6afcfe60, expect_type=0x0, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:110 #15 0x0000558c6844a95e in evaluate_expression (exp=0x558c6afcfe60, expect_type=0x0) at ../../src/gdb/eval.c:124 #16 0x0000558c682061ef in breakpoint_cond_eval (exp=0x558c6afcfe60) at ../../src/gdb/breakpoint.c:4971 ... The fix is to disable cooperative SIGINT handling while handling inferior events, so that SIGINT is saved in the global quit flag, and not in the extension language, while handling an event. This commit augments the testcase added by the previous commit to test this scenario as well. Approved-By: Tom Tromey <[email protected]> Change-Id: Idf8ab815774ee6f4b45ca2d0caaf30c9b9f127bb
mbitsnbites
pushed a commit
that referenced
this issue
Feb 25, 2023
…l/kernel mode addresses At the moment GDB only handles pointer authentication (pauth) for userspace addresses and if we're debugging a Linux-hosted program. The Linux Kernel can be configured to use pauth instructions for some additional security hardening, but GDB doesn't handle this well. To overcome this limitation, GDB needs a couple things: 1 - The target needs to advertise pauth support. 2 - The hook to remove non-address bits from a pointer needs to be registered in aarch64-tdep.c as opposed to aarch64-linux-tdep.c. There is a patch for QEMU that addresses the first point, and it makes QEMU's gdbstub expose a couple more pauth mask registers, so overall we will have up to 4 pauth masks (2 masks or 4 masks): pauth_dmask pauth_cmask pauth_dmask_high pauth_cmask_high pauth_dmask and pauth_cmask are the masks used to remove pauth signatures from userspace addresses. pauth_dmask_high and pauth_cmask_high masks are used to remove pauth signatures from kernel addresses. The second point is easily addressed by moving code around. When debugging a Linux Kernel built with pauth with an unpatched GDB, we get the following backtrace: #0 __fput (file=0xffff0000c17a6400) at /repos/linux/fs/file_table.c:296 #1 0xffff8000082bd1f0 in ____fput (work=<optimized out>) at /repos/linux/fs/file_table.c:348 #2 0x30008000080ade30 [PAC] in ?? () #3 0x30d48000080ade30 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) With a patched GDB, we get something a lot more meaningful: #0 __fput (file=0xffff0000c1bcfa00) at /repos/linux/fs/file_table.c:296 #1 0xffff8000082bd1f0 in ____fput (work=<optimized out>) at /repos/linux/fs/file_table.c:348 #2 0xffff8000080ade30 [PAC] in task_work_run () at /repos/linux/kernel/task_work.c:179 #3 0xffff80000801db90 [PAC] in resume_user_mode_work (regs=0xffff80000a96beb0) at /repos/linux/include/linux/resume_user_mode.h:49 #4 do_notify_resume (regs=regs@entry=0xffff80000a96beb0, thread_flags=4) at /repos/linux/arch/arm64/kernel/signal.c:1127 #5 0xffff800008fb9974 [PAC] in prepare_exit_to_user_mode (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:137 #6 exit_to_user_mode (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:142 #7 el0_svc (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:638 #8 0xffff800008fb9d34 [PAC] in el0t_64_sync_handler (regs=<optimized out>) at /repos/linux/arch/arm64/kernel/entry-common.c:655 #9 0xffff800008011548 [PAC] in el0t_64_sync () at /repos/linux/arch/arm64/kernel/entry.S:586 Backtrace stopped: Cannot access memory at address 0xffff80000a96c0c8
mbitsnbites
pushed a commit
that referenced
this issue
Mar 20, 2023
In some cases GDB will fail when attempting to complete a command that involves a rust symbol, the failure can manifest as a crash. The problem is caused by the completion_match_for_lcd object being left containing invalid data during calls to cp_symbol_name_matches_1. The first question to address is why we are calling a C++ support function when handling a rust symbol. That's due to GDB's auto language detection for msymbols, in some cases GDB can't tell if a symbol is a rust symbol, or a C++ symbol. The test application contains symbols for functions which are statically linked in from various rust support libraries. There's no DWARF for these symbols, so all GDB has is the msymbols built from the ELF symbol table. Here's the problematic symbol that leads to our crash: mangled: _ZN4core3str21_$LT$impl$u20$str$GT$5parse17h5111d2d6a50d22bdE demangled: core::str::<impl str>::parse As an msymbol this is initially created with language auto, then GDB eventually calls symbol_find_demangled_name, which loops over all languages calling language_defn::sniff_from_mangled_name, the first language that can demangle the symbol gets assigned as the language for that symbol. Unfortunately, there's overlap in the mangled symbol names, some (legacy) rust symbols can be demangled as both rust and C++, see cplus_demangle in libiberty/cplus-dem.c where this is mentioned. And so, because we check the C++ language before we check for rust, then the msymbol is (incorrectly) given the C++ language. Now it's true that is some cases we might be able to figure out that a demangled symbol is not actually a valid C++ symbol, for example, in our case, the construct '::<impl str>::' is not, I believe, valid in a C++ symbol, we could look for ':<' and '>:' and refuse to accept this as a C++ symbol. However, I'm not sure it is always possible to tell that a demangled symbol is rust or C++, so, I think, we have to accept that some times we will get this language detection wrong. If we accept that we can't fix the symbol language detection 100% of the time, then we should make sure that GDB doesn't crash when it gets the language wrong, that is what this commit addresses. In our test case the user tries to complete a symbol name like this: (gdb) complete break pars This results in GDB trying to find all symbols that match 'pars', eventually we consider our problematic symbol, and we end up with a call stack that looks like this: #0 0x0000000000f3c6bd in strncmp_iw_with_mode #1 0x0000000000706d8d in cp_symbol_name_matches_1 #2 0x0000000000706fa4 in cp_symbol_name_matches #3 0x0000000000df3c45 in compare_symbol_name #4 0x0000000000df3c91 in completion_list_add_name #5 0x0000000000df3f1d in completion_list_add_msymbol #6 0x0000000000df4c94 in default_collect_symbol_completion_matches_break_on #7 0x0000000000658c08 in language_defn::collect_symbol_completion_matches #8 0x0000000000df54c9 in collect_symbol_completion_matches #9 0x00000000009d98fb in linespec_complete_function #10 0x00000000009d99f0 in complete_linespec_component #11 0x00000000009da200 in linespec_complete #12 0x00000000006e4132 in complete_address_and_linespec_locations #13 0x00000000006e4ac3 in location_completer In cp_symbol_name_matches_1 we enter a loop, this loop repeatedly tries to match the demangled problematic symbol name against the user supplied text ('pars'). Each time around the loop another component of the symbol name is stripped off, thus, we check 'pars' against these options: core::str::<impl str>::parse str::<impl str>::parse <impl str>::parse parse As soon as we get a match the cp_symbol_name_matches_1 exits its loop and returns. In our case, when we're looking for 'pars', the match occurs on the last iteration of the loop, when we are comparing to 'parse'. Now the problem here is that cp_symbol_name_matches_1 uses the strncmp_iw_with_mode, and inside strncmp_iw_with_mode we allow for skipping over template parameters. This allows GDB to match the symbol name 'foo<int>(int,int)' if the user supplies 'foo(int,'. Inside strncmp_iw_with_mode GDB will record any template arguments that it has skipped over inside the completion_match_for_lcd object that is passed in as an argument. And so, when GDB tries to match against '<impl str>::parse', the first thing it sees is '<impl str>', GDB assumes this is a template argument and records this as a skipped region within the completion_match_for_lcd object. After '<impl str>' GDB sees a ':' character, which doesn't match with the 'pars' the user supplied, so strncmp_iw_with_mode returns a value indicating a non-match. GDB then removes the '<impl str>' component from the symbol name and tries again, this time comparing to 'parse', which does match. Having found a match, then in cp_symbol_name_matches_1 we record the match string, and the full symbol name within the completion_match_result object, and return. The problem here is that the skipped region, the '<impl str>' that we recorded in the penultimate loop iteration was never discarded, its still there in our returned result. If we look at what the pointers held in the completion_match_result that cp_symbol_name_matches_1 returns, this is what we see: core::str::<impl str>::parse | \________/ | | | '--- completion match string | '---skip range '--- full symbol name When GDB calls completion_match_for_lcd::finish, GDB tries to create a string using the completion match string (parse), but excluding the skip range, as the stored skip range is before the start of the completion match string, then GDB tries to do some weird string creation, which will cause GDB to crash. The reason we don't often see this problem in C++ is that for C++ symbols there is always some non-template text before the template argument. This non-template text means GDB is likely to either match the symbol, or reject the symbol without storing a skip range. However, notice, I did say, we don't often see this problem. Once I understood the issue, I was able to reproduce the crash using a pure C++ example: template<typename S> struct foo { template<typename T> foo (int p1, T a) { s = 0; } S s; }; int main () { foo<int> obj (2.3, 0); return 0; } Then in GDB: (gdb) complete break foo(int The problem here is that the C++ symbol for the constructor looks like this: foo<int>::foo<double>(int, double) When GDB enters cp_symbol_name_matches_1 the symbols it examines are: foo<int>::foo<double>(int, double) foo<double>(int, double) The first iteration of the loop will match the 'foo', then add the '<int>' template argument will be added as a skip range. When GDB find the ':' after the '<int>' the first iteration of the loop fails to match, GDB removes the 'foo<int>::' component, and starts the second iteration of the loop. Again, GDB matches the 'foo', and now adds '<double>' as a skip region. After that the '(int' successfully matches, and so the second iteration of the loop succeeds, but, once again we left the '<int>' in place as a skip region, even though this occurs before the start of our match string, and this will cause GDB to crash. This problem was reported to the mailing list, and a solution discussed in this thread: https://sourceware.org/pipermail/gdb-patches/2023-January/195166.html The solution proposed here is similar to one proposed by the original bug reported, but implemented in a different location within GDB. Instead of placing the fix in strncmp_iw_with_mode, I place the fix in cp_symbol_name_matches_1. I believe this is a better location as it is this function that implements the loop, and it is this loop, which repeatedly calls strncmp_iw_with_mode, that should be resetting the result object state (I believe). What I have done is add an assert to strncmp_iw_with_mode that the incoming result object is empty. I've also added some other asserts in related code, in completion_match_for_lcd::mark_ignored_range, I make some basic assertions about the incoming range pointers, and in completion_match_for_lcd::finish I also make some assertions about how the skip ranges relate to the match pointer. There's two new tests. The original rust example that was used in the initial bug report, and a C++ test. The rust example depends on which symbols are pulled in from the rust libraries, so it is possible that, at some future date, the problematic symbol will disappear from this test program. The C++ test should be more reliable, as this only depends on symbols from within the C++ source code. Since I originally posted this patch to the mailing list, the following patch has been merged: commit 6e7eef7 Date: Sun Mar 19 09:13:10 2023 -0600 Use rust_demangle to fix a crash This solves the problem of a rust symbol ending up in the C++ specific code by changing the order languages are sorted. However, this new commit doesn't address the issue in the C++ code which was fixed with this commit. Given that the C++ issue is real, and has a reproducer, I'm still going to merge this fix. I've left the discussion of rust in this commit message as I originally wrote it, but it should be read within the context of GDB prior to commit 6e7eef7. Co-Authored-By: Zheng Zhan <[email protected]>
mbitsnbites
pushed a commit
that referenced
this issue
Jun 20, 2023
Commit 7a8de0c ("Remove ALL_BREAKPOINTS_SAFE") introduced a use-after-free in the breakpoints iterations (see below for full ASan report). This makes gdb.base/stale-infcall.exp fail when GDB is build with ASan. check_longjmp_breakpoint_for_call_dummy iterates on all breakpoints, possibly deleting the current breakpoint as well as related breakpoints. The problem arises when a breakpoint in the B->related_breakpoint chain is also B->next. In that case, deleting that related breakpoint frees the breakpoint that all_breakpoints_safe has saved. The old code worked around that by manually changing B_TMP, which was the next breakpoint saved by the "safe iterator": while (b->related_breakpoint != b) { if (b_tmp == b->related_breakpoint) b_tmp = b->related_breakpoint->next; delete_breakpoint (b->related_breakpoint); } (Note that this seemed to assume that b->related_breakpoint->next was the same as b->next->next, not sure this is guaranteed.) The new code kept the B_TMP variable, but it's not useful in that context. We can't go change the next breakpoint as saved by the safe iterator, like we did before. I suggest fixing that by saving the breakpoints to delete in a map and deleting them all at the end. Here's the full ASan report: (gdb) PASS: gdb.base/stale-infcall.exp: continue to breakpoint: break-run1 print infcall () ================================================================= ==47472==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000034980 at pc 0x563f7012c7bc bp 0x7ffdf3804d70 sp 0x7ffdf3804d60 READ of size 8 at 0x611000034980 thread T0 #0 0x563f7012c7bb in next_iterator<breakpoint>::operator++() /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/next-iterator.h:66 #1 0x563f702ce8c0 in basic_safe_iterator<next_iterator<breakpoint> >::operator++() /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/safe-iterator.h:84 #2 0x563f7021522a in check_longjmp_breakpoint_for_call_dummy(thread_info*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7611 #3 0x563f714567b1 in process_event_stop_test /home/smarchi/src/binutils-gdb/gdb/infrun.c:6881 #4 0x563f71454e07 in handle_signal_stop /home/smarchi/src/binutils-gdb/gdb/infrun.c:6769 #5 0x563f7144b680 in handle_inferior_event /home/smarchi/src/binutils-gdb/gdb/infrun.c:6023 #6 0x563f71436165 in fetch_inferior_event() /home/smarchi/src/binutils-gdb/gdb/infrun.c:4387 #7 0x563f7136ff51 in inferior_event_handler(inferior_event_type) /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:42 #8 0x563f7168038d in handle_target_event /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4219 #9 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573 #10 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694 #11 0x563f72fcaf2b in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:217 #12 0x563f7262b9bb in wait_sync_command_done() /home/smarchi/src/binutils-gdb/gdb/top.c:426 #13 0x563f7137a7c3 in run_inferior_call /home/smarchi/src/binutils-gdb/gdb/infcall.c:650 #14 0x563f71381295 in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1332 #15 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780 #16 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649 #17 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677 #18 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136 #19 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689 #20 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219 #21 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110 #22 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319 #23 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332 #24 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465 #25 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95 #26 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735 #27 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572 #28 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543 #29 0x563f7101014b in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /home/smarchi/src/binutils-gdb/gdb/event-top.c:779 #30 0x563f72777942 in tui_command_line_handler /home/smarchi/src/binutils-gdb/gdb/tui/tui-interp.c:104 #31 0x563f7100d059 in gdb_rl_callback_handler /home/smarchi/src/binutils-gdb/gdb/event-top.c:250 #32 0x7f5a80418246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb) #33 0x563f7100ca06 in gdb_rl_callback_read_char_wrapper_noexcept /home/smarchi/src/binutils-gdb/gdb/event-top.c:192 #34 0x563f7100cc5e in gdb_rl_callback_read_char_wrapper /home/smarchi/src/binutils-gdb/gdb/event-top.c:225 #35 0x563f728c70db in stdin_event_handler /home/smarchi/src/binutils-gdb/gdb/ui.c:155 #36 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573 #37 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694 #38 0x563f72fcb15c in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264 #39 0x563f7177ec1c in start_event_loop /home/smarchi/src/binutils-gdb/gdb/main.c:412 #40 0x563f7177f12e in captured_command_loop /home/smarchi/src/binutils-gdb/gdb/main.c:476 #41 0x563f717846e4 in captured_main /home/smarchi/src/binutils-gdb/gdb/main.c:1320 #42 0x563f71784821 in gdb_main(captured_main_args*) /home/smarchi/src/binutils-gdb/gdb/main.c:1339 #43 0x563f6fcedfbd in main /home/smarchi/src/binutils-gdb/gdb/gdb.c:32 #44 0x7f5a7e43984f (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e) #45 0x7f5a7e439909 in __libc_start_main (/usr/lib/libc.so.6+0x23909) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e) #46 0x563f6fcedd84 in _start (/home/smarchi/build/binutils-gdb/gdb/gdb+0xafb0d84) (BuildId: 50bd32e6e9d5e84543e9897b8faca34858ca3995) 0x611000034980 is located 0 bytes inside of 208-byte region [0x611000034980,0x611000034a50) freed by thread T0 here: #0 0x7f5a7fce312a in operator delete(void*, unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:164 #1 0x563f702bd1fa in momentary_breakpoint::~momentary_breakpoint() /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:304 #2 0x563f702771c5 in delete_breakpoint(breakpoint*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:12404 #3 0x563f702150a7 in check_longjmp_breakpoint_for_call_dummy(thread_info*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7673 #4 0x563f714567b1 in process_event_stop_test /home/smarchi/src/binutils-gdb/gdb/infrun.c:6881 #5 0x563f71454e07 in handle_signal_stop /home/smarchi/src/binutils-gdb/gdb/infrun.c:6769 #6 0x563f7144b680 in handle_inferior_event /home/smarchi/src/binutils-gdb/gdb/infrun.c:6023 #7 0x563f71436165 in fetch_inferior_event() /home/smarchi/src/binutils-gdb/gdb/infrun.c:4387 #8 0x563f7136ff51 in inferior_event_handler(inferior_event_type) /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:42 #9 0x563f7168038d in handle_target_event /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4219 #10 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573 #11 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694 #12 0x563f72fcaf2b in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:217 #13 0x563f7262b9bb in wait_sync_command_done() /home/smarchi/src/binutils-gdb/gdb/top.c:426 #14 0x563f7137a7c3 in run_inferior_call /home/smarchi/src/binutils-gdb/gdb/infcall.c:650 #15 0x563f71381295 in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1332 #16 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780 #17 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649 #18 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677 #19 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136 #20 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689 #21 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219 #22 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110 #23 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319 #24 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332 #25 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465 #26 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95 #27 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735 #28 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572 #29 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543 previously allocated by thread T0 here: #0 0x7f5a7fce2012 in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:95 #1 0x563f7029a9a3 in new_momentary_breakpoint<program_space*&, frame_id&, int&> /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8129 #2 0x563f702212f6 in momentary_breakpoint_from_master /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8169 #3 0x563f70212db1 in set_longjmp_breakpoint_for_call_dummy() /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7582 #4 0x563f713804db in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1260 #5 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780 #6 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649 #7 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677 #8 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136 #9 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689 #10 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219 #11 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110 #12 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319 #13 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332 #14 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465 #15 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95 #16 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735 #17 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572 #18 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543 #19 0x563f7101014b in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /home/smarchi/src/binutils-gdb/gdb/event-top.c:779 #20 0x563f72777942 in tui_command_line_handler /home/smarchi/src/binutils-gdb/gdb/tui/tui-interp.c:104 #21 0x563f7100d059 in gdb_rl_callback_handler /home/smarchi/src/binutils-gdb/gdb/event-top.c:250 #22 0x7f5a80418246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb) Change-Id: Id00c17ab677f847fbf4efdf0f4038373668d3d88 Approved-By: Tom Tromey <[email protected]>
mbitsnbites
pushed a commit
that referenced
this issue
Jun 20, 2023
Commit b5661ff ("gdb: fix possible use-after-free when executing commands") attempted to fix possible use-after-free in case command redefines itself. Commit 37e5833 ("gdb: fix command lookup in execute_command ()") updated the previous fix to handle subcommands as well by using the original command string to lookup the command again after its execution. This fixed the test in gdb.base/define.exp but it turned out that it does not work (at least) for "target remote" and "target extended-remote". The problem is that the command buffer P passed to execute_command () gets overwritten in dont_repeat () while executing "target remote" command itself: #0 dont_repeat () at top.c:822 #1 0x000055555730982a in target_preopen (from_tty=1) at target.c:2483 #2 0x000055555711e911 in remote_target::open_1 (name=0x55555881c7fe ":1234", from_tty=1, extended_p=0) at remote.c:5946 #3 0x000055555711d577 in remote_target::open (name=0x55555881c7fe ":1234", from_tty=1) at remote.c:5272 #4 0x00005555573062f2 in open_target (args=0x55555881c7fe ":1234", from_tty=1, command=0x5555589d0490) at target.c:853 #5 0x0000555556ad22fa in cmd_func (cmd=0x5555589d0490, args=0x55555881c7fe ":1234", from_tty=1) at cli/cli-decode.c:2737 #6 0x00005555573487fd in execute_command (p=0x55555881c802 "4", from_tty=1) at top.c:688 Therefore the second call to lookup_cmd () at line 697 fails to find command because the original command string is gone. This commit addresses this particular problem by creating a *copy* of original command string for the sole purpose of using it after command execution to lookup the command again. It may not be the most efficient way but it's safer given that command buffer is shared and overwritten in hard-to-foresee situations. Tested on x86_64-linux. PR 30249 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30249 Approved-By: Tom Tromey <[email protected]>
mbitsnbites
pushed a commit
that referenced
this issue
Jun 20, 2023
After this commit: commit baab375 Date: Tue Jul 13 14:44:27 2021 -0400 gdb: building inferior strings from within GDB It was pointed out that a new ASan failure had been introduced which was triggered by gdb.base/internal-string-values.exp: (gdb) PASS: gdb.base/internal-string-values.exp: test_setting: all langs: lang=ada: ptype "foo" print $_gdb_maint_setting("test-settings string") ================================================================= ==80377==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000068034 at pc 0x564785cba682 bp 0x7ffd20644620 sp 0x7ffd20644610 READ of size 1 at 0x603000068034 thread T0 #0 0x564785cba681 in find_command_name_length(char const*) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2129 #1 0x564785cbacb2 in lookup_cmd_1(char const**, cmd_list_element*, cmd_list_element**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2186 #2 0x564785cbb539 in lookup_cmd_1(char const**, cmd_list_element*, cmd_list_element**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2248 #3 0x564785cbbcf3 in lookup_cmd(char const**, cmd_list_element*, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2339 #4 0x564785c82df2 in setting_cmd /tmp/src/binutils-gdb/gdb/cli/cli-cmds.c:2219 #5 0x564785c84274 in gdb_maint_setting_internal_fn /tmp/src/binutils-gdb/gdb/cli/cli-cmds.c:2348 #6 0x564788167b3b in call_internal_function(gdbarch*, language_defn const*, value*, int, value**) /tmp/src/binutils-gdb/gdb/value.c:2321 #7 0x5647854b6ebd in expr::ada_funcall_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:11254 #8 0x564786658266 in expression::evaluate(type*, noside) /tmp/src/binutils-gdb/gdb/eval.c:111 #9 0x5647871242d6 in process_print_command_args /tmp/src/binutils-gdb/gdb/printcmd.c:1322 #10 0x5647871244b3 in print_command_1 /tmp/src/binutils-gdb/gdb/printcmd.c:1335 #11 0x564787125384 in print_command /tmp/src/binutils-gdb/gdb/printcmd.c:1468 #12 0x564785caac44 in do_simple_func /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:95 #13 0x564785cc18f0 in cmd_func(cmd_list_element*, char const*, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2735 #14 0x564787c70c68 in execute_command(char const*, int) /tmp/src/binutils-gdb/gdb/top.c:574 #15 0x564786686180 in command_handler(char const*) /tmp/src/binutils-gdb/gdb/event-top.c:543 #16 0x56478668752f in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /tmp/src/binutils-gdb/gdb/event-top.c:779 #17 0x564787dcb29a in tui_command_line_handler /tmp/src/binutils-gdb/gdb/tui/tui-interp.c:104 #18 0x56478668443d in gdb_rl_callback_handler /tmp/src/binutils-gdb/gdb/event-top.c:250 #19 0x7f4efd506246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb) #20 0x564786683dea in gdb_rl_callback_read_char_wrapper_noexcept /tmp/src/binutils-gdb/gdb/event-top.c:192 #21 0x564786684042 in gdb_rl_callback_read_char_wrapper /tmp/src/binutils-gdb/gdb/event-top.c:225 #22 0x564787f1b119 in stdin_event_handler /tmp/src/binutils-gdb/gdb/ui.c:155 #23 0x56478862438d in handle_file_event /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:573 #24 0x564788624d23 in gdb_wait_for_event /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:694 #25 0x56478862297c in gdb_do_one_event(int) /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:264 #26 0x564786df99f0 in start_event_loop /tmp/src/binutils-gdb/gdb/main.c:412 #27 0x564786dfa069 in captured_command_loop /tmp/src/binutils-gdb/gdb/main.c:476 #28 0x564786dff61f in captured_main /tmp/src/binutils-gdb/gdb/main.c:1320 #29 0x564786dff75c in gdb_main(captured_main_args*) /tmp/src/binutils-gdb/gdb/main.c:1339 #30 0x564785381b6d in main /tmp/src/binutils-gdb/gdb/gdb.c:32 #31 0x7f4efbc3984f (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e) #32 0x7f4efbc39909 in __libc_start_main (/usr/lib/libc.so.6+0x23909) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e) #33 0x564785381934 in _start (/tmp/build/binutils-gdb/gdb/gdb+0xabc5934) (BuildId: 90de353ac158646e7dab501b76a18a76628fca33) 0x603000068034 is located 0 bytes after 20-byte region [0x603000068020,0x603000068034) allocated by thread T0 here: #0 0x7f4efcee0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 #1 0x5647856265d8 in xcalloc /tmp/src/binutils-gdb/gdb/alloc.c:97 #2 0x564788610c6b in xzalloc(unsigned long) /tmp/src/binutils-gdb/gdbsupport/common-utils.cc:29 #3 0x56478815721a in value::allocate_contents(bool) /tmp/src/binutils-gdb/gdb/value.c:929 #4 0x564788157285 in value::allocate(type*, bool) /tmp/src/binutils-gdb/gdb/value.c:941 #5 0x56478815733a in value::allocate(type*) /tmp/src/binutils-gdb/gdb/value.c:951 #6 0x5647854ae81c in expr::ada_string_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:10675 #7 0x5647854b63b8 in expr::ada_funcall_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:11184 #8 0x564786658266 in expression::evaluate(type*, noside) /tmp/src/binutils-gdb/gdb/eval.c:111 #9 0x5647871242d6 in process_print_command_args /tmp/src/binutils-gdb/gdb/printcmd.c:1322 #10 0x5647871244b3 in print_command_1 /tmp/src/binutils-gdb/gdb/printcmd.c:1335 #11 0x564787125384 in print_command /tmp/src/binutils-gdb/gdb/printcmd.c:1468 #12 0x564785caac44 in do_simple_func /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:95 #13 0x564785cc18f0 in cmd_func(cmd_list_element*, char const*, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2735 #14 0x564787c70c68 in execute_command(char const*, int) /tmp/src/binutils-gdb/gdb/top.c:574 #15 0x564786686180 in command_handler(char const*) /tmp/src/binutils-gdb/gdb/event-top.c:543 #16 0x56478668752f in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /tmp/src/binutils-gdb/gdb/event-top.c:779 #17 0x564787dcb29a in tui_command_line_handler /tmp/src/binutils-gdb/gdb/tui/tui-interp.c:104 #18 0x56478668443d in gdb_rl_callback_handler /tmp/src/binutils-gdb/gdb/event-top.c:250 #19 0x7f4efd506246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb) The problem is in cli/cli-cmds.c, in the function setting_cmd, where we do this: const char *a0 = (const char *) argv[0]->contents ().data (); Here argv[0] is a value* which we know is either a TYPE_CODE_ARRAY or a TYPE_CODE_STRING. The problem is that the above line is casting the value contents directly to a C-string, i.e. one that is assumed to have a null-terminator at the end. After the above commit this can no longer be assumed to be true. A string value will be represented just as it would be in the current language, so for Ada and Fortran the string will be an array of characters with no null-terminator at the end. My proposed solution is to copy the string contents into a std::string object, and then use the std::string::c_str() value, this will ensure that a null-terminator has been added. I had a check through GDB at places TYPE_CODE_STRING was used and couldn't see any other obvious places where this type of assumption was being made, so hopefully this is the only offender. Running the above test with ASan compiled in no longer gives an error. Reviewed-By: Tom Tromey <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
For instance:
The text was updated successfully, but these errors were encountered: