Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using dlopen on libnglib.so crashes in std::map #201

Open
bubbleguuum opened this issue Dec 17, 2024 · 2 comments · May be fixed by #202
Open

Using dlopen on libnglib.so crashes in std::map #201

bubbleguuum opened this issue Dec 17, 2024 · 2 comments · May be fixed by #202

Comments

@bubbleguuum
Copy link

Version v6.2.2404, from openSUSE Tumbleweed.

Here's a weird one.

Using dlopen on libnglib.so will cause a crash during the static initialization of global variable below in rw_medit.cpp:

static RegisterUserFormat reg_medit ("Medit Format", {".mesh"},
                                     ReadMeditFormat,
                                     WriteMeditFormat);

Here's the gdb stack trace:

#0  0x00007fcd066d641e in std::local_Rb_tree_decrement (__x=0x7fcd0610d688 <netgen::UserFormatRegister::format_to_entry_index[abi:cxx11]+8>) at ../../../../../libstdc++-v3/src/c++98/tree.cc:98
#1  std::_Rb_tree_decrement (__x=__x@entry=0x7fcd0610d688 <netgen::UserFormatRegister::format_to_entry_index[abi:cxx11]+8>) at ../../../../../libstdc++-v3/src/c++98/tree.cc:123
#2  0x00007fcd05f87790 in std::_Rb_tree_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >::operator-- (this=<synthetic pointer>) at /usr/include/c++/14/bits/stl_tree.h:298
#3  std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_unique_pos (this=this@entry=0x7fcd0610d680 <netgen::UserFormatRegister::format_to_entry_index[abi:cxx11]>, __k=...) at /usr/include/c++/14/bits/stl_tree.h:2123
#4  0x00007fcd05f884d1 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .isra.0] (this=this@entry=0x7fcd0610d680 <netgen::UserFormatRegister::format_to_entry_index[abi:cxx11]>, __position=..., __k=...)
    at /usr/include/c++/14/bits/stl_tree.h:2220
#5  0x00007fcd05f19787 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::piecewise_construct_t const&, std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>&&, std::tuple<>&&) [clone .constprop.0] [clone .isra.0] (__pos=..., this=<optimized out>) at /usr/include/c++/14/bits/stl_tree.h:2459
#6  0x00007fcd05b6888e in std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::operator[] (this=<optimized out>, __k="Medit Format") at /usr/include/c++/14/bits/stl_map.h:513
#7  netgen::UserFormatRegister::Register (entry=...) at /usr/src/debug/netgen-6.2.2404/libsrc/interface/writeuser.hpp:35
#8  netgen::RegisterUserFormat::RegisterUserFormat(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ngcore::Array<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long>, std::optional<std::function<void (netgen::Mesh&, std::filesystem::__cxx11::path const&)> >, std::optional<std::function<void (netgen::Mesh const&, std::filesystem::__cxx11::path const&)> >, std::function<bool (std::filesystem::__cxx11::path const&)>) [clone .isra.0] (format=..., extensions=..., read=..., write=..., ftest=..., this=<optimized out>) at /usr/src/debug/netgen-6.2.2404/libsrc/interface/writeuser.hpp:62
#9  0x00007fcd05b6a382 in _sub_I_65535_0.0 () from /usr/lib64/netgen/libnglib.so
#10 0x00007fcd06a4869e in call_init (l=<optimized out>, argc=1, argv=0x7ffe49707378, env=0x7ffe49707388) at dl-init.c:74
#11 call_init (l=<optimized out>, argc=1, argv=0x7ffe49707378, env=0x7ffe49707388) at dl-init.c:26
#12 0x00007fcd06a4879c in _dl_init (main_map=0x1744e2e0, argc=1, argv=0x7ffe49707378, env=0x7ffe49707388) at dl-init.c:121
#13 0x00007fcd06a455fe in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x7fcd06a4f73e <call_dl_init>, args=args@entry=0x7ffe49706df0) at dl-catch.c:215
#14 0x00007fcd06a4f6ce in dl_open_worker (a=a@entry=0x7ffe49706f90) at dl-open.c:829
#15 0x00007fcd06a45571 in __GI__dl_catch_exception (exception=exception@entry=0x7ffe49706f70, operate=operate@entry=0x7fcd06a4f63e <dl_open_worker>, args=args@entry=0x7ffe49706f90) at dl-catch.c:241
#16 0x00007fcd06a4fb2c in _dl_open (file=0x402004 "/usr/lib64/netgen/libnglib.so", mode=<optimized out>, caller_dlopen=0x401184 <main+30>, nsid=<optimized out>, argc=1, argv=0x7ffe49707378, env=0x7ffe49707388) at dl-open.c:905
#17 0x00007fcd06293a3c in dlopen_doit (a=a@entry=0x7ffe49707200) at dlopen.c:56
#18 0x00007fcd06a45571 in __GI__dl_catch_exception (exception=exception@entry=0x7ffe49707160, operate=0x7fcd062939de <dlopen_doit>, args=0x7ffe49707200) at dl-catch.c:241
#19 0x00007fcd06a456a3 in _dl_catch_error (objname=0x7ffe497071b8, errstring=0x7ffe497071c0, mallocedp=0x7ffe497071b7, operate=<optimized out>, args=<optimized out>) at dl-catch.c:260
#20 0x00007fcd062934e7 in _dlerror_run (operate=operate@entry=0x7fcd062939de <dlopen_doit>, args=args@entry=0x7ffe49707200) at dlerror.c:138
#21 0x00007fcd06293b01 in dlopen_implementation (file=<optimized out>, mode=<optimized out>, dl_caller=<optimized out>) at dlopen.c:71
#22 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81
#23 0x0000000000401184 in main ()

It was generated from execution this simple program that is segfaulting in dlopen:

#include <dlfcn.h>

int main(int argc, char **argv) {
    dlopen("/usr/lib64/netgen/libnglib.so", RTLD_NOW);
    return 0;
}

I tried to understand what could cause it but failed.
I tried a minimal C++ shared lib mimicking what is done in writeuser.hpp/writeuser.cpp but it did not crash.
Quite puzling this one. If that helps, I attached the compilation log of netgen:

netgen_build_log.txt

@bubbleguuum
Copy link
Author

bubbleguuum commented Dec 18, 2024

After investigation this crash is not specific to dlopen, but happen whenever libnglib.so is initialized when linked to any program. This combination of CFLAGS used by default on openSUSE TW cause the crash (compiler: gcc 14.2.1):

-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -flto=auto -g

After a few tests with various flags, I do not think it is not a single one, but a combination of these that I could not determine.

@StefanBruens
Copy link
Contributor

Classic initilization order problem. The RegisterUserFormat static variables are initialized before the required std::map for the formats is initialized (zero-initialization is insufficient).

StefanBruens added a commit to StefanBruens/netgen that referenced this issue Dec 19, 2024
A std::map is in an invalid state when just zero-initialized, and needs
to be initialized by its constructor. As this initilization may be done
after the first call to Register, a crash will typically happen.

To fix this wrap all accesses to the map with a Meyers Singleton. Also
remove the extra Array - most accesses are using the key, and the few
format list iterations all sort the result afterwards anyway.

Fixes NGSolve#201.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants