Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Termonad Segfaults from commit 56b1fa5 onwards on NixOS 18.09 #107

Closed
craigem opened this issue Feb 17, 2019 · 25 comments
Closed

Termonad Segfaults from commit 56b1fa5 onwards on NixOS 18.09 #107

craigem opened this issue Feb 17, 2019 · 25 comments

Comments

@craigem
Copy link
Contributor

craigem commented Feb 17, 2019

From the commit logs I expect that this behaviour is expected in the transition to the latest nixpkgs but I felt it was worth flagging in case my assumption was incorrect.

Termonad beyond commit 56b1fa5 builds OK on NixOS 18.09 but segfaults after compiling then attempting to run user's Termonad binary.

I've got strace output etc should it be deemed useful.

@cdepillabout
Copy link
Owner

cdepillabout commented Feb 18, 2019

@craigem When you say building on NixOS 18.09, do you mean running something like:

$ nix-build --arg nixpkgs '<nixpkgs>'

Or do you mean just running nix-build on a 18.09 system?

As far as I am aware, the second one should work (although I don't have any support for the first one).

Would you be able to describe a little bit more about how you're compiling Termonad?

Also, you're compiling and running on Linux, correct?

@craigem
Copy link
Contributor Author

craigem commented Feb 18, 2019

I thought I may have be being a little vague, so the work flow specifically is:

$ nix-build

Which completes successfully. I then delete my cache:

$ rm -rf ~/.cache/termonad

and then I install the new build:

$ nix-env --file .nix-helpers/termonad-with-packages.nix --install

Lastly I run termonad, it then successfully builds my cached binary but segfaults when running it. This happened on every commit from 56b1fa5 except 56b1fa5 itself which refused to build in the nix-build phase due to a missing heterocephalus, as indicated in the commit message itself.

Hopefully that clears up the workflow.

@cdepillabout
Copy link
Owner

Thanks for the updated info.

I'll take a look at this and see if I can reproduce the problem.

@craigem
Copy link
Contributor Author

craigem commented Feb 18, 2019

For the sake of reproduction, I first hit the issue whith a simple "git pull" on master, encountered the described segfault and worked back from there (including rolling back NixOS to make sure I'd not introduced some breakage).

@cdepillabout
Copy link
Owner

cdepillabout commented Feb 18, 2019

@craigem I had a chance to look at this, but I'm not able to reproduce this segfault on my system.

Doing nix-env -f ./ -i && rm -rf ~/.cache/termonad && termonad works for me without any segfaults. I'm trying to update NixOS right now to the latest release of 18.09 to see if that causes Termoand to segfault. I'll let you know if I see any segfaults.

Ideally Termonad should not be segfaulting, especially if it successfully compiles. Would you be able to run Termonad in gdb (or possibly strace or ltrace) to try and figure out where the segfault occurs? If you do this, make sure that you're actually running Termonad and not the wrapper script:

$ nix-build
/nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3
$ ls -l /nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3/bin
-r-xr-xr-x 1 root root 1553 Jan  1  1970 termonad
$ cat /nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3/bin/termonad
#! /nix/store/gayzik3jx16rpc64czk5cy5s953431s6-bash-4.4-p23/bin/bash -e
export GIO_EXTRA_MODULES='/nix/store/m0ais13znv0g6g72mcs7k63zqg48dzsh-dconf-0.30.1-lib/lib/gio/modules'${GIO_EXTRA_MODULES:+':'}$GIO_EXTRA_MODULES
export GIO_EXTRA_MODULES='/nix/store/m0ais13znv0g6g72mcs7k63zqg48dzsh-dconf-0.30.1-lib/lib/gio/modules'${GIO_EXTRA_MODULES:+':'}$GIO_EXTRA_MODULES
export NIX_GHC='/nix/store/h2kpqakj951sqa97knhh9xsanscamk6n-ghc-8.6.3-with-packages/bin/ghc'
export XDG_DATA_DIRS='/nix/store/bll64d7jbr5xanwgyxh3wdbzdjmd0408-adwaita-icon-theme-3.30.1/share:/nix/store/bll64d7jbr5xanwgyxh3wdbzdjmd0408-adwaita-icon-theme-3.30.1/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/bll64d7jbr5xanwgyxh3wdbzdjmd0408-adwaita-icon-theme-3.30.1/share:/nix/store/bll64d7jbr5xanwgyxh3wdbzdjmd0408-adwaita-icon-theme-3.30.1/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share:/nix/store/xrf7kkz9vh02rhmr9p2ssc05lbzlrvhr-hicolor-icon-theme-0.17/share'${XDG_DATA_DIRS:+':'}$XDG_DATA_DIRS
exec -a "$0" "/nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3/bin/.termonad-wrapped"  "${extraFlagsArray[@]}" "$@"

The file /nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3/bin/.termonad-wrapped is the actual Termonad binary, but it needs environment variables like NIX_GHC and XDG_DATA_DIRS set correctly, which is what the wrapper script it for.

Or, if you could come up with some sort of virtual machine that also exhibits this segfaulting behavior, that would be easier for me to take a look at.

I have an example NixOps definition for Termonad here. You might be able to base something off of this:

https://github.com/cdepillabout/termonad/blob/516a957447b9aa093f5c5a70b867c5e4d60a4fee/.nix-helpers/nixops.nix

@cdepillabout
Copy link
Owner

cdepillabout commented Feb 18, 2019

@craigem I tried updating to the latest release of 18.09 and Termonad still seems to work fine. I'm not able to reproduce the segfaults.

One more thing you could try to do is to put print statements in the code and try to figure out exactly where it segfaults.

@craigem
Copy link
Contributor Author

craigem commented Feb 18, 2019

In transit, just dropped into an airport, awaiting a connecting flight.

Following your steps, the latest master still segfaults for me but your testing clearly shows this is a local issue.

I've attached a strace I collected on the flight but have yet to examine myself, should curiosity get to you :-D

termonad_strace.txt

@cdepillabout
Copy link
Owner

@craigem I took a look at the strace output, but unfortunately it doesn't say a lot. Here's the relevant parts right before the segfault:

openat(AT_FDCWD, "/nix/store/0q31dphs3nw6p17lfi56lpw0y83i4c0i-termonad-1.1.0.0-data/share/ghc-8.6.3/x86_64-linux-ghc-8.6.3/termonad-1.1.0.0/img/termonad-lambda.png", O_RDONLY) = 8

...

read(8, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\1\0\0\0\1\0\10\6\0\0\0\\r\250"..., 4096) = 4096

stat("/nix/store/b226z7lag85bb15hajm7kzzcm3la0rlw-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so", {st_mode=S_IFREG|0555, st_size=32992, ...}) = 0

openat(AT_FDCWD, "/nix/store/b226z7lag85bb15hajm7kzzcm3la0rlw-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so", O_RDONLY|O_CLOEXEC) = 9

read(9, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@%\0\0\0\0\0\0"..., 832) = 832

...

close(9)                                = 0

...

lseek(8, 0, SEEK_SET)                   = 0

read(8, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\1\0\0\0\1\0\10\6\0\0\0\\r\250"..., 4096) = 4096

read(8, "\212\6\264\202\35w\224)4e4mU\221\331\364\216\254\0\337\350\221\305\270\320%(G\35507\300"..., 4096) = 4096

...

read(8, "\370\355v\34\232\243 `\0000\0.q~\314\202\367?u\243\265#\204\240\313\211\220/\16\313,W"..., 4096) = 3855

--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x420} ---
+++ killed by SIGSEGV +++

You can see that it reads /nix/store/0q31dphs3nw6p17lfi56lpw0y83i4c0i-termonad-1.1.0.0-data/share/ghc-8.6.3/x86_64-linux-ghc-8.6.3/termonad-1.1.0.0/img/termonad-lambda.png and then dies.

However, I guess it is possible there are multiple threads doing other stuff, so you might need to run strace -f instead of just strace.

Also, it may be easier to just do some "printf debugging" in the Termonad source code itself.

I am definitely still interested in figuring this out and fixing it though, so let me know if you have any trouble with this.

@romanofski
Copy link

I ran into this too. I ran it through GDB. Here is a trace back:

[...]
Thread 1 "termonad-linux-" received signal SIGSEGV, Segmentation fault.
0x00007ffff3cffb82 in __memmove_avx_unaligned_erms () from /nix/store/p8fyjvz3djclmndwxjfcj7zkmx89wniw-glibc-2.27/lib/libc.so.6
(gdb) bt
#0  0x00007ffff3cffb82 in __memmove_avx_unaligned_erms () from /nix/store/p8fyjvz3djclmndwxjfcj7zkmx89wniw-glibc-2.27/lib/libc.so.6
#1  0x00007fffee16e1e5 in png_combine_row () from /nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so.16
#2  0x00007fffee1613f8 in png_read_row () from /nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so.16
#3  0x00007fffee162f6a in png_read_image () from /nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so.16
#4  0x00007fffe222a0ee in gdk_pixbuf.png_image_load ()
   from /nix/store/b226z7lag85bb15hajm7kzzcm3la0rlw-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so
#5  0x00007ffff5dd09e9 in gdk_pixbuf_new_from_file ()
   from /nix/store/m88clyixj6grpf8qs0v14k6zmxbh3nd3-gdk-pixbuf-2.38.0/lib/libgdk_pixbuf-2.0.so.0
#6  0x00007ffff723424e in load_pixbuf_verbosely () from /nix/store/jdf4bpilaxz9s4lpym25wb18bdkfkdpb-gtk+3-3.24.2/lib/libgtk-3.so.0
#7  0x00007ffff723d246 in gtk_window_set_default_icon_from_file ()
   from /nix/store/jdf4bpilaxz9s4lpym25wb18bdkfkdpb-gtk+3-3.24.2/lib/libgtk-3.so.0
#8  0x000000000048cad9 in ?? ()
#9  0x00007fffffff4d84 in ?? ()
#10 0x00007ffff7de13d1 in do_lookup_x () from /nix/store/p8fyjvz3djclmndwxjfcj7zkmx89wniw-glibc-2.27/lib/ld-linux-x86-64.so.2
#11 0x000000420003d000 in ?? ()
#12 0x0000000000000017 in ?? ()
#13 0x00007ffff7fd0370 in ?? ()
#14 0x0000000000000005 in ?? ()
#15 0x0000000000000000 in ?? ()

and I'm wondering if we, the ones who run into this, use the same version of libpng?

@craigem
Copy link
Contributor Author

craigem commented Feb 24, 2019

I have two versions installed but at this juncture I'm going to respond with a tentative "yes".

In IRC I recently said this to @romanofski :

        ⚡ ╡ craige [16:21:50] finds some incriminating build errors he'd been missing.

but as I've not been at it for a few days, what that was will elude me til I have another look at it.

$ find /nix/store -name 'libpng*'
/nix/store/pjc2s2z0lwffgmsaihij0yicqqblyags-source/pkgs/development/libraries/libpng
/nix/store/pjc2s2z0lwffgmsaihij0yicqqblyags-source/pkgs/misc/cups/drivers/cnijfilter_2_80/patches/libpng15.patch
/nix/store/qnb31iypx20drvay0kcyzkgq6c76hypx-nixexprs.tar.xz/pkgs/development/libraries/libpng
/nix/store/qnb31iypx20drvay0kcyzkgq6c76hypx-nixexprs.tar.xz/pkgs/misc/cups/drivers/cnijfilter_2_80/patches/libpng15.patch
/nix/store/pf30lcin289j9j7wxz11jjdr0h947n27-nixos-18.09.2243.63a09881b67/nixos/pkgs/development/libraries/libpng
/nix/store/pf30lcin289j9j7wxz11jjdr0h947n27-nixos-18.09.2243.63a09881b67/nixos/pkgs/misc/cups/drivers/cnijfilter_2_80/patches/libpng15.patch
/nix/store/a1nfcpzp9f1ncgs74x1m4hcn4ps7c20m-libpng-apng-1.6.35-dev/lib/pkgconfig/libpng.pc
/nix/store/a1nfcpzp9f1ncgs74x1m4hcn4ps7c20m-libpng-apng-1.6.35-dev/lib/pkgconfig/libpng16.pc
/nix/store/a1nfcpzp9f1ncgs74x1m4hcn4ps7c20m-libpng-apng-1.6.35-dev/bin/libpng-config
/nix/store/a1nfcpzp9f1ncgs74x1m4hcn4ps7c20m-libpng-apng-1.6.35-dev/bin/libpng16-config
/nix/store/a1nfcpzp9f1ncgs74x1m4hcn4ps7c20m-libpng-apng-1.6.35-dev/include/libpng16
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng16.la
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng.la
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng16.so
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng16.so.16
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng.so
/nix/store/73bda0a26bda42v22y9fj051r44wsyma-libpng-apng-1.6.35/lib/libpng16.so.16.35.0
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.la
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng.la
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so.16
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng.so
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35/lib/libpng16.so.16.35.0
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev/lib/pkgconfig/libpng.pc
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev/lib/pkgconfig/libpng16.pc
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev/bin/libpng-config
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev/bin/libpng16-config
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev/include/libpng16
/nix/store/dr9l5m0qnmqjvivx8k076iwkx8zy21m9-libpng-apng-1.6.34-dev/lib/pkgconfig/libpng.pc
/nix/store/dr9l5m0qnmqjvivx8k076iwkx8zy21m9-libpng-apng-1.6.34-dev/lib/pkgconfig/libpng16.pc
/nix/store/dr9l5m0qnmqjvivx8k076iwkx8zy21m9-libpng-apng-1.6.34-dev/bin/libpng-config
/nix/store/dr9l5m0qnmqjvivx8k076iwkx8zy21m9-libpng-apng-1.6.34-dev/bin/libpng16-config
/nix/store/dr9l5m0qnmqjvivx8k076iwkx8zy21m9-libpng-apng-1.6.34-dev/include/libpng16
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng16.la
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng.la
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng16.so
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng16.so.16
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng16.so.16.34.0
/nix/store/laqjm9bj0lq5bs47n4w1fn85bg90w70k-libpng-apng-1.6.34/lib/libpng.so

@cdepillabout
Copy link
Owner

cdepillabout commented Feb 24, 2019

You could try just commenting out the place where the .png in question is being loaded. I wonder if this would fix the segfault.

termonadIconPath <- getDataFileName "img/termonad-lambda.png"
windowSetDefaultIconFromFile termonadIconPath

Although it is puzzling why you would be seeing this, and I wouldn't, since we should be using the exact same version of libpng. Here's what I ran in Termonad's source dir:

$ nix-build
...
/nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3
$ nix-store --query --requisites /nix/store/1h6rp136j9xn2y2mg75i94ialarlvdyl-termonad-with-packages-8.6.3 | grep png
/nix/store/3cl6mshpdrzbiz88307wmjbkwxdsmghi-libpng-apng-1.6.35
/nix/store/bsjs7p3cn1sdcvx1y2i49ikb3ygmardc-libpng-apng-1.6.35-dev

@cdepillabout
Copy link
Owner

@craigem @romanofski
I just updated Termonad to use the latest release of nixpkgs-19.03. Maybe that will work better for you?

@romanofski
Copy link

@cdepillabout Cheers! I tried to get a small reproducer running with debug symbols this morning, but failed. I did the same nix-build on a different distro and did not see the crash. I'm almost tempted to think it might be platform/kernel specific ...

@romanofski
Copy link

I've gotten around to try out HEAD (963d912) against Linux 4.14.104 #1-NixOS SMP Wed Feb 27 09:08:09 UTC 2019 x86_64 GNU/Linux (in case that is important) and no luck.

#0  0x00007ffff6c69bb2 in __memmove_avx_unaligned_erms () from /nix/store/sw54ph775lw7b9g4hlfvpx6fmlvdy8qi-glibc-2.27/lib/libc.so.6
#1  0x00007ffff62c86f5 in png_combine_row () from /nix/store/parrh56r6sr4ccp9ypl4sh7h5b19rijg-libpng-apng-1.6.36/lib/libpng16.so.16
#2  0x00007ffff62bb8af in png_read_row () from /nix/store/parrh56r6sr4ccp9ypl4sh7h5b19rijg-libpng-apng-1.6.36/lib/libpng16.so.16
#3  0x00007ffff62bd42a in png_read_image () from /nix/store/parrh56r6sr4ccp9ypl4sh7h5b19rijg-libpng-apng-1.6.36/lib/libpng16.so.16
#4  0x00007fffec6680ee in gdk_pixbuf.png_image_load () from /nix/store/59mh0p4ng0iz1p5bq4vpyy4v8kmfgnf6-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so
#5  0x00007ffff73b1f19 in gdk_pixbuf_new_from_file () from /nix/store/vyibwikjzxwr2l19xhz3b5zmrj73d8sw-gdk-pixbuf-2.38.0/lib/libgdk_pixbuf-2.0.so.0
#6  0x00007ffff7a2852e in load_pixbuf_verbosely () from /nix/store/xywpyd0gf2fca3vxczkra72y29phlz4d-gtk+3-3.24.4/lib/libgtk-3.so.0
#7  0x00007ffff7a31546 in gtk_window_set_default_icon_from_file () from /nix/store/xywpyd0gf2fca3vxczkra72y29phlz4d-gtk+3-3.24.4/lib/libgtk-3.so.0
#8  0x000000000048c879 in ?? ()
#9  0x00007fffffff4d84 in ?? ()
#10 0x00007ffff7fdf551 in do_lookup_x () from /nix/store/sw54ph775lw7b9g4hlfvpx6fmlvdy8qi-glibc-2.27/lib/ld-linux-x86-64.so.2
#11 0x000000420003d000 in ?? ()
#12 0x0000000000000018 in ?? ()
#13 0x00007ffff58a3160 in ?? ()
#14 0x0000000000000005 in ?? ()
#15 0x0000000000000000 in ?? ()

Different version of libpng, same version of glibc. I'm still interested in getting debug symbols. Perhaps it's related to our version of glibc we're using ...

@cdepillabout
Copy link
Owner

@romanofski Did you try commenting out the lines that load the PNG? I'm interested in whether or not that fixes this crash. If so, we know it is definitely a PNG-related issue.

#107 (comment)

Also, are you on NixOS-18.09? Or NixOS-unstable? I'm on 18.09 here.

Different version of libpng, same version of glibc. I'm still interested in getting debug symbols. Perhaps it's related to our version of glibc we're using ...

Looking at the backtrace, I think we're using the same version of libpng, glibc, etc. From your backtrace, it looks like you have the following:

/nix/store/parrh56r6sr4ccp9ypl4sh7h5b19rijg-libpng-apng-1.6.36
/nix/store/sw54ph775lw7b9g4hlfvpx6fmlvdy8qi-glibc-2.27
/nix/store/59mh0p4ng0iz1p5bq4vpyy4v8kmfgnf6-gdk-pixbuf-2.36.12
/nix/store/xywpyd0gf2fca3vxczkra72y29phlz4d-gtk+3-3.24.4

Here's what it looks like for me, looking at it with nix-store --query -R:

$ nix-store --query --requisites /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3 | grep libpng
/nix/store/parrh56r6sr4ccp9ypl4sh7h5b19rijg-libpng-apng-1.6.36
/nix/store/8bfqx13gj1ibg9q440bqqxgk8i0hk1dq-libpng-apng-1.6.36-dev
$ nix-store --query --requisites /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3 | grep glibc
/nix/store/sw54ph775lw7b9g4hlfvpx6fmlvdy8qi-glibc-2.27
/nix/store/p1aicp5gllvlnnr8a49i3inrgal1w812-glibc-2.27-bin
/nix/store/qmg2fkspkzl0xsdqpmg3k06bmgp27saz-glibc-2.27-dev
$ nix-store --query --requisites /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3 | grep gdk-pixbuf
/nix/store/vyibwikjzxwr2l19xhz3b5zmrj73d8sw-gdk-pixbuf-2.38.0
/nix/store/gck5xpcgrd7krswqnni04p17g9q8mxkg-gdk-pixbuf-2.38.0-dev
$ nix-store --query --requisites /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3 | grep gtk+
/nix/store/xywpyd0gf2fca3vxczkra72y29phlz4d-gtk+3-3.24.4
/nix/store/yrbj7k2ajhnmbf2mwk0zrxnkn3n48xwq-gtk+3-3.24.4-dev

Oh wait, it looks like you have anolder version of gdk-pixbuf for some reason! Could you try recompiling with the current master for Termonad and seeing if the output derivation gets linked to the correct gdk-pixbuf? Hopefully that will fixthis error.

$ ldd /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3/bin/.termonad-wrapped  | grep -i pixbuf
	libgdk_pixbuf-2.0.so.0 => /nix/store/vyibwikjzxwr2l19xhz3b5zmrj73d8sw-gdk-pixbuf-2.38.0/lib/libgdk_pixbuf-2.0.so.0 (0x00007feabe708000)

@romanofski
Copy link

romanofski commented Mar 8, 2019

@cdepillabout Sorry I keep ignoring that point, am I (re commenting out loading of the PNG). Will definitely have a look tonight.

In regards to the other points:

  • NixOS-18.09

Also will check the gdk-pixbuf version.

@romanofski
Copy link

Alright answers:

% nix-store --query --requisites /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3 | grep glibc
/nix/store/sw54ph775lw7b9g4hlfvpx6fmlvdy8qi-glibc-2.27
/nix/store/p1aicp5gllvlnnr8a49i3inrgal1w812-glibc-2.27-bin
/nix/store/qmg2fkspkzl0xsdqpmg3k06bmgp27saz-glibc-2.27-dev
/nix/store/b79xaydz0a65d357kkqz72wwp2l6kqfn-glibc-iconv-2.27

and same output with ldd:

% ldd /nix/store/i66kpbqff3mh1dbv2yycc7brpnxgn4p9-termonad-with-packages-8.6.3/bin/.termonad-wrapped  | grep -i pixbuf
	libgdk_pixbuf-2.0.so.0 => /nix/store/vyibwikjzxwr2l19xhz3b5zmrj73d8sw-gdk-pixbuf-2.38.0/lib/libgdk_pixbuf-2.0.so.0 (0x00007f160fb62000)

However, you're right... when I run the binary which causes the segfault there is this older version of gdk-pixbuf again:

#4  0x00007fffec6680ee in gdk_pixbuf.png_image_load ()
   from /nix/store/59mh0p4ng0iz1p5bq4vpyy4v8kmfgnf6-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so
#5  0x00007ffff73b1f19 in gdk_pixbuf_new_from_file ()
   from /nix/store/vyibwikjzxwr2l19xhz3b5zmrj73d8sw-gdk-pixbuf-2.38.0/lib/libgdk_pixbuf-2.0.so.0
#6  0x00007ffff7a2852e in load_pixbuf_verbosely () from /nix/store/xywpyd0gf2fca3vxczkra72y29phlz4d-gtk+3-3.24.4/lib/libgtk-3.so.0

So .. aeh.. wtf? I'll check if I can get rid of it ...

@cdepillabout
Copy link
Owner

@romanofski Thanks for looking into this.

So I guess for now if you (or anyone else) is running into this but still want to use Termonad, then commenting out the problematic lines is a reasonable workaround.

I get the same output when using nix-store --query -R with one exception

Oh sorry, we do actually get the same output here. I had removed the /nix/store/b79xaydz0a65d357kkqz72wwp2l6kqfn-glibc-iconv-2.27 line from my output because I thought it was not related. Sorry for the confusion.

However, you're right... when I run the binary which causes the segfault there is this older version of gdk-pixbuf again:
So .. aeh.. wtf? I'll check if I can get rid of it ...

That's very strange. One version of gdk-pixbuf calling into an older version definitely sounds like it could result in a segfault. I have no idea how that could happen in practice though.

Do you maybe have something strange in your LD_LIBRARY_PATH env var? Or maybe you could try running LD_DEBUG=all termonad to see why that weird pixbuf version is being loaded. Honestly I have no idea what could be going on here.

@romanofski
Copy link

@cdepillabout Yes I'm surprised as well. I've found several binaries on the system mainly installed from NixOS-18.09 which are linked against gdk-pixbuf-2.36.12 (e.g. gimp or emacs). I'd say the majority is linked against the older gdk-pixbuf version.

Cheers for the pointer to LD_DEBUG. I'll give that a shot too.

@romanofski
Copy link

I straced a small reproducer I wrote https://github.com/romanofski/termReproducer and found this:

[...]
openat(AT_FDCWD, "/nix/store/dnm7ayjjbghg0fliwlfl7hjay8pb5rbq-librsvg-2.42.4/lib/gdk-pixbuf-2.0/2.10.0/loaders.cache", O_RDONLY) = 5    
fstat(5, {st_mode=S_IFREG|0444, st_size=4487, ...}) = 0                                                                                 
read(5, "# GdkPixbuf Image Loader Modules"..., 1024) = 1024

The librsvg has a gdkpixbuf loaders.cache which points to gdk-pixbuf-2.36 loaders. The next thing what happens is:

openat(AT_FDCWD, "/nix/store/jmq0fxpm87v31kqg81s7n4v9akq6slp2-termReproducer-0.1.0.0-data/share/ghc-8.6.3/x86_64-linux-ghc-8.6.3/termReproducer-0.1.0.0/termonad-lambda.png", O_RDONLY) = 5                                                                                     
fstat(5, {st_mode=S_IFREG|0444, st_size=12047, ...}) = 0                                                                                
read(5, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\1\0\0\0\1\0\10\6\0\0\0\\r\250"..., 4096) = 4096                                               
stat("/nix/store/59mh0p4ng0iz1p5bq4vpyy4v8kmfgnf6-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so", {st_mode=S_IFREG|0555, st_size=32992, ...}) = 0                                                                                                 
openat(AT_FDCWD, "/nix/store/59mh0p4ng0iz1p5bq4vpyy4v8kmfgnf6-gdk-pixbuf-2.36.12/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-png.so", O_RDONLY|O_CLOEXEC) = 9 

loading the PNG by the different version of gdk-pixbuf. I'm surprised that librsvg has it's own loaders.cache. I guess it would defeat any isolation Nix is trying to accomplish here. Same thing also happens with my termonad-linux-x86_64 binary.

I'm currently not sure if using a different version actually corrupts the loading, but it's certainly very odd.

@cdepillabout
Copy link
Owner

@romanofski

I tried searching on nixpkgs for this problem and it seems like there are other people running into this:

NixOS/nixpkgs#54278
NixOS/nixpkgs#39493

I just sent a PR that makes sure to set XDG_DATA_DIRS to point to pixbuf. I don't know if that would help at all, but you could try it out:

#109

@cdepillabout
Copy link
Owner

I believe this has been fixed in #109, so I am going to go ahead and merge this.

If anyone else is seeing this problem, please leave a message on this thread.

@romanofski
Copy link

Cheers @cdepillabout and @craigem. Getting behind this was an enjoying puzzle.

@craigem
Copy link
Contributor Author

craigem commented Mar 9, 2019

I'm still catching up this and #109 - looks like a great result and I'll have time test a build tomorrow.

(This had become a blocker to me working on #104)

@craigem
Copy link
Contributor Author

craigem commented Mar 11, 2019

Confirming that it's fixed for me. Thanks!

(built Termonad from master on NixOS 18.09)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants