Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting Issues with Pytraj using tutorials #1647

Closed
PhiMykah opened this issue Dec 19, 2023 · 28 comments
Closed

Reporting Issues with Pytraj using tutorials #1647

PhiMykah opened this issue Dec 19, 2023 · 28 comments

Comments

@PhiMykah
Copy link

Hello!
I was doing some testing of the pytraj code, imported as a package from source, with the latest version of amber, cpptraj, and compiled libcpptraj. I wanted to identify some issues that came up as I worked through it. I have made 4 jupyter notebooks documenting the errors I received, and I tested using the "trytophan zipper." There may be something I am missing, however if that is the case it should be evident by my logs.
I worked on four tutorials: Basic Examples, Energy Decomposition, Pairwise RMSD, and Plot Correlation Matrix. Here is the archive of the four notebooks saved:
pytraj-notebook-testing.tar.gz

Please let me know if there is anything I can do to work with these errors if possible!

@hainm
Copy link
Contributor

hainm commented Dec 19, 2023

@PhiMykah which errors are you encountering? can you please paste here (vs uploading the notebook).

@PhiMykah
Copy link
Author

Okay, a lot of segmentation faults and free errors.

Loading using pt.load:

double free or corruption (!prev)
Fatal Python error: Aborted

Current thread 0x00007fccdbe82480 (most recent call first):
  File "/.../pytraj/io.py", line 137 in load
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pytraj.core.c_dict, pytraj.core.box, pytraj.core.parameter_types, pytraj.core.coordinfo, pytraj.datafiles.datafiles, pytraj.utils.cyutils, pytraj.analysis.c_action.c_action, pytraj.analysis.c_action.actionlist, pytraj.trajectory.c_traj.c_trajectory, pytraj.datasets.c_datasets, pytraj.core.c_core, pytraj.datasets.cast_dataset, pytraj.datasets.c_datasetlist, pytraj.trajectory.c_traj.c_trajout, pytraj.trajectory.frame, pytraj.core.c_options, pytraj.topology.topology, pytraj.math.cpp_math, pytraj.core.topology_objects, pytraj.analysis.c_analysis.c_analysis (total: 33)
Aborted (core dumped)

using pt.distance(traj, ':1-3 :5-8') on trp zip

Fatal Python error: Segmentation fault

Current thread 0x00007fdadf328480 (most recent call first):
  File "/..../pytraj/all_actions.py", line 401 in distance
  File "/.../pytraj/utils/decorators.py", line 11 in inner
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pytraj.core.c_dict, pytraj.core.box, pytraj.core.parameter_types, pytraj.core.coordinfo, pytraj.datafiles.datafiles, pytraj.utils.cyutils, pytraj.analysis.c_action.c_action, pytraj.analysis.c_action.actionlist, pytraj.trajectory.c_traj.c_trajectory, pytraj.datasets.c_datasets, pytraj.core.c_core, pytraj.datasets.cast_dataset, pytraj.datasets.c_datasetlist, pytraj.trajectory.c_traj.c_trajout, pytraj.trajectory.frame, pytraj.core.c_options, pytraj.topology.topology, pytraj.math.cpp_math, pytraj.core.topology_objects, pytraj.analysis.c_analysis.c_analysis (total: 33)
Segmentation fault (core dumped)

seg fault loading pdb using pt.load_pdb_rcsb('1l2y'):

double free or corruption (!prev)
Fatal Python error: Aborted

Current thread 0x00007fc3d5c8f480 (most recent call first):
  File "/.../pytraj/io.py", line 137 in load
  File "/.../pytraj/io.py", line 587 in _make_traj_from_remote_file
  File "/.../pytraj/io.py", line 572 in loadpdb_rcsb
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pytraj.core.c_dict, pytraj.core.box, pytraj.core.parameter_types, pytraj.core.coordinfo, pytraj.datafiles.datafiles, pytraj.utils.cyutils, pytraj.analysis.c_action.c_action, pytraj.analysis.c_action.actionlist, pytraj.trajectory.c_traj.c_trajectory, pytraj.datasets.c_datasets, pytraj.core.c_core, pytraj.datasets.cast_dataset, pytraj.datasets.c_datasetlist, pytraj.trajectory.c_traj.c_trajout, pytraj.trajectory.frame, pytraj.core.c_options, pytraj.topology.topology, pytraj.math.cpp_math, pytraj.core.topology_objects, pytraj.analysis.c_analysis.c_analysis (total: 33)
Aborted (core dumped)

Attempting dssp analysis on downloaded 1l2y pdb pt.dssp(pdb):

double free or corruption (!prev)
Fatal Python error: Aborted

Current thread 0x00007fbf8c027480 (most recent call first):
  File "/.../pytraj/utils/decorators.py", line 59 in _inner
  File "/.../pytraj/analysis/dssp_analysis.py", line 101 in dssp
  File "/.../pytraj/utils/get_common_objects.py", line 314 in inner
  File "/.../pytraj/utils/decorators.py", line 20 in inner
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pytraj.core.c_dict, pytraj.core.box, pytraj.core.parameter_types, pytraj.core.coordinfo, pytraj.datafiles.datafiles, pytraj.utils.cyutils, pytraj.analysis.c_action.c_action, pytraj.analysis.c_action.actionlist, pytraj.trajectory.c_traj.c_trajectory, pytraj.datasets.c_datasets, pytraj.core.c_core, pytraj.datasets.cast_dataset, pytraj.datasets.c_datasetlist, pytraj.trajectory.c_traj.c_trajout, pytraj.trajectory.frame, pytraj.core.c_options, pytraj.topology.topology, pytraj.math.cpp_math, pytraj.core.topology_objects, pytraj.analysis.c_analysis.c_analysis (total: 33)
Aborted (core dumped)

Attempting to search hbonds for the pdb pt.search_hbonds(pdb):

Fatal Python error: Segmentation fault

Current thread 0x00007f95af0e5480 (most recent call first):
  File "/.../pytraj/utils/get_common_objects.py", line 314 in inner
  File "/.../pytraj/utils/decorators.py", line 11 in inner
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, pytraj.core.c_dict, pytraj.core.box, pytraj.core.parameter_types, pytraj.core.coordinfo, pytraj.datafiles.datafiles, pytraj.utils.cyutils, pytraj.analysis.c_action.c_action, pytraj.analysis.c_action.actionlist, pytraj.trajectory.c_traj.c_trajectory, pytraj.datasets.c_datasets, pytraj.core.c_core, pytraj.datasets.cast_dataset, pytraj.datasets.c_datasetlist, pytraj.trajectory.c_traj.c_trajout, pytraj.trajectory.frame, pytraj.core.c_options, pytraj.topology.topology, pytraj.math.cpp_math, pytraj.core.topology_objects, pytraj.analysis.c_analysis.c_analysis (total: 33)
Segmentation fault (core dumped)

Similar errors occur in energy decomposition, pairwise rmsd, and plot correlation matrix. The details of everything ran is in the notebook.

@hainm
Copy link
Contributor

hainm commented Dec 19, 2023 via email

@PhiMykah
Copy link
Author

@hainm I don't know what cpptraj is, pretty sure no commits from me in there 😅

Apologies! I'm assuming was meant for @drroe.

@hainm
Copy link
Contributor

hainm commented Dec 19, 2023

I don't know what cpptraj is, pretty sure no commits from me in there 😅

Sorry droe, I will remove your comment so you won't receive notification for this thread.

@Amber-MD Amber-MD deleted a comment from droe Dec 19, 2023
@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

@PhiMykah for now, can you please use the cpptraj's git commit that had the green tests?

cd cpptraj
git checkout bc47e44965c04912fb2aa5018c8278d2158c33be
<build libcpptraj again>
<clean pytraj build and rebuild pytraj>
image

@PhiMykah
Copy link
Author

for now, can you please use the cpptraj's git commit that had the green tests?

Sure! I'll give it a shot.

@PhiMykah
Copy link
Author

@hainm Testing through all of the tutorial cases, they all seem to work on this cpptraj git commit head.

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

Thanks @PhiMykah for confirming. I've sent an email to @drroe (cc you too).

@drroe
Copy link
Contributor

drroe commented Dec 20, 2023

Hi,

So while it is possible there is indeed an issue with cpptraj, I've managed to track down the bad behavior (at least with my local conda environment) to which cython version is being used. I noticed that a recent python 3.8 conda environment, a pytraj test was failing with an abort:

(pytrajtest3.8_2) [droe@thor tests] (master)$ pytest test_actionlist.py 
======================================= test session starts ========================================
platform linux -- Python 3.8.18, pytest-7.4.0, pluggy-1.0.0
rootdir: /home/droe/GitHub/pytraj
collected 18 items                                                                                 

test_actionlist.py Fatal Python error: Aborted

Current thread 0x00007f138a65f500 (most recent call first):
  File "/home/droe/GitHub/pytraj/tests/test_actionlist.py", line 22 in test_distances
  File "/home/droe/Programs/anaconda3/envs/pytrajtest3.8_2/lib/python3.8/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
...

However, an older python 3.8 conda environment was working just fine:

(pytrajtest3.8) [droe@thor tests] (master)$ pytest test_actionlist.py 
======================================= test session starts ========================================
platform linux -- Python 3.8.17, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/droe/GitHub/pytraj
collected 18 items                                                                                 

test_actionlist.py ..................                                                        [100%]

======================================== 18 passed in 1.00s ========================================

Comparing the two environments via conda list I saw that while there were several packages that had different versions, the important one seemed to be the cython version; it was 0.29.35 in the working version and 3.0.6 in the non-working version. So I created a new python 3.8 with everything the same except the cython version was downgraded to 0.29.35 and it worked:

======================================= test session starts ========================================
platform linux -- Python 3.8.18, pytest-7.4.0, pluggy-1.0.0
rootdir: /home/droe/GitHub/pytraj
collected 18 items                                                                                 

test_actionlist.py ..................                                                        [100%]

======================================== 18 passed in 0.84s ========================================

Version 3.0.0 was also broken, but downgrading to 0.29.36 worked. So it seems like the problem is that the 3.X.X versions of python do not work with pytraj.

I often run cpptraj tests with valgrind enabled, so I'm pretty confident that there are no memory errors or leaks in libcpptraj.so itself (although I'm going to run one right now to be certain). Therefore I think the issue is either a bug in cython itself, or a bug in pytraj that is being exposed by the upgrade from the 0.X.X series to the 3.X.X series.

@PhiMykah can you try downgrading your cython to one of the 0.X.X versions and see if the issue goes away?

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

hi @drroe Thanks for investigating.

It's still unclear to me how checking out the "green" commit from cpptraj helps resolve the segmentation fault too.
I will also investigate. (I need to find out how to build netcdf on my mac first :D)

@PhiMykah
Copy link
Author

Hello! Looking in my conda environment, my cython version is 0.29.36, as it likely came from the amber installation. I will attempt to downgrade to python 3.8.18, but the version amber gave me was 3.11.5. Thank you for your reply and help! @drroe

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

hi @PhiMykah I think @drroe meant about checking the cython version (not the python version). The latest cython version is 3.0.7 https://pypi.org/project/Cython/ (in the context of 0.X.X series to the 3.X.X series.).

@PhiMykah
Copy link
Author

Checking the cython version (not the python version).
Yes, I believe he was saying that cython versions 3.X.X don't work properly, but cython versions 0.X.X should work, but I am on 0.29.36 for cython.

@drroe
Copy link
Contributor

drroe commented Dec 20, 2023

Pinning cython to 0.29.36 worked for the CI framework: Amber-MD/cpptraj#1062

Looking in my conda environment, my cython version is 0.29.36

@PhiMykah So you're saying that with cython 0.29.36 you are still getting errors? If that is the case you should list in exact detail how you set up your tests. What version of amber, what your cmake configure line was, etc. I will try to reproduce what you are seeing locally.

@PhiMykah
Copy link
Author

@drroe My amber version is most recent version of amber23 (but the folder name is amber22 when it was installed). I used bash configure -shared -openmp -amberlib gnu -nohdf5 If I had to suspect anything it would be that the -nohdf5 is the cause, but I got the same issues without amber and with hdf5. However, I was unable to get hdf5 to work with amber activated.

I followed the amber22 guide for installing from source:

cd amber22_src/build
./run_cmake
make install
source /home/xxxx/amber22/amber.sh
cd $AMBERHOME
make test.serial

I got the 4 expected errors from the test as well. Hope this helps!

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

@PhiMykah more importantly, are you using macos or not?
(since the CI is using ubuntu).

@PhiMykah
Copy link
Author

@hainm I am running Ubuntu 22.04.3 LTS.

@drroe
Copy link
Contributor

drroe commented Dec 20, 2023

@PhiMykah I'm using a fresh install of AmberTools 23 on Fedora 38 (GNU 13.2.1, cmake 3.27.7). Used the run_cmake script in $AMBERHOME/build. Attaching output cmake.log.
cmake.log
After sourcing amber.sh from the install directory, the pytraj test suite seems to pass OK:

[droe@thor lib]$ cd $AMBERHOME/AmberTools/test
[droe@thor test]$ cd pytraj/
[droe@thor pytraj]$ make test.pytraj 
Testing serial pytraj
/home/droe/amber/temp/amber22//miniconda/bin/python test.py
<module 'pytraj' from '/home/droe/amber/temp/amber22/lib/python3.11/site-packages/pytraj/__init__.py'>
PASSED
======================================= test session starts ========================================
platform linux -- Python 3.11.5, pytest-7.4.0, pluggy-1.0.0 -- /home/droe/amber/temp/amber22/miniconda/bin/python
cachedir: .pytest_cache
rootdir: /home/droe/amber/temp/amber22/AmberTools
plugins: anyio-3.5.0
collected 9 items                                                                                  

../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_GB PASSED       [ 11%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_GB_QMMM PASSED  [ 22%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_PME PASSED      [ 33%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_PME_QMMM PASSED [ 44%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_PME_with_energy_decompositionr PASSED [ 55%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_frame_indices PASSED [ 66%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_gbneck2nu PASSED [ 77%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_mm_options_as_string PASSED [ 88%]
../../src/pytraj/tests/test_energy/test_sander_energies.py::TestSander::test_qm_options_as_string PASSED [100%]

======================================== 9 passed in 4.59s =========================================

Note: I did have to fix the miniconda libstdc++:

E   ImportError: /home/droe/amber/temp/amber22_src/build/CMakeFiles/miniconda/install/lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /home/droe/amber/temp/amber22/lib/libcpptraj.so)

This is because the Fedora GLIBC version is ahead of conda's. To fix it you just need to link the conda libstdc++ to the system one. This is a conda issue and does not result in the errors you were seeing:

An independent python script appears to work fine.

[droe@thor PytrajTesting]$ cat Test.py 
import pytraj as pt

traj = pt.iterload('/home/droe/Cpptraj/cpptraj/test/tz2.ortho.nc', '~/Cpptraj/cpptraj/test/tz2.ortho.parm7')
farray = traj[:]
farray.autoimage()
farray[1:3].save('temp.x', overwrite=True)
[droe@thor PytrajTesting]$ amber.python Test.py 
[droe@thor PytrajTesting]$ head temp.x
Cpptraj Generated trajectory                                                    
  15.354  28.680  16.102  16.280  28.978  16.374  14.845  29.469  15.731  15.381
  28.161  15.236  14.548  27.898  17.078  14.264  28.583  17.877  13.333  27.309
  16.382  13.615  26.793  15.464  12.712  26.624  16.960  12.454  28.349  15.905
  11.700  27.913  15.499  15.324  26.715  17.716  16.347  26.341  17.187  14.782
  26.142  18.775  13.955  26.505  19.228  15.385  24.906  19.392  16.334  24.634
  18.931  15.732  25.168  20.898  14.853  25.195  21.542  16.315  24.306  21.223
  16.605  26.438  21.130  16.087  27.572  21.679  15.088  27.582  22.090  17.101
  28.588  21.716  17.020  29.448  22.241  18.272  28.140  21.061  19.527  28.736
  20.950  19.771  29.730  21.295  20.511  27.973  20.217  21.501  28.358  20.022

So I'm unable to reproduce the issues you are seeing.

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

@drroe I confirm with you that using pytraj's master (as of today) and cpptraj (latest, with pinning cython) and cython 0.29.36 is ok on my macos.

I will test with cython 3 soon.

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

After sourcing amber.sh from the install directory, the pytraj test suite seems to pass OK:

@drroe If I understand correctly, @PhiMykah tested the code with pytraj and cpptraj in master branch, not the ones in ambertools.

@hainm
Copy link
Contributor

hainm commented Dec 20, 2023

@drroe I confirm with you that using pytraj's master (as of today) and cpptraj (latest, with pinning cython) and cython 0.29.36 is ok on my macos.

I will test with cython 3 soon.

I confirm that cython 3 is indeed an issue.

@PhiMykah
Copy link
Author

After sourcing amber.sh from the install directory, the pytraj test suite seems to pass OK:

I will attempt to obtain a fresh install of AmberTools23 and obtain the masters of pytraj and cpptraj

@drroe If I understand correctly, @PhiMykah tested the code with pytraj and cpptraj in master branch, not the ones in ambertools.

Yes, I used pytraj and cpptraj both built from source on the most recent versions, not the pytraj and cpptraj installed in ambertools. I will attempt to install again while also using cython 0.29.36 and fixing the libstdc++ import error.

@drroe
Copy link
Contributor

drroe commented Dec 21, 2023

@PhiMykah one important thing to note is that different versions of cpptraj/pytraj should not be mixed, i.e. don't use AmberTools pytraj with GitHub cpptraj or vice versa. The reason is that API changes in the GitHub version of cpptraj are only accounted for in the GitHub version of pytraj.

@PhiMykah
Copy link
Author

@drroe @hainm Great news! After much struggle, most recent version of amber with most recent version of pytraj and cpptraj seems to be working properly! Some things to note:
cmake version is 3.22.1
cython 0.29.36
pytraj installed using pip install -e . in the pytraj directory

Please let me know if theres any part of the code you would like me to contribute to now that I have a smooth working environment. What @hainm mentioned about cython is probably also correct, but I am unsure what happened that caused my earlier source to not work properly.

@hainm
Copy link
Contributor

hainm commented Dec 21, 2023

@PhiMykah for now, can you please use the cpptraj's git commit that had the green tests?

cd cpptraj
git checkout bc47e44965c04912fb2aa5018c8278d2158c33be
<build libcpptraj again>
<clean pytraj build and rebuild pytraj>
image

um, this suggestion doesn't resolve if using cython 3.0 either. So it's indeed an issue with cython.

@PhiMykah for now, please stick with cython 0.29.26 (or 0.29.x)

@hainm
Copy link
Contributor

hainm commented Dec 21, 2023

Please let me know if theres any part of the code you would like me to contribute to now that I have a smooth working environment.

@PhiMykah You can try to add support for #1599

@hainm
Copy link
Contributor

hainm commented Dec 21, 2023

Let's close this topic and move the segmentation fault with cython 3.0 to #1648

@hainm hainm closed this as completed Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants