Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeWarning: overflow encountered in reduce #81

Open
DinosaurInSpace opened this issue Nov 12, 2019 · 5 comments
Open

RuntimeWarning: overflow encountered in reduce #81

DinosaurInSpace opened this issue Nov 12, 2019 · 5 comments

Comments

@DinosaurInSpace
Copy link

I am calculating the Mordred descriptors for a subset of 10k or so of HMDB. I get the following errors:

"
RuntimeWarning: overflow encountered in reduce
return ufunc.reduce(obj, axis, dtype, out, **passkwards)
"

Interestingly, the number of errors increases over time as I run through the set. I am not sure if this is some issue with the system, or perhaps a bias as you go through the hmdb subset.

I am happy to send over the hmdb id's and structures as pickle if you are interested.

Thanks so much! Super happy with everything else so far...

--
I am running very simple code for this operation per your tutorial:

calc = Calculator(descriptors)
df = calc.pandas('mol')

--

channels:

  • rdkit
  • bioconda
  • mordred-descriptor
  • conda-forge
  • anaconda
  • defaults
    dependencies:
  • altair=3.2.0=py36_0
  • appnope=0.1.0=py36hf537a9a_0
  • asn1crypto=1.2.0=py36_0
  • attrs=19.2.0=py_0
  • backcall=0.1.0=py36_0
  • blas=1.0=mkl
  • bleach=3.1.0=py36_0
  • bzip2=1.0.8=h1de35cc_0
  • ca-certificates=2019.10.16=0
  • cairo=1.14.12=hc4e6be7_4
  • certifi=2019.9.11=py36_0
  • cffi=1.13.1=py36hb5b8e2f_0
  • chardet=3.0.4=py36_1003
  • cryptography=2.8=py36ha12b0ac_0
  • cycler=0.10.0=py_1
  • dbus=1.13.6=h90a0687_0
  • decorator=4.4.0=py36_1
  • defusedxml=0.6.0=py_0
  • entrypoints=0.3=py36_0
  • expat=2.2.6=h0a44026_0
  • fontconfig=2.13.0=h5d5b041_1
  • freetype=2.10.0=h24853df_1
  • gettext=0.19.8.1=h15daf44_3
  • glib=2.56.2=hd9629dc_0
  • icu=58.2=h4b95b61_1
  • idna=2.8=py36_0
  • intel-openmp=2019.5=281
  • ipykernel=5.1.2=py36h39e3cac_0
  • ipython=7.8.0=py36h39e3cac_0
  • ipython_genutils=0.2.0=py36h241746c_0
  • ipywidgets=7.5.1=py_0
  • jedi=0.15.1=py36_0
  • jinja2=2.10.3=py_0
  • joblib=0.13.2=py36_0
  • jpeg=9b=he5867d9_2
  • jsonschema=3.0.2=py36_0
  • jupyter=1.0.0=py36_7
  • jupyter_client=5.3.3=py36_1
  • jupyter_console=6.0.0=py36_0
  • jupyter_core=4.5.0=py_0
  • kiwisolver=1.1.0=py36h770b8ee_0
  • libboost=1.67.0=hebc422b_4
  • libcxx=4.0.1=hcfea43d_1
  • libcxxabi=4.0.1=hcfea43d_1
  • libedit=3.1.20181209=hb402a30_0
  • libffi=3.2.1=h475c297_4
  • libgfortran=3.0.1=h93005f0_2
  • libiconv=1.15=hdd342a3_7
  • libpng=1.6.37=h2573ce8_0
  • libsodium=1.0.16=h3efe00b_0
  • libtiff=4.0.10=hcb84e12_2
  • libxml2=2.9.9=hf6e021a_1
  • libxslt=1.1.33=h33a18ac_0
  • llvm-openmp=4.0.1=hcfea43d_1
  • llvmlite=0.30.0=py36h98b8051_0
  • lxml=4.4.1=py36hef8c89e_0
  • markupsafe=1.1.1=py36h1de35cc_0
  • matplotlib=3.1.1=py36_1
  • matplotlib-base=3.1.1=py36h3a684a6_1
  • matplotlib-venn=0.11.5=py_1
  • mistune=0.8.4=py36h1de35cc_0
  • mkl=2019.5=281
  • mkl-service=2.3.0=py36hfbe908c_0
  • mkl_fft=1.0.14=py36h5e564d8_0
  • mkl_random=1.1.0=py36ha771720_0
  • mordred=1.2.0=pyhe5148d4_0
  • nbconvert=5.6.0=py36_1
  • nbformat=4.4.0=py36h827af21_0
  • ncurses=6.1=h0a44026_1
  • networkx=2.4=py_0
  • notebook=6.0.1=py36_0
  • numba=0.46.0=py36h6440ff4_0
  • numpy=1.17.2=py36h99e6662_0
  • numpy-base=1.17.2=py36h6575580_0
  • olefile=0.46=py36_0
  • openssl=1.1.1=h1de35cc_0
  • pandas=0.25.1=py36h0a44026_0
  • pandoc=2.2.3.2=0
  • pandocfilters=1.4.2=py36_1
  • parso=0.5.1=py_0
  • pcre=8.43=h0a44026_0
  • pexpect=4.7.0=py36_0
  • pickleshare=0.7.5=py36_0
  • pillow=6.2.0=py36hb68e598_0
  • pip=19.2.3=py36_0
  • pixman=0.38.0=h1de35cc_0
  • prometheus_client=0.7.1=py_0
  • prompt_toolkit=2.0.10=py_0
  • ptyprocess=0.6.0=py36_0
  • py-boost=1.67.0=py36h6440ff4_4
  • pycparser=2.19=py36_0
  • pygments=2.4.2=py_0
  • pyimzml=1.2.6=py_1
  • pyopenssl=19.0.0=py36_0
  • pyparsing=2.4.2=py_0
  • pyqt=5.9.2=py36h655552a_0
  • pyrsistent=0.15.4=py36h1de35cc_0
  • pysocks=1.7.1=py36_0
  • pyteomics=4.1.2=py_0
  • python=3.6.8=haf84260_0
  • python-dateutil=2.8.0=py36_0
  • pytz=2019.3=py_0
  • pyzmq=18.1.0=py36h0a44026_0
  • qt=5.9.7=h468cd18_1
  • qtconsole=4.5.5=py_0
  • rdkit=2019.03.4.0=py36h65625ec_1
  • readline=7.0=h1de35cc_5
  • requests=2.22.0=py36_0
  • scikit-learn=0.21.3=py36h27c97d8_0
  • scipy=1.3.1=py36h1410ff5_0
  • send2trash=1.5.0=py36_0
  • setuptools=41.4.0=py36_0
  • sip=4.19.13=py36h0a44026_0
  • six=1.12.0=py36_0
  • spectrum_utils=0.3.2=py_2
  • sqlalchemy=1.3.10=py36h1de35cc_0
  • sqlite=3.30.0=ha441bb4_0
  • tbb=2019.8=h04f5b5a_0
  • terminado=0.8.2=py36_0
  • testpath=0.4.2=py36_0
  • tk=8.6.8=ha441bb4_0
  • toolz=0.10.0=py_0
  • tornado=6.0.3=py36h01d97ff_0
  • tqdm=4.36.1=py_0
  • traitlets=4.3.3=py36_0
  • wcwidth=0.1.7=py36h8c6ec74_0
  • webencodings=0.5.1=py36_1
  • wheel=0.33.6=py36_0
  • wheezy.template=0.1.167=py_1
  • widgetsnbextension=3.5.1=py36_0
  • xz=5.2.4=h1de35cc_4
  • zeromq=4.3.1=h0a44026_3
  • zlib=1.2.11=h1de35cc_3
  • zstd=1.3.7=h5bba6e5_0
  • pip:
    • boto3==1.10.10
    • botocore==1.13.10
    • docutils==0.15.2
    • elasticsearch==5.4.0
    • elasticsearch-dsl==5.3.0
    • jmespath==0.9.4
    • metaspace2020==1.4.3
    • plotly==4.2.1
    • pymspec==0.1.2
    • pyyaml==5.1.2
    • retrying==1.3.3
    • s3transfer==0.2.1
    • urllib3==1.25.6

--

OS/distribution

Mac OSX 10.15 Catalina

conda or pip

conda

python version

Python 3.6.8 :: Anaconda, Inc.

library version

rdkit 2019.03.4.0 py36h65625ec_1 rdkit

@remseven
Copy link

remseven commented Feb 11, 2021

I am facing the same issue, when running Mordred in command line, sometimes even with a single molecule. It seems that it occurs mostly with large molecules.

@plkx
Copy link

plkx commented Feb 23, 2021

I've confirmed this problem, also.

I am trying to isolate the source by backtracking through compound sets and various python updates. I have previously run much larger molecules without the error.

It may have to wait a few days before this gets prioritized for me.

Regards,

PLKX

@plkx
Copy link

plkx commented Feb 23, 2021

Question either poster who have experienced this problem:

What, if any modifications to mordred are there in your system?

For example, does your environment have code modifications such as these (or comparable): #80 (comment) #80 (comment) ??

I see that DinosaurInSpace is using networkx=2.4. I am using 2.5 with requisite code modifications in DetourMatrix.py which allow successful completion of mordred self-tests.

The mordred self-tests do not test any molecules with a number of atoms equal or greater to those for which I have encountered the numpy overflow warning. Since the detour matrix is an n×n matrix for n = the number of heavy atoms (non-hydrogen), this seems the most logical starting point to track this problem.

However, if the problem occurs in the absence of the DetourMatrix.py modifications, I may look elsewhere.

Thanks,

PLKX

@batmanscode
Copy link

batmanscode commented Mar 5, 2021

I'm having this overflow reduce problem as well.

image

Env: google colab

Code:

! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
! chmod +x Miniconda3-py37_4.8.2-Linux-x86_64.sh
! bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -f -p /usr/local
! conda install -c rdkit rdkit -y
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')
# used rdkit to calculate lipinski descriptors in the same notebook before installing and using mordred

pip install -q 'mordred[full]'

from rdkit import Chem
from mordred import Calculator, descriptors

mols = [Chem.MolFromSmiles(smi) for smi in data['canonical_smiles']]
df = calc.pandas(mols)

data:
canon_smiles.txt

@remseven
Copy link

remseven commented Mar 29, 2021

@plkx, Sorry for not answering earlier...
In my case I have modified environment to use networkx=2.1.0, based on what I had read in previous issues. This was about a year ago when I installed Mordred for the first time. At the time it seemed to fix the issues I faced (probably in self test).

Last week, I got this overflow problem:
~/anaconda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:87: RuntimeWarning: overflow encountered in reduce
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

Previously I also had this one:
~/anaconda/lib/python3.6/site-packages/mordred/_matrix_attributes.py:251: RuntimeWarning: invalid value encountered in double_scalars
s += (eig.vec[i, eig.max] * eig.vec[j, eig.max]) ** -0.5
~/anaconda/lib/python3.6/site-packages/mordred/_matrix_attributes.py:251: RuntimeWarning: divide by zero encountered in double_scalars
s += (eig.vec[i, eig.max] * eig.vec[j, eig.max]) ** -0.5

I hope this can help. Let me know if you want me to run a few test. Sadly I can't communicate the structures I am studying.
I have run the list from @batmanscode and I do get some overflow too (see chosen pieces attached).

canon_smiles_calculated.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants