Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exception when installing library #90

Open
mromanello opened this issue Apr 6, 2020 · 5 comments
Open

exception when installing library #90

mromanello opened this issue Apr 6, 2020 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@mromanello
Copy link
Member

mromanello commented Apr 6, 2020

The installation with python setup.py install doesn't work as expected. The resources (e.g. impresso schemas, data) are not available in the zipped folder at /home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-4opvm7nD/lib/python3.6/site-packages/text_importer-0.1 0.1-py3.6.egg.

However, the respective resources seem to be properly defined in the setup.py:

    package_data={
        'text_importer': [
           'data/*.*',
           'impresso-schemas/*.*',
           'impresso-schemas/json/newspaper/issue.schema.json'
           
        ]
    },

The resulting error when running any text importer is the following:


/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/python_jsonschema_objects/__init__.py:53: UserWarning: Schema version http://json-schema.org/draft-06/schema# not recognized. Some keywords and features may not be supported.
  self.schema["$schema"]
Traceback (most recent call last):
  File "text_importer/scripts/tetmlimporter.py", line 1, in <module>
    from text_importer.importers import generic_importer
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/text_importer-0.10.1-py3.6.egg/text_importer/importers/generic_importer.py", line 38, in <module>
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/text_importer-0.10.1-py3.6.egg/text_importer/importers/classes.py", line 14, in <module>
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/text_importer-0.10.1-py3.6.egg/text_importer/utils.py", line 49, in get_page_schema
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1146, in resource_filename
    self, resource_name
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1746, in get_resource_filename
    return self._extract_resource(manager, zip_path)
  File "/home/user/aflueck/.local/share/virtualenvs/impresso-text-acquisition-CWjik7Hc/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1767, in _extract_resource
    timestamp, size = self._get_date_and_size(self.zipinfo[zip_path])
KeyError: 'text_importer/impresso-schemas/json/newspaper/page.schema.json'

As a current workaround, one may skip the installation via setup.py. Then the issue doesn't occurs.

@mromanello mromanello pinned this issue Apr 6, 2020
@mromanello mromanello added the bug Something isn't working label Apr 6, 2020
@mromanello
Copy link
Member Author

mromanello commented Apr 6, 2020

now, if I run tox the tests stop with the following exception

        # Get all BLIP dirs (named with NLP ID)
>       blip_dirs = [x for x in os.listdir(base_dir) if os.path.isdir(os.path.join(base_dir, x))]
E       FileNotFoundError: [Errno 2] No such file or directory: '/home/romanell/Documents/impresso/impresso-text-acquisition/.tox/py36/lib/python3.6/site-packages/text_importer/data/temp/'

.tox/py36/lib/python3.6/site-packages/text_importer/importers/bl/detect.py:144: FileNotFoundError

due to the fact that we assume the existence of data/temp. the code should be made more resilient and create the data/temp folder if not existing

@mromanello mromanello changed the title exception when installing library with pip exception when installing library Apr 6, 2020
@aflueckiger
Copy link
Collaborator

Locally, tox runs fine, unlike the situation that we have on the server. Although the data/temp is missing here as well, the test with tox outputs a different error message:

tests/importers/test_bl_importer.py:5: in <module>
    from text_importer.importers.bl.classes import BlNewspaperIssue
text_importer/importers/bl/classes.py:7: in <module>
    from text_importer.importers.mets_alto import (MetsAltoNewspaperIssue,
text_importer/importers/mets_alto/__init__.py:1: in <module>
    from text_importer.importers.mets_alto.classes import MetsAltoNewspaperPage, MetsAltoNewspaperIssue
text_importer/importers/mets_alto/classes.py:11: in <module>
    from text_importer.importers.classes import NewspaperIssue, NewspaperPage
text_importer/importers/classes.py:13: in <module>
    IssueSchema = get_issue_schema()
text_importer/utils.py:67: in get_issue_schema
    with open(os.path.join(schema_path), "r") as f:
E   FileNotFoundError: [Errno 2] No such file or directory: '/home/alex/impresso-text-acquisition/text_importer/impresso-schemas/json/newspaper/issue.schema.json'

This seems to be the original error that we had after executing the setup.py:

@mromanello
Copy link
Member Author

Mmm, I don’t understand why it is looking for the Json file in your local folder instead of the one under .tox where all package-related data should have been copied...(compare with the path of the error I get and see how they differ).

@aflueckiger
Copy link
Collaborator

aflueckiger commented Apr 7, 2020

I see the difference. I get the same error also when running the setup.py before tox
The resources in .tox are created this time (checked after setup.py). However, for the text_importer, I have only the following file: /home/alex/impresso-text-acquisition/.tox/py36/lib/python3.6/site-packages/text-importer.egg-link. I don't know about this file. Apparently, it is not a zipped folder and the other resources are not there (the ones you have).

Simon told me that I should test within a conda environment on the server. I will check in a bit.

@mromanello
Copy link
Member Author

update after some fighting:

  • content of impresso-schemas is now included in the installed package; when running python setup.py sdist they get copied in the right place;
  • installation with pip (version <= 18.x) works from the cloned repository (pip install .)
  • installation with pip (version <= 18.x) from GH repo pip install https://github.com/impresso/impresso-text-acquisition/archive/issue-90/installation.zip fails (I still don't understand why as it should be the same as the command above)
  • installation with pip 19.x or pip 20.x fails because the --process-dependency-links got deprecated in these versions, and we have two deps "living" in GH instead of pypi thus requiring dependency-links for installation

the way forward seems to be: publishing impresso-commons and dask_k8 in pypi so that we don't need to use any more dependency-links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants