Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBRunRecoveryErro wrong. #25

Open
Big2Cat opened this issue Nov 8, 2017 · 3 comments
Open

DBRunRecoveryErro wrong. #25

Big2Cat opened this issue Nov 8, 2017 · 3 comments

Comments

@Big2Cat
Copy link

Big2Cat commented Nov 8, 2017

 File "/home/.virtualenvs/Spider_py2/local/lib/python2.7/site-packages/scrapy_deltafetch/middleware.py", line 79, in process_spider_output
if key in self.db:                                    
DBRunRecoveryError: (-30974, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: fatal region error detected; run recovery')

it looks like something wrong with the middleware. it occured in the process of a long time spider processing.

@sola91
Copy link

sola91 commented May 13, 2019

any news regarding this issue?

@jmgomezsoriano
Copy link

I have the same problem. I've tried to fix this problem developing this method:

def recover_cache():
    db = DB()
    db.open(DELTAFETCH_DB)
    for k, v in db.items():
        try:
            db[k]
        except KeyError:
            db[k] = v
    db.close()

And calling it to repare the index before start the crawler. This works but the error sooner or later appears again. I think the problem is that the method to store a key value in the Berkeley DB is not synchronized and when two threads or process call the method at the same time, the database is corrupted. I think it is necessary to sync the method or use a semaphor.

@smartmark-pro
Copy link

@jmgomezsoriano

And calling it to repare the index before start the crawler. This works but the error sooner or later appears again. I think the problem is that the method to store a key value in the Berkeley DB is not synchronized and when two threads or process call the method at the same time, the database is corrupted. I think it is necessary to sync the method or use a semaphor.

do you mean this http://pybsddb.sourceforge.net/ref/transapp/recovery.html?

I try to run the command "db_recover -c", the problem still exists.

I can only give up repairing db and restart it by "-a deltafetch_reset=1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants