Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible race condition in acquiring free ports for Drb #73

Open
christopherwharrop opened this issue Dec 20, 2019 · 1 comment
Open

Comments

@christopherwharrop
Copy link
Owner

The following was reported by Kate Friedman:

I am getting a random error when running rocotorun on Hera (see below). It doesn't happen all the time and will eventually go away after another attempt or two. Not sure if anyone else has encountered this today or reported it.

-bash-4.2$ rocotorun -d testff192.db -w testff192.xml
/usr/share/ruby/drb/drb.rb:863:in initialize': Address already in use - bind(2) (Errno::EADDRINUSE) from /usr/share/ruby/drb/drb.rb:863:in open'
from /usr/share/ruby/drb/drb.rb:863:in open_server_inaddr_any' from /usr/share/ruby/drb/drb.rb:877:in open_server'
from /usr/share/ruby/drb/drb.rb:764:in block in open_server' from /usr/share/ruby/drb/drb.rb:762:in each'
from /usr/share/ruby/drb/drb.rb:762:in open_server' from /usr/share/ruby/drb/drb.rb:1373:in initialize'
from /usr/share/ruby/drb/drb.rb:1664:in new' from /usr/share/ruby/drb/drb.rb:1664:in start_service'
from /apps/rocoto/1.3.1/sbin/rocotodbserver:45:in <main>' 09/23/19 19:19:27 UTC :: testff192.xml :: Workflow Manager Initialization failed. 09/23/19 19:19:27 UTC :: testff192.xml :: no implicit conversion of nil into String 09/23/19 19:19:27 UTC :: testff192.xml :: Unexpected failure: Workflow Manager Initialization failed. 09/23/19 19:19:27 UTC :: testff192.xml :: /usr/share/ruby/drb/drb.rb:748:in +'
/usr/share/ruby/drb/drb.rb:748:in open' /usr/share/ruby/drb/drb.rb:746:in open'
/usr/share/ruby/drb/drb.rb:1216:in initialize' /usr/share/ruby/drb/drb.rb:1196:in new'
/usr/share/ruby/drb/drb.rb:1196:in open' /usr/share/ruby/drb/drb.rb:1109:in block in method_missing'
/usr/share/ruby/drb/drb.rb:1128:in with_friend' /usr/share/ruby/drb/drb.rb:1108:in method_missing'
/apps/rocoto/1.3.1/lib/workflowmgr/dbproxy.rb:130:in rescue in initdb' /apps/rocoto/1.3.1/lib/workflowmgr/dbproxy.rb:106:in initdb'
/apps/rocoto/1.3.1/lib/workflowmgr/dbproxy.rb:31:in initialize' /apps/rocoto/1.3.1/lib/workflowmgr/workflowengine.rb:66:in new'
/apps/rocoto/1.3.1/lib/workflowmgr/workflowengine.rb:66:in initialize' /apps/rocoto/1.3.1/bin/../sbin/rocotorun.rb:34:in new'
/apps/rocoto/1.3.1/bin/../sbin/rocotorun.rb:34:in `

'

@christopherwharrop
Copy link
Owner Author

This is likely a race condition when multiple processes attempt to spin up a new server process. The call to start a new Drb server acquires a random free port. Somehow, it thinks a port is free, but then finds that it is not free when it attempts to actually use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant