Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconnect to an existing bond after it was broken (instance_id issue) #22

Open
dmalyuta opened this issue Jun 21, 2017 · 7 comments
Open

Comments

@dmalyuta
Copy link

It seems that because of line 309 in bond.cpp, it is not possible to re-form a bond after a node on one end died and was revived. The use case story is as follows:

  • Nodes A and B form a bond
  • Node B dies, the bond is broken and a loop is activated in A with a waitUntilFormed call
  • Node B is revived by the user, however instead of re-forming the bond, A starts printing the More than two locations are trying to use a single bond... error

Is this intentional in the bond package? It means that, when the bond is broken, what A should do in the above use case is delete the old bond and create a new, identical ("fresh") one and use that with waitUntilFormed.

@mikaelarguedas
Copy link
Member

@dmalyuta thanks for reporting,
I think that it should not be the behavior and you should be able to reform a bond. the sister_instance_id should be cleaned on sister's death one way or another without users having to cleanup the bond themselves and creating a "fresh one"

@mikaelarguedas
Copy link
Member

Looking a bit more into it, it looks like the bond state machine doesn't allow to exit the "Dead" state so allowing this use case would require more changes than just some cleanup on sister's death. I'm going to mark this as an enhancement and will try to address it either in ROS Lunar or ROS Melodic but not try to address it in existing stable distros.

For the time being I'd suggest to delete the bond and create a new one with the same id.

@dmalyuta
Copy link
Author

Sounds good. That's what I'm doing currently (deleting bond and creating a new one).

@kjeremy
Copy link

kjeremy commented Aug 18, 2017

I tried this but it doesn't always work. If I bring one of my nodes back up I still see More than two locations are trying to use a single bond

@mikaelarguedas
Copy link
Member

@kjeremy Can you provide a reproducible example ?

@ricsp
Copy link

ricsp commented Sep 5, 2017

Hi there,
any news about this issue?
I am getting the same "More than two locations.." error but only from ProcessB in the following code.

ProcessA.cpp

#include <bondcpp/bond.h>
#include <ros/spinner.h>

int main(int argc, char **argv)
{
  ros::init(argc, argv, "ProcessA", true);
  ros::AsyncSpinner spinner(1);
  spinner.start();
  bond::Bond *bond = new bond::Bond("example_bond_topic", "myBondId123456");
  ROS_INFO("A starting bond");
  bond->start();
  ROS_INFO("A waiting for bond to be formed");
  if (!bond->waitUntilFormed(ros::Duration(10.0))) {
      ROS_ERROR("ERROR!");
      return false;
  }
  ROS_INFO("A waiting for bond to be broken");
  // ... do things with B ...
  bond->waitUntilBroken(ros::Duration(20.0));
  ROS_INFO("B has broken the bond");
  ROS_INFO("A starting bond again");
  delete bond;
  bond = new bond::Bond("example_bond_topic", "myBondId123456");
  bond->start();
  ROS_INFO("A waiting for bond to be formed again");
  if (!bond->waitUntilFormed(ros::Duration(10.0)))
  {
      ROS_ERROR("ERROR!");
      return false;
  }
  ROS_INFO("A waiting for bond to be broken again");
  bond->waitUntilBroken(ros::Duration(20.0));
  ROS_INFO("B has broken the bond");
}

ProcessB.cpp

#include <bondcpp/bond.h>
#include <ros/spinner.h>

int main(int argc, char **argv)
{
  ros::init(argc, argv, "ProcessB", true);
  ros::AsyncSpinner spinner(1);
  spinner.start();

  bond::Bond *bond = new bond::Bond("example_bond_topic", "myBondId123456");

  ROS_INFO("Starting bond");
  bond->start();
  ROS_INFO("Bond started");

  ros::Duration(3.0).sleep();

  ROS_INFO("Breaking bond");
  bond->breakBond();
  ROS_INFO("Bond stopped");
  ros::Duration(3.0).sleep();
  ROS_INFO("Start again");
  delete bond;
  bond = new bond::Bond("example_bond_topic", "myBondId123456");
  bond->start();
  ros::Duration(3.0).sleep();
  bond->breakBond();
  ROS_INFO("Bond stopped");

}

@gongyue666
Copy link

does this bug sloved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants