Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpmi: move some complexity to broker #2172

Merged
merged 12 commits into from
Jun 5, 2019
Merged

Commits on Jun 5, 2019

  1. libpmi2: drop incomplete PMI-2 library

    Problem: our PMI-2 library is incomplete and is an
    attractive nuisance at this point.
    
    It was added to allow OpenMPI jobs to be launched under
    flux when OpenMPI, through its MCA plugins, determines that
    it should dlopen slurm's libpmi2.so.  OpenMPI could be
    tricked via LD_LIBRARY_PATH into opening a Flux libpmi2.so
    that provides the minimal API needed by OpenMPI, and maps
    it to the simple PMI-1 wire protocol.  For more background
    see flux-framework#746
    
    Since we submitted a flux MCA module that helps OpenMPI
    choose the right PMI in the flux environment, and shipping
    this libpmi2.so with flux is misleading as it doesn't implement
    the full PMI-2 "spec", let's get rid of it.
    
    Fixes flux-framework#2082
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    ca5ae3f View commit details
    Browse the repository at this point in the history
  2. libpmi: refactor duplicate pmi_cliquetostr()

    Problem: function for converting array of integers
    to comma-separated string exists in both test/pminfo.c
    and test/clique.c.
    
    Refactor to pmi_cliquetostr() in clique.[ch].
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    f30bf66 View commit details
    Browse the repository at this point in the history
  3. libpmi: drop singleton and dlopen methods

    Problem: the PMI implementation in libpmi is a source of
    confusion because it's used by the flux broker to bootstrap
    in non-Flux environments, AND it is the PMI library provided
    to users for launching under Flux.
    
    We can significantly dumb this thing down if it is assumed
    to only be used by users running under Flux and then it will
    be more understandable when looked at only from that perspective.
    Drop singleton and dlopen (wrap) methods, and in fact drop the
    method abstraction and just assume the "simple client" is the
    only method.
    
    Update pmi unit tests.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    67a163f View commit details
    Browse the repository at this point in the history
  4. libpmi: add some arg checks

    Problem: PMI library doesn't consistently protect against
    NULL parameters used to return values (segfault).
    
    Return PMI_ERR_INVALID_ARG in those cases.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    1ba24e9 View commit details
    Browse the repository at this point in the history
  5. libpmi: fix strtoul() error handling

    Problem: strtoul() and related functions are used without
    proper error handling.
    
    As recommended in strtoul(3), one must set errno = 0 before
    the call and check for errno != 0 after the call.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    fdf97cd View commit details
    Browse the repository at this point in the history
  6. libpmi: move clique emulation to simple_client

    Problem: the code for obtaining the "clique" information
    from the 'PMI_process_mapping' key hardwires canonical
    PMI API functions, but a user (internal for now) might want
    to use the handled simple_client functions instead.
    
    Move this code to the simple_client class.  Currently this is
    used behind the canonical clique functions in the Flux libpmi.so,
    but not internally.  In the past it was used to assist with
    PGM wireum.
    
    Update test/pminfo.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    2b45fb6 View commit details
    Browse the repository at this point in the history
  7. broker/boot_pmi: add dlopen/singleton support

    Problem: support dlopen()ing libpmi.so and for
    falling back to singleton operation was removed
    from the Flux libpmi.so, but the broker needs
    this capability.
    
    Re-add this capability, with significantly reduced
    complexity given that it need not be a faithful
    PMI-1 API interface.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    f2206b2 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    d33cec8 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    88eccc4 View commit details
    Browse the repository at this point in the history
  10. testsuite: cover PMI_Get_universe_size()

    Problem: libpmi/test/pminfo.c didn't call PMI_Get_universe_size().
    
    Add a call.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    78c35b5 View commit details
    Browse the repository at this point in the history
  11. testsuite: refactor PMI simple test for server reuse

    Pull out the server thread to its own reusable module,
    and change it slightly so that it could support multiple
    clients.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    ff4d053 View commit details
    Browse the repository at this point in the history
  12. testsuite: add test of canonical PMI-1 API

    Problem: the PMI_*() API functions don't have unit
    tests per se.
    
    Add a unit test for these interfaces.
    garlick committed Jun 5, 2019
    Configuration menu
    Copy the full SHA
    60847e1 View commit details
    Browse the repository at this point in the history