Skip to content

wikimedia/service-runner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

service-runner

Generic nodejs service runner & supervisor

Features

  • Supervise and cluster node services in a generic manner with a minimal interface:
module.exports = function (options) {
    var config = options.config;
    // Logger instance
    var logger = options.logger;
    // Metrics reporter (statsd,log)
    var metrics = options.metrics;

    // Start the app, returning a promise.
    // Return an object with a `close()` function for clean shut-down support.
    // (ex: node's HTTP server instances).
    return startApp(config, logger, metrics);
}
  • standard command line parameters:
Usage: service-runner.js [command] [options]

Commands:
  docker-start  starts the service in a Docker container
  docker-test   starts the test process in a Docker container
  build         builds the service's package and deploy repo

Options:
  -n, --num-workers  number of workers to start                    [default: -1]
  -c, --config       YAML-formatted configuration file
                                             [string] [default: "./config.yaml"]
  -f, --force        force the operation to execute   [boolean] [default: false]
  -d, --deploy-repo  build only the deploy repo       [boolean] [default: false]
  -r, --review       send the patch to Gerrit after building the repo
                                                      [boolean] [default: false]
  --verbose          be verbose                       [boolean] [default: false]
  -v, --version      print the service's version and exit
                                                      [boolean] [default: false]
  -h, --help         Show help                                         [boolean]
  • config loading
  • flexible logging using bunyan, including logstash support via gelf: logger.log('info/request', { message: 'foo', uri: req.uri })
  • metric reporting using statsd, logging, and/or Prometheus. (See lib/metrics/index.js:Metrics.makeMetric())
  • heap dumps

Usage

npm install --save service-runner

As a binary

In package.json, configure npm start to call service-runner:

  "scripts": {
    "start": "service-runner"
  }

Create a config.yaml file following the spec below. Make sure to point the module parameter to your service's entry point.

Finally, start your service with npm start. You can also pass parameters to service-runner like this: npm start -- -c /etc/yourservice/config.yaml.

As a library

Service-runner can also be used to run services within an application. This is can also be used to run services for testing or other purposes.

Example script for starting a service, using commandline options:

var ServiceRunner = require('service-runner');
new ServiceRunner().start();

It is also possible to skip commandline options, and pass in a config directly to ServiceRunner.start() (see the config section for details on available options). Here is an example demonstrating this, as well as return values & the stop() method:

var ServiceRunner = require('service-runner');
var runner = new ServiceRunner();

var startupPromise = runner.start({
    num_workers: 0,
    services: [{
        name: 'parsoid',
        conf: {...}
    }],
    logging: {...},
})
.then(function(startupResults) {
    // startupResults is an array of arrays of objects returned by each
    // service. These objects should be JSON.stringify()-able.
})
.then(function() {
    // To stop a service, call the stop() method
    return runner.stop();
});

Config loading

  • Default config locations in a project: config.yaml for a customized config, and config.example.yaml for an example config for a service.
  • You can specify the location of the configuration file in two ways: by using the -c/--config command-line option; or by setting the APP_CONFIG_PATH environment variable. If both are specified, the environment variable takes precedence.
  • By default, we assume that your project depends on service-runner and you follow standard node project layout. However, if a custom layout is used, you must override the app base path with either:
    • APP_BASE_PATH environment variable
    • app_base_path config stanza.
  • If the project requires cancellable promises (which are disabled by default) you must set the APP_ENABLE_CANCELLABLE_PROMISES environment variable to a non-empty and truth-y value (like 1 or true). For more information about cancellable promises please refer to the Bluebird documentation.
  • Default top-level config format (draft):
# Number of worker processes to spawn.
# Set to 0 to run everything in a single process without clustering.
num_workers: ncpu

# Number of workers to start in parallel after the first worker.
# The first worker is always started independently. After it has completed
# its start-up, this number controls the number of workers to start in
# parallel until `num_workers` have been started. Note that setting this
# number to a too high a value might lead to high resource consumption
# (especially of CPU) during the start-up process.
startup_concurrency: 2

# Number of milliseconds to wait for a heartbeat from worker before killing
# and restarting a worker. 'false' means disabling the heartbeat feature. 
worker_heartbeat_timeout: 7500

# Logger info
logging:
  level: info
  # Sets up sample logging for some 'interesting' events.
  # Map keys correspond to the full log level names.
  # Map values specify the probability for such events to be logged
  # regardless of the configured logging level.
  sampled_levels:
    'trace/webrequest': 0.2
  streams:
  - type: stdout # log to stdout
    named_levels: true # emit log level name instead of index. e.g. INFO vs 30
  # Use gelf-stream -> logstash
  - type: gelf
    host: logstash1003.eqiad.wmnet
    port: 12201
    # Alternatively you can provide a comma-separated list of host:port pairs,
    # the server to use will be selected randomly on startup
    # uris: logstash1001.eqiad.wmnet:12201,logstash1003.eqiad.wmnet:12201

# Statsd metrics reporter
metrics:
  - type: statsd
    host: localhost
    port: 8125
    batch: # Metrics batching options. Supported only for `statsd` reporter type
      max_size: 1500 # Max size of the batch buffer (default: 1500)
      max_delay: 1000  # Max delay for an individual metric in milliseconds (default: 1000)

# Prometheus metrics endpoint
  - type: prometheus
    port: 9000

# Rate limiter (enabled by default)
ratelimit:
  type: memory
  # optional: Kademlia backend
  # type: kad
  # seeds:
  #  - 192.0.2.10
  #  - 192.0.2.20

# DNS caching, switched on by default. To disable caching use:
# dns_cache: false
# To specify caching parameters use:
dns_cache:
  ttl: 5 # Time-to-live for cache entries, in seconds. Default: 5
  size: 100 # Optional cache size. Default: 100  

services:
  - name: parsoid
    # a relative path or the name of an npm package, if different from name
    # module: ./lib/server.js

    # optionally, a version constraint of the npm package
    # version: ^0.4.0

    # per-service config
    conf:
        port: 12345
        interface: localhost
        # more per-service config settings

In the configuration file itself, you can also use environment variables:

field: '{env(ENV_VAR_NAME[,default_value])}'

The service's environment will be inspected, and if the value of ENV_VAR_NAME is defined, it will be used in the configuration. Additionally, one can also supply a default value in case the environment does not contain the sought value.

All file paths in the config are relative to the application base path. The base path is an absolute path to the folder where your application is located (where package.json file is located).

We are also working on a standard template for node services, which will set up this & other things for you.

Metric reporting

We basically expose the node-statsd interface:

// Timing: sends a timing command with the specified milliseconds
options.metrics.timing('response_time', 42);

// Increment: Increments a stat by a value (default is 1)
options.metrics.increment('my_counter');

// Decrement: Decrements a stat by a value (default is -1)
options.metrics.decrement('my_counter');

// Histogram: send data for histogram stat
options.metrics.histogram('my_histogram', 42);

// Gauge: Gauge a stat by a specified amount
options.metrics.gauge('my_gauge', 123.45);

// Set: Counts unique occurrences of a stat (alias of unique)
options.metrics.set('my_unique', 'foobar');
options.metrics.unique('my_unique', 'foobarbaz');

// Incrementing multiple items
options.metrics.increment(['these', 'are', 'different', 'stats']);

// Sampling, this will sample 25% of the time
// the StatsD Daemon will compensate for sampling
options.metrics.increment('my_counter', 1, 0.25);

// Tags, this will add user-defined tags to the data
options.metrics.histogram('my_histogram', 42, ['foo', 'bar']);

All metrics are automatically prefixed by the config-provided service name / graphite hierachy prefix to ensure a consistent graphite metric hierarchy.

Rate limiting

Service-runner provides an efficient ratelimiter instance backed by limitation. All per-request checks are done in-memory for low latency and minimal overhead.

To enforce a limit:

// Sets limit to 10 req/s, returns true if above limit.
var isAboveLimit = options.ratelimiter.isAboveLimit('some_limit_key', 10);

Several backends are supported. By default, a simple in-memory backend is used. For clusters, a Kademlia DHT based backend is available. Basic Kademlia configuration:

ratelimiter:
  type: kademlia
  # Cluster nodes
  seeds:
    # Port 3050 used by default
    - 192.168.88.99

Advanced Kademlia options:

ratelimiter:
  type: kademlia
  # Cluster nodes
  seeds:
    # Port 3050 used by default
    - 192.168.88.99
    - address: some.host.com
      port: 6030

  # Optional
  # Address / port to listen on
  # Default: localhost:3050, random port fall-back if port used
  listen:
    address: localhost
    port: 3050
  # Counter update / block interval; Default: 10000ms
  interval: 10000

Worker status tracking

At any point of the execution the service can emit a service_status message to update the worker status. Statuses are tracked and reported when the worker dies or is killed on a timeout, which is useful for debugging failure reasons.

To emit a status update use the following code:

process.emit('service_status', {
   type: 'request_processing_begin',
   uri: req.uri.toString(),
   some_other_property: 'some_value'
})

Note: The status message could be an arbitrary object, however it must not contain cyclic references.

Issue tracking

Please report issues in the service-runner phabricator project.

See also