Skip to content

LSD API and Reference

Mario Juric edited this page Jan 1, 2017 · 1 revision

Introduction

PyDoc generated documentation (ugly but informative) can be [http://nebel.rc.fas.harvard.edu/mjuric/lsd/pydocs/lsd.html found here]. These can also be read from ipython using the help command (e.g., try import lsd; help lsd.DB from an ipython instance).

The classes (correctly) documented as of v0.3 are (in order of importance):

  • DB (in submodule join_ops)
  • Query (in submodule join_ops)
  • iarray (in submodule join_ops)
  • ColGroup (in submodule colgroup)
  • interval (in submodule intervalset)
  • Table (in submodule table)

and all functions in modules:

  • bhpix
  • bounds

Please disregard all others until they're documented better.

Important classes (and methods) to know about

  • DB - main database manager; you get references to everything else via a DB instance
  • ColGroup - a functional (and API-wise) equivalent of a numpy structured array, with data internally stored in columns. All LSD functions returning rows return them as instances of this class.
  • Query - the class representing a query. Obtain an instance by calling DB.query()
    • Query.fetch() - execute the query, returning the entire result as a single ColGroup
    • Query.iterate() - execute the query, yielding the results row by row, or in blocks of rows (see its {{{return_blocks}}} argument). This function is a generator.
    • Query.execute() - execute the query, passing its results to a chain of MapReduce kernels, yielding back the result of the final kernel. This function is a generator.
  • Table - the class representing an LSD Table. Obtain an instance by calling DB.table(). Usually not meant to be used directly

Environment variables influencing LSD behavior

As of v0.3, the following variables can be used to control aspects of LSD behavior.

  • LSD_DB=<dbdir>: The default database directory. Unless overridden by {{{--db}}} parameter, LSD command line utilities will look for tables in the directory specified by this variable.
  • NWORKERS=<integer>: Number of worker processes to launch. Equal to the number of logical cores by default. If NWORKERS=1, no separate worker processes will be launched, and any computation delegated to the workers will be performed within the main process.
  • DEBUG=<0|1>: Turn debugging on. Has the following effects:
    • Sets NWORKERS=1 (single threaded computation)
    • Sets LOGLEVEL=debug (logging of debug messages) *LOG=<filename.log>: Sets the name of the log file to {{{filename.log}}}. By default, LSD appends log messages to {{{./lsd.log}}}. *LOGLEVEL=(info|debug): If set to 'debug', activates logging of debug messages (those emitted with {{{logging.debug()}}}) *{{{PIXLEVEL=<integer>: The BHpix table pixelization level. Effective only for newly created tables. The current default is 7.

Module files

lsd/bhpix.py		# Function related to BHpix projection
lsd/bounds.py		# Functions to construct space/time bounds
lsd/colgroup.py		# ColGroup class (and supporting classes)
lsd/dvo.py		# DVO->LSD import routines (!!outdated!!)
lsd/fcache.py		# Tablet directory tree cache (TabletTreeCache class)
lsd/interval.py		# intervalset class, for time intervals
lsd/join_ops.py		# class DB and all of the query logic (should be refactored)
lsd/pixelization.py	# Pixelization class (mapping of coords to cells and IDs)
lsd/pool2.py		# multiprocessing support (Pool class)
lsd/query_parser.py	# Parser for LSD queries
lsd/sdss.py		# SDSS Sweep files->LSD import routines
lsd/smf.py		# PS1 .smf files->LSD import routines
lsd/table.py		# Table class (representation of a table in the database)
lsd/tasks.py		# various misc. tasks (TODO: refactor)
lsd/tui.py		# Commonly used routines for cmdline programs
lsd/utils.py		# Various utility functions