A Python package for extracting and simulating diffuse scattering of X-rays in protein crystals
If you use this software, please cite the following paper: TBC
- phenix/cctbx: runs in cctbx.python (2.7.x / 3.x) and depends on cctbx modules
- dials/dxtbx: used for reading crystallographic image files
- numpy
- scipy
- matplotlib
Make sure cctbx.python
is available and the dxtbx
package is installed.
Clone the repository
git clone https://github.com/jvdhorn/decaf
Update PATH
and PYTHONPATH
source source_env.sh
Optional: install package
cctbx.python -m pip install .
Command-line parameters can be provided using Phil syntax. For example:
schimpy pdb_in=structure.pdb tls_in=optimised_model.json
Run any of the modules without arguments to get an overview of the available parameters for that module.
schimpy
- SuperCell Hierarchical Modelstimpy
- Statistical Image-processingslicemtz
- Viewing utility for mtz-filesubmtz
- Subtract two mtz-filesplotmtz
- Plot intensity distributions in mtz-filesccmtz
- Calculate correlation coefficient and R1-values between mtz-filesrmbragg
- Remove supercell voxels around Bragg positions in mtz-filemtzstats
- Show various statistics including R-int for mtz-filestimpy3d
- Apply the stimpy-procedure to resolution bins in mtz-filepatterson
- Calculate Patterson map from mtz-filepattsize
- Estimate the size of the Patterson origin-peakmtz2txt
- Extract raw intensities from mtz-filemtz2map
- Convert mtz to ccp4 map-filemap2mtz
- Convert ccp4 map-file to mtzfilter_mtz
- Apply a kernel-filter to an mtz-fileqdep
- Find power law for intensity decay around Bragg positions in mtz-filepdbrad
- Estimate the size of a pdb-objectpdbdist
- Plot C-alpha covariance matrix for multistate pdbcovbydist
- Plot C-alpha covariances by distance and fit decay functionensemble2adp
- Convert multistate pdb to (anisotropic) ADPssubadp
- Subtract ADPs from two pdb-filesbtrace
- Plot C-alpha B-factor trace
pdb_in
- refined structure file (pdb)tls_in
- result of the ECHT B-factor distribution (json)- if not provided, TLS-matrices are extracted from
pdb_in
(if available)
- if not provided, TLS-matrices are extracted from
sc_size
- size of the supercell (e.g."5 5 10"
)correlate
- enable or disable correlation of TLS-groups (defaultTrue
)use_pbc
- enable or disable periodic boundary conditions (defaultTrue
)stretch
- stretch parameter for the anchor points (default0.25
)cutoff
- distance cutoff for intermolecular interactions (default3.0
)weights
- power for the number of interactions (default1.5
)max_level
- highest level of the TLS-hierarchy to include (default1
)resolution
- resolution limit (default2.0
)k_sol
andb_sol
- bulk solvent parameters (default0.35
and50.0
)tls_multipliers
- multipliers for all input tls matrices (e.g."1.0 0.0 0.0"
)skip
- skip correlation (but not displacements!) for these levels (e.g."2 3 4"
)randomize_after
- randomly swap positions of all molecules after this level (e.g.2
)reverse
- start procedure at highest level of the hierarchy (defaultTrue
)remove_waters
- remove all water molecules from the input (defaultTrue
)swap_frac
- end correlation prematurely after this fraction of swaps (default1.0
)energy_percentile
- only allow swaps for which local energy exceeds this percentile (default0.0
)single_mtz
- write MTZ file with phases after first supercell (defaultFalse
)processes
- number of parallel simulations (default1
, watch memory usage!)interval
- number of seconds between consecutive simulations (default1.0
)n_models
- number of supercells to simulate (default128
)
image
- raw image fileradial
- polarization-corrected radial averagebin_counts
- bin regions within this range of counts (default1.0
)N
- expected number of independent rotations (default1.0
)discrete
- estimate background using discrete noisy Wilson statistics (defaultFalse
)median_size
- kernel size of the median filter (default9
)dilation_size
- kernel size of the mask dilation (default5
)
mtz
- input mtz or map-filelbl
- array of interest (defaultIDFF
)slice
- desired slice to plot (defaulthk0
)save
- save image as high-res PNG instead of plotting (defaultFalse
)depth
- additional depth of the slice on both sides (default0
)sc_size
- size of the supercell for drawing Bragg positions (e.g."5 5 10"
)overlay
- colour of axes and Bragg position indicators (defaultblack
)log
- plot log10 of the intensities (defaultFalse
)min
andmax
- minimum and maximum values to plotautoscale
- decrease upper bound for more detail in skewed slices (defaultTrue
)zoom
- zoom in around a given coordinate ("[x] [y] [pad]"
, e.g."65 65 27"
)inset
- inset zoom around a given coordinate ("[x] [y] [pad]"
, e.g."65 65 27"
)contours
- plot this many contours instead of raw values (e.g.127
)projection
- construct a cylindrical projection (gp
,eq
ormercator
)center
- shift grid to put the corner in the center (defaultFalse
)
mtz_1
- first input mtz-filemtz_2
- second input mtz-file (can be multiple)lbl_1
- first array of interest (defaultIDFF
)lbl_2
- second array of interest (defaultIDFF
)mode
- use a different operator (sub
,add
,mul
,div
ormerge
, defaultsub
)scale
- scale factor for second mtz-file (0
for autoscale, default1.0
)
mtz
- input mtz-file (can be multiple)lbl
- array of interest (defaultIDFF
)log
- plot logarithmic distributions (defaultTrue
)byres
- plot average intensity in resolution bins (defaultFalse
)resolution
- low and high resolution (e.g."3.6 3.4"
)
mtz_1
- first input mtz-filemtz_2
- second input mtz-filelbl_1
- first array of interest (defaultIDFF
)lbl_2
- second array of interest (defaultIDFF
)bins
- number of resolution shells (default10
)hlim
andklim
andllim
- limit h, k, and l (e.g."-20 20"
)resolution
- low and high resolution (e.g."3.6 3.4"
)
mtz
- input mtz-filesc_size
- supercell size (e.g."5 5 10"
)box
- number of voxels to remove around every Bragg position (default"1 1 1"
)fraction
- remove this fraction of highest intensities in every box (e.g.0.05
)subtract
- subtract common intensities in resolution shells (mean
ormin
)keep
- invert selection (defaultFalse
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)sg
- space group for R-int (symbol or number, e.g.P43212
or96
)resolution
- low and high resolution (e.g."3.6 3.4"
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)bins
- number of resolution shells (default1
)N
- expected number of independent rotations (default1.0
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)bins
- divide into this many shells and subtract the mean from each one (e.g.50
)sample
- sampling of the grid in 1/Angstrom, higher is finer (default3.0
)use_intensities
- calculate Patterson using intensities instead of F (defaultFalse
)center
- place the origin in the center of the map (defaultTrue
)limit
- real space limit in Angstrom of the output map (e.g.10.0
)resolution
- low and high resolution (e.g."3.6 3.4"
)
map
- input patterson mapbinsize
- size of radial bins in Angstrom (default1.0
)sigma
- sigma cutoff to determine size (default2.0
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)resolution
- low and high resolution (e.g."3.6 3.4"
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)fill
- fill value for missing reflections (default-1000
)resolution
- low and high resolution (e.g."3.6 3.4"
)
map
- input map-fileresolution
- resolution limit (e.g.2.0
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)size
- filter size (default1
)filter
- filter type (gaussian
oruniform
, defaultgaussian
)interpolate
- interpolate missing intensities using a normalized convolution (defaultFalse
)
mtz
- input mtz-filelbl
- array of interest (defaultIDFF
)sc_size
- supercell size (e.g."5 5 10"
)strong
- consider only this number of strongest reflections (default100
)
pdb
- input multistate pdb-file
pdb
- input multistate pdb-file (can be multiple)mode
- ensemble treatment (cov
,cc
,std
,var
ormean
, defaultcov
)combine
- multiple input treatment (both
,sub
,div
,add
ormul
, defaultboth
)lines
- plot lines at these x and y-positions (e.g."25.5 75.5"
)
pdb
- input multistate pdb-fileinclude_neighbours
- include neighbouring molecules in the analysis (defaultTrue
)
pdb
- input multistate pdb-filemodels
- limit number of models from the input pdb (e.g.100
)
pdb_1
- first input pdb-filepdb_2
- second input pdb-file (can be multiple)mode
- use a different operator (sub
oradd
, defaultsub
)
input
- input pdb or json (can be multiple)lines
- plot vertical lines at these x-positions (e.g."25.5 75.5"
)
This software relies on methods described in the following papers:
- Chapman, Henry N., et al. "Continuous diffraction of molecules and disordered molecular crystals." Journal of applied crystallography 50.4 (2017): 1084-1103.
- Pearce, Nicholas M., and Piet Gros. "A method for intuitively extracting macromolecular dynamics from structural disorder." Nature communications 12.1 (2021): 5493.
- Urzhumtsev, Alexandre, et al. "From deep TLS validation to ensembles of atomic models built from elemental motions. Addenda and corrigendum." Acta Crystallographica Section D: Structural Biology 72.9 (2016): 1073-1075.