-
Notifications
You must be signed in to change notification settings - Fork 2
Experimental Phasing of Proteinase K
In this section, we reprocess a subset of images from Proteinase K Praseodymium phasing dataset. This study was published by Sugahara et al as "Hydroxyethyl cellulose matrix applied to serial crystallography" in Scientific reports (2017). Hit images from Cheetah pipeline have been uploaded to CXIDB entry 48.
To save processing time, we use only runs 359465 to 359472. Since each run was split into three blocks, we have 24 files in total, which amount to about 23 GB. Please download them from CXIDB. If you have access to SACLA HPC, images are also available at /lfs01/2018A/8023/public/CXIDB-48-ProK-Pr/
.
To mimic the real world situation, we do not use the refined geometry but start from scratch using the initial geometry. Download cxidb_48_metadata.tar.gz
and take sacla-15oct-10keV-orig.geom
. As discussed in the README
file, adu_per_eV
lines have errors. Change q1/adu_per_eV = 0.000
to q1/adu_per_eV = 0.001
. Repeat the same for q2/adu_per_eV
, q3/adu_per_eV
, ..., up to q8/adu_per_eV
. We called this file orig.geom
.
As discussed in this wiki, we processed by (1) optimizing the detector distance (2) optimizing the beam center (3) optimizing the spot finding parameters (4) bulk indexing (5) metrology refinement (6) re-integration.
In the clen-test
folder, first make geometry files with varying the detector distance.
for len in `seq 490 5 520`; do sed 's/clen.*/clen = 0.0'$len'/' orig.geom > sacla-15oct-$len.geom; done
Also prepare a job submission script index-dirax-geom.sh
:
#!/bin/bash
#PBS -l nodes=1:ppn=14
#PBS -q serial
if [ -n "$PBS_O_WORKDIR" -a "$PBS_ENVIRONMENT" != "PBS_INTERACTIVE" ]; then
cd $PBS_O_WORKDIR
fi
source ~sacla_sfx_app/setup.sh
if [ -z "$TARGET" -o -z "$GEOM" ]; then
echo "please set TARGET and GEOM"
exit 1
fi
TARGET=${TARGET%.lst}
indexamajig -i $TARGET.lst -o dirax-$TARGET-$GEOM.stream -j 14 -g $GEOM -p ../sfx.cell --indexing=dirax --peaks=zaef --threshold=400 --min-gradient=40000 --min-snr=5 --int-radius=3,4,7
The initial unit cell is
CrystFEL unit cell file version 1.0
lattice_type = tetragonal
centering = P
unique_axis = c
a = 67 A
b = 67 A
c = 107 A
al = 90.0 deg
be = 90.0 deg
ga = 90.0 deg
We use only one HDF5 file for testing.
ls ../data/run359469-1.h5 |tee 349649-1.lst
The submit jobs.
for f in sacla*.geom; do qsub -v TARGET=349649-1.lst,GEOM=$f index-dirax-geom.sh; done
The indexing results from index_rate *.stream
:
dirax-349649-1-sacla-15oct-490.geom.stream 164 138 84.1463
dirax-349649-1-sacla-15oct-493.geom.stream 164 138 84.1463
dirax-349649-1-sacla-15oct-495.geom.stream 164 140 85.3659
dirax-349649-1-sacla-15oct-498.geom.stream 164 139 84.7561
dirax-349649-1-sacla-15oct-500.geom.stream 164 139 84.7561
dirax-349649-1-sacla-15oct-502.geom.stream 164 138 84.1463
dirax-349649-1-sacla-15oct-505.geom.stream 164 134 81.7073
(Since 49.5 and 50.0 mm looked equally good, 49.3, 49.8 and 50.2 mm were also tested)
Use detector-shifts
script to correct the beam center and re-run indexing on promising geometries.
dirax-349649-1-sacla-15oct-495-predrefine.geom.stream 164 139 84.7561
dirax-349649-1-sacla-15oct-498-predrefine.geom.stream 164 139 84.7561
dirax-349649-1-sacla-15oct-500-predrefine.geom.stream 164 139 84.7561
dirax-349649-1-sacla-15oct-502-predrefine.geom.stream 164 138 84.1463
Inspecting the unit cell distributions in cell_explorer
, 49.8 mm looked the best. It was most symmetrical;
The distribution of the a axis length for 49.5 mm had a tail to left, while that for 50.0 mm had a tail to right.
The unit cell parameters were updated to 68.4 68.4 108.5 90 90 90.
Both zaef
and peakfinder8
were tested.
dirax-349649-1-p8-th0-snr4.0-minpix2-lbg3.stream 164 144 87.8049
dirax-349649-1-p8-th0-snr4.5-minpix2-lbg3.stream 164 138 84.1463
dirax-349649-1-p8-th0-snr5.5-minpix2-lbg3.stream 164 143 87.1951
dirax-349649-1-p8-th0-snr5-minpix2-lbg3.stream 164 140 85.3659
dirax-349649-1-p8-th400-snr4.0-minpix2-lbg3.stream 164 143 87.1951
dirax-349649-1-p8-th400-snr4.5-minpix2-lbg3.stream 164 145 88.4146
dirax-349649-1-p8-th400-snr5.5-minpix2-lbg3.stream 164 140 85.3659
dirax-349649-1-zaef-th0-gr100000-snr5.stream 164 138 84.1463
dirax-349649-1-zaef-th0-gr10000-snr3.stream 164 57 34.7561
dirax-349649-1-zaef-th0-gr10000-snr5.stream 164 141 85.9756
dirax-349649-1-zaef-th0-gr200000-snr5.stream 164 139 84.7561
dirax-349649-1-zaef-th0-gr50000-snr5.stream 164 137 83.5366
dirax-349649-1-zaef-th400-gr10000-snr3.stream 164 138 84.1463
dirax-349649-1-zaef-th400-gr10000-snr5.stream 164 138 84.1463
All images were indexed with --indexing=dirax --peaks=peakfinder8 --threshold=400 --min-snr=4.5 --min-pix-count=2 --local-bg-radius=3 --int-radius=3,4,7
.
5748 out of 6506 images were indexed. The unit cell parameters were updated to 68.4 68.4 108.4 90 90 90.
The stream file was used for sensor metrology refinement.
$ geoptimiser -g sacla-15oct-498-predrefine.geom -i all.stream -o sacla-15oct-498-opt.geom -c connected -q independent 2>&1 | tee geoptimiser.log
Error for connected group q1: 724 pixels with more than 3 peaks: RMSD = 2.1457 pixels.
Error for connected group q2: 5062 pixels with more than 3 peaks: RMSD = 1.2015 pixels.
Error for connected group q3: 5919 pixels with more than 3 peaks: RMSD = 0.9634 pixels.
Error for connected group q4: 611 pixels with more than 3 peaks: RMSD = 1.0774 pixels.
Error for connected group q5: 605 pixels with more than 3 peaks: RMSD = 2.6680 pixels.
Error for connected group q6: 5789 pixels with more than 3 peaks: RMSD = 1.0072 pixels.
Error for connected group q7: 5677 pixels with more than 3 peaks: RMSD = 0.9425 pixels.
Error for connected group q8: 781 pixels with more than 3 peaks: RMSD = 1.3405 pixels.
Detector-wide error before correction: RMSD = 1.1532 pixels.
Error for connected group q1: 724 pixels with more than 3 peaks: RMSD = 1.0700 pixels.
Error for connected group q2: 5062 pixels with more than 3 peaks: RMSD = 0.8967 pixels.
Error for connected group q3: 5919 pixels with more than 3 peaks: RMSD = 0.8454 pixels.
Error for connected group q4: 611 pixels with more than 3 peaks: RMSD = 1.0447 pixels.
Error for connected group q5: 605 pixels with more than 3 peaks: RMSD = 1.0593 pixels.
Error for connected group q6: 5789 pixels with more than 3 peaks: RMSD = 0.8358 pixels.
Error for connected group q7: 5677 pixels with more than 3 peaks: RMSD = 0.8104 pixels.
Error for connected group q8: 781 pixels with more than 3 peaks: RMSD = 0.9987 pixels.
Detector-wide error after correction: RMSD = 0.8695 pixels.
Using this refined geometry, all images were re-processed.
Now 5824 out of 6505 images were indexed. Rerunning geoptimiser
again did not improve the RMSD.
Thus we took this stream file for merging.
TO BE WRITTEN
TO BE WRITTEN