Skip to content
biochem_fan edited this page Apr 26, 2024 · 7 revisions

CrystFEL 0.11

This is not part of the tutorial; this is my notes on installation and results of initial evaluation.

Installation

SACLA HPC does not have direct internet access. Thus, compilation with Meson is tricky.

Meson

wget https://github.com/mesonbuild/meson/releases/download/1.4.0/meson-1.4.0.tar.gz
wget https://www.desy.de/~twhite/crystfel/crystfel-0.11.0.tar.gz
# then rsync to the HPC

tar xzvf meson-1.4.0.tar.gz 
ninja-linux.zip 
mv ninja ~/local/bin/

HDF5 libraries

With CMake, specifying which HDF5 libraries to use is easy; just set the HDF5_ROOT variable. It is harder with Meson.

Older Meson could find HDF5 by h5cc but this feature was deleted in 2020 (why?). Meson needs pkg-config's *.pc files. These are created only when the HDF5 library is built with CMake, but not with Autotools. This missing feature is one of the earliest issues (#8 out of > 4400) in their GitHub, but has not been addressed. In summary, we have to build HDF5lib with CMake.

Building HDF5 with CMake is very tricky! Carefully follow guidances in release_docs/INSTALL_CMake.txt.

It is essential that all files and directories have the exact name. For example, zlib-1.3.1.tar.gz is not picked up by the build script and the resulting library lacks deflate filter support.

wget https://github.com/HDFGroup/hdf5/archive/refs/tags/hdf5_1.14.4.2.tar.gz
# rsync to the HPC
wget https://github.com/HDFGroup/hdf5_plugins/archive/refs/tags/hdf5-1.14.4.tar.gz
# rsync to the HPC and RENAME this to hdf5_plugins.tar.gz
# Why do they name this as "hdf5" not "hdf5_plugins"? Very confusing.
wget https://github.com/madler/zlib/releases/download/v1.3.1/zlib-1.3.1.tar.gz
# rsync to the HPC and RENAME this to zlib-1.3.tar.gz

mkdir hdf5-1.14.4.2-build
cd hdf5-1.14.4.2-build
# put hdf5_plugins.tar.gz and zlib-1.3.tar.gz here
tar xzvf hdf5_1.14.4.2.tar.gz
# The directory name must be EXACTLY like this!
# I would say this is a packaging error.
mv hdf5-hdf5_1.14.4.2 hdf5-1.14.4-2

cp hdf5-1.14.4-2/config/cmake/scripts/CTestScript.cmake .
cp hdf5-1.14.4-2/config/cmake/scripts/HDF5config.cmake .
cp hdf5-1.14.4-2/config/cmake/scripts/HDF5options.cmake .

module load  cmake/3.20.3_gcc-4.8.5
INSTALLDIR=/home/sacla_sfx_app/local ctest -S HDF5config.cmake,BUILD_GENERATOR=Unix -C Release -VV -O hdf5.log

cd ~/local
bash ../packages/hdf5-1.14.4.2-build/build/HDF5-1.14.4.2-Linux.sh
# say No to "Do you want to include the subdirectory HDF5-1.14.4.2-Linux?"

After all this hussle, the pkg-config files are broken! We have to fix the path.

cd /home/sacla_sfx_app/local/HDF_Group/HDF5/1.14.4.2/lib/pkgconfig
sed -i.bak -e s,packages/hdf5-1.14.4.2-build/HDF_Group/HDF5/1.14.4,local/HDF_Group/HDF5/1.14.4.2,g *.pc

# This replaces the wrong prefix
# prefix=/home/sacla_sfx_app/packages/hdf5-1.14.4.2-build/HDF_Group/HDF5/1.14.4
# into
# prefix=/home/sacla_sfx_app/local/HDF_Group/HDF5/1.14.4.2

A good new is that HDF5 libraries built in this way can read EIGER HDF5 files without setting HDF5_PLUGIN_PATH. This can be confirmed by:

h5dump -d /entry/data/data -s 0 -S 1 -c 1 -k 1 some_EIGER_dataset_data_000001.h5 

CrystFEL 0.11

Because we do not have internet access, we have to disable automatic downloading of dependencies.

wget https://www.desy.de/~twhite/crystfel/crystfel-0.11.0.tar.gz
# rsync to the HPC

tar xzvf crystfel-0.11.0.tar.gz
cd crystfel-0.11.0

# module load gsl/2.7_gcc-4.8.5
# Meson uses pkg-config. The SALCA-provided module sets
# CPATH, LD_LIBRARY_PATH, LIBRARY_PATH but not PKG_CONFIG_PATH.
# Thus one ends up using GSL 2.7's header but linking to the system provided GSL 1.15!

module load python/3.8.10_gcc-4.8.5
export PKG_CONFIG_PATH=/home/sacla_sfx_app/local/HDF_Group/HDF5/1.14.4.2/lib/pkgconfig:/home/sacla_sfx_app/local/lib64/pkgconfig:/home/software/local/gsl-2.7_gcc-4.8.5/lib/pkgconfig:/usr/share/pkgconfig
~/packages/meson-1.4.0/meson.py setup build --wrap-mode=nodownload
ninja -C build

We do not "install" this, because Ninja install strips rpath and causes various problems.

This shows following missing libraries but they are inconsequential:

  • We do not use on-the-fly related features.
  • We can export MTZ files via create-mtz.
  • Millepede can be built separately (see below).
Found CMake: NO
Run-time dependency libzmq found: NO (tried pkgconfig and cmake)
Run-time dependency libasapo-producer found: NO (tried pkgconfig and cmake)
Run-time dependency seedee found: NO (tried pkgconfig and cmake)
Run-time dependency libccp4c found: NO (tried pkgconfig and cmake)
Run-time dependency msgpack-c found: NO (tried pkgconfig and cmake)
Run-time dependency msgpack found: NO (tried pkgconfig and cmake)
Subproject  millepede is buildable: NO (disabling)
Program pandoc found: NO
Program pandoc found: NO

Millepede

wget https://gitlab.desy.de/claus.kleinwort/millepede-ii/-/archive/V04-16-01/millepede-ii-V04-16-01.tar.gz
# rsync to the HPC

tar xzvf millepede-ii-V04-16-01.tar.gz 
cd millepede-ii-V04-16-01/
make # do not use -jX!

cp pede ~/local/bin

For unknown reasons, compilation was unsuccessful with newer GCCs, saying "undefined reference to `_gfortran_os_error_at'".

Changes to the geometry file

We have to comment out these lines:

;mask_good = 0x00            ; instead, we can specify bad regions below if necessary
;mask_bad = 0xFF

Otherwise CrystFEL 0.11.0 complains: "You have specified good/bad bits for mask 0 of panel q1, but not the mask location."

Add this after panels are defined to enable geometry refinement by Millipede:

group_all = q1,q2,q3,q4,q5,q6,q7,q8

Unlike rigid groups, groups cannot be defined before its panels are defined.

Very annoyingly, once group_all is added, the geometry file is no longer compatible with older versions of CrystFEL. Stream files generated by 0.11 also become unreadable, because their header contains a copy of the geometry file. This can be patched by commenting out reject = 1; in libcrystfel/src/datatemplate.c#L1239-L1244.

Testing

Can we refine the detector distance?

I took run 198167-1 with 476 hits and ~ 340 indexable patterns. The correct distance judged by the index rate and the histogram of unit cell parameters from CrystFEL 0.10.2 was 50.4 mm.

Indexing with various detector distances and running align_detector -l0 --out-of-plane gave:

$ grep z-tra *op.log
mille-492-op.log:    z-translation -0.919960 mm
mille-496-op.log:    z-translation -0.711770 mm
mille-500-op.log:    z-translation -0.524220 mm
mille-504-op.log:    z-translation -0.240980 mm
mille-508-op.log:    z-translation -0.126700 mm
mille-512-op.log:    z-translation +0.075574 mm

This suggests that something near 51.0 mm is the best, which is inconsistent with the result from 0.10.2. Furthermore, the shifts are too small; changing the initial distance by 0.4 mm changes the shift by only about 0.2 mm.

Refining tilt and twist might be too unstable with such a small number of crystals. I hacked align_detector.c to refine z-translation but not tilt and twist.

66c66
<              "      --out-of-plane         Also refine out of x/y plane\n"
---
>              "      --out-of-plane         Also refine Z-translation (but not tilt & twist)\n"
179,180c179,180
<               write_zero_sum(fh, g, groups, n_groups, GPARAM_DET_RX);
<               write_zero_sum(fh, g, groups, n_groups, GPARAM_DET_RY);
---
> //            write_zero_sum(fh, g, groups, n_groups, GPARAM_DET_RX);
> //            write_zero_sum(fh, g, groups, n_groups, GPARAM_DET_RY);
402,403c402,403
<       fprintf(fh, "%i 0 %i\n", mille_label(0, GPARAM_DET_RX), out_of_plane ? 0 : -1);
<       fprintf(fh, "%i 0 %i\n", mille_label(0, GPARAM_DET_RY), out_of_plane ? 0 : -1);
---
>       fprintf(fh, "%i 0 %i\n", mille_label(0, GPARAM_DET_RX), -1);
>       fprintf(fh, "%i 0 %i\n", mille_label(0, GPARAM_DET_RY), -1);
413,414c413,414
<               fprintf(fh, "%i 0 %i\n", mille_label(groups[i].serial, GPARAM_DET_RX), f_outplane);
<               fprintf(fh, "%i 0 %i\n", mille_label(groups[i].serial, GPARAM_DET_RY), f_outplane);
---
>               fprintf(fh, "%i 0 %i\n", mille_label(groups[i].serial, GPARAM_DET_RX), -1);
>               fprintf(fh, "%i 0 %i\n", mille_label(groups[i].serial, GPARAM_DET_RY), -1);

However, the result was basically the same.

$ grep z-tra *onlyZ.log
mille-492-onlyZ.log:    z-translation -0.915230 mm
mille-496-onlyZ.log:    z-translation -0.710160 mm
mille-500-onlyZ.log:    z-translation -0.523160 mm
mille-504-onlyZ.log:    z-translation -0.240110 mm
mille-508-onlyZ.log:    z-translation -0.124580 mm
mille-512-onlyZ.log:    z-translation +0.088850 mm

These observations suggest that this new algorithm still suffers from the correlation between cell parameters and the detector distance and is biased by the initial value. Perhaps introduction of Bravais lattice constraints and target cell restraints stabilizes refinement (as in DIALS).