Skip to content

Ophys ROI segmentation

Mike Huang edited this page Nov 3, 2023 · 37 revisions

Overview

The Opyhs ROI Segmentation pipeline employs methods to improve the recall and precision of the suite2p sparsery segmentation algorithm. A denoising step is first added that improves the recall but leads to poor precision. An ROI cell classifier takes human labels ROIs as TP or FP and that significantly improves the precision with marginal loss to the recall.

An example processed dataset can be found on Isilon at /allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/

ophys_segmentation

Denoising - Deep Interpolation

Deep Interpolation is an initial denoising step that improves recall with suite2p segmentation image

Green: true positives Red: false positives Cyan: false negatives

This deep learning based denoiser requires a two step training process. First, the model is trained on a large repository of movies (pretrained model is available). Next, the model is fine tuned on the movie to be denoised. Finally, the denoised movie is outputted from the model inference.

Finetuning

This module takes a pre-trained model and fine tunes it on the movie to be denoised. A previous model trained on an ensemble of SSF datasets can be found on Isilon at /allen/programs/mindscope/workgroups/surround/denoising_labeling_2022/ensemble_output/ensemble_ssf_model_even_smaller_validation_mean_squared_error-0120-0.9622.h5

Run module

python -m ophys_etl.modules.denoising.fine_tuning --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.denoising.fine_tuning --help
Example Input JSON
{
"output_full_args": true,
"test_generator_params": {
  "pre_post_omission": 0,
  "cache_data": false,
  "batch_size": 5,
  "name": "MovieJSONGenerator",
  "total_samples": -1,
  "post_frame": 30,
  "end_frame": -1,
  "randomize": true,
  "pre_frame": 30,
  "start_frame": 0,
  "gpu_cache_full": false,
  "movie_statistics_nbframes": -100,
  "data_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878/val.json",
  "normalize_cache": true,
  "seed": 1234,
  "steps_per_epoch": -1
},
"input_json": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878/DENOISING_FINETUNING_input.json",
"run_uid": "1307046775",
"log_level": "INFO",
"finetuning_params": {
  "model_string": "",
  "cache_data": true,
  "multi_gpus": false,
  "nb_workers": 1,
  "output_dir": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878",
  "model_source": {
    "local_path": "/allen/programs/mindscope/workgroups/surround/denoising_labeling_2022/ensemble_output/ensemble_ssf_model_even_smaller_validation_mean_squared_error-0120-0.9622.h5"
  },
  "name": "transfer_trainer",
  "caching_validation": false,
  "use_multiprocessing": false,
  "steps_per_epoch": 20,
  "loss": "mean_squared_error",
  "measure_baseline_loss": false,
  "nb_times_through_data": 1,
  "learning_rate": 0.0001,
  "verbose": 2,
  "period_save": 1,
  "apply_learning_decay": false,
  "epochs_drop": 5
},
"generator_params": {
  "pre_post_omission": 0,
  "cache_data": false,
  "batch_size": 5,
  "name": "MovieJSONGenerator",
  "total_samples": -1,
  "post_frame": 30,
  "end_frame": -1,
  "randomize": true,
  "pre_frame": 30,
  "start_frame": 0,
  "gpu_cache_full": false,
  "movie_statistics_nbframes": -100,
  "data_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878/train.json",
  "normalize_cache": true,
  "seed": 1234,
  "steps_per_epoch": 20
},
"output_json": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878/DENOISING_FINETUNING_output.json"
}

Module outputs

  1. fine tuned model h5 - best validation loss
  2. fine tuned models h5 - epochs where validation loss improved
  3. epoch vs train/val loss plot

Inference

Run module

python -m ophys_etl.modules.denoising.inference --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.denoising.inference --help
Example Input JSON
{
"generator_params": {
  "batch_size": 5,
  "name": "InferenceOphysGenerator",
  "start_frame": 0,
  "cache_data": true,
  "normalize_cache": false,
  "gpu_cache_full": false,
  "data_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/MOTION_CORRECTION/2023-10-28_08-10-39-046840/1307046775_suite2p_motion_output.h5",
  "seed": 1234
},
"inference_params": {
  "model_source": {
    "local_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_FINETUNING/2023-10-28_09-10-20-599878/1307046775_mean_squared_error_transfer_model.h5"
  },
  "rescale": true,
  "save_raw": false,
  "output_padding": true,
  "output_file": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_INFERENCE/2023-10-28_11-10-16-834331/1307046775_denoised_video.h5",
  "steps_per_epoch": 0
},
"run_uid": "1307046775"
}

Module outputs

  1. denoised movie h5

Addendum

We forked the main repository to make several improvements to deepinterpolation with regard to logging and optimizations.

Forked repo

Main repo

Change log

  • Performance optimizations
    • Cache input movie with MovieJSONGenerator
    • The data cache for the input movie is shared between the train and test generator objects instead of caching twice
    • 2D slicing with MovieJSONGenerator when generating a batch input instead of running through an inefficient for loop
  • Separate logging of train and validation by Keras. This helps determine performance bottlenecks during training
  • Bugfix with marshmallow schema validator
  • Unpin argschema in requirements

Suite2P Segmentation

Segmentation

Segmentation is performed with suite2p's sparsery module

The threshold_scaling parameter was optimized ad hoc for the SSF datasets by lowering the threshold such that the recall was maximized but not lowered excessively to result in excessive false positives. This value may need to be optimized ad hoc for new datasets (such as mFISH learning).

The defaults for all other parameters were used.

The resultant ROIs are further postprocessed with the following:

  1. filter by aspect ratio
    • postprocess_args.aspect_ratio_threshold
  2. binarize masks from suite2p weights if it crosses an absolute threshold
    • postprocess_args.abs_threshold (optional), default to use the quantile defined by binary quantile
    • postprocess_args.binary_quantile (optional) (default=0.1)
  3. reduce pixelation by performing binary closing followed by binary opening
  4. format suite2p output to LIMS schema

Usage

Run module

python -m ophys_etl.modules.segment_postprocess --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.segment_postprocess --help
Example Input JSON
{
"suite2p_args": {
  "h5py": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_INFERENCE/2023-10-28_11-10-16-834331/1307046775_denoised_video.h5",
  "movie_frame_rate_hz": 9.48
},
"postprocess_args": {},
"output_json": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/SEGMENTATION/2023-10-28_13-10-13-009716/SEGMENTATION_output.json"
}

Module outputs

  1. ROI json file with a list of ROIs stored with the following schema
  • x: left most coordinate of bounding box
  • y: top most coordinate of bounding box
  • width: width of bounding box
  • height: height of bounding box
  • mask_matrix: 2D boolean array of cropped ROI
  • valid_roi: boolean value (this may change in further downstream steps)
  • exclusion_labels: list of reasons the ROI may be invalidated (examples include but are not limited to intersects motion border, empty neuropil mask, is a decrosstalk ghost)
  • mask_image_plane
  • id: unique identifier for each ROI
  • max_correction_up: maximum upwards shift of motion correction
  • max_correction_down: maximum downwards shift of motion correction
  • max_correction_left: maximum left shift of motion correction
  • max_correction_right: maximum right shift of motion correction

ROI TP/FP Classifier

This step is to filter out false positives from the suite2p segmentation step. Since suite2p's segmentation algorithm is optimized to be sensitive for high recall, there is an abundance of false positives.

image

Overview of classifier

The classifier uses an ImageNet CNN to classify ROIs as TP or FP that's been trained on human labels as ground truth. The ROIs are cropped and are processed to three representations, correlation projection, max projection, and mask. These are represented by the channels dimension of the 2D CNN.

image

These artifacts consist of thumbnail crops of each ROI and are generated with two separate modules, generate_correlation_projection_graph and generate_thumbnails

Training data is generated by randomly sampling ophys experiments to get representative FOVs. Each FOV is cropped in select regions. The ROIs from suite2p are labeled by human labelers as TP or FP.

image

Generate correlation projection graph

Run module

python -m ophys_etl.modulessegmentation.calculate_edges --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.segmentation.calculate_edges --help
Example Input JSON
{
"video_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_INFERENCE/2023-10-28_11-10-16-834331/1307046775_denoised_video.h5",
"graph_output": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_CORRELATION_PROJECTION_GRAPH/2023-10-28_16-10-13-144253/1307046775_correlation_graph.pkl",
"attribute_name": "filtered_hnc_Gaussian",
"neighborhood_radius": 7,
"n_parallel_workers": 32
}

Module output

  1. correlation projection graph pkl file

Generate thumbnails

Run module

python -m ophys_etl.modules.segmentation.calculate_edges --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.segmentation.calculate_edges --help
Example Input JSON
{
"video_path": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/DENOISING_INFERENCE/2023-10-28_11-10-16-834331/1307046775_denoised_video.h5",
"graph_output": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_CORRELATION_PROJECTION_GRAPH/2023-10-28_16-10-13-144253/1307046775_correlation_graph.pkl",
"attribute_name": "filtered_hnc_Gaussian",
"neighborhood_radius": 7,
"n_parallel_workers": 32
}

Module output

  1. png images of 128x128 cropped thumbnails around each ROI with three representations
  • correlation projection
  • max projection
  • suite2p segmentation mask

Labeling App

Training

Training is performed with DeepCell

Run module

python -m deepcell.cli.modules.cloud.train --input_json

View module input schema

python -m deepcell.cli.modules.cloud.train --help

Inference

Run module

python -m ophys_etl.modules.classifier_inference --input_json <path_to_input_json>

View module input schema

python -m ophys_etl.modules.classifier_inference --help
Example Input JSON (Truncated with 2 ROIs)
[
{
  "experiment_id": 1307046775,
  "roi_id": "163244",
  "channel_order": [
    "CORRELATION_PROJECTION",
    "MAX_PROJECTION",
    "MASK"
  ],
  "channel_path_map": {
    "CORRELATION_PROJECTION": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/correlation_1307046775_163244.png",
    "MAX_PROJECTION": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/max_1307046775_163244.png",
    "MASK": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/mask_1307046775_163244.png"
  },
  "label": null
},
{
  "experiment_id": 1307046775,
  "roi_id": "163358",
  "channel_order": [
    "CORRELATION_PROJECTION",
    "MAX_PROJECTION",
    "MASK"
  ],
  "channel_path_map": {
    "CORRELATION_PROJECTION": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/correlation_1307046775_163358.png",
    "MAX_PROJECTION": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/max_1307046775_163358.png",
    "MASK": "/allen/programs/mindscope/production/informatics/ophys_processing/specimen_1286077606/session_1306855381/experiment_1307046775/ROI_CLASSIFICATION_GENERATE_THUMBNAILS/2023-10-28_17-10-26-596409/thumbnails/mask_1307046775_163358.png"
  },
  "label": null
}
]

Module output

  1. P(cell) csv with schema:
  • roi-id
  • experiment_id
  • y_score: probability of cell by classifier
  • y_pred: boolean classification (y_pred > classification_threshold)
Clone this wiki locally