Cell Labeling Scripts

This Python module contains utility methods related to:

generating labeled masks
converting annotations
processing segmentation outputs
training a model (via CLI argument)

This program works with the cell detection process. Currently, the program works with one animal and one task at a time. All models are stored in /net/birdstore/Active_Atlas_Data/cell_segmentation/models/ The models are named by step: models_step_X_threshold_2000.pkl where ‘X’ is the step number. The program can be run with the following commands:

python srs/labeling/scripts/create_labels.py –animal DKXX –task create_features
python srs/labeling/scripts/create_labels.py –animal DKXX –task detect
python srs/labeling/scripts/create_labels.py –animal DKXX –task extract
python srs/labeling/scripts/create_labels.py –animal DKXX –task train
python srs/labeling/scripts/create_labels.py –animal DKXX –task fix
python srs/labeling/scripts/create_labels.py –animal DKXX –task precomputed

Explanation for the tasks:

detect - This is the 1st task to run and will create the cell_labels/detection_XXX.csv files. This task will run the cell detection model on the images and create the detections.
extract - This task will extract the predictions from the detection files and create the cell_labels/all_predictions.csv file.
train - This task creates the detection_XXX.csv files created above and trains the model. The features are taken from the detection_XXX.csv files. The model is saved in the cell_segmentation/models dir. This new model can then be used to rerun the detection process. Repeat as necessary.
fix - This is only needed when the images have the extra tissue and skull present. You will need to create the rotated and aligned masks for the images.

Training workflow:: - The supervised models used require manual steps for processing (see )

Detect cells on available brains.
Some of the brains have too many points to easily display in Neuroglancer. DK59 has about 75MB of points. This won’t display and will crash the browser. We can take the points and display them as a precomputed data format, similar to the way we display large images.
Once we have the display of the predicted points along with the image stacks of the dye and the virus channels, we can create two more layers. A ‘bad’ layer where the user marks as ‘bad’ the predictions that are bad. And another layer ‘sure’ where the user creates annotations that the prediction process has missed.
These ‘bad’ and ‘sure’ new annotations are then saved to the database.
We then create features from these ‘bad’ and ‘sure’ coordinates.
These features are then fed back into the training process and a new model is created which we then use to repeat the process.

Detecting Cells

To run cell detection/segmentation using a trained model, use the –task detect argument:

python -m labeling.scripts.create_labels --task detect --animal {ANIMAL_ID} --model {neuron_model_type}

This mode:

Loads the trained model from the centralized model directory
Loads average cell image on 2 channels (virus and dye)
Calculates features for cell candidates: energy coorelation and Hu moments
Applies the model to score the cell candidates
Saves detection results (e.g. detections_{section}.csv) to: /net/birdstore/Active_Atlas_Data/cell_segmentation/data_root/pipeline_data/{ANIMAL_ID}/preps/cell_labels/

Optional: Ensure the image stacks (tif or OME-Zarr) are available in preps folder, and that the specified model exists if argument –model is used.

Training the Model

To train the cell labeling model, run the script with the –train argument:

python -m labeling.scripts.create_labels --task train --animal {ANIMAL_ID} --model {neuron_model_type} --step 1

This mode:

Reads the ‘ground truth’ neuron coordinates from cell_labels{step} directory
Trains specific model type using supplied putative identifications (positive, negative)
Saves model to centralized location (with metrics) to /net/birdstore/Active_Atlas_Data/cell_segmentation/models/