=============== Getting Started =============== Import required packages .. code-block:: python import warnings warnings.simplefilter("ignore") import os from IPython.display import display, HTML display(HTML("")) from nimbus_inference.nimbus import Nimbus, prep_naming_convention from nimbus_inference.utils import MultiplexDataset from alpineer import io_utils from ark.utils import example_dataset from nimbus_inference.viewer_widget import NimbusViewer **0: Set root directory and download example dataset** Here we are using the example data located in :code:`/data/example_dataset/input_data`. To modify this notebook to run using your own data, simply change :code:`base_dir` to point to your own sub-directory within the data folder. Set :code:`base_dir`, the path to all of your imaging data (i.e. multiplexed images and segmentation masks). Subdirectory :code:`nimbus_output` will contain all of the data generated by this notebook. In the following, we expect this folder structure, with :code:`fov_1` and :code:`fov_2` either being folders of individual channel images or :code:`.ome.tiff` files that contain all channels in a single file. .. code-block:: bash |-- base_dir | |-- image_data | | |-- fov_1 | | |-- fov_2 | |-- segmentation | | |-- deepcell_output | |-- nimbus_output Set up the base directory .. code-block:: python base_dir = os.path.normpath("../data/example_dataset") If you would like to test Nimbus with an example dataset, run the cell below. It will download a dataset consisting of 10 FOVs with 22 channels. You may find more information about the example dataset in the `ark-analysis README `_. If you want to use your own data, skip the cell below .. code-block:: python example_dataset.get_example_dataset(dataset="cluster_pixels", save_dir = base_dir, overwrite_existing = False) **1: Set file paths and parameters** All data, images, files, etc. must be placed in the :code:`data` directory, and referenced via :code:`../data/path_to_your_data` .. code-block:: python # set up file paths tiff_dir = os.path.join(base_dir, "image_data") deepcell_output_dir = os.path.join(base_dir, "segmentation", "deepcell_output") nimbus_output_dir = os.path.join(base_dir, "nimbus_output") # Create nimbus output directory os.makedirs(nimbus_output_dir, exist_ok=True) # Check if paths exist io_utils.validate_paths([base_dir, tiff_dir, deepcell_output_dir, nimbus_output_dir]) **2: Set up input paths and the naming convention for the segmentation data** Store names of channels to exclude in the list below. Either predict all FOVs or specify manually the ones you want to apply Nimbus on. .. code-block:: python # define the channels to include include_channels = [ "CD3", "CD4", "CD8", "CD14", "CD20", "CD31", "CD45", "CD68", "CD163", "CK17", "Collagen1", "ECAD", "Fibronectin", "GLUT1", "HLADR", "IDO", "Ki67", "PD1", "SMA", "Vim" ] # either get all fovs in the folder... fov_names = os.listdir(tiff_dir) # ... or optionally, select a specific set of fovs manually # fovs = ["fov0", "fov1"] # make sure to filter paths out that don't lead to FoVs, e.g. .DS_Store files. fov_names = [fov_name for fov_name in fov_names if not fov_name.startswith(".")] # construct paths for fovs fov_paths = [os.path.join(tiff_dir, fov_name) for fov_name in fov_names] Define the naming convention for the segmentation data in function :code:`segmentation_naming_convention`, that maps the :code:`fov_name` to the path of the associated segmentation output. The below function :code:`prep_deepcell_naming_convention` assumes that all segmentation outputs are stored in one folder, with the :code:`fov_name` as the prefix and :code:`_whole_cell.tiff` as the suffix, as shown below in the visualization of the folder structure. If this does not apply to your data, you have to define a function :code:`segmentation_naming_convention` that takes an element from :code:`fov_paths` and returns a valid path to the segmentation label map you want to use for that fov. .. code-block:: bash |-- base_dir | |-- image_data | | |-- fov_1 | | |-- fov_2 | |-- segmentation | | |-- deepcell_output | | | |-- fov_1_whole_cell.tiff | | | |-- fov_2_whole_cell.tiff | |-- nimbus_output .. code-block:: python # Prepare segmentation naming convention that maps a fov_path to the according segmentation label map segmentation_naming_convention = prep_naming_convention(deepcell_output_dir) # test segmentation_naming_convention if os.path.exists(segmentation_naming_convention(fov_paths[0])): print("Segmentation data exists for fov 0 and naming convention is correct") else: print("Segmentation data does not exist for fov 0 or naming convention is incorrect") Next we will use the :code:`MultiplexDataset` class to abstract away differences in data representation. The class takes :code:`fov_paths`, :code:`segmentation_naming_convention` and a :code:`suffix` and provides methods :code:`.get_channel(fov, channel)` and :code:`.get_segmentation(fov)` to access the data. The :code:`suffix` is used to filter out files that do not end with the specified suffix. When you use :code:`.ome.tiff` files make sure to set the suffix to :code:`.ome.tiff`, otherwise the ViewerWidget won't be able to display the images. .. code-block:: python dataset = MultiplexDataset( fov_paths=fov_paths, suffix=".tiff", # or .png, .jpg, .jpeg, .tif or .ome.tiff include_channels=include_channels, segmentation_naming_convention=segmentation_naming_convention, ) **3: Load model and initialize Nimbus application** The following code initializes the Nimbus application and loads the model checkpoint. The model was trained on a diverse set of tissues, protein markers, imaging platforms and cell types and doesn't need re-training. If you want to use the model on a machine without GPU, set :code:`test_time_aug=False` to speed up inference. If you run it on a laptop GPU and run into out-of-memory errors, consider reducing the :code:`batch_size` to 1 and the :code:`input_shape` to :code:`[512,512]`. .. code-block:: python # Initialize the Nimbus application nimbus = Nimbus( dataset=dataset, output_dir=nimbus_output_dir, save_predictions=True, batch_size=4, test_time_aug=True, input_shape=[1024,1024], device="auto", ) # check if all inputs are valid nimbus.check_inputs() **4: Prepare normalization dictionary** The next step is to iterate through all the fovs and calculate the 0.999 marker expression quantile for each marker individually. This is used for normalizing the marker expressions prior to predicting marker confidence scores with our model. You can set :code:`n_subset` to estimate the quantiles on a small subset of the data and you can set :code:`multiprocessing=True` to speed up computation. .. code-block:: python nimbus.prepare_normalization_dict( quantile=0.999, n_subset=50, clip_values=(0, 2), multiprocessing=True, overwrite=True ) **5: Make predictions with the model** Nimbus will iterate through your samples and store predictions and a file named :code:`nimbus_cell_table.csv` that contains the mean-per-cell predicted marker confidence scores in the sub-directory called :code:`nimbus_output`. .. code-block:: python cell_table = nimbus.predict_fovs() **6: View multiplexed channels and Nimbus predictions side-by-side** Select an FOV and one marker image per channel to inspect the imaging data and associated Nimbus predictions .. code-block:: python viewer = NimbusViewer(dataset=dataset, output_dir=nimbus_output_dir) viewer.display()