Tools for Efficient Annotation of Complex Bioimage Data
Many biological questions can only be answered through visualisation. Seeing is believing, however seeing something that is biologically interesting usually also requires image processing to turn that qualitative observation into quantitative information. The first step in quantifying something about an image is to annotate it. Annotation is the act of labelling an image or parts of an image. This can be as simple as saying that there is a cell present in the image, or as complex as specifying the exact pixels that make up the mitochondria within the image. Annotating an image or a 3D volume can be very difficult and time-consuming, yet it is a necessary step to automating the process since training data is required to teach an AI model how to do the task. Because providing training data of complex 3D imaging data is manual and slow, there is very little of it currently available.
As an added complexity, the training of models for image analysis often requires an iterative approach and a lot of experimentation to steer the training process to produce a useful model. We develop tools to assist biologists in both the creation of annotation and the training of machine learning models. In particular, the open-source tool SuRVoS2 (Pennington, A., et al., 2022) is designed for interactive, iterative workflows that allow experts to develop complex annotation and train deep learning segmentation models.
SuRVoS
Once a dataset has been reconstructed into a 3D volume from the 2D images of microscopy samples, the next step is to annotate and segment the data to extract meaning from it. SuRVoS2 is a collection of tools to help accelerate annotation and segmentation of large volumetric bio-imaging workflows. It enables either shallow or deep machine learning approaches, using a suite of image processing filters, supervoxels (boundary adherent groupings of similar, adjacent voxels), and annotation hierarchies. SuRVoS2 also provides a set of tools to enable visualization and interaction with large numbers of distributed annotations (e.g. those performed by multiple members of a group or citizen scientists). The open-source code for SuRVoS2 can be found here.
Using Deep Learning for Analysing New Imaging Modalities
Over the past decade, deep learning methods have become widespread in volumetric image analysis. Deep learning methods use Graphics Processing Units (GPUs) to train large neural network models with many layers. Many different neural network designs, or architectures, have been proposed and different architectures have strengths and weaknesses for certain types of biological samples and for certain imaging modalities.
Particularly important for research at the Franklin is the development of image analysis methods that can adapt to new approaches to microscopy, such as recent work on cryogenic plasma FIB/SEM volume imaging of mesoscale cellular structures (Dumoux, M., et al., 2023). This work showed that deep learning methods are effective on images produced using this new approach which allows imaging of large volumes of vitrified, frozen-hydrated samples. This work utilises tools from ongoing projects focused on deep transfer learning, including SuRVoS2 and volume-segmantics (King, O.N., et al., 2022).
Volume-segmantics is an automatic segmentation tool that is very effective for segmentation of biological volumetric data. To do this, it generates several predictions from a single raw dataset. This means that multiple outputs are provided, each with a slightly different prediction of which classes each voxel belongs to (e.g., inside of a cell vs outside of a cell). The last, essential stage is to combine all these different predictions to produce a final segmentation that best represents all of them together. One common approach to this is a maximum voting strategy where the class with the most “votes” wins.
There are many alternative approaches one could take to combining predictions. LeopardGecko is a software tool developed at the Franklin to provide these options, view the open source code here. It includes statistical analysis options that can either be visually guided or automated with an additional neural network. We have also implemented a new metric and visualisation strategies to better understand the quality of the outputs when multiple classes are present during prediction (the consistency score; Figure 2).
Instance Segmentation and Object-based Crowdsourced Annotation
Instance segmentation is a specialised image analysis approach we are researching at the Franklin that produces segmentations that detect objects in images. Tools to support object detection for volumetric image analysis are limited at present and we have extended SuRVoS2 to better support these tasks. Plugins that have been created to support object-based workflows include advanced connected component analysis with component filtering as well as mask rasterization and segmentation cleaning. Detected point sets can be visualised in the interface, and point-based annotation can be created and used to generate mask annotation for training instance segmentation models. In addition, we have developed crowdsourcing approaches to developing annotation for object-based analysis of bioimages and integrated tools for cleaning crowdsourced data into SuRVoS2. These various approaches and tools have been used in the analysis of viral infection and mitochondria distributions in human cells.
References
Dumoux, M., Glen, T., Smith, J.L., Ho, E.M., Perdigão, L.M., Pennington, A., Klumpe, S., Yee, N.B., Farmer, D.A., Lai, P.Y. and Bowles, W., (2023). Cryo-plasma FIB/SEM volume imaging of biological specimens. Elife, 12, e83623.
King, O.N., Bellos, D., & Basham, M. (2022). Volume Segmantics: A Python Package for Semantic Segmentation of Volumetric Data Using Pre-trained PyTorch Deep Learning Models. Journal of Open Source Software, 7(78), 4691.
Pennington, A., King, O. N., Tun, W. M., Ho, E. M., Luengo, I., Darrow, M. C., & Basham, M. (2022). SuRVoS 2: Accelerating Annotation and Segmentation for Large Volumetric Bioimage Workflows Across Modalities and Scales. Frontiers in Cell and Developmental Biology, 10.
Tun, W. M., Poologasundarampillai, G., Bischof, H., Nye, G., King, O. N. F., Basham, M., … & Chernyavsky, I. L. (2021). A massively multi-scale approach to characterizing tissue architecture by synchrotron micro-CT applied to the human placenta. Journal of The Royal Society Interface, 18(179), 20210140.
Project Leadership
Mark Basham
Michele Darrow
Project Members at the Franklin
Avery Pennington
Dolapo Adebo
Sam Kersley
Collaborating Institutes
University of Southampton
University of Manchester
University of Nottingham
Funded by
Wellcome Trust (previously)
Wellcome Leap – in utero
BBSRC