Image Segmentation๐
This document explains how the PlanktoScope software's segmenter program processes raw images (captured by the PlanktoScope's sampleimaging functionality) in order to detect objects  such as plankton, microplastics, and other particles  and to extract each object into its own segmented image for downstream use, such as for uploading to EcoTaxa. This document also lists and explains the metadata fields added by the PlanktoScope segmenter for uploading to EcoTaxa.
Currently, the segmenter only operates in batchprocessing mode: the segmenter takes as input a complete raw image dataset, and it produces as output a complete segmented object dataset as well as an export archive of segmented objects which can be uploaded to EcoTaxa.
When the segmenter starts, it will perform a mediancalculation step on the first ten images of the dataset of raw images. The mediancalculation step outputs a median image which is then used as an input for an imagecorrection step on each raw image; the median image will occasionally be recalculated (conditions triggering a recalculation are described below). Each imagecleaning step outputs a mediancorrected image is then used as the input for a maskcalculation step. Each maskcalculation step outputs a segmentation mask which is then used as an input for an objectextraction step.
For each raw image from the input dataset, after the objectextraction step outputs a set of objects, the number of extracted objects is accumulated into a cumulative moving average of the number of objects extracted per raw image. However, before the cumulative moving average is updated, the number of extracted objects is compared against the previous value of the cumulative moving average (calculated after the previous raw image was processed): if the number of extracted objects is greater than the previous value of the cumulative moving average by more than 20, then the median image will be recalculated for the next raw image. The input for the next mediancalculation step will usually be the next 10 consecutive raw images, unless the next raw image is one of the last 10 raw images  in which case the previous ten images will instead be used as the input for the next mediancalculation step. Yes, this logic is complicated, and yes, for some reason we don't center the sequence of raw images around the next raw image as our input to the mediancalculation step.
Mediancalculation step๐
The mediancalculation step takes as input a sequence of consecutive raw images, but if the image sequence consists of an even number of images then the last image is excluded from the calculation. The mediancalculation step uses the raw images to calculate a median image, in which the color of each pixel of the output is calculated as the median of the colors of the corresponding pixels in the input images.
The output of this step is supposed to be an estimate of what the the "background" of the image would be if there were no objects within the fieldofview. However, this step is not robust to sample density: if a sample is dense enough that certain pixel locations overlap with objects in more than half of any consecutive sequence of ten images, the color of the "background" in those pixel locations will be estimated as the color of an object in one of those images.
Imagecorrection step๐
The imagecorrection step takes as input a median image and a raw image. First, the imagecorrection step divides the color of each pixel of the raw image by the color of the corresponding pixel of the median image; this is probably intended to correct for inhomogeneous illumination in the raw image, and to remove any objects which had been stuck to the flow cell (and thus were included in the median image) from the raw image. Next, the imagecorrection step slightly rescales the intensity range of the resulting image (TODO: determine what the effect of this intensityrescaling operation is  does it make the image brighter or dimmer? Does it increase or decrease the contrast? Does it clip the white value? Why is this step performed???). The final result is a mediancorrected image.
Maskcalculation step๐
The maskcalculation step takes as input a mediancorrected image and the result from the previous maskcalculation step. It consists of the following operations:

"Simple threshold": this operation applies a global threshold to the input corrected image, using the triangle algorithm to calculate an optimal threshold value for the image; the output is a mask in which each pixel is set to 0 if the corresponding pixel of the input image is greater than the threshold, and to 255 otherwise. The resulting mask should select for objects which appear darker than the background of the image.

"Remove previous mask": this operation combines the result of the previous maskcalculation step with the mask created by the previous "simple threshold" operation, by subtracting the intersection of the two masks from the mask created by the previous "simple threshold" operation. This operation is probably intended to remove objects which had been stuck to the PlanktoScope's flowcell during imaging and thus might appear in many consecutive input corrected images. However, this operation is not robust in dense samples where two different objects might appear in overlapping locations across two consecutive raw images.

"Erode": this operation erodes the mask with a 2pixelby2pixel square kernel. In the resulting mask, small regions (such as thresholded noise) are eliminated.

"Dilate": this operation dilates the mask with an 8pixeldiameter circular kernel. In the resulting mask, regions remaining after the previous "erode" operation are padded with a margin.

"Close": this operation dilates and then erodes the mask with an 8pixeldiameter circular kernel. In the resulting mask, small holes in regions remaining after the previous "dilate" operation are eliminated.

"Erode2": this operation erodes the mask with an 8pixeldiameter circular kernel, inverting the effect of the previous "dilate" operation.
The final result these operations is a spatiallyfiltered segmentation mask where the value of each pixel represents whether that pixel is part of an object or part of the background of the input corrected image.
Objectextraction step๐
The objectextraction step takes the following inputs:

A mediancorrected image

A segmentation mask

The following sample metadata fields:

acq_minimum_mesh
: the diameter of the smallest spherical object which is expected to be in the sample, usually 20 ยตm. This value is set on the "Fluidic Acquisition" page of the PlanktoScope's NodeRED dashboard as the "Min fraction size". 
process_pixel
: the pixel size calibration of the PlanktoScope, in units of ยตm per pixel; then the area (in units of ยตm^{2}) per pixel isprocess_pixel * process_pixel
. This value is set on the "Hardware Settings" page of the PlanktoScope's NodeRED dashboard as the "Pixel size calibration: um per pixel".
First, the objectextraction step calculates a minimumarea threshold for objects to extract using the input segmentation mask: the threshold (in units of pixel^{2}) is calculated as (2 * acq_minimum_mesh / process_pixel) ^ 2
.
Next, the objectextraction step identifies all connected regions of the input segmentation mask and measures properties of those regions. The objectextraction step then discards any region whose boundingbox area (area_bbox
in scikitimage) is less than the minimumarea threshold.
Metadata calculation๐
For each resulting region after the minimumarea threshold is applied, that region will be used to extract a segmented and cropped image of the object (including pixels in any holes in the object) from the input mediancorrected image. This cropped image is used to calculate some metadata fields about the distribution of colors in the object's segmented image:

MeanHue
: the mean of the hue channel of the image in a huesaturationvalue (HSV) representation of the image 
StdHue
: the standard deviation of the hue channel of the image in an HSV representation of the image 
MeanSaturation
: the standard deviation of the saturation channel of the image in an HSV representation of the image 
StdSaturation
: the standard deviation of the saturation channel of the image in an HSV representation of the image 
MeanValue
: the standard deviation of the value channel of the image in an HSV representation of the image 
StdValue
: the standard deviation of the value channel of the image in an HSV representation of the image
Additionally, some metadata for the object is calculated from the region properties calculated by scikitimage for that object's region:

label
: The identifier of the object's region, as assigned by scikitimage. This corresponds to thelabel
region property in scikitimage. 
Basic area properties:

area_exc
: Number of pixels in the region (excluding pixels in any holes). This corresponds to thearea
region property in scikitimage. 
area
: Number of pixels of the region with all holes filled in (i.e. including pixels in any holes). This corresponds to thearea_filled
region property in scikitimage. Yes, it's somewhat confusing that the PlanktoScope segmenter renames scikitimage'sarea
region property toarea_exc
and renames scikitimage'sarea_filled
region property toarea
. 
%area
: Ratio between the number of pixels in any holes in the region and the total number of pixels of the region with all holes filled in; calculated as1  area_exc / area
. In other words, this represents the proportion of the region which consists of holes. Yes,%area
is a misleading name both because of the%
in the name and because of thearea
in the name. 
Equivalentcircle properties:

equivalent_diameter
: The diameter (in pixels) of a circle with the same number of pixels in its area as the number of pixels in the region (excluding pixels in any holes). This corresponds to theequivalent_diameter_area
property in scikitimage.  Equivalentellipse properties:

eccentricity
: Eccentricity of the ellipse that has the same secondmoments as the region; eccentricity is the ratio of the focal distance (distance between focal points) over the major axis length. The value is in the interval [0, 1), where a value of 0 represents a circle. This corresponds to theeccentricity
property in scikitimage. 
major
: The length (in pixels) of the major axis of region's equivalent ellipse. This corresponds to theaxis_major_length
property in scikitimage. 
minor
: The length (in pixels) of the minor axis of the region's equivalent ellipse. This corresponds to theaxis_minor_length
property in scikitimage. 
elongation
: The ratio betweenmajor
andminor
. 
angle
: Angle (in degrees) between the xaxis of the input mediancorrected image and the major axis of the region's equivalent ellipse. Values range from 0 deg to 180 deg counterclockwise. This is calculated from theorientation
property in scikitimage. 
Equivalentobject perimeter properties:

perim.
: Perimeter (in pixels) of an object which approximates the region's contour as a line through the centers of border pixels using a 4connectivity. This corresponds to theperimeter
property in scikitimage. 
perimareaexc
: Ratio between the perimeter and the number of pixels in the region (excluding pixels in any holes). Calculated asperim. / area_exc
. 
perimmajor
: Ratio between the perimeter and the length of the major axis of the region's equivalent ellipse. Calculated asperim. / major
. 
circ.
: The roundness of the region's equivalent object, including pixels in any holes. Calculated as4 * ฯ * area / (perim. * perim.)
. Ranges from 1 for a perfect circle to 0 for highly noncircular shapes. 
circex
: The roundness of the region's equivalent object, excluding pixels in any holes. Calculated as4 * ฯ * area_exc / (perim. * perim.)
. Ranges from 1 for a perfect circle to 0 for highly noncircular shapes or shapes with many large holes. 
Bounding box (the smallest rectangle which includes all pixels of the region, under the constraint that the edges of the box are parallel to the x and yaxes of the input mediancorrected image) properties:

bx
: xposition (in pixels) of the topleft corner of the region's bounding box, relative to the topleft corner of the input mediancorrected image. This corresponds to the second element of thebbox
property in scikitimage. 
by
: yposition (in pixels) of the topleft corner of the region's bounding box, relative to the topleft corner of the input mediancorrected image. This corresponds to the first element of thebbox
property in scikitimage. 
width
: Width (in number of pixels) of the region's bounding box. This is calculated from the elements of thebbox
property in scikitimage. 
height
: Height (in number of pixels) of the region's bounding box. This is calculated from the elements of thebbox
property in scikitimage. 
bounding_box_area
: Number of pixels in the region's bounding box; equivalent towidth * height
. This corresponds to thearea_bbox
region property in scikitimage. 
extent
: Ratio between the number of pixels in the region (excluding pixels in any holes) and the number of pixels in the region's bounding box; equivalent toarea_exc / bounding_box_area
. This corresponds to theextent
region property in scikitimage. 
Convex hull (the smallest convex polygon which encloses the region) properties:

convex_area
: Number of pixels in the convex hull of the region. This corresponds to thearea_convex
region property in scikitimage. 
solidity
: Ratio between the number of pixels in the region (excluding pixels in any holes) and the number of pixels in the convex hull of the region. Equivalent toarea_exc / convex_area
. This corresponds to thesolidity
region property in scikitimage. 
Unweighted centroid properties:

x
: xposition (in pixels) of the centroid of the object, relative to the topleft corner of the input mediancorrected image. This corresponds to the second element of thecentroid
region property in scikitimage. 
y
: yposition (in pixels) of the centroid of the object, relative to the topleft corner of the input mediancorrected image. This corresponds to the first element of thecentroid
region property in scikitimage. 
local_centroid_col
: xposition (in pixels) of the centroid of the object, relative to the topleft corner of the region's bounding box; equivalent tox  bx
. This corresponds to the second element of thecentroid_local
region property in scikitimage. 
local_centroid_row
: yposition (in pixels) of the centroid of the object, relative to the topleft corner of the region's bounding box; equivalent toy  by
. This corresponds to the first element of thecentroid_local
region property in scikitimage. 
Topological properties:

euler_number
: The Euler characteristic of the set of nonzero pixels. Computed as the number of connected components subtracted by the number of holes (with 2connectivity). This corresponds to theeuler_number
property in scikitimage.
Output image cropping๐
Finally, a segmented and cropped image of the object (including pixels in any holes in the object) is saved from the input mediancorrected image, but with the crop expanded by up to 10 pixels in each direction (TODO: check whether this description is accurate  the corresponding code is extremely unreadable).
Thus, the output of the outputextraction step is a set of objects, each with a corresponding cropped image saved to file and with a corresponding list of metadata values.