thelper.tasks package¶
Task definition package.
This package contains classes and functions whose role is to define the input/output formats and operations expected from a trained model. This essentially defines the ‘goal’ of the model, and is used to specialize and automate its training.
Submodules¶
thelper.tasks.classif module¶
Classification task interface module.
This module contains a class that defines the objectives of models/trainers for classification tasks.
-
class
thelper.tasks.classif.
Classification
(class_names: Iterable[AnyStr], input_key: collections.abc.Hashable, label_key: collections.abc.Hashable, meta_keys: Optional[Iterable[collections.abc.Hashable]] = None, multi_label: bool = False)[source]¶ Bases:
thelper.tasks.utils.Task
,thelper.ifaces.ClassNamesHandler
Interface for input labeling/classification tasks.
This specialization requests that when given an input tensor, the trained model should provide prediction scores for each predefined label (or class). The label names are used here to help categorize samples, and to assure that two tasks are only identical when their label counts and ordering match.
- Variables
class_names – list of label (class) names to predict (each name should be a string).
input_key – the key used to fetch input tensors from a sample dictionary.
label_key – the key used to fetch label (class) names/indices from a sample dictionary.
meta_keys – the list of extra keys provided by the data parser inside each sample.
-
__init__
(class_names: Iterable[AnyStr], input_key: collections.abc.Hashable, label_key: collections.abc.Hashable, meta_keys: Optional[Iterable[collections.abc.Hashable]] = None, multi_label: bool = False)[source]¶ Receives and stores the class (or label) names to predict, the input tensor key, the groundtruth label (class) key, and the extra (meta) keys produced by the dataset parser(s).
The class names can be provided as a list of strings, or as a path to a json file that contains such a list. The list must contain at least two items. All other arguments are used as-is to index dictionaries, and must therefore be key-compatible types.
If the multi_label is activated, samples with non-scalar class labels will be allowed in the get_class_sizes and get_class_sample_map functions.
-
check_compat
(task: thelper.tasks.utils.Task, exact: bool = False) → bool[source]¶ Returns whether the current task is compatible with the provided one or not.
This is useful for sanity-checking, and to see if the inputs/outputs of two models are compatible. If
exact = True
, all fields will be checked for exact (perfect) compatibility (in this case, matching meta keys and class name order).
-
get_class_sample_map
(samples: Iterable, unset_key: collections.abc.Hashable = None) → Dict[AnyStr, List[int]][source]¶ Splits a list of samples based on their labels into a map of sample lists.
This function is useful if we need to split a dataset based on its label categories in order to sort it, augment it, or re-balance it. The samples do not need to be fully loaded for this to work, as only their label (gt) value will be queried. If a sample is missing its label, it will be ignored and left out of the generated dictionary unless a value is given for
unset_key
.- Parameters
samples – the samples to split, where each sample is provided as a dictionary.
unset_key – a key under which all unlabeled samples should be kept (
None
= ignore).
- Returns
A dictionary that maps each class label to its corresponding list of samples.
-
get_class_sizes
(samples: Iterable) → Dict[AnyStr, int][source]¶ Given a list of samples, returns a map of sample counts for each class label.
-
get_compat
(task: thelper.tasks.utils.Task) → thelper.tasks.utils.Task[source]¶ Returns a task instance compatible with the current task and the given one.
-
supports_classification
= True¶
thelper.tasks.detect module¶
Detection task interface module.
This module contains classes that define object detection utilities and task interfaces.
-
class
thelper.tasks.detect.
BoundingBox
(class_id, bbox, include_margin=True, difficult=False, occluded=False, truncated=False, iscrowd=False, confidence=None, image_id=None, task=None)[source]¶ Bases:
object
Interface used to hold instance metadata for object detection tasks.
Object detection trainers and display utilities in the framework will expect this interface to be used when parsing a predicted detection or an annotation. The base contents are based on the PASCALVOC metadata structure, and this class can be derived if necessary to contain more metadata.
- Variables
class_id – type identifier for the underlying object instance.
bbox – four-element tuple holding the (xmin,ymin,xmax,ymax) bounding box parameters.
include_margin – defines whether xmax/ymax is included in the bounding box area or not.
difficult – defines whether this instance is considered “difficult” (false by default).
occluded – defines whether this instance is considered “occluded” (false by default).
truncated – defines whether this instance is considered “truncated” (false by default).
iscrowd – defines whether this instance covers a “crowd” of objects or not (false by default).
confidence – scalar or array of prediction confidence values tied to class types (empty by default).
image_id – string used to identify the image containing this bounding box (i.e. file path or uuid).
task – reference to the task object that holds extra metadata regarding the content of the bbox (None by default).
-
__init__
(class_id, bbox, include_margin=True, difficult=False, occluded=False, truncated=False, iscrowd=False, confidence=None, image_id=None, task=None)[source]¶ Receives and stores low-level input detection metadata for later access.
-
property
area
¶ Returns a scalar indicating the total surface of the annotation (may be None if unknown/unspecified).
-
property
bbox
¶ Returns the bounding box tuple \((x_min,y_min,x_max,y_max)\).
-
property
bottom
¶ Returns the bottom bounding box edge origin offset value.
-
property
bottom_right
¶ Returns the bottom right bounding box corner coordinates \((x,y)\).
-
property
centroid
¶ Returns the bounding box centroid coordinates \((x,y)\).
-
property
class_id
¶ Returns the object class type identifier.
-
property
confidence
¶ Returns the confidence value (or array of confidence values) associated to the predicted class types.
-
static
decode
(vec, format=None)[source]¶ Returns a BoundingBox object from a vectorized representation in a specified format.
Note
The input bbox is expected to be a 4 element array \((x_min,y_min,x_max,y_max)\).
-
property
difficult
¶ Returns whether this bounding box is considered difficult by the dataset (false by default).
-
encode
(format=None)[source]¶ Returns a vectorizable representation of this bounding box in a specified format.
WARNING: Encoding might cause information loss (e.g. task reference is discarded).
-
property
height
¶ Returns the height of the bounding box.
-
property
image_id
¶ Returns the image string identifier.
-
property
include_margin
¶ Returns whether \(x_max\) and \(y_max\) are included in the bounding box area or not
-
intersects
(geom)[source]¶ Returns whether the bounding box intersects a geometry (i.e. a 2D point or another bbox).
-
property
iscrowd
¶ Returns whether this instance covers a crowd of objects or not (false by default).
-
property
left
¶ Returns the left bounding box edge origin offset value.
-
property
occluded
¶ Returns whether this bounding box is considered occluded by the dataset (false by default).
-
property
right
¶ Returns the right bounding box edge origin offset value.
-
supports_detection
= True¶
-
property
task
¶ Returns the reference to the task object that holds extra metadata regarding the content of the bbox.
-
tolist
()[source]¶ Gets a
list
representation of the underlying bounding box tuple \((x_min,y_min,x_max,y_max)\).This ensures that
Tensor
objects are converted to native Python types.
-
property
top
¶ Returns the top bounding box edge origin offset value.
-
property
top_left
¶ Returns the top left bounding box corner coordinates \((x,y)\).
-
totuple
()[source]¶ Gets a
tuple
representation of the underlying bounding box tuple \((x_min,y_min,x_max,y_max)\).This ensures that
Tensor
objects are converted to native Python types.
-
property
truncated
¶ Returns whether this bounding box is considered truncated by the dataset (false by default).
-
property
width
¶ Returns the width of the bounding box.
-
class
thelper.tasks.detect.
Detection
(class_names, input_key, bboxes_key, meta_keys=None, input_shape=None, target_shape=None, target_min=None, target_max=None, background=None, color_map=None)[source]¶ Bases:
thelper.tasks.regr.Regression
,thelper.ifaces.ClassNamesHandler
Interface for object detection tasks.
This specialization requests that when given an input image, the trained model should provide a list of bounding box (bbox) proposals that correspond to probable objects detected in the image.
This specialized regression interface is currently used to help display functions.
- Variables
class_names – map of class name-value pairs for object types to detect.
input_key – the key used to fetch input tensors from a sample dictionary.
bboxes_key – the key used to fetch target (groundtruth) bboxes from a sample dictionary.
meta_keys – the list of extra keys provided by the data parser inside each sample.
input_shape – a numpy-compatible shape to expect input images to possess.
target_shape – a numpy-compatible shape to expect the predictions to be in.
target_min – a 2-dim tensor containing minimum (x,y) bounding box corner values.
target_max – a 2-dim tensor containing maximum (x,y) bounding box corner values.
background – value of the ‘background’ label (if any) used in the class map.
color_map – map of class name-color pairs to use when displaying results.
See also
-
__init__
(class_names, input_key, bboxes_key, meta_keys=None, input_shape=None, target_shape=None, target_min=None, target_max=None, background=None, color_map=None)[source]¶ Receives and stores the bbox types to detect, the input tensor key, the groundtruth bboxes list key, the extra (meta) keys produced by the dataset parser(s), and the color map used to color bboxes when displaying results.
The class names can be provided as a list of strings, as a path to a json file that contains such a list, or as a map of predefined name-value pairs to use in gt maps. This list/map must contain at least two elements (background and one class). All other arguments are used as-is to index dictionaries, and must therefore be key- compatible types.
-
property
background
¶ Returns the ‘background’ label value used in loss functions (can be
None
).
-
check_compat
(task, exact=False)[source]¶ Returns whether the current task is compatible with the provided one or not.
This is useful for sanity-checking, and to see if the inputs/outputs of two models are compatible. If
exact = True
, all fields will be checked for exact (perfect) compatibility (in this case, matching meta keys and class maps).
-
property
color_map
¶ Returns the color map used to swap label indices for colors when displaying results.
-
get_class_sizes
(samples, bbox_format=None)[source]¶ Given a list of samples, returns a map of element counts for each object type.
-
get_compat
(task)[source]¶ Returns a task instance compatible with the current task and the given one.
-
supports_detection
= True¶
thelper.tasks.regr module¶
Regression task interface module.
This module contains a class that defines the objectives of models/trainers for regression tasks.
-
class
thelper.tasks.regr.
Regression
(input_key, target_key, meta_keys=None, input_shape=None, target_shape=None, target_type=None, target_min=None, target_max=None)[source]¶ Bases:
thelper.tasks.utils.Task
Interface for n-dimension regression tasks.
This specialization requests that when given an input tensor, the trained model should provide an n-dimensional target prediction. This is a fairly generic task that (unlike image classification and semantic segmentation) is not linked to a pre-existing set of possible solutions. The task interface is used to carry useful metadata for this task, e.g. input/output shapes, types, and min/max values for rounding/saturation.
- Variables
input_shape – a numpy-compatible shape to expect model inputs to be in.
target_shape – a numpy-compatible shape to expect the predictions to be in.
target_type – a numpy-compatible type to cast the predictions to (if needed).
target_min – an n-dim tensor containing minimum target values (if applicable).
target_max – an n-dim tensor containing maximum target values (if applicable).
input_key – the key used to fetch input tensors from a sample dictionary.
target_key – the key used to fetch target (groundtruth) values from a sample dictionary.
meta_keys – the list of extra keys provided by the data parser inside each sample.
See also
-
__init__
(input_key, target_key, meta_keys=None, input_shape=None, target_shape=None, target_type=None, target_min=None, target_max=None)[source]¶ Receives and stores the keys produced by the dataset parser(s).
-
check_compat
(task, exact=False)[source]¶ Returns whether the current task is compatible with the provided one or not.
This is useful for sanity-checking, and to see if the inputs/outputs of two models are compatible. If
exact = True
, all fields will be checked for exact (perfect) compatibility (in this case, matching meta keys).
-
get_compat
(task)[source]¶ Returns a task instance compatible with the current task and the given one.
-
property
input_shape
¶ Returns the shape of the input tensors to be processed by the model.
-
supports_regression
= True¶
-
property
target_max
¶ Returns the maximum target value(s) to be generated by the model.
-
property
target_min
¶ Returns the minimum target value(s) to be generated by the model.
-
property
target_shape
¶ Returns the shape of the output tensors to be generated by the model.
-
property
target_type
¶ Returns the type of the output tensors to be generated by the model.
-
class
thelper.tasks.regr.
SuperResolution
(input_key, target_key, meta_keys=None, input_shape=None, target_type=None, target_min=None, target_max=None)[source]¶ Bases:
thelper.tasks.regr.Regression
Interface for super-resolution tasks.
This specialization requests that when given an input tensor, the trained model should provide an identically-shape target prediction that essentially contains more (or more adequate) high-frequency spatial components.
This specialized regression interface is currently used to help display functions.
- Variables
input_shape – a numpy-compatible shape to expect model inputs/outputs to be in.
target_type – a numpy-compatible type to cast the predictions to (if needed).
target_min – an n-dim tensor containing minimum target values (if applicable).
target_max – an n-dim tensor containing maximum target values (if applicable).
input_key – the key used to fetch input tensors from a sample dictionary.
target_key – the key used to fetch target (groundtruth) values from a sample dictionary.
meta_keys – the list of extra keys provided by the data parser inside each sample.
See also
-
__init__
(input_key, target_key, meta_keys=None, input_shape=None, target_type=None, target_min=None, target_max=None)[source]¶ Receives and stores the keys produced by the dataset parser(s).
-
supports_regression
= True¶
thelper.tasks.segm module¶
Segmentation task interface module.
This module contains a class that defines the objectives of models/trainers for segmentation tasks.
-
class
thelper.tasks.segm.
Segmentation
(class_names, input_key, label_map_key, meta_keys=None, dontcare=None, color_map=None)[source]¶ Bases:
thelper.tasks.utils.Task
,thelper.ifaces.ClassNamesHandler
Interface for pixel-level labeling/classification (segmentation) tasks.
This specialization requests that when given an input tensor, the trained model should provide prediction scores for each predefined label (or class), for each element of the input tensor. The label names are used here to help categorize samples, and to assure that two tasks are only identical when their label counts and ordering match.
- Variables
class_names – map of class name-value pairs to predict for each pixel.
input_key – the key used to fetch input tensors from a sample dictionary.
label_map_key – the key used to fetch label (class) maps from a sample dictionary.
meta_keys – the list of extra keys provided by the data parser inside each sample.
dontcare – value of the ‘dontcare’ label (if any) used in the class map.
color_map – map of class name-color pairs to use when displaying results.
-
__init__
(class_names, input_key, label_map_key, meta_keys=None, dontcare=None, color_map=None)[source]¶ Receives and stores the class (or label) names to predict, the input tensor key, the groundtruth label (class) map key, the extra (meta) keys produced by the dataset parser(s), the ‘dontcare’ label value that might be present in gt maps (if any), and the color map used to swap label indices for colors when displaying results.
The class names can be provided as a list of strings, as a path to a json file that contains such a list, or as a map of predefined name-value pairs to use in gt maps. This list/map must contain at least two elements. All other arguments are used as-is to index dictionaries, and must therefore be key-compatible types.
-
check_compat
(task, exact=False)[source]¶ Returns whether the current task is compatible with the provided one or not.
This is useful for sanity-checking, and to see if the inputs/outputs of two models are compatible. If
exact = True
, all fields will be checked for exact (perfect) compatibility (in this case, matching meta keys and class maps).
-
property
color_map
¶ Returns the color map used to swap label indices for colors when displaying results.
-
property
dontcare
¶ Returns the ‘dontcare’ label value used in loss functions (can be
None
).
-
get_class_sizes
(samples)[source]¶ Given a list of samples, returns a map of element counts for each class label.
-
get_compat
(task)[source]¶ Returns a task instance compatible with the current task and the given one.
-
supports_segmentation
= True¶
thelper.tasks.utils module¶
Task utility functions & base interface module.
This module contains utility functions used to instantiate tasks and check their compatibility, and the base interface used to define new tasks.
-
class
thelper.tasks.utils.
Task
(input_key: collections.abc.Hashable, gt_key: collections.abc.Hashable = None, meta_keys: Optional[Iterable[collections.abc.Hashable]] = None)[source]¶ Bases:
object
Basic task interface that defines a training objective and that holds sample i/o keys.
Since the framework’s data loaders expect samples to be passed in as dictionaries, keys are required to obtain the input that should be forwarded to a model, and to obtain the groundtruth required for the evaluation of model predictions. Other keys might also be kept by this interface for reference (these are considered meta keys).
Note that while this interface can be instantiated directly, trainers and models might not be provided enough information about their goal to be correctly instantiated. Thus, specialized task objects derived from this base class should be used if possible.
- Variables
See also
-
__init__
(input_key: collections.abc.Hashable, gt_key: collections.abc.Hashable = None, meta_keys: Optional[Iterable[collections.abc.Hashable]] = None)[source]¶ Receives and stores the keys used to index dataset sample contents.
-
check_compat
(task: thelper.tasks.utils.Task, exact: bool = False) → bool[source]¶ Returns whether the current task is compatible with the provided one or not.
This is useful for sanity-checking, and to see if the inputs/outputs of two models are compatible. It should be overridden in derived classes to specialize the compatibility verification. If
exact = True
, all fields will be checked for exact compatibility.
-
get_compat
(task: thelper.tasks.utils.Task) → thelper.tasks.utils.Task[source]¶ Returns a task instance compatible with the current task and the given one.
-
property
gt_key
¶ Returns the key used to fetch groundtruth data tensors from a sample dictionary.
-
property
input_key
¶ Returns the key used to fetch input data tensors from a sample dictionary.
-
property
keys
¶ Returns a list of all keys used to carry tensors and metadata in samples.
-
property
meta_keys
¶ Returns the list of keys used to carry meta/auxiliary data in samples.
-
thelper.tasks.utils.
create_global_task
(tasks: Optional[Iterable[Task]]) → Optional[thelper.tasks.utils.Task][source]¶ Returns a new task object that is compatible with a list of subtasks.
When different datasets must be combined in a session, the tasks they define must also be merged. This functions allows us to do so as long as the tasks all share a common objective. If creating a globally-compatible task is impossible, this function will raise an exception. Otherwise, the returned task object can be used to replace the subtasks of all used datasets.
-
thelper.tasks.utils.
create_task
(config: Union[Dict[AnyStr, Union[AnyStr, float, int, List[Any], Dict[Any, Any], ConfigDict]], AnyStr]) → Task[source]¶ Parses a configuration dictionary or repr string and instantiates a task from it.
If a string is provided, it will first be parsed to get the task type, and then the object will be instantiated by forwarding the parameters contained in the string to the constructor of that type. Note that it is important for this function to work that the constructor argument names match the names of parameters printed in the task’s
__repr__
function.If a dict is provided, it should contain a ‘type’ and a ‘params’ field with the values required for direct instantiation.
If a
Task
instance was specified, it is directly returned.See also