Definitions
Annotate/label
The action of creating metadata to characterize the data and allow it to train machine learning models. For example, identifying a house on a satellite image, or attaching textual information to a type of named entity on a document.
Annotation
The metadata created from the labeling work on assets. For example, a box delimiting the house on the satellite image, or the selection of textual information related to its typology of named entities.
Asset
A file/document. This could be a photography, a satellite image, a video, a PDF, an email, etc.
Consensus
A quality parameter to measure the agreement between several annotations of the same asset, made by different labelers. Ensures consistency between the annotators and the best data quality for your project.
Honeypot
A quality parameter to measure the agreement between a pre-annotated asset (a gold standard) and the annotation made by a labeler.
Instructions
Instructions are guidances available in the annotation interface to help labelers complete their tasks. They can be defined at the project and job levels.
Interface
The graphical user interface configured at the beginning of a project. Made available to users to enable them to perform the annotation task.
Label
The annotation or combination of all annotations created on an asset. For example, all houses identified on the satellite image, or all annotated fragments of text in a document.
Number of annotations
The total number of annotations (there can be multiple annotations on a given asset).
Number of labeled assets
The total number of assets that have been labeled.
Number of hours
The total time users spent to annotate and review.
Organization
- An organization contains users who can create projects and easily collaborate.
- The number of labeled assets at the organization level is the sum of labeled assets over all members from projects belonging to the organization. A project belongs to an organization when the author is a member of the organization.
- Similarly, the number of hours at the organization level is the sum of labeling hours over all members from projects belonging to the organization.
Project
A project is the combination of:
- A dataset (collection of assets to be annotated).
- An interface adapted to the annotation task we want to perform on the dataset.
- Members with different roles (eg. labelers and reviewers).
- Settings regarding quality management workflow.
Project user
It is the intersection of a project and a user.
- The number of labeled assets of a project user, is the number of assets containing a review or default label whose author is the user.
- The total work duration of a project user, is the sum of the time spent on each review or default label made in the project by the user.
Updated 4 months ago