Importing labels
You can directly import existing labels so annotators can start working on pre-annotated assets. This will reduce their work complexity; they will just need to validate the pre-annotations, potentially correct a few of them, and complete the annotation process. This is easier than starting from scratch.
Refer to the example recipe:
For details on the required structure of uploaded objects, refer to Kili data format.
You can also use this feature to run quality checks. For example, you can upload ground truth labels to review the annotators' work (refer to Honeypot overview).
Imported labels can be:
- Predictions from a custom model
- Predictions from a weakly-supervised learning framework
- Human-labeled data from previous projects or other sources
Predictions from a custom model
If you have a custom, in-house model that already detects or adds labels to your assets and the inference phase is done on your dataset, you can upload predictions to your project.
If you have multiple models, you can still "tag" your predictions with the source model. Simply fill in the modelName
field in the GraphQL API. You'll then be able to filter by models when working with assets and labels.
Predictions from a weakly-supervised learning framework
Weak supervision is the ability to combine weak predictors to build a more robust prediction.
Here are some examples of weak predictions:
- Hard-coded heuristics: usually regular expressions (regexes)
- Syntactics: for example spaCy dependency trees
- Distant supervision: external knowledge bases
- Noisy manual labels: crowdsourcing
- External models: other models with useful signals
Weakly-supervised learning maturity depends on your task complexity. Our experience shows that it can be extremely powerful on text annotation, classification, and NER tasks.
We are used to working with Snorkel, a framework created at Standford.
With Snorkel, after defining your own pre-annotation functions, you can upload your predictions to Kili.
To learn more about weak supervision, refer to http://ai.stanford.edu/blog/weak-supervision.
Human-labeled data
There may be many reasons that require you to review/re-annotate human-labeled data.
Examples include:
- Reviewing or re-annotating an annotated dataset sourced outside
- Performing a quality check on pre-annotated datasets
- Labeling the human-generated logs from a chat bot framework.
In such cases, the import process does not change: you can still upload your predictions, assets, and existing labels into Kili.
How Annotations are Displayed Based on Status
The
labelVersion
in the annotations within thejsonResponse
of a label determines how the annotation is displayed in the labeling interface:
- labelVersion = "prediction": The annotation is shown with a dashed outline, indicating it is a prediction.
- labelVersion = "default": The annotation is shown with a solid outline, indicating it is a manually created or updated annotation.
If
labelVersion
is not defined at the annotation level, the interface checks thelabelType
of the label:
- labelType = "PREDICTION": The annotation is displayed as dashed to indicate it is a prediction.
- labelType = "DEFAULT", "AUTOSAVE", or "REVIEW": The annotation is displayed as solid, indicating it is a created or updated annotation.
Note: When an annotation is modified, its
labelVersion
is automatically updated to"default"
to reflect the modification.
Learn more
For an end-to-end example of how to programmatically import model-based pre-annotations to a Kili project using Kili's Python SDK, refer to our tutorial on importing labels.
Updated 4 months ago