Honeypot (or gold standard) is a tool for auditing the work of labelers by measuring the accuracy of their annotations.
Honeypot works by interspersing assets with defined ground truth label in the annotation queue. This way you can measure the agreement level between your ground truth and the annotations made by labelers.
Assets used as Honeypot are intelligently distributed and sent for annotation to all project labelers.
For details on how to set assets as Honeypot, refer to How to use honeypot in your project.
Honeypot computation rules
Honeypot is computed under two conditions:
- The asset that the label was added to is marked as a Honeypot (programmatically,
isHoneypotproperty is set to
- A label of type
Reviewexists for this asset. This label will be used as the ground truth.
If more than one label with
Reviewtype exists, the last label will be used as ground truth.
If these two conditions are met, then random members of the project will see this asset in their annotation queue.
When a labeler submits a label, the Honeypot metrics are updated.
- At the label level, Kili app compares the current label with the gold standard. You can access this metric from the project Queue page. When you expand a labeled asset, you will see several labels with their honeypot scores.
- At the asset level, Kili app queries the latest submitted label (type = Default) for each labeler, compares it with the gold standard, and then computes the mean of all these honeypot scores. This metric is currently not accessible from the Kili app UI. To access this metric, refer to Python SDK reference.
- At the project member level, Kili app queries all the assets for which a specific project member submitted labels (type = Default) and then computes the mean over all the honeypot scores of these assets. To access this metric, go to Analytics page > Labelers > "List of labelers" table.
- At the project level, Kili app queries all the assets in a specific project and then computes the mean over all the honeypot scores of all the assets. To access this metric, go to Analytics page > Progress > "Quality" table.
You can decide whether or not a specific labeling job is taken into account when calculating honeypot, by using the
isIgnoredForMetricsComputationssettings. For details, refer to Customizing the interface through json settings.
For calculation details, refer to Calculation rules for quality metrics.
Updated 4 months ago