DocumentationRecipesReferenceChangelog
Log In
Documentation

Using optical character recognition


Kili Technology provides an interface for automatic transcription (or optical character recognition (OCR)) to extract text information from:

  • an image document (for example, a scanned pdf document)
  • a pdf document

Interface requirement

OCR results are captured and stored in bounding box transcription subjobs.

Make sure your project ontology / JSON interface is properly configured:

  • Add a transcription subjob to your bounding box tool
  • Ensure the autofill option is set to true (enabled by default)

This is required for OCR to automatically populate and refresh transcriptions based on the underlying metadata.

Prerequisite: Provide OCR Metadata

To enable OCR in Kili, your images must be pre-processed and enriched with OCR metadata.

When OCR metadata is available, drawing a bounding box will automatically extract the corresponding text.

How it works

  • OCR is not performed directly in the interface
  • Instead, it relies on pre-computed OCR results attached to each asset
  • These results must be uploaded as metadata using the jsonMetadata field

Metadata format

The expected metadata structure is similar to the one produced by Google Vision API.

  • Each detected text element is associated with coordinates in the image
  • This allows Kili to match bounding boxes with the correct transcription

You can refer to this tutorial for an example

Importing OCR metadata

When importing your assets:

  • Include OCR results in the jsonMetadata field
  • Ensure the format matches the expected structure

For more details, see Importing asset metadata

Creating OCR Annotations

OCR annotations allow you to extract and edit text from images using bounding boxes with transcription.

To create an OCR annotation:

  1. Select the bounding box tool configured with OCR transcription
  2. Draw a bounding box around the text you want to capture
  3. If autofill is enabled, the transcription will be automatically populated
  4. You can manually edit the transcription if needed

Refreshing OCR Transcriptions

When working with OCR annotations, you may need to adjust bounding boxes after their initial creation (e.g. moving or resizing them). In such cases, the existing transcription may no longer match the updated area.

You can now refresh the OCR transcription to reflect the updated bounding box content.

How to refresh OCR

  1. Select one or multiple bounding boxes with OCR transcription
  2. Right-click on the selection
  3. Click "Refresh OCR"

The transcription will be recalculated based on the current position and size of each bounding box.

🚧

Refresh OCR is currently available for Image projects only. Support for PDF documents will be added soon.