Kili Technology provides an interface for automatic transcription (or optical character recognition (OCR)) to extract text information from:
- an image document (for example, a scanned pdf document)
- a pdf document
To annotate an OCR job:
- Select the class.
- Draw a bounding box on the area to be transcribed.
You need to pre-process your image with OCR so that when you draw a bounding box, the text is automatically extracted. To do so, upload OCR metadata to the asset, using the
The metadata structure is similar to the one produced by Google APIs.
Refer to an example tutorial on creating OCR annotations using Google vision api.
For examples of how to import metadata for OCR, refer to Importing asset metadata.
Updated 9 months ago