Description
Provides necessary methods for OCR activities. It is recommended to use this activity inside an OCR activity.
Properties
Analyst
- Display Name: Optional friendly name used for debugging, validation, exception handling, and activity tracking.
- Description: Optional text for documentation purposes.
Common
-
Disable Log: Disables the logging functionality for this activity.
-
Disable Protocol: Disables the protocol functionality for this activity.
Input
- Input Image: Set bitmap image if you're not using the Tesseract engine inside an OCR activity.
Options
- Extract Words: Check if you want to extract single words. The output is provided in the scraped words property.
- Language: Provided language models that are necessary to execute the activity. You can add other language models in the activities/tessdata folder.
- Page segmentation mode: By default, Tesseract expects a page of text when it segments an image. If you´re just seeking to OCR a small region, try a different segmentation mode.
Output
- Result: Output of the received words in a tuple object.
- Scraped Text:Text which is received after execution as string output.
- Scraped Words: List of extracted words which are received after execution as ICollection.
Preprocessing
- Auto rotate: If you want to extract text from scanned documents it perhaps makes sense to correct the positioning of the document. Check if you want to do it automatically.
- Automatic profile: Sometimes it makes sense to use the same settings for similar applications. These settings can be stored in profiles and reused.
- Denoise: Check if you want to use a filter to denoise noisy images.
- Invert: Choose if you want to invert the pixel information. The white pixel will become black and vice versa.
- Line thickness: This value is necessary for Remove frames property. This value determines how thick the overlying line should be.
- Remove frames: Check if you want to remove frames around the text. This property will detect frames and cover them with a white line. the thickness of the frame is determined in the Line thickness property.
- Rotate: Check if you want to rotate the input image. This action depends on the angle set in the Rotation angle property.
- Rotation angle: Set the desired rotation if you want to use the Rotate property.
- Scale: Enlarges or reduces the size of the image.