Optical Character Recognition Tool > Introduction to Optical Character Recognition Tool recipe

Introduction to Optical Character Recognition Tool recipe

Use the Optical Character Recognition (OCR) Tool recipe to create a Python tool that automates text extraction from all the documents and images in a specified folder and prepares the data for further analysis.

The tool scans file directories on the machine where your Secure Agent is installed and applies advanced character recognition techniques to convert multi-modal files into readable text. It leverages Python libraries like pytesseract, fitz, image, and json to read files from the folder.

The tool supports data extraction from the following file formats:

•CSV
•JPG
•JSON
•PDF
•PNG
•TXT

The tool enables AI agents to efficiently access and analyze files for use cases like Retrieval-Augmented Generation (RAG) and multimodal file analysis. Additionally, chatbots can use this tool to interact with and query files within a folder, facilitating seamless information retrieval and enhancing user interactions.