Skip to content

Latest commit

 

History

History
52 lines (36 loc) · 1.3 KB

README.md

File metadata and controls

52 lines (36 loc) · 1.3 KB

OCR

Examples

Download the demo resource

wget https://raw.githubusercontent.com/open-mmlab/mmocr/main/demo/demo_kie.jpeg

Use the tool directly (without agent)

from agentlego.apis import load_tool

# load tool
tool = load_tool('OCR', device='cuda', lang='en', x_ths=3., line_group_tolerance=30)

# apply tool
res = tool('demo_kie.jpeg')

For bilingual Chinese and English OCR, lang may be ['en', 'ch_sim'], here is all supported language code name.

With Lagent

from lagent import ReAct, GPTAPI, ActionExecutor
from agentlego.apis import load_tool

# load tools and build agent
# please set `OPENAI_API_KEY` in your environment variable.
tool = load_tool('OCR', device='cuda').to_lagent()
agent = ReAct(GPTAPI(temperature=0.), action_executor=ActionExecutor([tool]))

# agent running with the tool.
ret = agent.chat(f'Here is a receipt image `demo_kie.jpeg`, please tell me the total cost.')
for step in ret.inner_steps[1:]:
    print('------')
    print(step['content'])

Set up

Before using the tool, please confirm you have installed the related dependencies by the below commands.

pip install easyocr

Reference

The default implementation of OCR tool uses EasyOCR.