下拉刷新
Repository Details
Shared bynavbar_avatar
repo_avatar
HelloGitHub Rating
10.0
1 ratings
Multi-format Document Parsing and Export Tool
FreeMIT
Claim
Collect
Share
42.6k
Stars
No
Chinese
Python
Language
Yes
Active
142
Contributors
680
Issues
Yes
Organization
2.58.0
Latest
3k
Forks
MIT
License
More
docling image
This is a Python tool open-sourced by IBM, specifically designed to convert various documents into formats suitable for generative AI. It can export multiple popular document formats such as PDF, DOCX, PPTX, images, HTML, and Markdown into Markdown and JSON formats. It supports multiple OCR engines (for PDF) and a unified document object (DoclingDocument), and can be easily integrated into retrieval-augmented generation (RAG) and question-answering applications. It is suitable for scenarios where documents need to be used as input for generative AI models.
Included in:
Vol.115
Tags:
Python

Comments

Rating:
No comments yet