下拉刷新
Repository Details
Shared bynavbar_avatar
repo_avatar
HelloGitHub Rating
10.0
1 ratings
Multi-format Document Parsing and Exporting Tool
FreeMIT
Claim
Collect
Share
14k
Stars
No
Chinese
Python
Language
Yes
Active
28
Contributors
104
Issues
Yes
Organization
2.12.0
Latest
698
Forks
MIT
License
More
docling image
This is an open-source Python tool developed by IBM, specifically designed for converting various documents into formats suitable for generative AI use. It is capable of exporting multiple popular document formats such as PDF, DOCX, PPTX, images, HTML, and Markdown into Markdown and JSON formats. The tool supports various OCR engines for PDF, a unified document object model (DoclingDocument), and can be easily integrated with Retrieval-Augmented Generation (RAG) and question-answering applications. It is ideal for scenarios that require documents as inputs for generative AI models.
Tags:
Python

Comments

Rating:
No comments yet