Repository Details
Shared by


HelloGitHub Rating
0 ratings
Python Library for Easily Extracting PDF Text and Tables
Past 3 days Received 8 stars ✨
Free•MIT
Claim
Discuss
Collect
Share
8.4k
Stars
Yes
Chinese
Python
Language
Yes
Active
38
Contributors
72
Issues
No
Organization
0.11.7
Latest
776
Forks
MIT
License
More

This project is a Python-based PDF parsing and data extraction library that can easily extract text and tables. It is able to accurately obtain detailed positions, sizes and font information of each character, line, rectangle and other elements in the PDF document, and supports one-click generation of page snapshots for convenient debugging.
Comments
Rating:
No comments yet