Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

pdfplumber

该项目是基于 Python 的 PDF 解析与数据提取库，可轻松提取文本和表格。它能够精确获取 PDF 文档中每个字符、线条、矩形等元素的详细位置、尺寸和字体信息，并支持一键生成页面快照，方便调试。

This project is a Python-based PDF parsing and data extraction library that can easily extract text and tables. It is able to accurately obtain detailed positions, sizes and font information of each character, line, rectangle and other elements in the PDF document, and supports one-click generation of page snapshots for convenient debugging.

jsvine/pdfplumber

jsvine/pdfplumber

Comments