Repository Details
Shared by


HelloGitHub Rating
10.0
1 ratings
Free•CC-BY-4.0
Claim
Discuss
Collect
Share
21k
Stars
No
Chinese
Jupyter Notebook
Language
Yes
Active
7
Contributors
175
Issues
Yes
Organization
.2.0.0
Latest
2k
Forks
CC-BY-4.0
License
More

This is an open-source screen parsing tool by Microsoft that can convert UI screenshots into structured and easily manageable elements. Developed in Python, it leverages models such as YOLO, BLIP2, and Florence to achieve accurate icon recognition and generate descriptive text. It also supports integration with mainstream large language models (e.g., GPT-4V) and is suitable for developing desktop automation applications.
Comments
Rating:
No comments yet