下拉刷新
Repository Details
Shared bynavbar_avatar
repo_avatar
HelloGitHub Rating
0 ratings
Lightweight vLLM Built from Scratch
FreeMIT
Claim
Collect
Share
5.1k
Stars
No
Chinese
Python
Language
Yes
Active
7
Contributors
21
Issues
No
Organization
None
Latest
600
Forks
MIT
License
More
This project is a lightweight vLLM (large language model inference engine) implemented in Python. The core code is only over 1000 lines. It has a clear structure and is easy to read. The inference speed is comparable to the original vLLM and integrates inference optimization techniques such as prefix caching, tensor parallelism, and Torch compilation.

Comments

Rating:
No comments yet