NVIDIA/TensorRT-LLM

TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

代码仓库主页

33/100

Stars13,690

Forks2,398

语言Python

概览

适合场景

评估 TensorRT-LLM 在 Python AI 工作流中的适用性。
对比一个拥有 13,690 stars 且仍有仓库活动的 GitHub 项目。

优点

TensorRT-LLM 已有 13,690 stars，可作为开发者关注度参考。主题：blackwell, cuda, llm-serving。
项目提供外部主页，便于进一步评估。

限制

生产适配度仍取决于文档深度、issue 活跃度和发布节奏。
未检测到许可证，需要人工确认使用风险。

生产可用性

TensorRT-LLM 在生产使用前，应结合 README、发布历史、开放 issue 和集成要求做验证。

许可证风险

GitHub 未报告许可证，生产使用前通常需要人工法务确认。

安装方式

git clone https://github.com/NVIDIA/TensorRT-LLM.git