Triton Inference Guides
Triton Inference tutorial directory
Each guide includes a link to the source code for the provided example. In most cases, it will be required to clone the repository in order to follow along directly with the tutorial's walkthrough.
📄️ FasterTransformer GPT-J and GPT: NeoX 20B
Deploy GPT-J or GPT-NeoX using NVIDIA Triton Inference Server with the FasterTransformer backend