Instruction-tuned Llama 3.1 Nemotron model with 70 billion parameters, focusing on advanced task completion
Artificial Analysis Quality Index; Higher is better
Output Tokens per Second; Higher is better
USD per 1M Tokens; Lower is better
Seconds to First Tokens Chunk Received; Lower is better