NVIDIA Llama 3.1 Nemotron Instruct 70B

September 2024 update of the Reka Flash model, enhancing speed and efficiency

Creator: NVIDIALicense: Open

Summary

Artificial Analysis Quality
Index

70The model performs better than the average model

Output Speed (Tokens per
Second)

47The model's speed is lesser than the average model

Model Pricing (USD per 1M Tokens)

$0.400The model's pricing is lower than the average model

Latency (to receive the first
token)

14.21sThe model's latency is lower than the average model

Context Window (tokens)

128k

Artificial Analysis Quality Index; Higher is better

+ Add model

Output Tokens per Second; Higher is better

+ Add model

USD per 1M Tokens; Lower is better

+ Add model

Seconds to First Tokens Chunk Received; Lower is better

+ Add model