NVIDIA's NV Embed V 2 is a generalist embedding model that excels at text embedding tasks, as evidenced by its top ranking on the Massive Text Embedding Benchmark with a score of 72.31. It features a base decoder-only LLM architecture, specifically Mistral-7B-v0.1, and utilizes a latent-attention pooling type to produce high-quality embeddings. Notably, NV Embed V 2 incorporates a novel two-staged instruction tuning method and hard-negative mining approach to enhance its accuracy, particularly in retrieval tasks.
Input
Output
Context
33K
Max Output
-
Parameters
7B
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.