Exam NCA-GENL Topic 4 Question 32 Discussion

Actual exam question for NVIDIA's NCA-GENL exam
Question #: 32
Topic #: 4

Which metric is commonly used to evaluate machine-translation models?

A. F1 Score B. BLEU score C. ROUGE score D. Perplexity

The BLEU (Bilingual Evaluation Understudy) score is the most commonly used metric for evaluating machine-translation models. It measures the precision of n-gram overlaps between the generated translation and reference translations, providing a quantitative measure of translation quality. NVIDIA's NeMo documentation on NLP tasks, particularly machine translation, highlights BLEU as the standard metric for assessing translation performance due to its focus on precision and fluency. Option A (F1 Score) is used for classification tasks, not translation. Option C (ROUGE) is primarily for summarization, focusing on recall.
Option D (Perplexity) measures language model quality but is less specific to translation evaluation.
References:
NVIDIA NeMo Documentation: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp
/intro.html
Papineni, K., et al. (2002). "BLEU: A Method for Automatic Evaluation of Machine Translation."

by faresaljhne3 at Jan 28, 2026, 03:15 AM

Limited Time Offer

15%

Off

Get Premium NCA-GENL Questions as Interactive Self Test Engine or PDF

Comments

0 Happy Clients

0 Shares

0 Demo Downloads

10 Years in Business