Exam NCA-GENL Topic 4 Question 32 Discussion

Actual exam question for NVIDIA's NCA-GENL exam
Question #: 32
Topic #: 4
Which metric is commonly used to evaluate machine-translation models?

Suggested Answer: C Vote an answer

The BLEU (Bilingual Evaluation Understudy) score is the most commonly used metric for evaluating machine-translation models. It measures the precision of n-gram overlaps between the generated translation and reference translations, providing a quantitative measure of translation quality. NVIDIA's NeMo documentation on NLP tasks, particularly machine translation, highlights BLEU as the standard metric for assessing translation performance due to its focus on precision and fluency. Option A (F1 Score) is used for classification tasks, not translation. Option C (ROUGE) is primarily for summarization, focusing on recall.
Option D (Perplexity) measures language model quality but is less specific to translation evaluation.
References:
NVIDIA NeMo Documentation: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp
/intro.html
Papineni, K., et al. (2002). "BLEU: A Method for Automatic Evaluation of Machine Translation."

by faresaljhne3 at Jan 28, 2026, 03:15 AM

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Nick name: Submit Cancel
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

0
0
0
10