Exam NCA-GENL Topic 6 Question 86 Discussion
Actual exam question for NVIDIA's NCA-GENL exam
Question #: 86
Topic #: 6
Question #: 86
Topic #: 6
In large-language models, what is the purpose of the attention mechanism?
Suggested Answer: D Vote an answer
The attention mechanism is a critical component of large language models, particularly in Transformer architectures, as covered in NVIDIA's Generative AI and LLMs course. Its primary purpose is to assign weights to each token in the input sequence based on its relevance to other tokens, allowing the model to focus on the most contextually important parts of the input when generating or interpreting text. This is achieved through mechanisms like self-attention, where each token computes a weighted sum of all other tokens' representations, with weights determined by their relevance (e.g., via scaled dot-product attention).
This enables the model to capture long-range dependencies and contextual relationships effectively, unlike traditional recurrent networks. Option A is incorrect because attention focuses on the input sequence, not the output sequence. Option B is wrong as the order of generation is determined by the model's autoregressive or decoding strategy, not the attention mechanism itself. Option C is also inaccurate, as capturing the order of words is the role of positional encoding, not attention. The course highlights: "The attention mechanism enables models to weigh the importance of different tokens in the input sequence, improving performance in tasks like translation and text generation." References: NVIDIA Building Transformer-Based Natural Language Processing Applications course; NVIDIA Introduction to Transformer-Based Natural Language
This enables the model to capture long-range dependencies and contextual relationships effectively, unlike traditional recurrent networks. Option A is incorrect because attention focuses on the input sequence, not the output sequence. Option B is wrong as the order of generation is determined by the model's autoregressive or decoding strategy, not the attention mechanism itself. Option C is also inaccurate, as capturing the order of words is the role of positional encoding, not attention. The course highlights: "The attention mechanism enables models to weigh the importance of different tokens in the input sequence, improving performance in tasks like translation and text generation." References: NVIDIA Building Transformer-Based Natural Language Processing Applications course; NVIDIA Introduction to Transformer-Based Natural Language
by Heloise at Jun 16, 2026, 10:05 PM
0
0
0
10
Comments
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
Report Comment
Commenting
You can sign-up / login (it's free).