Sequence-to-Sequence — Translation & Summarization
Seq2seq models take variable-length sequences as input and produce variable-length sequences as output. Translation (English → French), summarization (article → summary), and code generation (docstring → function) are all Seq2seq tasks. T5 and BART are the dominant modern architectures.
Translation and Summarization with T5/BART
from transformers import (
AutoTokenizer, AutoModelForSeq2SeqLM,
MarianMTModel, MarianTokenizer,
pipeline
)
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# MACHINE TRANSLATION — MarianMT
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
mt_tokenizer = MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
mt_model = MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
text_en = "Artificial intelligence is transforming every industry."
inputs = mt_tokenizer(text_en, return_tensors="pt", padding=True)
translated_ids = mt_model.generate(
**inputs,
num_beams=5, # beam search: explore 5 candidates simultaneously
early_stopping=True, # stop when all beams finish
max_length=60,
)
text_fr = mt_tokenizer.batch_decode(translated_ids, skip_special_tokens=True)[0]
print(f"EN: {text_en}")
print(f"FR: {text_fr}")
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# TEXT SUMMARIZATION — BART / T5
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
article = """
The development of artificial intelligence has accelerated dramatically in the past decade.
Deep learning breakthroughs have enabled computers to recognize speech, understand images,
translate languages, and generate creative content. Companies like OpenAI, Google DeepMind,
and Anthropic are racing to build increasingly capable AI systems. The launch of ChatGPT
in November 2022 brought AI into mainstream consciousness, attracting 100 million users
in just two months — faster adoption than any technology in history. These systems now
assist with coding, writing, analysis, and scientific research, transforming how experts
work across every industry from healthcare to legal services.
"""
summary = summarizer(article, max_length=80, min_length=30, do_sample=False)[0]["summary_text"]
print(f"\nSummary: {summary}")
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# GENERATION STRATEGIES — how the model produces output tokens
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# These are used in seq2seq AND in LLMs (GPT, Claude)
from transformers import AutoModelForCausalLM
model_name = "gpt2"
gen_model = AutoModelForCausalLM.from_pretrained(model_name)
gen_tok = AutoTokenizer.from_pretrained(model_name)
input_ids = gen_tok("The future of AI", return_tensors="pt").input_ids
strategies = {
"Greedy": dict(max_new_tokens=30, do_sample=False),
"Beam search (5)": dict(max_new_tokens=30, num_beams=5, early_stopping=True),
"Top-k sampling": dict(max_new_tokens=30, do_sample=True, top_k=50, temperature=0.8),
"Top-p (nucleus)": dict(max_new_tokens=30, do_sample=True, top_p=0.9, temperature=0.7),
}
for name, kwargs in strategies.items():
out = gen_model.generate(input_ids, **kwargs)
text = gen_tok.decode(out[0], skip_special_tokens=True)
print(f"\n{name}: {text}")
# Key insights:
# Greedy: deterministic, often repetitive
# Beam search: better quality, still deterministic
# Top-k: sample from top 50 probable next tokens → diverse, creative
# Top-p: sample from smallest set covering 90% of probability mass → best qualityTip
Tip
Practice SequencetoSequence Translation Summarization in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Modern NLP = Transformer-based. Pre-train, then fine-tune.
Practice Task
Note
Practice Task — (1) Write a working example of SequencetoSequence Translation Summarization from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with SequencetoSequence Translation Summarization is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ai code.