Sequence-to-Sequence — Translation & Summarization

Seq2seq models take variable-length sequences as input and produce variable-length sequences as output. Translation (English → French), summarization (article → summary), and code generation (docstring → function) are all Seq2seq tasks. T5 and BART are the dominant modern architectures.

20 min•By Priygop Team•Updated 2026

Translation and Summarization with T5/BART

from transformers import (
    AutoTokenizer, AutoModelForSeq2SeqLM,
    MarianMTModel, MarianTokenizer,
    pipeline
)

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# MACHINE TRANSLATION — MarianMT
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

mt_tokenizer = MarianTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-fr")
mt_model = MarianMTModel.from_pretrained("Helsinki-NLP/opus-mt-en-fr")

text_en = "Artificial intelligence is transforming every industry."
inputs = mt_tokenizer(text_en, return_tensors="pt", padding=True)

translated_ids = mt_model.generate(
    **inputs,
    num_beams=5,          # beam search: explore 5 candidates simultaneously
    early_stopping=True,  # stop when all beams finish
    max_length=60,
)
text_fr = mt_tokenizer.batch_decode(translated_ids, skip_special_tokens=True)[0]
print(f"EN: {text_en}")
print(f"FR: {text_fr}")

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# TEXT SUMMARIZATION — BART / T5
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

article = """
The development of artificial intelligence has accelerated dramatically in the past decade.
Deep learning breakthroughs have enabled computers to recognize speech, understand images,
translate languages, and generate creative content. Companies like OpenAI, Google DeepMind,
and Anthropic are racing to build increasingly capable AI systems. The launch of ChatGPT
in November 2022 brought AI into mainstream consciousness, attracting 100 million users
in just two months — faster adoption than any technology in history. These systems now
assist with coding, writing, analysis, and scientific research, transforming how experts
work across every industry from healthcare to legal services.
"""

summary = summarizer(article, max_length=80, min_length=30, do_sample=False)[0]["summary_text"]
print(f"\nSummary: {summary}")

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# GENERATION STRATEGIES — how the model produces output tokens
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# These are used in seq2seq AND in LLMs (GPT, Claude)

from transformers import AutoModelForCausalLM
model_name = "gpt2"
gen_model = AutoModelForCausalLM.from_pretrained(model_name)
gen_tok = AutoTokenizer.from_pretrained(model_name)
input_ids = gen_tok("The future of AI", return_tensors="pt").input_ids

strategies = {
    "Greedy": dict(max_new_tokens=30, do_sample=False),
    "Beam search (5)": dict(max_new_tokens=30, num_beams=5, early_stopping=True),
    "Top-k sampling": dict(max_new_tokens=30, do_sample=True, top_k=50, temperature=0.8),
    "Top-p (nucleus)": dict(max_new_tokens=30, do_sample=True, top_p=0.9, temperature=0.7),
}

for name, kwargs in strategies.items():
    out = gen_model.generate(input_ids, **kwargs)
    text = gen_tok.decode(out[0], skip_special_tokens=True)
    print(f"\n{name}: {text}")

# Key insights:
# Greedy: deterministic, often repetitive
# Beam search: better quality, still deterministic
# Top-k: sample from top 50 probable next tokens → diverse, creative
# Top-p: sample from smallest set covering 90% of probability mass → best quality

Tip

Practice SequencetoSequence Translation Summarization in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

Modern NLP = Transformer-based. Pre-train, then fine-tune.

Practice Task

Note

Practice Task — (1) Write a working example of SequencetoSequence Translation Summarization from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with SequencetoSequence Translation Summarization is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ai code.

Topics in This Module