Word Embeddings — Semantic Representations

Word embeddings map discrete tokens to continuous dense vectors where semantically similar words are geometrically close. The famous analogy: king - man + woman ≈ queen emerges from training, not programming. Modern Transformers learn contextual embeddings — the same word gets different vectors depending on context.

20 min•By Priygop Team•Updated 2026

Static vs Contextual Embeddings

import torch
import torch.nn as nn
import numpy as np

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# STATIC EMBEDDINGS: nn.Embedding — same vector regardless of context
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

vocab_size = 30_000
embed_dim   = 300

embedding_layer = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
# Shape of weight matrix: [30000, 300] — each row is a word vector

# Lookup: token ID → embedding vector
token_ids = torch.tensor([42, 101, 999])   # 3 tokens
vecs = embedding_layer(token_ids)
print(f"Embedding output: {vecs.shape}")   # [3, 300]

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# PRETRAINED WORD2VEC / GLOVE — load as embedding weights
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# pip install gensim
import gensim.downloader as api

# Load GloVe (Global Vectors for Word Representation)
# Trained on 840B tokens from Common Crawl
# glove_model = api.load("glove-wiki-gigaword-300")  # 300-dim, 400K words

# The famous analogies (demonstrates semantic geometry of embedding space):
# glove_model.most_similar(positive=["king", "woman"], negative=["man"], topn=1)
# → [('queen', 0.85)]  → king - man + woman ≈ queen ✓

# France - Paris + Berlin ≈ Germany (country - capital + capital ≈ country)
# doctor - man + woman ≈ nurse  (reveals gender bias in training data!)

# BIAS IN WORD EMBEDDINGS — critical concern
embedding_biases = {
    "Gender": "doctor→man, nurse→woman learned from biased corpus",
    "Race":   "certain names associated with positive/negative words",
    "Geography": "country embeddings reflect geopolitical biases",
}
print("\nEmbedding biases to be aware of:")
for bias_type, example in embedding_biases.items():
    print(f"  {bias_type}: {example}")

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# CONTEXTUAL EMBEDDINGS — different vector per context
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Static: "bank" always → same vector
# Contextual: "bank" in:
#   "I put money in the bank" → financial institution embedding
#   "We sat on the river bank" → riverbank embedding

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

def get_contextual_embedding(text: str, word: str):
    """Get BERT contextual embedding for a specific word in context."""
    tokens = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        outputs = model(**tokens)
    # outputs.last_hidden_state: [1, seq_len, 768] — one embedding per token
    embeddings = outputs.last_hidden_state.squeeze(0)  # [seq_len, 768]
    print(f"Contextual embedding shape: {embeddings.shape}")

    # Find the word's position in tokens
    token_ids_list = tokens["input_ids"][0].tolist()
    word_id = tokenizer.convert_tokens_to_ids(word)
    if word_id in token_ids_list:
        pos = token_ids_list.index(word_id)
        return embeddings[pos]  # word-specific embedding
    return embeddings.mean(0)  # fallback: mean pooling

# "bank" gets different vectors in different contexts
emb1 = get_contextual_embedding("I put money in the bank", "bank")
emb2 = get_contextual_embedding("We sat on the river bank", "bank")

# Cosine similarity between the two "bank" embeddings
cos_sim = torch.nn.functional.cosine_similarity(emb1.unsqueeze(0), emb2.unsqueeze(0))
print(f"\nSimilarity of 'bank' in financial vs river context: {cos_sim.item():.3f}")
# Should be < 1.0 — different contexts → different embeddings

Tip

Practice Word Embeddings Semantic Representations in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.

Diagram

Loading diagram…

Semantic elements improve SEO, accessibility, and code readability

Practice Task

Note

Practice Task — (1) Write a working example of Word Embeddings Semantic Representations from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.

Quick Quiz

Common Mistake

Warning

A common mistake with Word Embeddings Semantic Representations is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ai code.

Topics in This Module