Word Embeddings — Semantic Representations
Word embeddings map discrete tokens to continuous dense vectors where semantically similar words are geometrically close. The famous analogy: king - man + woman ≈ queen emerges from training, not programming. Modern Transformers learn contextual embeddings — the same word gets different vectors depending on context.
Static vs Contextual Embeddings
import torch
import torch.nn as nn
import numpy as np
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# STATIC EMBEDDINGS: nn.Embedding — same vector regardless of context
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
vocab_size = 30_000
embed_dim = 300
embedding_layer = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
# Shape of weight matrix: [30000, 300] — each row is a word vector
# Lookup: token ID → embedding vector
token_ids = torch.tensor([42, 101, 999]) # 3 tokens
vecs = embedding_layer(token_ids)
print(f"Embedding output: {vecs.shape}") # [3, 300]
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# PRETRAINED WORD2VEC / GLOVE — load as embedding weights
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# pip install gensim
import gensim.downloader as api
# Load GloVe (Global Vectors for Word Representation)
# Trained on 840B tokens from Common Crawl
# glove_model = api.load("glove-wiki-gigaword-300") # 300-dim, 400K words
# The famous analogies (demonstrates semantic geometry of embedding space):
# glove_model.most_similar(positive=["king", "woman"], negative=["man"], topn=1)
# → [('queen', 0.85)] → king - man + woman ≈ queen ✓
# France - Paris + Berlin ≈ Germany (country - capital + capital ≈ country)
# doctor - man + woman ≈ nurse (reveals gender bias in training data!)
# BIAS IN WORD EMBEDDINGS — critical concern
embedding_biases = {
"Gender": "doctor→man, nurse→woman learned from biased corpus",
"Race": "certain names associated with positive/negative words",
"Geography": "country embeddings reflect geopolitical biases",
}
print("\nEmbedding biases to be aware of:")
for bias_type, example in embedding_biases.items():
print(f" {bias_type}: {example}")
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# CONTEXTUAL EMBEDDINGS — different vector per context
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Static: "bank" always → same vector
# Contextual: "bank" in:
# "I put money in the bank" → financial institution embedding
# "We sat on the river bank" → riverbank embedding
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")
def get_contextual_embedding(text: str, word: str):
"""Get BERT contextual embedding for a specific word in context."""
tokens = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**tokens)
# outputs.last_hidden_state: [1, seq_len, 768] — one embedding per token
embeddings = outputs.last_hidden_state.squeeze(0) # [seq_len, 768]
print(f"Contextual embedding shape: {embeddings.shape}")
# Find the word's position in tokens
token_ids_list = tokens["input_ids"][0].tolist()
word_id = tokenizer.convert_tokens_to_ids(word)
if word_id in token_ids_list:
pos = token_ids_list.index(word_id)
return embeddings[pos] # word-specific embedding
return embeddings.mean(0) # fallback: mean pooling
# "bank" gets different vectors in different contexts
emb1 = get_contextual_embedding("I put money in the bank", "bank")
emb2 = get_contextual_embedding("We sat on the river bank", "bank")
# Cosine similarity between the two "bank" embeddings
cos_sim = torch.nn.functional.cosine_similarity(emb1.unsqueeze(0), emb2.unsqueeze(0))
print(f"\nSimilarity of 'bank' in financial vs river context: {cos_sim.item():.3f}")
# Should be < 1.0 — different contexts → different embeddingsTip
Tip
Practice Word Embeddings Semantic Representations in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Semantic elements improve SEO, accessibility, and code readability
Practice Task
Note
Practice Task — (1) Write a working example of Word Embeddings Semantic Representations from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with Word Embeddings Semantic Representations is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready ai code.