This course introduced modern NLP with a focus on deep learning methods and foundation models. Topics included neural and transformer architectures, pre-trained and instruction-tuned LLMs, parameter-efficient fine-tuning (PEFT), prompting, and post-training techniques. Core problems emphasized semantic understanding and generation (question answering, machine translation, etc.).


Project 1 — Sentiment Analysis with Neural Networks

Built progressively more sophisticated models for binary sentiment classification on movie reviews.

I:

class DAN(nn.Module):
    def forward(self, x, lengths):
        embedded = self.embedding(x)
        # Average over non-padded tokens
        mask = (x != 0).unsqueeze(-1).float()
        averaged = (embedded * mask).sum(dim=1) / lengths.unsqueeze(-1)
        return self.log_softmax(self.network(averaged))

Project 2 — Transformer Encoder & Decoder

Implemented transformer architectures from scratch for classification and language modeling.

I:

class MultiHeadAttention(nn.Module):
    def forward(self, x):
        q = self.query(x).view(B, T, n_head, head_dim).transpose(1, 2)
        k = self.key(x).view(B, T, n_head, head_dim).transpose(1, 2)
        v = self.value(x).view(B, T, n_head, head_dim).transpose(1, 2)
        
        attn = (q @ k.transpose(-2, -1)) / math.sqrt(head_dim)
        if self.use_alibi:
            attn = attn + build_alibi_bias(n_head, T, device)
        if self.causal:
            attn = attn.masked_fill(self.mask[:,:,:T,:T] == 0, float("-inf"))
        return self.out_proj(F.softmax(attn, -1) @ v)

Project 3 — Decoding & Retrieval-Augmented Generation

Explored decoding strategies and modern RAG systems.

Part A: Beam Search Decoding

Part B: RAG System