Module 2 — Applications & Tools of Generative AI

Hands-on, plain-English cards for **what to build and with what**. Each card: one-line meaning, quick use-cases, and a 2025 note so nothing feels stale.

Click “Show details” on any card. Search filters across terms, synonyms, and tags.

Text / NLP Applications

SummarizationText
Condense long content into key points.
  • Use: meeting notes, legal summaries, blog TL;DR.
  • 2025: Map-reduce + citations curb hallucinations.
Translation (Multilingual)Text
Convert text across languages.
  • Use: Bangla↔English content ops.
  • 2025: Domain glossaries + constraints improve names/brands.
Classification (Intent/Topic)Text
Assign labels to text.
  • Use: support triage, sentiment, lead quality.
  • 2025: Small task-specific SLMs outperform generic prompts.
Information ExtractionText
Pull structured fields from messy text.
  • Use: invoices, resumes, policy docs.
  • 2025: Constrained JSON schemas + validators for reliability.
Content GenerationText
Draft net-new text in a chosen voice.
  • Use: blogs, ads, product pages.
  • 2025: LoRA adapters + style guides > prompt-only.
Document Q&AText
Answer questions grounded in documents.
  • Use: HR/Legal/Policy assistants.
  • 2025: Retrieval + reranker + short context beats dumping whole PDFs.
Style TransferText
Rewrite text to match a brand voice.
  • Use: single-source content → multiple personas.
  • 2025: Guard with examples + banned patterns.
Structured Output (JSON/CSV)Text
Force outputs into machine-readable shapes.
  • Use: pipelines, analytics.
  • 2025: Use schema + validators + retries on failure.

Image / Vision Applications

Image GenerationVision
Create pictures from prompts or references.
  • Use: lifestyle shots, ads, hero banners.
  • 2025: Latent diffusion/flow dominate; control via conditioning.
Image Editing (In/Out-painting)Vision
Modify parts of an image or extend the canvas.
  • Use: background swaps, object removal.
  • 2025: Masks + prompt conditioning = precise control.
Super-ResolutionVision
Increase image resolution and clarity.
  • Use: e-commerce packshots.
  • 2025: GANs strong for tiny images; diffusion for complex edits.
OCR (Text from Images/PDF)Vision
Turn scans into searchable text.
  • Use: bills, receipts, forms.
  • 2025: Layout-aware OCR + RAG greatly improves accuracy.
Image Captioning / Alt TextVision
Describe images for SEO/accessibility.
  • Use: auto alt text, catalog prep.
  • 2025: Multimodal LLMs with brand style constraints.
Visual Grounding / ControlVision
Guide generation with edges, poses, or boxes.
  • Use: consistent catalogs, product swaps.
  • 2025: Control adapters standard in pipelines.

Audio / Speech Applications

Speech-to-Text (ASR)Audio
Transcribe audio into text.
  • Use: meetings, call centers.
  • 2025: Quality hinges on diarization + timestamps.
Text-to-Speech (TTS)Audio
Speak text in natural voices.
  • Use: IVR, narration, accessibility.
  • 2025: Prompt-conditioned style & language switching.
Voice Cloning (Consent-based)Audio
Replicate a specific voice.
  • Use: branded voice assets.
  • 2025: Strong consent/gating required; watermark outputs.
Audio Enhancement / DenoiseAudio
Clean and level recordings.
  • Use: podcasts, support calls.
  • 2025: Diffusion-based denoisers improve quality at low SNR.

Video Applications

Video GenerationVideo
Create short clips from prompts or images.
  • Use: ads, explainers.
  • 2025: Flow/consistency models cut frames-per-second cost; expect limits on lengths.
Captioning & SubtitlesVideo
Add time-aligned text tracks.
  • Use: social content, accessibility.
  • 2025: Auto translate + brand-safe profanity filters.
Scene DetectionVideo
Break videos into logical shots/scenes.
  • Use: highlight reels, editing.
  • 2025: Multimodal detectors improve over color-histogram only.

Code / Dev Applications

Code CompletionCode
Suggest next lines as you type.
  • Use: IDE copilots.
  • 2025: Repo-aware RAG + tests for safety.
Code Generation & RepairCode
Produce functions, fix bugs, write tests.
  • Use: scaffolding new services.
  • 2025: Sandbox + unit-test loops, not blind merges.
SQL GenerationCode
Translate questions into SQL.
  • Use: self-serve analytics.
  • 2025: Enforce schemas + safety rails to prevent destructive queries.
Documentation AssistCode
Auto-create docstrings, READMEs, and examples.
  • Use: onboarding speed.
  • 2025: Keep sources in repo RAG to avoid drift.

RAG / Search Patterns

Hybrid Search (BM25 + Vectors)RAG
Combine keyword and semantic search.
  • Use: precision on rare terms + recall on synonyms.
  • 2025: Default baseline for robust RAG.
Query RewritingRAG
Rewrite user questions to better searchable forms.
  • Use: handle typos, synonyms, long questions.
  • 2025: Learned rewriters lift recall materially.
Late-Interaction / Multi-VectorRAG
Store multiple vectors per doc for long-doc recall.
  • Use: policies, manuals.
  • 2025: Outperforms single-vector for long passages.
Cross-Encoder RerankerRAG
Reorder retrieved chunks by true relevance.
  • Use: boost precision/accuracy.
  • 2025: Often doubles correct answers vs vectors alone.
Citations & AttributionRAG
Show sources for each answer.
  • Use: trust & audit.
  • 2025: Mandatory for policy/health/finance content.
Index FreshnessRAG
Keep your knowledge base up to date.
  • Use: nightly/streaming updates.
  • 2025: Content drift is the #1 hidden failure of RAG.
Chunking StrategyRAG
Split docs with headings & overlap.
  • 2025: Semantic chunkers beat fixed windows on varied docs.

Agents / Orchestration

Function CallingAgentic
Let the model call APIs safely.
  • Use: calculators, DB, CRM.
  • 2025: Strict schemas + rate limits + timeouts required.
Planning & ReflectionAgentic
Model drafts a plan, executes, then self-critiques.
  • Use: multi-step tasks.
  • 2025: Cap steps/budget to avoid loops.
Routing (Model/Tool Selection)Agentic
Pick the right tool/model for each query.
  • Use: SLM for simple, LLM for hard.
  • 2025: Policy-based routers reduce cost massively.
GuardrailsAgentic
Enforce allowed tools, inputs, outputs.
  • Use: prevent unsafe actions.
  • 2025: Combine classifiers + allowlists + human-in-loop.

Data & Pipeline Tools (Glossary)

Embedding ModelsTooling
Turn items into vectors for search/clustering.
  • 2025: Domain-tuned embeddings win over general ones.
Vector DatabasesTooling
Store and search embeddings efficiently.
  • 2025: Choose for filters, hybrid, and ops comfort.
Document LoadersTooling
Ingest PDFs, HTML, docs into pipelines.
  • 2025: Layout-aware parsing matters (tables, headers).
Table ExtractionTooling
Pull structured tables from docs.
  • 2025: Use HTML/CSV outputs; avoid raw text tables.
Metadata EnrichmentTooling
Attach titles, authors, topics to chunks.
  • 2025: Critical for filtering and reranking features.
Feature StoreTooling
Central hub for ML features/embeddings.
  • 2025: Helps reuse vectors across products.
Evaluation Datasets (Golden Set)Tooling
A fixed set of queries with true answers.
  • 2025: Version & refresh quarterly to prevent overfitting.
Observability & TracingTooling
Record prompts, tool calls, latencies, costs.
  • 2025: Required for debugging and compliance.
Prompt ManagementTooling
Version, test, and roll back prompts.
  • 2025: Treat prompts like code with A/B tests.
Caching LayersTooling
Reuse responses to cut latency/cost.
  • 2025: TTL + key strategy avoids stale answers.

LLMOps, Safety & Governance

Red-TeamingGovernance
Actively try to break the system to find risks.
  • 2025: Automate with attack libraries; review high-risk outputs weekly.
PII Handling & PrivacyGovernance
Protect personal data across ingestion and outputs.
  • 2025: Masking + allowlisted fields + data-minimization.
Copyright & LicensingGovernance
Respect content licenses and track sources.
  • 2025: Keep provenance tags; use safe-source corpora for fine-tuning.
Watermarking / ProvenanceGovernance
Mark or detect AI-generated media.
  • 2025: Expect mixed effectiveness; combine with policy controls.
Bias & Fairness ChecksGovernance
Detect skewed outcomes across groups.
  • 2025: Use dashboards; escalate high-impact flows to human review.
Abuse Prevention & Rate LimitsGovernance
Throttle suspicious usage and block abusive prompts.
  • 2025: Required for public endpoints and agents.

Evaluation & Metrics (Applied)

Exact-Match / F1 (Q&A)Eval
Measure correctness against a ground truth.
  • 2025: Add citation rate and harmful error rate.
Retrieval MetricsEval
Recall@k, MRR, nDCG for search quality.
  • 2025: Evaluate before and after reranker.
Human EvaluationEval
Humans rate outputs for quality and safety.
  • 2025: Rubrics + double-blind reviews reduce drift.
Latency & Cost KPIsEval
Track time-to-first-token, total time, and price per answer.
  • 2025: Set SLOs; stream for UX.
A/B & Canary RolloutsEval
Compare variants safely in production.
  • 2025: Gate new prompts/models with traffic splits.

Efficiency & Deployment (LLM Inference)

QuantizationDeploy
Use fewer bits to store weights for speed/cost wins.
  • 2025: 4-bit mixed precision common for SLMs.
Knowledge DistillationDeploy
Train a small “student” from a large “teacher”.
  • 2025: Core tactic for on-device assistants.
BatchingDeploy
Process multiple requests at once for throughput.
  • 2025: Pair with priority queues for interactivity.
KV CacheDeploy
Reuse attention history to speed generation.
  • 2025: Essential for chat latency at scale.
Model ParallelismDeploy
Split a model across devices when it doesn’t fit one.
  • 2025: Prefer SLMs + quantization before sharding.
Inference Servers (vLLM, Triton, llama.cpp, ONNX)Deploy
Serve models efficiently with queuing/caching.
  • 2025: Pick by hardware + latency targets.

Named Tools & Stacks (Descriptive)

Hosted Model APIsVendors
Access strong models via API.
  • 2025: Compare price, latency, safety, privacy.
Llama (Open Weights)Models
Open-weight base for fine-tuning or local use.
  • 2025: Check licenses; great SLM options.
Hugging Face (Transformers/Datasets)Ecosystem
Model zoo + training/inference helpers.
  • 2025: Rapid prototyping standard.
LangChain / LlamaIndexOrchestration
Wire prompts, tools, memory, retrieval.
  • 2025: Useful—keep control and add tests.
Pinecone / Weaviate / Milvus / FAISSVector DB
Embedding stores with ANN search.
  • 2025: Choose by ops model & hybrid search.
Weights & Biases / MLflowTracking
Experiment and model tracking.
  • 2025: Needed for eval/drift audits.
Airflow / DagsterOrchestration
Schedule pipelines and data jobs.
  • 2025: Use for nightly indexing & eval jobs.
Postgres/Elastic/OpenSearch/SupabaseData
Text/keyword search & storage backbones.
  • 2025: Strong hybrid search when combined with vectors.
vLLM / Triton / llama.cpp / ONNX / OllamaServing
Serve/open-run models for cost control.
  • 2025: Pick by hardware, scale, and privacy needs.

Last updated: 2025-09 • Scope mirrors your Coursera Module 2 (“Applications & Tools”) but written for speed-to-mastery and current practices.