Module 2 — Applications & Tools of Generative AI
Hands-on, plain-English cards for **what to build and with what**. Each card: one-line meaning, quick use-cases, and a 2025 note so nothing feels stale.
Click “Show details” on any card. Search filters across terms, synonyms, and tags.
Text / NLP Applications
SummarizationText
Condense long content into key points.
- Use: meeting notes, legal summaries, blog TL;DR.
- 2025: Map-reduce + citations curb hallucinations.
Translation (Multilingual)Text
Convert text across languages.
- Use: Bangla↔English content ops.
- 2025: Domain glossaries + constraints improve names/brands.
Classification (Intent/Topic)Text
Assign labels to text.
- Use: support triage, sentiment, lead quality.
- 2025: Small task-specific SLMs outperform generic prompts.
Information ExtractionText
Pull structured fields from messy text.
- Use: invoices, resumes, policy docs.
- 2025: Constrained JSON schemas + validators for reliability.
Content GenerationText
Draft net-new text in a chosen voice.
- Use: blogs, ads, product pages.
- 2025: LoRA adapters + style guides > prompt-only.
Document Q&AText
Answer questions grounded in documents.
- Use: HR/Legal/Policy assistants.
- 2025: Retrieval + reranker + short context beats dumping whole PDFs.
Style TransferText
Rewrite text to match a brand voice.
- Use: single-source content → multiple personas.
- 2025: Guard with examples + banned patterns.
Structured Output (JSON/CSV)Text
Force outputs into machine-readable shapes.
- Use: pipelines, analytics.
- 2025: Use schema + validators + retries on failure.
Image / Vision Applications
Image GenerationVision
Create pictures from prompts or references.
- Use: lifestyle shots, ads, hero banners.
- 2025: Latent diffusion/flow dominate; control via conditioning.
Image Editing (In/Out-painting)Vision
Modify parts of an image or extend the canvas.
- Use: background swaps, object removal.
- 2025: Masks + prompt conditioning = precise control.
Super-ResolutionVision
Increase image resolution and clarity.
- Use: e-commerce packshots.
- 2025: GANs strong for tiny images; diffusion for complex edits.
OCR (Text from Images/PDF)Vision
Turn scans into searchable text.
- Use: bills, receipts, forms.
- 2025: Layout-aware OCR + RAG greatly improves accuracy.
Image Captioning / Alt TextVision
Describe images for SEO/accessibility.
- Use: auto alt text, catalog prep.
- 2025: Multimodal LLMs with brand style constraints.
Visual Grounding / ControlVision
Guide generation with edges, poses, or boxes.
- Use: consistent catalogs, product swaps.
- 2025: Control adapters standard in pipelines.
Audio / Speech Applications
Speech-to-Text (ASR)Audio
Transcribe audio into text.
- Use: meetings, call centers.
- 2025: Quality hinges on diarization + timestamps.
Text-to-Speech (TTS)Audio
Speak text in natural voices.
- Use: IVR, narration, accessibility.
- 2025: Prompt-conditioned style & language switching.
Voice Cloning (Consent-based)Audio
Replicate a specific voice.
- Use: branded voice assets.
- 2025: Strong consent/gating required; watermark outputs.
Audio Enhancement / DenoiseAudio
Clean and level recordings.
- Use: podcasts, support calls.
- 2025: Diffusion-based denoisers improve quality at low SNR.
Video Applications
Video GenerationVideo
Create short clips from prompts or images.
- Use: ads, explainers.
- 2025: Flow/consistency models cut frames-per-second cost; expect limits on lengths.
Captioning & SubtitlesVideo
Add time-aligned text tracks.
- Use: social content, accessibility.
- 2025: Auto translate + brand-safe profanity filters.
Scene DetectionVideo
Break videos into logical shots/scenes.
- Use: highlight reels, editing.
- 2025: Multimodal detectors improve over color-histogram only.
Code / Dev Applications
Code CompletionCode
Suggest next lines as you type.
- Use: IDE copilots.
- 2025: Repo-aware RAG + tests for safety.
Code Generation & RepairCode
Produce functions, fix bugs, write tests.
- Use: scaffolding new services.
- 2025: Sandbox + unit-test loops, not blind merges.
SQL GenerationCode
Translate questions into SQL.
- Use: self-serve analytics.
- 2025: Enforce schemas + safety rails to prevent destructive queries.
Documentation AssistCode
Auto-create docstrings, READMEs, and examples.
- Use: onboarding speed.
- 2025: Keep sources in repo RAG to avoid drift.
RAG / Search Patterns
Hybrid Search (BM25 + Vectors)RAG
Combine keyword and semantic search.
- Use: precision on rare terms + recall on synonyms.
- 2025: Default baseline for robust RAG.
Query RewritingRAG
Rewrite user questions to better searchable forms.
- Use: handle typos, synonyms, long questions.
- 2025: Learned rewriters lift recall materially.
Late-Interaction / Multi-VectorRAG
Store multiple vectors per doc for long-doc recall.
- Use: policies, manuals.
- 2025: Outperforms single-vector for long passages.
Cross-Encoder RerankerRAG
Reorder retrieved chunks by true relevance.
- Use: boost precision/accuracy.
- 2025: Often doubles correct answers vs vectors alone.
Citations & AttributionRAG
Show sources for each answer.
- Use: trust & audit.
- 2025: Mandatory for policy/health/finance content.
Index FreshnessRAG
Keep your knowledge base up to date.
- Use: nightly/streaming updates.
- 2025: Content drift is the #1 hidden failure of RAG.
Chunking StrategyRAG
Split docs with headings & overlap.
- 2025: Semantic chunkers beat fixed windows on varied docs.
Agents / Orchestration
Function CallingAgentic
Let the model call APIs safely.
- Use: calculators, DB, CRM.
- 2025: Strict schemas + rate limits + timeouts required.
Planning & ReflectionAgentic
Model drafts a plan, executes, then self-critiques.
- Use: multi-step tasks.
- 2025: Cap steps/budget to avoid loops.
Routing (Model/Tool Selection)Agentic
Pick the right tool/model for each query.
- Use: SLM for simple, LLM for hard.
- 2025: Policy-based routers reduce cost massively.
GuardrailsAgentic
Enforce allowed tools, inputs, outputs.
- Use: prevent unsafe actions.
- 2025: Combine classifiers + allowlists + human-in-loop.
Data & Pipeline Tools (Glossary)
Embedding ModelsTooling
Turn items into vectors for search/clustering.
- 2025: Domain-tuned embeddings win over general ones.
Vector DatabasesTooling
Store and search embeddings efficiently.
- 2025: Choose for filters, hybrid, and ops comfort.
Document LoadersTooling
Ingest PDFs, HTML, docs into pipelines.
- 2025: Layout-aware parsing matters (tables, headers).
Table ExtractionTooling
Pull structured tables from docs.
- 2025: Use HTML/CSV outputs; avoid raw text tables.
Metadata EnrichmentTooling
Attach titles, authors, topics to chunks.
- 2025: Critical for filtering and reranking features.
Feature StoreTooling
Central hub for ML features/embeddings.
- 2025: Helps reuse vectors across products.
Evaluation Datasets (Golden Set)Tooling
A fixed set of queries with true answers.
- 2025: Version & refresh quarterly to prevent overfitting.
Observability & TracingTooling
Record prompts, tool calls, latencies, costs.
- 2025: Required for debugging and compliance.
Prompt ManagementTooling
Version, test, and roll back prompts.
- 2025: Treat prompts like code with A/B tests.
Caching LayersTooling
Reuse responses to cut latency/cost.
- 2025: TTL + key strategy avoids stale answers.
LLMOps, Safety & Governance
Red-TeamingGovernance
Actively try to break the system to find risks.
- 2025: Automate with attack libraries; review high-risk outputs weekly.
PII Handling & PrivacyGovernance
Protect personal data across ingestion and outputs.
- 2025: Masking + allowlisted fields + data-minimization.
Copyright & LicensingGovernance
Respect content licenses and track sources.
- 2025: Keep provenance tags; use safe-source corpora for fine-tuning.
Watermarking / ProvenanceGovernance
Mark or detect AI-generated media.
- 2025: Expect mixed effectiveness; combine with policy controls.
Bias & Fairness ChecksGovernance
Detect skewed outcomes across groups.
- 2025: Use dashboards; escalate high-impact flows to human review.
Abuse Prevention & Rate LimitsGovernance
Throttle suspicious usage and block abusive prompts.
- 2025: Required for public endpoints and agents.
Evaluation & Metrics (Applied)
Exact-Match / F1 (Q&A)Eval
Measure correctness against a ground truth.
- 2025: Add citation rate and harmful error rate.
Retrieval MetricsEval
Recall@k, MRR, nDCG for search quality.
- 2025: Evaluate before and after reranker.
Human EvaluationEval
Humans rate outputs for quality and safety.
- 2025: Rubrics + double-blind reviews reduce drift.
Latency & Cost KPIsEval
Track time-to-first-token, total time, and price per answer.
- 2025: Set SLOs; stream for UX.
A/B & Canary RolloutsEval
Compare variants safely in production.
- 2025: Gate new prompts/models with traffic splits.
Efficiency & Deployment (LLM Inference)
QuantizationDeploy
Use fewer bits to store weights for speed/cost wins.
- 2025: 4-bit mixed precision common for SLMs.
Knowledge DistillationDeploy
Train a small “student” from a large “teacher”.
- 2025: Core tactic for on-device assistants.
BatchingDeploy
Process multiple requests at once for throughput.
- 2025: Pair with priority queues for interactivity.
KV CacheDeploy
Reuse attention history to speed generation.
- 2025: Essential for chat latency at scale.
Model ParallelismDeploy
Split a model across devices when it doesn’t fit one.
- 2025: Prefer SLMs + quantization before sharding.
Inference Servers (vLLM, Triton, llama.cpp, ONNX)Deploy
Serve models efficiently with queuing/caching.
- 2025: Pick by hardware + latency targets.
Named Tools & Stacks (Descriptive)
Hosted Model APIsVendors
Access strong models via API.
- 2025: Compare price, latency, safety, privacy.
Llama (Open Weights)Models
Open-weight base for fine-tuning or local use.
- 2025: Check licenses; great SLM options.
Hugging Face (Transformers/Datasets)Ecosystem
Model zoo + training/inference helpers.
- 2025: Rapid prototyping standard.
LangChain / LlamaIndexOrchestration
Wire prompts, tools, memory, retrieval.
- 2025: Useful—keep control and add tests.
Pinecone / Weaviate / Milvus / FAISSVector DB
Embedding stores with ANN search.
- 2025: Choose by ops model & hybrid search.
Weights & Biases / MLflowTracking
Experiment and model tracking.
- 2025: Needed for eval/drift audits.
Airflow / DagsterOrchestration
Schedule pipelines and data jobs.
- 2025: Use for nightly indexing & eval jobs.
Postgres/Elastic/OpenSearch/SupabaseData
Text/keyword search & storage backbones.
- 2025: Strong hybrid search when combined with vectors.
vLLM / Triton / llama.cpp / ONNX / OllamaServing
Serve/open-run models for cost control.
- 2025: Pick by hardware, scale, and privacy needs.
Last updated: 2025-09 • Scope mirrors your Coursera Module 2 (“Applications & Tools”) but written for speed-to-mastery and current practices.