4-Day Summer Bootcamp: University-Style Full Syllabus
This single-column webpage syllabus is separate from the poster and written with the density of a short university course. Each day has lecture topics, lab work, key questions, assignments, and a deliverable. The syllabus adds detailed coverage of LLM principles, CNN/LSTM/GRU/seq2seq/attention/Transformer architectures, and their limits in empirical research.
Course Information
Course Type
4-day intensive bootcamp
For learners who want a systematic path through Python, AI, causal inference, and agentic empirical research.
Structure
Lecture + code + empirical case + project output
Each day is organized around one concrete research deliverable.
Core Tools
Python / pandas / sklearn / PyTorch / Stata / R / StatsPAI / Codex
The course emphasizes cross-checking across tools, not dependence on one software stack.
Materials
Example data, code templates, task briefings, replication checklists
All materials support the final reproducible research package.
Audience
Advanced undergraduates, graduate students, junior faculty, and empirical researchers
Especially useful for learners with research questions but no automation workflow.
Learning Objectives
Build a reusable Python empirical-research project template for data, code, tables, figures, logs, and documentation.
Master the basic ML workflow: problem definition, feature construction, train/validation/test splits, metrics, overfitting diagnosis, and model interpretation.
Understand the architecture of CNN, RNN, LSTM, GRU, seq2seq, attention, Transformers, and large language models rather than memorizing model names.
Explain how LLM pretraining, instruction tuning, alignment, RAG, function calling, MCP, and tool use enter research workflows.
Separate prediction tasks from causal-identification tasks and learn when RCT, DID, PSM, IV, RD, synthetic control, DML, and causal forests are appropriate.
Turn a natural-language research task into an executable agent plan that leaves auditable files, logs, and outputs.
Practice publication-grade habits: data dictionaries, sample restrictions, robustness checks, interpretation, replication packages, and human review checkpoints.
Prerequisites
At least one course in statistics, econometrics, or empirical research methods is recommended.
Fluency in Python is not required, but students should be ready to install environments, run notebooks, and read basic code.
Students are encouraged to bring one research question, dataset, or paper they want to replicate.
Learners with Stata or R experience can bring existing do-files or R scripts for comparison with Python workflows.
The course assumes one principle: AI can accelerate research, but it cannot replace researcher judgment about identification, data quality, and claims.
Module Schedule
The four days move from data workflow to AI modeling, causal identification, and agent automation. Each day produces a research component that can be saved, rerun, and extended.
Day 1
Python Programming, Data Engineering, and Reproducible Project Structure
Day 1
The first day moves research work from manual operations to scripted workflows. The goal is not a syntax sprint; it is a reusable project skeleton for future empirical research.
Data access: CSV/Excel, APIs, requests, BeautifulSoup, webpage structure, and scraping limits.
pandas: missing values, duplicates, type conversion, merge, concat, groupby, pivot, reshape, and time variables.
Visualization and outputs: matplotlib, seaborn, descriptive tables, figure export, and logs.
Reproducible structure: raw data stays untouched, processed data is generated, scripts rerun, and outputs are traceable.
In-Class Lab
Start from one raw dataset and finish cleaning, variable construction, descriptive statistics, and a first figure.
Move successful notebook exploration into scripts and document how to rerun the project in a README.
Use Git to save the full change from raw data to output table.
Key Questions
What does reproducible mean, and do failures usually come from data, paths, dependencies, random seeds, or manual steps?
Why do agents need clear file structure, explicit data dictionaries, and stable output formats?
When should a workflow remain in Stata/R, and when should it move to Python?
Assignment
Organize one research project directory and write the data source, variable notes, run order, output files, and unresolved problems.
Module Deliverable
A rerunnable data-cleaning and descriptive-analysis pipeline.
Day 2
Machine Learning, Deep Learning, and Large Language Model Principles
Day 2
The second day explains how AI models learn from data. It starts with sklearn-style ML and moves into neural networks, CNN, LSTM, GRU, seq2seq, attention, Transformers, and LLMs.
Lecture Topics
ML tasks: supervised learning, unsupervised learning, classification, regression, clustering, dimensionality reduction, and anomaly detection.
Modeling workflow: feature engineering, train/validation/test splits, cross-validation, loss functions, optimizers, regularization, and metrics.
Neural-network foundations: linear layers, activations, backpropagation, gradient descent, batches, epochs, and learning rates.
LLMs: next-token pretraining, instruction tuning, alignment, RAG, function calling, tool calling, and multimodal extension.
In-Class Lab
Train or run a sklearn baseline, a simple neural model, and an LLM prompt/RAG approach on the same text or table task.
Draw the information flow for CNN, LSTM/GRU, seq2seq, and Transformers, marking input, parameter sharing, memory, and output.
Ask an LLM to explain a variable and generate code, then verify reliability through execution.
Key Questions
How do sample size, computation, interpretability, and verification cost change as model complexity rises?
Why did Transformers replace many recurrent designs, and why do they still hallucinate?
In research use, which information must come from external evidence rather than model memory?
Assignment
Complete a model-choice memo comparing classical ML, deep learning, and LLM approaches for one research task, including data needs, validation strategy, risks, and recommendation.
Module Deliverable
An interpretable ML/LLM mini-experiment and a written model-choice note.
Day 3
Causal Inference, Econometrics, and Machine Learning
Day 3
The third day places predictive modeling inside a causal-identification framework. ML helps with high-dimensional variables and heterogeneity, but credible claims still require a clear counterfactual and assumptions.
Lecture Topics
Causal foundations: potential outcomes, ATE/ATT, counterfactuals, selection bias, common support, and SUTVA.
Experiments and quasi-experiments: RCT, DID, event studies, PSM, IV, RD, synthetic control, SDID, and GSC.
ML for causal inference: DML, causal forests, heterogeneous treatment effects, and nuisance functions.
Applied modeling: fixed effects, clustered standard errors, sample restrictions, winsorization, balance checks, and parallel trends.
Robustness: alternative variables, alternative samples, placebo tests, sensitivity analysis, mechanisms, and heterogeneity.
Result communication: reading regression tables, explaining economic magnitude, and avoiding overstated causal language.
In-Class Lab
Run a DID or event-study specification on a policy-evaluation dataset and produce a publication-style table.
Cross-check one core result across Python, Stata, or R, including sample size, coefficients, standard errors, and fixed effects.
Use DML or causal forests for high-dimensional control or heterogeneity exploration, then explain why they do not replace identification.
Key Questions
Does causal identification come from data, institutional background, model specification, or an ML algorithm?
How should a researcher choose among DID, IV, RD, PSM, and synthetic control?
Which robustness checks are informative, and which are mechanical table-stacking?
Assignment
Write an identification memo for your own question: treatment, outcome, sample, counterfactual, identifying assumptions, main model, and three robustness checks.
Module Deliverable
A model specification, result table, and robustness plan for a full empirical case.
Day 4
Agentic Empirical Research, MCP, and Paper-Automation Workflow
Day 4
The fourth day connects the previous three days to an agent workflow. The goal is not to replace the researcher, but to make AI act like an auditable research assistant for file reading, code execution, result checking, drafting, and replication packaging.
Tool use: function calling, MCP, filesystem, Python, Stata, R, databases, browsers, and StatsPAI.
Research decomposition: data audit, variable construction, descriptive statistics, model estimation, figures, and writing.
Auditable automation: provenance, logs, commands, random seeds, intermediate files, version control, and human checkpoints.
Paper workflow: literature summaries, research-design discussion, result interpretation, limitations, appendix, and replication package.
Failure modes: hallucination, false citation, data leakage, overfitting, path errors, silent sample changes, and irreproducible outputs.
In-Class Lab
Rewrite one natural-language request as an agent briefing with goal, data, variables, constraints, outputs, and validation steps.
Have an agent run data-analysis code while saving logs, output tables, errors, and draft interpretation.
Review the agent output manually for sample size, variable definitions, model specification, statistical significance, and wording.
Key Questions
How do we make an agent call tools instead of guessing?
How should an automated research workflow be paused, resumed, and rolled back?
Which writing steps can be automated, and which must remain researcher judgment?
Assignment
Complete an agent-assisted research package: task brief, code, execution logs, result table, figure, report draft, and human-review checklist.
Module Deliverable
An auditable agentic workflow from data to first report draft.
Model Architecture Track
This track places large language models in the broader history of deep learning: local feature extraction, sequence memory, input-output generation, attention, Transformers, and tool-using LLM agents.
CNN: Convolutional Neural Networks
CNNs answer the question of how to recognize stable local patterns. Instead of connecting every input to every output, they share filters over local windows to detect repeated structures in images, spatial grids, or local text patterns.
Architecture Anatomy
Convolution layers slide kernels over inputs to learn edges, shapes, local phrases, or spatial adjacency.
Padding and stride control boundary information, feature-map size, and compression speed.
Pooling compresses local information and increases translation robustness.
Feature maps and receptive fields explain how much of the original input a unit can see.
Research Use
Useful for remote-sensing images, night lights, street views, geographic grids, document images, contract layouts, and local textual features.
Limits and Risks
CNNs are less natural for long-range dependency and complex semantics, so they are often combined with sequence models, attention, or pretrained models.
Lab Connection
Compare hand-crafted features, CNN features, and pretrained embeddings on a small classification task, with attention to overfitting and interpretability.
RNN / LSTM / GRU Sequence Models
The RNN family answers how a model can read a sequence while carrying memory. LSTMs and GRUs use gates to reduce fast forgetting, vanishing gradients, and long-dependency failures.
Architecture Anatomy
RNN hidden states pass information from one time step to the next.
LSTM input, forget, and output gates decide what to write, retain, and expose.
GRU update and reset gates provide a lighter memory-control design.
Bidirectional sequence models use both left and right context for labeling and classification.
Research Use
Useful for firm trajectories, financial time series, user behavior sequences, policy-text evolution, and event histories.
Limits and Risks
Sequence models train slowly and parallelize poorly on long text; Transformers are often better for long or complex dependency patterns.
Lab Connection
Use panel or text sequences to compare lagged features, LSTM, and GRU models on out-of-sample performance and explanation cost.
seq2seq and Encoder-Decoder
seq2seq maps one input sequence into one output sequence. It is an early core framework for translation, summarization, Q&A, code generation, and research-report generation.
Architecture Anatomy
The encoder turns input text, code, or variable notes into contextual representations.
The decoder generates output step by step using prior generated tokens and context.
Teacher forcing stabilizes training by feeding the true previous token during learning.
Greedy search and beam search trade speed, quality, and diversity at inference time.
Research Use
Explains how natural-language research tasks become code, how regression tables become prose, and how literature paragraphs become summaries.
Limits and Risks
Early seq2seq models compress too much into a bottleneck representation, making long text difficult without attention.
Lab Connection
Convert a variable definition or empirical task into structured JSON or Python pseudocode, then inspect how generation errors arise.
Attention Mechanism
Attention answers which parts of the input the model should consult at the current step. Query, key, and value vectors compute relevance weights so the model can dynamically select information.
Architecture Anatomy
Q/K/V: the query asks the current question, keys describe candidate positions, and values carry the information.
Scaled dot-product attention computes similarity weights and weighted sums.
Cross-attention lets the output side read from the input side during generation.
Attention weights can support auditing, but they are not the same as causal explanations.
Research Use
Useful for connecting policy text, paper paragraphs, variable descriptions, interviews, and multi-source evidence.
Limits and Risks
Attention is not reliable causal evidence; long context adds cost, noise, and retrieval errors.
Lab Connection
Use policy text to locate treatment definitions, timing, and sample restrictions, then convert highlighted spans into auditable evidence.
Transformer Architecture
Transformers replace recurrence with self-attention, allowing parallel sequence processing and multi-head relationships. Modern large language models are built on this architecture.
Architecture Anatomy
Tokenization and embeddings turn text, code, and symbols into vectors.
Positional encoding adds order information.
Multi-head self-attention learns dependencies in several representation spaces.
Feed-forward networks, residual connections, and LayerNorm support nonlinear expression, stable training, and deep stacks.
Research Use
Used for paper reading, policy-text encoding, summarization, code generation, table interpretation, retrieval-augmented Q&A, and agent planning.
Limits and Risks
Transformers learn statistical associations and task patterns; they do not automatically guarantee factual accuracy, valid identification, or runnable code.
Lab Connection
Dissect one LLM answer: how the prompt is tokenized, how the response is generated, and which parts require RAG, code execution, and human confirmation.
Large Language Models and Agents
LLMs learn language, code, and knowledge patterns through next-token pretraining, then become task-facing through instruction tuning, preference alignment, RAG, function calling, and tool use.
Architecture Anatomy
Pretraining learns language distributions, commonsense associations, and code patterns from large corpora.
Instruction tuning and alignment improve task following, step-by-step explanation, and output safety.
RAG retrieves external sources before generation.
Tool calling and MCP let models invoke Python, Stata, R, databases, browsers, or StatsPAI instead of only writing text.
Research Use
Supports topic development, literature reading, coding, debugging, table interpretation, report drafting, replication packaging, and multi-tool workflow coordination.
Limits and Risks
LLMs may hallucinate, misread data, omit identification assumptions, or produce irreproducible conclusions. Logs, code execution, citations, tests, and human review are required.
Lab Connection
Turn a natural-language empirical request into an agent plan where every step leaves files, logs, outputs, and human checkpoints.
Assignments and Assessment
Code and Data Workflow
25%
Assess project structure, cleaning scripts, outputs, README, and rerun instructions.
AI Architecture and Model Choice
25%
Submit architecture notes for CNN/LSTM/GRU/seq2seq/attention/Transformer/LLM and one model-choice memo.
Causal Identification Memo
25%
Submit research question, assumptions, main model, robustness plan, and interpretation draft.
A reusable Python empirical-research project template.
An AI architecture map covering CNN, LSTM, GRU, seq2seq, attention, Transformers, and large language models.
A causal-inference empirical-case draft with main model, robustness plan, and interpretation.
An agent-assisted research workflow with natural-language tasking, tool calls, code execution, result verification, and human checkpoints.
Course Norms
The course encourages AI use, but all AI-generated code, prose, and claims must be verified through execution, citations, data checks, or human review.
Reproducibility is the first standard: every result should be regenerable from raw data and scripts.
The course discourages unexplained model stacking. Every model choice needs a data structure, research goal, validation metric, and failure-risk explanation.
Final outputs can become a replication package, course project, research-assistant workflow, or technical appendix for an empirical paper.