# StatsPAI 助力实证研究 作业模板 · An Audited Estimate, End to End

> StatsPAI Summer Bootcamp · 配合知识页 `/courses/summer-bootcamp/topics/statspai-research` 与 `statspai_research_lab.ipynb`。
> 目标：用 agent-native 因果工作流，从数据 + 问题走到一个经过审计、附可核验引用的估计。

---

## 中文版

### 目标
针对一个真实（或合成）数据集与研究问题，完整走一遍：识别设计 → 推荐估计量 → 拟合（handle）→ 审计稳健性 → 敏感性 → 可核验引用，并诚实报告边界。

### 任务
1. **识别设计**：根据数据形态（面板 / 断点 / 工具变量 / 可观测选择）判定设计，并画因果图。
2. **推荐与拟合**：选择与设计匹配的估计量（如 staggered DiD → callaway_santanna），拟合并得到结果 handle。
3. **审计**：列出推荐的稳健性检查清单，标注已完成 / 仍缺，计算覆盖率。
4. **敏感性**：至少做一项（E-value / Oster δ / honest DiD），量化结论对未观测混淆的稳健程度。
5. **引用纪律**：所有文献只能来自可核验 bib；演示一次对杜撰 key 的拒绝。
6. **报告**：估计 + 置信区间 + 审计覆盖 + 敏感性 + limitations。

### 交付物
- `analysis.ipynb`（全流程）、`audit.md`（清单 + 覆盖率 + 敏感性）、`references.bib`（仅可核验条目）。

### 评分要点
| 维度 | 权重 |
|---|---|
| 识别设计与估计量匹配 | 25% |
| handle 串联与审计覆盖 | 25% |
| 敏感性分析 | 25% |
| 引用纪律与诚实边界 | 25% |

> 红线：识别来自设计而非模型复杂度；不得杜撰文献；估计未经审计不得当作结论。

---

## English Version

### Goal
For a real (or synthetic) dataset and question, run the full agent-native causal workflow: detect design -> recommend estimator -> fit (handle) -> audit robustness -> sensitivity -> verifiable citations, and report limits honestly.

### Tasks
1. **Detect design**: classify the design from the data shape (panel / discontinuity / instrument / selection-on-observables) and draw the causal graph.
2. **Recommend and fit**: choose an estimator matched to the design (e.g. staggered DiD -> callaway_santanna), fit it, and obtain a result handle.
3. **Audit**: list the recommended robustness checks, mark done / missing, and compute coverage.
4. **Sensitivity**: do at least one (E-value / Oster delta / honest DiD) to quantify robustness to unobserved confounding.
5. **Citation discipline**: cite only from a verifiable bib; demonstrate one rejection of an invented key.
6. **Report**: estimate + confidence interval + audit coverage + sensitivity + limitations.

### Deliverables
- `analysis.ipynb` (full workflow), `audit.md` (checklist + coverage + sensitivity), `references.bib` (verifiable entries only).

### Rubric
| Dimension | Weight |
|---|---|
| Design detection and estimator match | 25% |
| Handle chaining and audit coverage | 25% |
| Sensitivity analysis | 25% |
| Citation discipline and honest limits | 25% |

> Red lines: identification comes from design, not model complexity; never invent citations; an unaudited estimate is not a conclusion.