Standing on the Shoulders of IMRaD and the Five-Chapter Format: How CoPaper.AI Built a Universal Empirical Paper Architecture for the AI Era

Anyone who has written an empirical paper shares a common experience: the structure of your paper determines half the writing experience.

A good structure lets your words flow naturally, with logical chains connecting seamlessly. A poor structure leaves you endlessly debating — where does the literature review go? Should data and methods be combined? How do you separate discussion from conclusion?

IMRaD and the five-chapter format, as classic frameworks, have served academia for over half a century, each with deep disciplinary roots and unique value. But when we tried to use AI to assist in completing a full social science empirical paper, we found the need for a new architecture that both respects academic tradition and adapts to AI workflows.

This led the CoPaper.AI team to develop the P1-P2-P3-P4 four-part empirical paper architecture, built on a thorough study of classic frameworks and top journal standards worldwide. It serves as the core skeleton of CoPaper's Paper Agent.

This article provides a complete breakdown of the architecture's design philosophy, technical details, and its central role in CoPaper.AI.

Classic Frameworks Revisited: What IMRaD and the Five-Chapter Format Established

Before introducing P1234, let's pay tribute to two classic frameworks — they are P1234's most important intellectual sources.

IMRaD: Elegant Four-Step Logic

IMRaD (Introduction-Methods-Results-and-Discussion) originated in the medical and biological sciences in the early 20th century and became the dominant format for international scientific papers by the 1950s.

Its design is remarkably elegant: pose a question → design methods → present results → interpret and discuss. These four steps perfectly mirror the cognitive process of scientific research. The APA Publication Manual recommends it for social science empirical research, and LetPub's IMRAD writing guide notes that this structure "makes the logic clearer."

IMRaD's greatest insight: paper structure should correspond to the thinking process of research, not exist for the sake of having chapters.

The Five-Chapter Format: A Systematic Academic Training Framework

The other widely used framework is the dissertation's "five-chapter format" — Introduction, Literature Review, Research Methods, Results, Discussion & Conclusion. This is the standard structure for master's and doctoral theses at universities worldwide.

The five-chapter format's value lies in its systematicity: it requires authors to provide independent, complete exposition of every research component — an excellent form of academic training.

It taught us: every component of a paper deserves to be treated seriously, with sufficient space.

Evolution from classic architectures to CoPaper P1234

From Classic to Innovation: Why a New Architecture?

If the classic frameworks work so well, why design something new?

The answer lies in a change of context.

When AI participates in paper writing, we face an entirely new challenge: AI needs a sufficiently clear, sufficiently universal architecture as its "skeleton" to generate structurally rigorous content at every stage. Traditional frameworks were designed for human authors — humans can improvise, but AI needs more explicit segmentation rules and proportion guidelines.

At the same time, after studying the actual paper structures of numerous top journals domestically and internationally, we found:

In top Chinese journals like Economic Research Journal and Management World, the introduction and literature review are often fused together rather than separate chapters
In top international economics journals like AER, QJE, and JPE, "data description" and "identification strategy" are typically combined into one section
The results section of social science empirical papers usually occupies the largest proportion, requiring deep integration of code execution and academic writing

These observations drove us to create a targeted fusion and innovation building on IMRaD and the five-chapter format.

CoPaper P1234 Architecture: Four Parts, Each With a Clear Mission

The core design principle of P1234: preserve the logical integrity of classic architectures while making the information organization within each Part more cohesive and better suited for AI-driven academic writing.

CoPaper.AI P1234 Four-Part Empirical Paper Architecture Overview

Part 1: Introduction & Literature Review

Design philosophy: Fuse the introduction and literature review into one organic whole.

This design directly draws from the standard practice of top international economics journals. In AER and QJE papers, the literature review is frequently embedded within the introduction, forming a coherent narrative of "why I'm studying this → what others have done → what my contribution is."

Typical chapters in Part 1 (1-4 chapters):

Research background and problem statement
Theoretical foundation and literature review
Research contributions and innovations
Paper structure overview

The benefit: readers understand both "why this matters" and "what's been done" within a single, coherent narrative — logical flow comes naturally. For dissertations requiring extensive literature reviews, Part 1 also supports up to 4 chapters for flexibility.

Part 2: Data & Identification Strategy

Design philosophy: Merge data description and research methods, highlighting the central role of "identification strategy" in modern empirical research.

The empirical research methodology at Central University of Finance and Economics emphasizes that econometric methods are merely "service tools" — the real core is research design: how you identify causal relationships from data. Brandeis University's economics writing guide also recommends combining data, methods, and models into one coherent section.

CoPaper's Part 2 integrates these into a complete "research design" section (1-3 chapters):

Data sources and sample description
Variable definitions and descriptive statistics
Model specification and identification strategy

Data selection determines the method space, and method selection in turn requires specific data support — they are naturally unified.

Part 3: Empirical Results & Analysis

Design philosophy: This is the "heart" of the paper, and what sets CoPaper apart from all other AI writing tools.

In CoPaper's word allocation system, Part 3 accounts for 38% of journal papers — the highest proportion among all four Parts. This aligns perfectly with academic consensus. LetPub's IMRAD writing guide notes that research findings account for 35% of paper abstracts, underscoring this as the core value of any paper.

What makes Part 3 unique: it doesn't just "display results" — it deeply integrates code execution, data analysis, and academic writing.

In CoPaper's Paper Agent system, every chapter in Part 3 goes through three stages:

Code generation: Automatically generates Python analysis code based on Part 2's research design
Sandbox execution: Runs code in a secure environment to produce real charts and statistical results
Content generation: Writes academic text based on actual execution results

This means every regression coefficient, every chart, every robustness check in the paper is genuinely computed, not "fabricated" by AI.

Typical chapters in Part 3 (1-3 chapters):

Baseline regression results
Robustness checks and endogeneity treatment
Heterogeneity analysis and mechanism tests

Part 4: Discussion & Conclusion

Design philosophy: Merge discussion and conclusion into one logically coherent closing section.

This design references the "three-section, nine-point" structure used by top Chinese management journals like Management World — brief conclusions, policy recommendations, and research limitations, delivered in one smooth flow. It also follows the common practice in top economics journals where "Conclusion" and "Discussion" are combined into a single chapter.

CoPaper's Part 4 (1-3 chapters):

Key findings and theoretical contributions
Policy implications and practical significance
Limitations and future research directions

Precision "Proportion Engineering": Every Part's Word Count Is Deliberate

The P1234 architecture isn't simply about "dividing into four parts." Behind it lies a dynamic word-count allocation system that adjusts by paper type — derived from the CoPaper team's deep analysis of paper structures in top journals including Economic Research Journal, Management World, AER, QJE, and JPE.

Paper Type	Part 1 (Intro + Lit Review)	Part 2 (Data + Methods)	Part 3 (Empirical Results)	Part 4 (Discussion + Conclusion)
Journal Paper	22%	18%	38%	22%
Course Paper	25%	15%	35%	25%
Undergraduate Thesis	28%	17%	33%	22%
Master's Thesis	27%	18%	33%	22%

Key patterns to note:

Part 3 is always the heaviest (33%-38%): Empirical results are always the core value of a paper
Journal papers have the highest Part 2 proportion (18%): Top journals demand the most methodological rigor
Dissertations have heavier Part 1 (27%-28%): Theses need more extensive literature review to demonstrate scholarly depth
Part 4 remains stable at 22%-25%: Discussion and conclusion length doesn't need to vary dramatically across paper types

This proportion system gives AI a clear word-count target and paragraph reference for every section, ensuring overall balance.

One Architecture, Three Modes: Covering All Academic Writing Scenarios

Another highlight of the P1234 architecture is its flexibility — the same architecture "morphs" into three modes for different scenarios:

Three Generation Modes of the P1234 Architecture

Mode 1: Full Empirical Mode

Completes the full P1→P2→P3→P4 pipeline. Part 3 includes code execution and chart generation. Suitable for the vast majority of social science empirical research papers, whether for journal submission or dissertations.

Mode 2: Theoretical/Review Mode

For theoretical papers or literature reviews that don't require data analysis, the semantic meaning of each Part shifts:

Part 2 changes from "Data & Methods" to "Literature & Theoretical Framework" (proportion increases to 30%)
Part 3 changes from "Empirical Results" to "Theoretical Arguments & Analysis" (35%, text-only)
Part 1's introduction shortens accordingly (15%), since the literature review focus moves to Part 2

Same P1234 architecture, different content semantics, entirely different proportion configurations.

Mode 3: Writing Complement Mode

This mode best reflects CoPaper's understanding of real academic workflows — many researchers prefer to run their data and examine results first, then go back to write the introduction and conclusion.

In complement mode, users upload completed Part 2 + Part 3 content (up to 40,000 words), and the system generates only Part 1 (introduction, ~55%) and Part 4 (conclusion, ~45%). The AI reads the uploaded empirical content, ensuring the introduction's research motivation and the conclusion's findings summary seamlessly connect with the existing analysis.

Part-by-Part Outline Configuration: Putting Control in the Author's Hands

Many AI writing tools follow a "one-click generate" logic — the user enters a title, and AI outputs the entire paper. But academic papers aren't blog posts. Researchers need precise control over paper structure.

CoPaper's Paper Agent builds a part-by-part human-in-the-loop (HITL) outline configuration system on top of the P1234 architecture.

Part-by-Part HITL Outline Configuration Workflow

Each Part Has an Independent Outline Configuration Stage

During paper generation, at every new Part, the system pauses and enters a human-in-the-loop outline configuration phase:

P1 outline → P1 content → P2 outline → P2 content → P3 outline → P3 code execution + content → P4 outline → P4 content

At each configuration node, the author can:

Choose the number of chapters: Part 1 supports 1-4 chapters; Parts 2/3/4 support 1-3 chapters
Edit chapter titles and subsection arrangements: Fully customize each chapter's content
Select outline depth: 2-level structure (chapter → section) or 3-level structure (chapter → section → subsection)
Review and revise AI-generated outline drafts: Iterate until satisfied

This means you're not "accepting" a paper from AI — you're progressively "designing" your own paper. After each Part is generated, you can use the completed content to better plan the next Part's structure.

Full-Paper Outline Preview and Journal Templates

Before writing begins, CoPaper also provides a full-paper outline preview feature with built-in Chinese journal reference templates:

Economic Research Journal econometric empirical template
Management World management empirical template
And more journal reference structures

Users can switch between template previews to see what P1234 outlines look like for different journal styles, choose the template closest to their target journal, and customize from there.

Intelligent Word Budget System

After outline configuration, the system automatically calculates the writing budget for each section:

Total word target × Part proportion ÷ number of chapters ÷ number of sections = word target per section

For example, a 15,000-word master's thesis with Part 3 at 33% (~4,950 words), divided into 3 chapters with 3 sections each, yields ~550 words per section (~4-5 paragraphs), with an 80%-115% flexibility range.

This budget system ensures overall balance — preventing situations where one section runs to 5,000 words while another barely reaches 500.

P1234: The Soul of CoPaper.AI

Finally, a product design reflection: P1234 isn't a "feature" of CoPaper — it's the soul and skeleton of the entire product.

CoPaper's Paper Agent has built a complete academic paper generation pipeline on the backend, and every step, every transition in this pipeline revolves around the P1234 architecture:

Pipeline orchestration: Information gathering → P1 outline → P1 content + reference upload → P2 outline → P2 content → P3 outline → P3 code execution + content → P4 outline → P4 content → Abstract → Auto review
Proportion control: Built-in allocation system precisely controls word budgets for each Part
Chapter constraints: Each Part has independent chapter count ranges (P1: 1-4, P2/3/4: 1-3)
Mode adaptation: Full empirical, theoretical review, and writing complement modes are all essentially P1234 variants for different scenarios

When the Paper Agent is architecturally built around P1234 from the ground up — rather than adding an academic veneer to a generic writing tool — the quality and structural rigor of its output is fundamentally different.

P1234 is the CoPaper team's systematic answer to the question: "What should a good empirical paper look like?" Standing on the shoulders of IMRaD and the five-chapter format, it fuses and innovates specifically for the characteristics of social science empirical papers, ultimately creating a universal empirical paper architecture built for the AI era.

Empirical research is becoming increasingly complex — multiple datasets, multiple identification strategies, multiple robustness checks. In this trend, a good paper architecture matters far more than a fast text generator.

CoPaper P1234 is our answer.

What are your thoughts on paper architecture? We'd love to hear from you.