Two "Brain Surgeries" on Our AI Paper Assistant: From Solo Player to Expert Team
It's been almost a year since I started building an AI-powered academic paper product. Recently I completed architecture upgrades on two core modules back to back — the Paper Refinement module and the Data Analysis module. Here's the thinking behind them.
Part 1: Paper Refinement Module
In CoPaper.AI, once a first draft is generated the user enters "paper refinement mode" — through conversation you can ask the AI to rewrite arguments, restructure sections, re-run regressions, polish prose, or respond to peer-review comments. In short, every iterative edit between first draft and final draft happens here.
The Problem Before: One AI Doing Everything
The original design had a single AI assistant handling every paper-editing task — rewriting content, restructuring sections, running regressions, polishing prose, responding to reviewers… over twenty tools all crammed into one agent.
The result:
- It frequently picked the wrong tool (you ask it to fix a figure, it edits a paragraph instead)
- Every call had to process an enormous context window
- Adding new features only made the bloat worse
The Upgrade: One Coordinator + Multiple Expert Agents
Drawing on state-of-the-art multi-agent collaboration research, we split the "jack-of-all-trades, master-of-none" agent into one coordinator plus multiple specialized agents. Each agent handles only one category of tasks — some focus on content, some on structure, some on empirical analysis, some on formatting… each staying in its own lane.
Core Design Principles
1. Context Isolation
Each expert agent loads only the information relevant to its task. The agent handling data analysis never loads the literature review; the agent handling formatting never sees the code. This dramatically reduces token consumption per AI call.
2. Model Tiering
Tasks requiring deep reasoning use a strong model; pattern-based tasks use a lightweight model. Different tasks get matched to different models — the optimal balance between cost and quality.
3. Unlocking Exclusive Capabilities
After the split, each expert can have its own dedicated tools and capabilities — such as fine-grained partial code edits, cross-chapter content moves, and more. These were nearly impossible under the monolith architecture.
4. Proportional Response Principle
The scope of AI edits should be proportional to the scope of the user's request. If the user asks you to fix one paragraph, fix that paragraph only — don't take the liberty of rewriting the whole paper. This prevents AI "over-enthusiasm."
Results
- Task routing accuracy significantly improved
- Per-call context dramatically reduced
- Unlocked a batch of fine-grained capabilities that were previously impossible
- Smooth toggle between old and new architecture — zero-risk deployment
What This Means for Paper Refinement
The journey from first draft to final version involves countless rounds of revision — a reviewer asks you to add a robustness check, your advisor thinks chapter three's argument isn't compelling enough, you notice the numbers in the abstract don't match the body… These edits are completely different in nature, yet previously a single agent had to "guess" what you wanted.
After the upgrade, the refinement experience has changed: say "switch to a different regression method," and the empirical agent edits the code, re-runs the analysis, and updates the figures — without touching the prose you just polished. Say "adapt the paper to journal X's format," and the formatting agent adjusts layout and wording only — without changing your model specification. Each agent stays in its lane; edits are precise and predictable.
For anyone doing empirical research, paper refinement often takes more time than writing the first draft. This architecture upgrade transforms the AI from a frazzled intern trying to do everything at once into a well-coordinated team of specialists.
Part 2: Data Analysis Module
In CoPaper.AI, uploading your data takes you into "data analysis mode" — here the AI can help with data cleaning, running regressions, creating figures, and even writing the paper for you. From a simple data-cleaning utility to a system that can handle the entire "data to analysis to paper" pipeline, this module has undergone a complete transformation.
The Evolution: From "Data Cleaning Tool" to "Full-Stack Research Partner"
This module's upgrade wasn't a one-shot rewrite — it went through progressive stages of evolution:
Stage 1: Data Cleaning Tool
Initially, it was just a simple data preprocessing utility — upload data, run basic cleaning, export results. That was it.
Stage 2: Empirical Analysis Platform
We soon realized that users don't upload data just to "clean" it — they want to run empirical analysis directly. So we massively expanded statistical analysis capabilities, adding support for a wide range of mainstream econometric methods and advanced modeling techniques, along with visualization output and multi-dataset management.
The module was officially renamed and repositioned as a complete empirical analysis platform.
Stage 3: Paper Writing Capabilities
This was the biggest leap — the module went from "run models and show you results" to directly writing your analysis results into a paper. The AI can automatically generate a paper outline based on your data and results, then write section by section, with analysis code and paper chapters automatically linked. The final product can be exported as a standard document format.
The entire experience became: Upload data → Explore and analyze → Generate outline → Write section by section → Export paper — all in one place.
Stage 4: Multi-Agent Architecture
As capabilities grew, so did the number of tools — and we hit the same problem as the Refinement module: the monolith agent was overwhelmed.
We applied the same coordinator + multiple sub-agents architecture: different sub-agents handle data preparation, statistical modeling, chart visualization, and paper writing respectively. Each sub-agent receives only the context relevant to its task, leading to more accurate execution and lower costs.
A Principle That Runs Through Both Modules
Proportional Response
Whether in paper refinement or data analysis, we follow the same principle: what the AI does should precisely match the scope of the user's request — no more, no less.
Ask it to run descriptive statistics, and it outputs descriptive statistics only — it won't unilaterally tack on regression analysis. Ask it to fix one paragraph, and it fixes that paragraph only — it won't rewrite the whole paper.
This principle sounds simple, but it's critical in practice — the pain point of many AI products isn't "it can't do it," but "it does too much," leaving users unable to tell which results they actually asked for.
Takeaways: Two Lessons from Building AI Products
After completing both architecture upgrades, my biggest takeaways:
1. It's not about using the strongest model — it's about using the right model for the right task. Multiple focused experts beat one generalist that knows everything but masters nothing.
2. The core experience of an AI product isn't "how smart is the AI" — it's "how well does the AI understand its boundaries." Teaching the AI "when to stop" matters more than teaching it "how to do more."
This approach isn't limited to academic papers. Any complex AI Agent product, once its tool count grows beyond a certain scale, should seriously consider splitting into a multi-agent architecture.