Meta-analysis is a systematic research methodology that synthesizes data from multiple existing studies to derive comprehensive conclusions. This approach not only mitigates limitations inherent in individual studies but also facilitates novel discoveries through integrated data analysis. Traditional meta-analysis involves a complex multi-stage pipeline including literature retrieval, paper screening, and data extraction, which demands substantial human effort and time. However, while LLM-based methods can accelerate certain stages, they still face significant challenges, such as hallucinations in paper screening and data extraction. In this paper, we propose a multi-agent system, Manalyzer, which achieves end-to-end automated meta-analysis through tool calls. The hybrid review, hierarchical extraction, self-proving, and feedback checking strategies implemented in Manalyzer significantly alleviate these two hallucinations. To comprehensively evaluate the performance of meta-analysis, we construct a new benchmark comprising 729 papers across 3 domains, encompassing text, image, and table modalities, with over 10,000 data points. Extensive experiments demonstrate that Manalyzer achieves significant performance improvements over the LLM baseline in multi meta-analysis tasks.
(a) Manual meta-analysis is inherently time-consuming. This traditional method relies heavily on human effort for every step, from paper screening to data extraction and synthesis, making it a lengthy and resource-intensive process. (b) In contrast, LLM-based methods offer some automation but are often limited to specific steps, consequently failing to achieve true end-to-end automation. A significant drawback of these approaches is their propensity for hallucinations during critical stages like paper screening and data extraction, which can compromise the reliability of the analysis. (c) Our proposed system, Manalyzer, addresses these limitations directly. It offers end-to-end automation for meta-analysis. Crucially, Manalyzer's workflow design specifically incorporates mechanisms that lead to significantly reduced hallucinations, thereby enhancing the accuracy and reliability of the entire meta-analysis process.
Manalyzer is a multi-agent system incorporating tool calling and feedback mechanisms, enabling end-to-end automated meta-analysis in real scientific research scenarios. We divide the meta-analysis process into three stages. The first stage involves receiving user input, searching for and downloading papers, followed by filtering out relevant and valuable ones. The second stage focuses on extracting data from these selected papers and integrating it into tables. The third stage is to analyze the integrated data and output the final meta-analysis report.
To comprehensively and objectively evaluate the performance of Manalyzer and LLM baselines in meta-analysis, we introduce the first meta-analysis benchmark dataset derived from real-world and large-scale scientific papers. The benchmark includes 729 papers with 10,000+ data points across three fields, which assesses models extract research-relevant data from multimodal content (tables, images, text) and consolidate it into structured tables.
* This work was primarily conducted during the author's internship at the
Shanghai Artificial Intelligence Laboratory.
† Corresponding author.