Reporting quality of animal research in journals that published the ARRIVE 1.0 or ARRIVE 2.0 guidelines: a cross-sectional analysis of 943 studies
Highlight box
Key findings
• None of the 943 interventional animal studies from the journals that published the Animals in Research: Reporting In Vivo Experiments (ARRIVE) guidelines reported on all 38 subitems of ARRIVE 1.0 or 2.0 guidelines, making only 0%, 0%, and 0.25% studies having “excellent” reporting quality in the three periods.
• The overall reporting quality was significantly improved among Pre-ARRIVE 1.0, Post-ARRIVE 1.0 and Post-ARRIVE 2.0 (P<0.001). The compliance of 15 out of 38 subitems (39.5%) in Post-ARRIVE 1.0 has significantly improved compared to Pre-ARRIVE 1.0 (P<0.05). Eleven out of 27 similar and comparable subitems (40.7%) in Post-ARRIVE 2.0 has significantly improved compared to Post-ARRIVE 1.0 (P<0.05).
• There was a significant difference in reporting quality for mandatory adherence to the ARRIVE guidelines in the author’s instructions and reference to ARRIVE in the manuscript.
What is known and what is new?
• The reporting quality of animal studies published beyond the journals that released the ARRIVE guidelines is inadequate.
• Though adherence to the ARRIVE guidelines has improved since introducing the guidelines, adherence also remains unsatisfactory in the journals that published the guidelines.
What is the implication, and what should change now?
• The ARRIVE guidelines and the 2.0 update impact positively on the reporting quality of interventional animal experiments.
• To further improve the reporting quality, more journals are encouraged to make the ARRIVE guidelines mandatory rather than simply mentioning in the author’s instructions.
Introduction
Background
Animal research, including interventional animal studies (in vivo experiments), is pivotal in enhancing our understanding of biological phenomena within whole-life systems (1). It allows the testing of novel treatments and has contributed significantly to developing medical treatments (2). However, there are also concerns with animal experiments, including poor methodological quality, which can limit the translational value of results to the clinical situation (3,4).
In this regard, the translational value of such studies hinges on the accuracy, comprehensiveness, and transparency of reporting. Even well-designed and meticulously executed studies can be misinterpreted or misrepresented if reported inadequately (5,6). Recognizing this, the Animal Research Guidelines Development group, comprising professional researchers from a range of disciplines, statisticians, and journal editors, was established in 2009 to formulate a checklist for improved reporting of animal research. The Animals in Research: Reporting In Vivo Experiments (ARRIVE) guidelines were first published in 2010 across four journals (7-10). The updated and reorganized ARRIVE 2.0 guidelines were further disseminated in 2020 across seven journals (11-17). Since then, the ARRIVE guidelines have gained widespread recognition among authors, journal editors, and reviewers. They are also accessible on the Enhancing the Quality and Transparency of Health Research (EQUATOR) network website (18) and recommended on its front page.
Rationale and knowledge gap
Recent studies have consistently highlighted the unsatisfactory reporting quality in animal research. Zhao et al. found that only 28.2% of subitems from the ARRIVE 1.0 guidelines met the stringent 90% compliance threshold across 4,342 animal experiments published in Chinese journals (19). An analysis of 234 studies involving the placing of electrocardiogram recording telemetry devices in adult mice demonstrated the quality of reporting was low to moderate regarding the animal, husbandry, statistics, and risk of bias related items (20). A systematic evaluation on modern studies on pulmonary heart valve implantation in large animals showed only a mean of 54.7% adequately scored ARRIVE items in 31 included articles (21). Another study reported a mean coincidence score of 53% in 28 animal experiments (22). Additionally, researchers have documented low overall compliance rates for the Essential 10 (42.0%) and Recommended Set (41.5%) from the ARRIVE 2.0 guidelines (23). However, existing investigations into adherence to the ARRIVE guidelines have predominantly focused on animal research published in journals beyond the journals that originally disseminated these guidelines. Yet, a knowledge gap remains: how is the compliance with ARRIVE guidelines within these journals? Furthermore, has there been a positive trend in adherence following the publication of ARRIVE 1.0 and ARRIVE 2.0 guidelines?
Objective
To answer these questions, this study aims to: (I) evaluate reporting quality: assess the reporting quality of interventional animal experiments published in the journals that initially published the guidelines before and after the release of ARRIVE 1.0 and 2.0 guidelines; (II) analyze compliance trends: present the reporting quality across three distinct periods for interventional animal experiments featured in these journals; (III) identify influencing factors: by analyzing underlying factors, provide actionable recommendations for enhancing the reporting quality of animal research. We present this article in accordance with the STROBE reporting checklist (available at https://cdt.amegroups.com/article/view/10.21037/cdt-24-413/rc).
Methods
Study design
The adherence to ARRIVE guidelines in interventional animal experiments published in target journals was evaluated across three distinct periods: Pre-ARRIVE 1.0, Post-ARRIVE 1.0, and Post-ARRIVE 2.0. Utilizing a structured PubMed search, relevant articles were identified. Compliance with ARRIVE guideline items was evaluated, and adherence levels were compared before and after the implementation of ARRIVE guidelines.
Search strategy and eligibility criteria
A comprehensive search was conducted to identify relevant interventional animal experiments from ten journals that initially released the ARRIVE 1.0 or 2.0 guidelines, including four that published ARRIVE 1.0 and seven that published ARRIVE 2.0. Notably, one journal published both versions of ARRIVE guidelines.
Periods considered: Pre-ARRIVE 1.0 (May 30, 2005 to May 30, 2010), 5 years preceding the publication of ARRIVE 1.0; Post-ARRIVE 1.0 (May 30, 2014 to May 30, 2019), 5 years after the release of ARRIVE 1.0, allowing a 4-year gap (2010–2014) to account for the guideline adaptation (24); Post-ARRIVE 2.0 (January 1, 2021 to December 31, 2021), 1 year after the publication of ARRIVE 2.0 in 2020.
Two reviewers (Y.L., F.Y.) independently conducted a systematic search in PubMed on March 3, 2022, using a structured search query (Table S1). Inclusion criteria: (I) interventional animal experiments (in vivo studies); (II) studies written in English. Exclusion criteria: (I) non-interventional animal experiments, including studies involving only human participants, in vitro or ex vivo experiments using animal tissues, organs, or cells, and bioinformatics analyses; (II) non-efficacy/mechanism studies, including parameter intervals, diagnoses, assay validity, and tissue morphology changes; (III) non-biomedical studies, including coral reproduction and migration; (IV) unavailability of the full text; (V) withdrawn from publication.
All the reviewers (Y.L., F.Y., and B.S.) underwent training to consistently understand the ARRIVE guidelines, screening, data extraction, and evaluation. The training consisted of a line-by-line interpretation of the checklists and the explanation & elaboration file for the ARRIVE guidelines (25). Two reviewers independently screened titles, abstracts, and full texts. Discrepancies were resolved through group discussion until reaching a consensus.
Data extraction
Data extraction table includes information: (I) basic characteristics: study and journal titles, journal Science Citation Index (SCI)-indexed, first author’s country, publication year, and reference to ARRIVE guidelines in the manuscript; (II) adherence to the ARRIVE items. Additionally, we evaluated the ‘author’s instructions’ section on the journal websites to assess their recognition and the strength of the recommendation of the ARRIVE guidelines. If the ARRIVE statement was mentioned, we recorded it as “yes” regarding ‘reference to ARRIVE in the author’s instructions’. Endorsement of ARRIVE guidelines was categorized as “mandatory” and “non-mandatory”. The condition “mandatory” was applied to strong wording and significant expression like “mandate”, “authors are required”, or “authors must” to distinguish the clarity of mandatory attitudes in journals. Conversely, the recognition without extra strong wording, such as “encourage”, “support”, “endorse”, “adopt”, “adhere” or “recommend” were considered as “non-mandatory” (26).
Reporting quality assessment of the included studies
Intraclass correlation coefficients (ICCs) were calculated to assess consistency in assessment among the three reviewers. An ICC of >0.75 indicates good reliability (27). It was estimated that seven studies were required to establish sufficient confidence intervals (available online: https://cdn.amegroups.cn/static/public/cdt-24-413-1.pdf), which were selected from each time period and assessed by each reviewer [using Excel function “=RANDBETWEEN()”]. All three reviewers were tested on reporting quality evaluation of all subitems for the same 14 studies prior to the start of the formal assessment. For the subsequent ARRIVE guideline assessments, each of the three reviewers (Y.L., F.Y., and B.S.) individually assessed the reporting quality of partial subitems (38/3 = 12–13 subitems per person, randomly assigned).
Studies from the four journals that published the ARRIVE 1.0 guidelines were evaluated following the ARRIVE 1.0 guidelines (Pre- and Post-ARRIVE 1.0), and that from the seven journals that published the ARRIVE 2.0 guidelines using the ARRIVE 2.0 guidelines (Post-ARRIVE 2.0).
Trained reviewers with no conflict of interest independently assessed a subset of items from the ARRIVE guidelines, following specific scoring criteria: “1” (fully reported), “0.5” (partially reported), or “0” (unreported). The total scores for each study ranged from 0 to 38 points, and the reporting quality coefficient was calculated (28):
Based on the reporting quality coefficient, the reporting quality was graded as follows: coefficient <0.5 was considered “poor”, 0.5≤ coefficient <0.8 was considered “average”, coefficient ≥0.8 was considered “excellent” (28).
The original data are available and accessed by downloading https://cdn.amegroups.cn/static/public/cdt-24-413-2.xlsx.
Statistical analysis
Statistical tests were performed using SPSS software (v26.0, IBM Corporation, Armonk, NY, USA). Categorical variables were reported as frequency (percentages). Comparisons of categorical variables between the groups were implemented using Chi-square test or Fisher’s exact test when expected cell counts were below 5. Post-hoc comparisons were performed using the Bonferroni correction to control for type I error. A two-sided P value less than 0.05 was considered statistically significant. ICC values <0.50, 0.50–0.75, >0.75 to 0.90, and >0.90 indicate poor, moderate, good and excellent reliability, respectively (27).
Results
Study inclusion
Figure 1 depicts the search and inclusion process. 215, 330, and 398 studies were included in the Pre-ARRIVE 1.0, Post-ARRIVE 1.0, and Post-ARRIVE 2.0 phases, respectively. Table 1 shows the number of studies included per journal; notably, studies published in BMJ Open Science were all excluded due to non-compliance with inclusion criteria.
Table 1
Publishing ARRIVE guidelines | Journal | Inception | 2022 JIF | SCI-indexed | Reference to ARRIVE in the author’s instructions# | Number of included studies |
---|---|---|---|---|---|---|
Journals publishing ARRIVE 1.0 guidelines | PB | 2003 | 9.8 | Yes | Yes | 275^ |
OC | 1993 | 7.0 | Yes | Yes | 250^ | |
VCP | 1975 | 1.2 | Yes | Yes | 7^ | |
JPP | 2010 | – | No | Yes | 13^ | |
Journals publishing ARRIVE 2.0 guidelines | PB | 2003 | 9.8 | Yes | Yes | 51‡ |
BJP | 1950 | 7.3 | Yes | Yes | 134‡ | |
JCBFM | 2002 | 6.3 | Yes | Yes | 72‡ | |
JP | 1878 | 5.5 | Yes | Yes | 65‡ | |
BMC-VR | 2005 | 2.6 | Yes | Yes | 15‡ | |
EP | 1990 | 2.7 | Yes | Yes | 61‡ | |
BMJ-OS | 2017 | – | No | Yes | 0‡ |
#, ARRIVE statement was mentioned in the author’s instructions. The detailed statements about ARRIVE guidelines in the author’s instructions are presented in Table S3. ^, the number of eligible studies from four journals publishing ARRIVE 1.0 guidelines during May 30, 2005 to May 30, 2010, and May 30, 2014 to May 30, 2019. ‡, the number of eligible studies from seven journals publishing ARRIVE 2.0 guidelines between January 1, 2021 to December 31, 2021. JIF, Journal Impact Factor; SCI, Science Citation Index; PB, PLOS Biology; OC, Osteoarthritis and Cartilage; VCP, Veterinary Clinical Pathology; JPP, Journal of Pharmacology & Pharmacotherapeutics; BJP, British Journal of Pharmacology; JCBFM, Journal of Cerebral Blood Flow & Metabolism; JP, The Journal of Physiology; BMC-VR, BMC Veterinary Research; EP, Experimental Physiology; BMJ-OS, BMJ Open Science; ARRIVE, Animals in Research: Reporting In Vivo Experiments.
Adherence to the ARRIVE guidelines
The inter-rater agreement among the three reviewers for assessing the ARRIVE 1.0 guidelines yielded a good ICC of 0.832 (95% CI: 0.799–0.861). For the ARRIVE 2.0 guidelines, the ICC was good with 0.753 (95% CI: 0.708–0.794).
Comparison of adherence in Pre- and Post-ARRIVE 1.0
Figure 2 summarizes the adherence to the ARRIVE guidelines. 52.6% (20 out of 38) subitems had a higher percentage of fully reported in Post-ARRIVE 1.0 than in Pre-ARRIVE 1.0. Among them, 39.5% (15 out of 38) subitems improved significantly (P<0.05, P value marked in blue). The percentage of “fully reported” studies was significantly higher in item 2 (abstract), item 5 (ethical statement), item 8a (animals details), item 9b (husbandry), item 9c (welfare-related assessments), item 10b (sample size calculation), item 10c (independent replications), item 11a (animals allocation), item 13a (statistical methods), item 13c (assumptions of statistical methods), item 15a (the number of animals for analysis), item 16 (outcomes and estimation), item 18b (study limitations), item 18c (implications of 3R), and item 20 (funding). Notably, items 7d (procedures rationale) and 10a (animal number) showed significantly reduced rates of “fully reported” in Post-ARRIVE 1.0 (Figure 2A, P value marked in red).
All eligible animal studies in Pre-ARRIVE 1.0 and Post-ARRIVE 1.0 fully reported item 1 (title), item 3a (sufficient scientific background), item 12 (outcomes definitions), item 13b (the analysis unit), and item 18a (results interpretation). However, within the Post-ARRIVE 1.0 phase, 13 out of 38 subitems (34.2%) remained unreported, with percentages exceeding 50% (i.e., 65.8% subitems with less than 50% unreported). The five least frequently reported items were item 11b (animals’ order of treatment and evaluation, 100% unreported), item 17b (reduced adverse events, 97.88% unreported), item 15b (exclusion reasons, 97.27% unreported), item 7d (procedures rationale, 95.15% unreported) and item 10b (sample size calculation, 92.12% unreported). None of the studies reported all 38 subitems outlined in the ARRIVE 1.0 guidelines (Figure 2A).
Comparison of adherence in Post-ARRIVE 1.0 and 2.0
Reporting appears to have improved since introducing the ARRIVE 2.0 guidelines, and the improvement appears to be more pronounced than Post-ARRIVE 1.0 (Figure 2B). In Post-ARRIVE 2.0, 27 out of 38 subitems (71.1%) had less than 50% unreported. The unreported percentage exceeded 50% in 11 out of 38 subitems (28.9%) in Post-ARRIVE 2.0, including items 2b (78.14%), 3a (88.69%), 3b (87.94%), 4b (68.34%), 9c (79.15%), 9d (90.70%), 10b (87.44%), 16b (87.94%), 16c (68.59%), 17b (65.33%), and 19 (92.71%). The first 7 subitems belonged to “ARRIVE 2.0 Essential 10” and the last 4 subitems belonged to “ARRIVE 2.0 Recommended Set”. The three items that were the least reported in Post-ARRIVE 2.0, include the disclosure of animal protocol approval (item 19, 92.71% unreported), the procedures rationale (item 9d, 90.7% unreported), and the inclusion and exclusion criteria (item 3a, 88.69% unreported). The latter two subitems belong to the “ARRIVE 2.0 Essential 10”. None of the studies reported all 38 subitems outlined in the Post-ARRIVE 2.0. Fully reported items include items 1b (experimental unit), 6a (outcomes definitions), 6b (primary outcome), 12a (sufficient scientific background), and 17a (results interpretation).
Noteworthy, the content of the ARRIVE 2.0 guidelines has not undergone significant changes. Consequently, 30 similar subitems exist between Post-ARRIVE 1.0 and 2.0 (Table S2). Among 27 similar and comparable subitems (3 subitems were excluded because they were not one-to-one comparable), three consistently achieved a 100% reporting rate in both Post-ARRIVE 2.0 and 1.0 (sufficient scientific background, outcomes definitions, and results interpretation). Notably, 33.3% (9 out of 27) and 40.7% (11 out of 27) subitems showed significantly reduced (P value marked in red) and increased (P value marked in blue) rates of “fully reported” in Post-ARRIVE 2.0 compared to Post-ARRIVE 1.0, respectively (Figure 2C). 63.6% (7 out of 11) of these significantly improved subitems belong to “The ARRIVE Essential 10” (number of groups, experimental unit, sample size calculation, exclusion reasons, assumptions of statistical methods, animal details, and procedures rationale). Six similar subitems in both Post-ARRIVE 1.0 and 2.0 still exhibit an unreported percentage exceeding 50%, including the site for the experimental procedures (1.0-item 7c: 80.61%; 2.0-item 9c: 79.15%), procedures rationale (1.0-item 7d: 95.15%; 2.0-item 9d: 90.70%), sample size calculation (1.0-item 10b: 92.12%; 2.0-item 2b: 78.14%), exclusion reasons (1.0-item 15b: 97.27%; 2.0-item 3b: 87.94%), details of all important adverse events (1.0-item 17a: 92.42%; 2.0-item 16b: 87.94%), and the study limitations (1.0-item 18b: 73.03%; 2.0-item 17b: 65.33%) (Figure 2C).
Reporting quality in Post-ARRIVE 1.0 and 2.0
There was a significant difference about the overall reporting quality among Pre-ARRIVE 1.0, Post-ARRIVE 1.0 and Post-ARRIVE 2.0 (P<0.001) (Table 2). The percentage of studies with “average” reporting quality was higher in Post-ARRIVE 1.0 compared to Pre-ARRIVE 1.0 (73.94% vs. 53.95%). In comparison, the percentage of studies with “poor” reporting quality was lower compared to Pre-ARRIVE 1.0 (26.06% vs. 46.05%). The overall reporting quality is significantly improved in Post-ARRIVE 1.0 compared to Pre-ARRIVE 1.0 (P<0.001). Additionally, concerning the ARRIVE 2.0 guidelines, a higher percentage of studies with “excellent” and “average” reporting quality was in Post-ARRIVE 2.0 compared to Post-ARRIVE 1.0 (0.25% vs. 0, 90.2% vs. 73.94%, respectively), and a lower percentage of studies with “poor” reporting quality (9.55% vs. 26.06%). A significant improvement in overall reporting quality was also found in Post-ARRIVE 2.0 compared to Post-ARRIVE 1.0 (P<0.001).
Table 2
Reporting quality | Pre-ARRIVE 1.0 | Post-ARRIVE 1.0 | Post-ARRIVE 2.0 | P valuea | P valueb | P valuec |
---|---|---|---|---|---|---|
Total | 215 | 330 | 398 | <0.001 | <0.001# | <0.001# |
Excellent, n (%) | 0 | 0 | 1 (0.25) | |||
Average, n (%) | 116 (53.95) | 244 (73.94) | 359 (90.20) | |||
Poor, n (%) | 99 (46.05) | 86 (26.06) | 38 (9.55) |
a, comparison among three groups; b, Pre-ARRIVE 1.0 vs. Post-ARRIVE 1.0; c, Post-ARRIVE 1.0 vs. Post-ARRIVE 2.0. #, P<0.0167. P value b and P value c were calculated with Bonferroni test to correct for multiple comparisons. ^, four journals: PLOS Biology, Osteoarthritis and Cartilage, Veterinary Clinical Pathology, and Journal of Pharmacology & Pharmacotherapeutics. ‡, six journals: PLOS Biology, British Journal of Pharmacology, Journal of Cerebral Blood Flow & Metabolism, The Journal of Physiology, BMC Veterinary Research, and Experimental Physiology. Categorical data were expressed as frequency (percentage). Pre-ARRIVE 1.0, May 30, 2005 to May 30, 2010; Post-ARRIVE 1.0, May 30, 2014 to May 30, 2019; Post-ARRIVE 2.0, January 1, 2021 and December 31, 2021. ARRIVE, Animals in Research: Reporting In Vivo Experiments.
Factors affecting the reporting quality
Significant differences emerged in reporting quality associated with journal, mandatory adherence to the ARRIVE guidelines in the author’s instructions, and reference to ARRIVE in the manuscript (all P<0.001) (Table 3). All nine journals explicitly mention ARRIVE guidelines in their author’s instructions, but only three journals have made it mandatory (Table S3). Mandatory adherence to the ARRIVE guidelines significantly impacts the reporting quality (P<0.001), exhibiting a higher percentage of studies with “excellent & average” reporting quality (93.42%) compared to journals that do not mandate adherence (80.21%). Furthermore, the presence of an ARRIVE statement in the manuscript also had a significant impact on reporting quality (P<0.001), demonstrating a higher percentage of “excellent & average” reporting quality (98.29%) compared to those without such a statement (75.71%).
Table 3
Characteristics | Total (N) | Excellent & average, n (%) | Poor, n (%) | χ2 value | P value |
---|---|---|---|---|---|
Country (29) | 3.19 | 0.074 | |||
Developed countries | 547 | 441 (81.54) | 101 (18.46) | ||
Developing countries | 181 | 158 (84.1) | 23 (15.9) | ||
Journals | 145.65 | <0.001** | |||
PB | 208 | 120 (57.69) | 88 (42.31) | ||
OC | 161 | 137 (85.09) | 24 (14.91) | ||
VCP | 3 | 3 (100.00) | 0 | ||
JPP | 9 | 8 (88.89) | 1 (11.11) | ||
BJP | 134 | 133 (99.25) | 1 (0.75) | ||
JCBFM | 72 | 68 (94.44) | 4 (5.56) | ||
JP | 65 | 59 (90.77) | 6 (9.23) | ||
BMC-VR | 15 | 15 (100.00) | 0 | ||
EP | 61 | 61 (100.00) | 0 | ||
Journals-indexed | 0.226 | >0.99 | |||
SCI-indexed | 719 | 596 (82.89) | 123 (17.11) | ||
Not SCI-indexed | 9 | 8 (88.89) | 1 (11.11) | ||
Mandatory adherence to ARRIVE guidelines in author’s instructions | 14.86 | <0.001** | |||
Yes | 152 | 142 (93.42) | 10 (6.58) | ||
No | 576 | 462 (80.21) | 114 (19.79) | ||
Reference to ARRIVE in the manuscript | 57.3 | <0.001** | |||
Yes | 234 | 230 (98.29) | 4 (1.71) | ||
No | 494 | 374 (75.71) | 120 (24.29) |
Categorical data were expressed as frequency (percentage). P values were calculated using Chi-square test or Fisher’s exact test when expected cell counts were below 5. Significant differences are indicated in bold, which are defined as **P<0.001. Post-ARRIVE 1.0, May 30, 2014 to May 30, 2019; Post-ARRIVE 2.0, January 1, 2021 and December 31, 2021. SCI, Science Citation Index; PB, PLOS Biology; OC, Osteoarthritis and Cartilage; VCP, Veterinary Clinical Pathology; JPP, Journal of Pharmacology & Pharmacotherapeutics; BJP, British Journal of Pharmacology; JCBFM, Journal of Cerebral Blood Flow & Metabolism; JP, The Journal of Physiology; BMC-VR, BMC Veterinary Research; EP, Experimental Physiology; ARRIVE, Animals in Research: Reporting In Vivo Experiments.
Discussion
Our analysis of interventional animal experiments published in the ten journals that released the ARRIVE guidelines revealed an improvement in reporting quality during periods before and after the publication of ARRIVE 1.0 and 2.0 guidelines. However, there remains room for improvement. Items most frequently left unreported from the ARRIVE 2.0 guidelines included item 19 (protocol registration), item 9d (procedures rationale), and item 3a (any inclusion and exclusion criteria). Furthermore, poor reporting of items about procedures rationale was observed both in ARRIVE 1.0 and 2.0 guidelines. The reporting quality was found to be significantly influenced by mandatory adherence to ARRIVE guidelines and reference to ARRIVE in the manuscript.
Inadequate reporting of key aspects in animal studies impairs the reproducibility of studies. Chalmers and Glasziou pointed out that at least 50% of research reports become unusable due to incomplete reporting (30,31). In addition to the suboptimal reporting quality found in our study, other studies also confirm this finding. Ding et al. evaluated 275 animal experiments following the ARRIVE 2.0 guidelines and found that only 50.6% discussed inclusion and exclusion criteria (item 3a); 20.9% of examined articles provided the procedures rationale (item 9d); and none of the studies reported the status of the study registration plan (item 19) (23). Another study also highlighted deficient compliance with the ARRIVE 2.0 guidelines for some critical items, including item 19 (protocol registration), item 3 (inclusion and exclusion criteria), item 4 (randomization), and item 5 (blinding) (32). To address this, journal editors, reviewers, and authors are encouraged to adhere to reporting guidelines, ensure high-quality reporting and minimize waste resulting from incomplete or unusable research.
Of note, our study found that item 19 (protocol registration) and item 9d (procedures rationale) were reported the poorest in Post-ARRIVE 2.0. Regarding item 19 (protocol registration), inadequate reporting exists in both clinical trials and animal studies. A study assessing clinical trials published in The BMJ from 2013 to 2017 found that improper registration persisted as a problem, especially for government or foundation-funded clinical trials (33). Prospective registrations are crucial for minimizing selective reporting, as they serve as predefined plans that facilitate the assessment of comprehensive reports and enable comparisons (34). The same study also highlighted that improperly registered clinical trials are almost always published, indicating that medical journal editors may not actively require registration (33). Detailed reporting of methods, including experimental processes and procedures, sample size calculation, allocation methods, is a key measure to ensure replication of the results of animal research. One study demonstrated that it was not possible to evaluate the reliability of the results in 51% animal experiments (39/76, with more than 500 citations) due to a bad methodological quality (35). Of 203 animal studies published in three prominent cardiovascular research journals, almost all studies lacked a power calculation and allocation methods (i.e., the unreported percentage exceeding 50%) (36). Similarly, when assessing the reporting quality of preclinical heart valve research, the authors found the item-sample size calculations was never reported and yielded a score of 0% (21). These findings underscore the need for greater attention from authors, editorial teams and reviewers to improve the quality of reporting in animal studies. Other upstream initiatives, such as registration requirements in funding calls may further improve reporting quality. For example, in February 2024, a call for funding from the German Ministry for Research and Education requests the preregistration of preclinical studies for successful applicants (37). Registration initiatives from funding agencies will simultaneously address these least-compliant items, as registration sites such as ‘animalstudyregistry.org’ require this information to be provided at the time of registration.
Authors may be unaware of the ARRIVE guidelines. Ma et al. found that only 9.4% (25 out of 266) of students and research staff were aware of the ARRIVE guidelines (38). Similarly, Reichlin et al. found that approximately half of in vivo researchers had never heard of the ARRIVE guidelines (39). These studies underscore the research community’s limited understanding of the ARRIVE guidelines. Given these results, there is room for improvement in author education regarding the importance and proper implementation of the ARRIVE guidelines. Alternatively, the reason for inadequate reporting may be the heavy burden placed on the author, reviewer, and editorial levels. Artificial intelligence may improve this burden in the coming years. According to the ARRIVE website, an adherence checker tool to verify an article’s adherence to ARRIVE guidelines will be publicly available on the ARRIVE website in 2025 (40). Another reason may be the word limit posed by certain journals, making complete reporting challenging. However, this could be addressed by reporting in the supplements.
Statistically significant differences in reporting quality were found across journals, and mandatory adherence to the ARRIVE guidelines correlated with improved reporting quality. A similar positive impact was observed in the case of journals from the Nature Publishing Group, where their editorial policy mandating a checklist led to enhanced conformity (41). However, in a 2019 survey, only 13.1% of the 198 responding journals were aware of the ARRIVE guidelines, and none of them included the ARRIVE guidelines in their author’s instructions (42). Also, the ARRIVE guidelines may be superficially endorsed by some journals (e.g., may not be mandatory or may not be enforced in practice) (43), but our findings suggest that mandatory adherence to the ARRIVE guidelines within journals could positively impact the quality of reporting and raise awareness among authors.
We analyzed whether adherence to ARRIVE guidelines in animal research was adequate in the ten journals that published the ARRIVE guidelines. Liu et al. evaluated the difference in reporting animal studies regarding before and after the publication of ARRIVE-2010 guidelines beyond these ten journals, including articles in pre-ARRIVE (2005–2010) and post-ARRIVE (2014–2019). They found that the adherence of 42.1% (16 out of 38) subitems had improved and two items were fully reported both in pre-ARRIVE and post-ARRIVE (24). In our study, 52.6% (20 out of 38) subitems had a higher percentage of fully reported in Post-ARRIVE 1.0 than in Pre-ARRIVE 1.0. Five items were fully reported in all eligible animal studies both in Pre-ARRIVE 1.0 and Post-ARRIVE 1.0. These data indicate that the publication of reporting guidelines may contribute to better guideline adherence by journals.
Although we observed suboptimal adherence to the ARRIVE guidelines, an upward trend was seen in reporting quality. The compliance rate for the majority of items has shown improvement in the Post-ARRIVE 1.0 period compared to the Pre-ARRIVE 1.0 period. Similarly, the publication of the ARRIVE 2.0 guidelines has had a significant impact on the reporting quality of interventional animal experiments. The detailed explanation and elaboration file of ARRIVE 2.0 guidelines might contribute to the good reporting quality, which is a useful tool for the authors to deeply understand the guidelines (25). In addition, the updated and reorganized ARRIVE 2.0 guidelines consist of 10 essential items and 10 recommended items (12). Our study identified 40.7% comparable similar subitems (11 out of 27) that were more comprehensive-reported significantly in Post-ARRIVE 2.0 than in Post-ARRIVE 1.0. Among the subitems showing improved reporting quality, 63.6% (7 out of 11) belong to “The ARRIVE Essential 10”. Some poorly reported items in the ARRIVE 1.0 guidelines are no longer required (e.g., items 14 and 18c) or have been refined (e.g., items 6b, 11a, and 11b) in Post-ARRIVE 2.0. This indicates that the form updates in the ARRIVE 2.0 guidelines are potentially contributing to the improved reporting quality. The guidelines developers may consider streamlining future reporting guidelines, emphasizing concise implementation, and focusing on core items.
Our study has certain limitations. First, while we have assessed adherence to ARRIVE 2.0 guidelines, our evaluation covers only 1 year. A longer-term perspective would provide a more robust understanding of the impact of the updated guideline and provide more time for adoption by authors who may have been unaware of ARRIVE 2.0 guidelines. Second, although we observe sequential improvements in reporting quality across Pre-ARRIVE 1.0, Post-ARRIVE 1.0 and Post-ARRIVE 2.0, no causality can be concluded regarding the publication of the ARRIVE 1.0 and 2.0 guidelines with regard to reporting quality. We must consider potential time effects. Is the enhancement solely due to ARRIVE 2.0 itself, or has awareness gradually influenced adherence since the publication of ARRIVE 1.0? Third, as this study primarily aimed to explore adherence to ARRIVE guidelines in the ten journals that released these guidelines, we did not randomly select another ten journals with comparable sample sizes, time dimensions, and impact factors to do the adherence comparison. We relied on existing literature for our comparison. While this approach provides valuable insights, it lacks the head-to-head comparison that a randomized selection would offer. Fortunately, we identified a literature source with a similar methodology, allowing us to draw meaningful comparisons, but methodological differences may impact outcomes (24). Finally, regarding the mandatory nature of the statement in the author’s instructions, due to the broad period, it is likely that journals have changed their instruction times over this period. Our analysis is only based on information viewed on February 19, 2024, so the results are only representative of a static situation at this time point.
Conclusions
The reporting quality regarding interventional animal experiments exhibited improvement after the release of the ARRIVE 1.0 and 2.0 guidelines. However, a significant gap persists regarding full reporting for animal research, within journals where the ARRIVE guidelines were published. While the adoption of ARRIVE guidelines by journals is improving, the level of advocacy remains insufficient. In the future, journals should be proactive in increasing the adoption of the ARRIVE guidelines. Some strategies include employing more assertive language, including the ARRIVE 2.0 compliance form as part of the author’s submission documents, and incorporating the ARRIVE 2.0 compliance form, filled out by authors, into the checklists used by editors and reviewers (at a minimum, focusing on the essential ten items outlined in ARRIVE 2.0).
Acknowledgments
We acknowledge Bob P. Hermans for the reproduction of the draft Figure 2, and the critical and valuable comments on this study.
Funding: None.
Footnote
Reporting Checklist: The authors have completed the STROBE reporting checklist. Available at https://cdt.amegroups.com/article/view/10.21037/cdt-24-413/rc
Peer Review File: Available at https://cdt.amegroups.com/article/view/10.21037/cdt-24-413/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://cdt.amegroups.com/article/view/10.21037/cdt-24-413/coif). Y.L., F.Y., and B.S. serve as full-time staff of AME Publishing Company (publisher of Cardiovascular Diagnosis & Therapy). K.Z. serves as an unpaid Associate and Guest Editor of Cardiovascular Diagnosis & Therapy from September 2024 to August 2026, and a full-time staff of AME Publishing Company (publisher of Cardiovascular Diagnosis & Therapy). The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- He Z, Xing R, Fang X, et al. On laboratory animal welfare, animal experiments and alternative methods of animal experiments. Lab Anim Sci Manage 2005;22:61-4.
- Brockhurst JK, Villano JS. The Role of Animal Research in Pandemic Responses. Comp Med 2021;71:359-68. [Crossref] [PubMed]
- Sandercock P, Roberts I. Systematic reviews of animal experiments. Lancet 2002;360:586. [Crossref] [PubMed]
- Perel P, Roberts I, Sena E, et al. Comparison of treatment effects between animal experiments and clinical trials: systematic review. BMJ 2007;334:197. [Crossref] [PubMed]
- Jilka RL. The Road to Reproducibility in Animal Research. J Bone Miner Res 2016;31:1317-9. [Crossref] [PubMed]
- Macleod M, Mohan S. Reproducibility and Rigor in Animal-Based Research. ILAR J 2019;60:17-23. [Crossref] [PubMed]
- Kilkenny C, Browne WJ, Cuthill IC, et al. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. Osteoarthritis Cartilage 2012;20:256-60. [Crossref] [PubMed]
- Kilkenny C, Browne WJ, Cuthill IC, et al. Improving bioscience research reporting: The ARRIVE guidelines for reporting animal research. J Pharmacol Pharmacother 2010;1:94-9. [Crossref] [PubMed]
- Kilkenny C, Browne WJ, Cuthill IC, et al. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol 2010;8:e1000412. [Crossref] [PubMed]
- Kilkenny C, Browne WJ, Cuthi I, et al. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. Vet Clin Pathol 2012;41:27-31. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: updated guidelines for reporting animal research. BMJ Open Sci 2020;4:e100115. [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. Exp Physiol 2020;105:1459-66. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. BMC Vet Res 2020;16:242. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: updated guidelines for reporting animal research. J Physiol 2020;598:3793-801. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. J Cereb Blood Flow Metab 2020;40:1769-77. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. Br J Pharmacol 2020;177:3617-24. [Crossref] [PubMed]
- Percie du Sert N, Hurst V, Ahluwalia A, et al. The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. PLoS Biol 2020;18:e3000410. [Crossref] [PubMed]
- Simera I, Altman DG, Moher D, et al. Guidelines for reporting health research: the EQUATOR network's survey of guideline authors. PLoS Med 2008;5:e139. [Crossref] [PubMed]
- Zhao B, Jiang Y, Zhang T, et al. Quality of interventional animal experiments in Chinese journals: compliance with ARRIVE guidelines. BMC Vet Res 2020;16:460. [Crossref] [PubMed]
- Gkrouzoudi A, Tsingotjidou A, Jirkof P. A systematic review on the reporting quality in mouse telemetry implantation surgery using electrocardiogram recording devices. Physiol Behav 2022;244:113645. [Crossref] [PubMed]
- Uiterwijk M, Vis A, de Brouwer I, et al. A systematic evaluation on reporting quality of modern studies on pulmonary heart valve implantation in large animals. Interact Cardiovasc Thorac Surg 2020;31:437-45. [Crossref] [PubMed]
- Abbas TO, Elawad A, Pullattayil S AK, et al. Quality of Reporting in Preclinical Urethral Tissue Engineering Studies: A Systematic Review to Assess Adherence to the ARRIVE Guidelines. Animals (Basel) 2021;11:2456. [Crossref] [PubMed]
- Ding F, Hu K, Liu X, et al. Quality of reporting and adherence to the ARRIVE guidelines 2.0 for preclinical degradable metal research in animal models of bone defect and fracture: a systematic review. Regen Biomater 2022;9:rbac076. [Crossref] [PubMed]
- Liu H, Gielen MJCAM, Bosmans JWAM, et al. Inadequate awareness of adherence to ARRIVE guidelines, regarding reporting quality of hernia models repaired with meshes: a systematic review. Hernia 2022;26:389-400. [Crossref] [PubMed]
- Percie du Sert N, Ahluwalia A, Alam S, et al. Reporting animal research: Explanation and elaboration for the ARRIVE guidelines 2.0. PLoS Biol 2020;18:e3000411. [Crossref] [PubMed]
- Knüppel H, Metz C, Meerpohl JJ, et al. How psychiatry journals support the unbiased translation of clinical research. A cross-sectional study of editorial policies. PLoS One 2013;8:e75995. [Crossref] [PubMed]
- Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 2016;15:155-63. [Crossref] [PubMed]
- García-González M, Muñoz F, González-Cantalapiedra A, et al. Systematic Review and Quality Evaluation Using ARRIVE 2.0 Guidelines on Animal Models Used for Periosteal Distraction Osteogenesis. Animals (Basel) 2021;11:1233. [Crossref] [PubMed]
- Statistical Annex. World economic situation and prospects 2023. Available online: https://desapublicationsunorg/file/1113/download
- Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet 2009;374:86-9. [Crossref] [PubMed]
- Glasziou P, Altman DG, Bossuyt P, et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet 2014;383:267-76. [Crossref] [PubMed]
- Guo A, Zheng Y, Zhong Y, et al. Effect of chitosan/inorganic nanomaterial scaffolds on bone regeneration and related influencing factors in animal models: A systematic review. Front Bioeng Biotechnol 2022;10:986212. [Crossref] [PubMed]
- Loder E, Loder S, Cook S. Characteristics and publication fate of unregistered and retrospectively registered clinical trials submitted to The BMJ over 4 years. BMJ Open 2018;8:e020037. [Crossref] [PubMed]
- Song J, Solmi M, Carvalho AF, et al. Twelve years after the ARRIVE guidelines: Animal research has not yet arrived at high standards. Lab Anim 2024;58:109-15. [Crossref] [PubMed]
- Hackam DG, Redelmeier DA. Translation of research evidence from animals to humans. JAMA 2006;296:1731-2. [Crossref] [PubMed]
- Williams JL, Chu HC, Lown MK, et al. Weaknesses in Experimental Design and Reporting Decrease the Likelihood of Reproducibility and Generalization of Recent Cardiovascular Research. Cureus 2022;14:e21086. [Crossref] [PubMed]
- Federal Ministry of Education and Research. Zweite Änderung der Richtlinie zur Förderung von Projekten zum Thema ‘Alternativmethoden zum Tierversuch’. Available online: https://www.bmbf.de/bmbf/shareddocs/bekanntmachungen/de/2024/02/2024-02-07-%C3%84nderungsbekanntmachung-Tierversuch.html. Bundesanzeiger vom 07022024.
- Ma B, Xu JK, Wu WJ, et al. Survey of basic medical researchers on the awareness of animal experimental designs and reporting standards in China. PLoS One 2017;12:e0174530. [Crossref] [PubMed]
- Reichlin TS, Vogt L, Würbel H. The Researchers' View of Scientific Rigor-Survey on the Conduct and Reporting of In Vivo Research. PLoS One 2016;11:e0165999. [Crossref] [PubMed]
- ARRIVE guidelines (published in 2010). Resources. Available online: https://wwwarriveguidelinesorg/resources
- Did a change in Nature journals' editorial policy for life sciences research improve reporting? BMJ Open Sci 2019;3:e000035. [PubMed]
- Zhang T, Yang J, Bai X, et al. Endorsement of Animal Research: Incorporation of In Vivo Experiments (ARRIVE) Guidelines/Gold Standard Publication Checklist (GSPC) by Chinese journals: A survey of journals' instructions for authors and editors. Lab Anim 2019; Epub ahead of print. [Crossref] [PubMed]
- Novak AL, Shaw DJ, Clutton RE. Animal welfare requirements in publishing guidelines. Lab Anim 2022;56:561-75. [Crossref] [PubMed]