Linking eight Chinese Lunar New Year customs to key research principles makes statistical rigor memorable.
Eight festive traditions, from clarifying question to interpreting reports, guide more trustworthy research.
The eight-day Spring Festival countdown offers a cultural roadmap for rigorous research.
Blending cultural narrative with statistical guidance to help researchers avoid pitfalls and boost credibility.
| [1] | Suchak T., Aliu A. E., Harrison C., et al. (2025). Explosion of formulaic research articles, including inappropriate study designs and false discoveries, based on the NHANES US national health database. PLoS Biol. 23:e3003152. DOI:10.1371/journal.pbio.3003152 |
| [2] | Feng G., Zhao Y., Yan F., et al. (2025). Escaping the data misuse maze: Reorienting medical research toward clinical needs. Innov. Med. 3:100153. DOI:10.59717/j.xinn-med.2025.100153 |
| [3] | Kahan B. C., Hindley J., Edwards M., et al. (2024). The estimands framework: A primer on the ICH E9(R1) addendum. BMJ 384:e076316. DOI:10.1136/bmj-2023-076316 |
| [4] | Van den Broeck J., Cunningham S. A., Eeckels R., et al. (2005). Data cleaning: Detecting, diagnosing, and editing data abnormalities. PLoS Med. 2:e267. DOI:10.1371/journal.pmed.0020267 |
| [5] | Pilowsky J. K., Elliott R. and Roche M. A. (2024). Data cleaning for clinician researchers: Application and explanation of a data-quality framework. Aust. Crit. Care 37:827−833. DOI:10.1016/j.aucc.2024.03.004 |
| [6] | Dziadkowiec O., Callahan T., Ozkaynak M., et al. (2016). Using a data quality framework to clean data extracted from the electronic health record: A case study. EGEMS (Wash DC) 4:1201. DOI:10.13063/2327-9214.1201 |
| [7] | Asher J., Resnick D., Brite J., et al. (2020). An introduction to probabilistic record linkage with a focus on linkage processing for WTC registries. Int. J. Environ. Res. Public Health 17:6937. DOI:10.3390/ijerph17186937 |
| [8] | Ciccione L., Dehaene G. and Dehaene S. (2023). Outlier detection and rejection in scatterplots: Do outliers influence intuitive statistical judgments. J. Exp. Psychol. Hum. Percept. Perform. 49:129−144. DOI:10.1037/xhp0001065 |
| [9] | Nakayama Y., Yata K. and Aoshima M. (2024). Test for high-dimensional outliers with principal component analysis. Japanese Journal of Statistics and Data Science 7:739−766. DOI:10.1007/s42081-024-00255-0 |
| [10] | Lakshmi R. and Sajesh T. A. (2025). A robust distance-based approach for detecting multidimensional outliers. J. Appl. Stat. 52:1278−1298. DOI:10.1080/02664763.2024.2422403 |
| [11] | Weiskopf N. G. and Weng C. (2013). Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research. J. Am. Med. Inform. Assoc. 20:144−151. DOI:10.1136/amiajnl-2011-000681 |
| [12] | Collins G. S., Reitsma J. B., Altman D. G., et al. (2015). Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMJ 350:g7594. DOI:10.1136/bmj.g7594 |
| [13] | Anthony C. A., Marco R. and Aldo C. (2021). The Box-Cox transformation: Review and extensions. Statistical Science 36:239−255. DOI:10.1214/20-STS778 |
| [14] | Sappani M., Mani T., Sudarsanam T., et al. (2022). Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research. Clinical Epidemiology and Global Health 15:101043. DOI:10.1016/j.cegh.2022.101043 |
| [15] | Nieboer D., Vergouwe Y., Roobol M. J., et al. (2015). Nonlinear modeling was applied thoughtfully for risk prediction: The Prostate Biopsy Collaborative Group. J. Clin. Epidemiol. 68:426−434. DOI:10.1016/j.jclinepi.2014.11.022 |
| [16] | Binder H., Sauerbrei W. and Royston P. (2013). Comparison between splines and fractional polynomials for multivariable model building with continuous covariates: A simulation study with continuous response. Stat. Med. 32:2262−2277. DOI:10.1002/sim.5639 |
| [17] | Lopez-Ayala P., Riley R. D., Collins G. S., et al. (2025). Dealing with continuous variables and modelling non-linear associations in healthcare data: Practical guide. BMJ 390:e082440. DOI:10.1136/bmj-2024-082440 |
| [18] | Ma J., Dhiman P., Qi C., et al. (2023). Poor handling of continuous predictors in clinical prediction models using logistic regression: A systematic review. J. Clin. Epidemiol. 161:140−151. DOI:10.1016/j.jclinepi.2023.07.017 |
| [19] | Feng G., Xu H., Wan S., et al. (2024). Twelve practical recommendations for developing and applying clinical predictive models. Innov. Med. 2:100105. DOI:10.59717/j.xinn-med.2024.100105 |
| [20] | Goodman M. S., Lopez A., Murillo A. L., et al. (2025). A comparison of methods for coding race in linear and logistic regression models. Ann. Epidemiol. 112:15−22. DOI:10.1016/j.annepidem.2025.10.005 |
| [21] | Daly A. J. D., Dekker T. and Hess S. (2016). Dummy coding vs effects coding for categorical variables: Clarifications and extensions. J. Choice Model. 21:36−41. DOI:10.1016/j.jocm.2016.09.005 |
| [22] | CIBIS Investigators and Committees. (1994). A randomized trial of beta-blockade in heart failure. The Cardiac Insufficiency Bisoprolol Study (CIBIS). Circulation 90:1765−1773. DOI:10.1161/01.cir.90.4.1765 |
| [23] | CIBIS-II Investigators and Committees. (1999). The Cardiac Insufficiency Bisoprolol Study II (CIBIS-II): A randomised trial. Lancet 353:9−13. |
| [24] | Bradley V. C., Kuriwaki S., Isakov M., et al. (2021). Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature 600:695−700. DOI:10.1038/s41586-021-04198-4 |
| [25] | Riley R. D., Ensor J., Snell K. I. E., et al. (2020). Calculating the sample size required for developing a clinical prediction model. BMJ 368:m441. DOI:10.1136/bmj.m441 |
| [26] | Riley R. D., Snell K. I., Ensor J., et al. (2019). Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat. Med. 38:1276−1296. DOI:10.1002/sim.7992 |
| [27] | Riley R. D., Snell K. I. E., Ensor J., et al. (2019). Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes. Stat. Med. 38:1262−1275. DOI:10.1002/sim.7993 |
| [28] | Pedersen A. B., Mikkelsen E. M., Cronin-Fenton D., et al. (2017). Missing data and multiple imputation in clinical epidemiological research. Clin. Epidemiol. 9:157−166. DOI:10.2147/clep.S129785 |
| [29] | White I. R., Royston P. and Wood A. M. (2011). Multiple imputation using chained equations: Issues and guidance for practice. Stat. Med. 30:377−399. DOI:10.1002/sim.4067 |
| [30] | Enders C. K. (2010). Applied missing data analysis. (The Guilford Press). |
| [31] | Leacy F. P., Floyd S., Yates T. A., et al. (2017). Analyses of sensitivity to the missing-at-random assumption using multiple imputation with delta adjustment: Application to a tuberculosis/HIV prevalence survey with incomplete HIV-status data. Am. J. Epidemiol. 185:304−315. DOI:10.1093/aje/kww107 |
| [32] | Cro S., Morris T. P., Kenward M. G., et al. (2016). Reference-based sensitivity analysis via multiple imputation for longitudinal trials with protocol deviation. Stata. J. 16:443−463. |
| [33] | National Research Council Panel on Handling Missing Data in Clinical Trials. (2010). The prevention and treatment of missing data in clinical trials. (National Academies Press (US)). DOI:10.17226/12955 |
| [34] | Little R. J., D'Agostino R., Cohen M. L., et al. (2012). The prevention and treatment of missing data in clinical trials. N. Engl. J. Med. 367:1355−1360. DOI:10.1056/NEJMsr1203730 |
| [35] | Bell M. L., Fiero M., Horton N. J., et al. (2014). Handling missing data in RCTs; a review of the top medical journals. BMC Med. Res. Methodol. 14:118. DOI:10.1186/1471-2288-14-118 |
| [36] | Lee K. J., Tilling K. M., Cornish R. P., et al. (2021). Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. J. Clin. Epidemiol. 134:79−88. DOI:10.1016/j.jclinepi.2021.01.008 |
| [37] | Seaman S. R. and White I. R. (2013). Review of inverse probability weighting for dealing with missing data. Stat. Methods Med. Res. 22:278−295. DOI:10.1177/0962280210395740 |
| [38] | Wang Y., Li W., Wang L., et al. (2026). Addressing confounders in observational comparative effectiveness research: Methods, software, and reporting standards. Innov. Med. 4:100187. DOI:10.59717/j.xinn-med.2026.100187 |
| [39] | Karch J. (2023). Outliers may not be automatically removed. J. Exp. Psychol. Gen. 152:1735−1753. DOI:10.1037/xge0001357 |
| [40] | Yuen K.-V. and Ortiz G. A. (2017). Outlier detection and robust regression for correlated data. Comput. Methods Appl. Mech. Eng. 313:632−646. DOI:10.1016/j.cma.2016.10.004 |
| [41] | Varin S. and Panagiotakos D. B. (2020). A review of robust regression in biomedical science research. Arch. Med. Sci. 16:1267−1269. DOI:10.5114/aoms.2019.86184 |
| [42] | Morton H. C. (1983). Graphical presentation: The visual display of quantitative information. Science 221:1170−1172. DOI:10.1126/science.221.4616.1170-a |
| [43] | Weissgerber T. L., Milic N. M., Winham S. J., et al. (2015). Beyond bar and line graphs: Time for a new data presentation paradigm. PLoS Biol. 13:e1002128. DOI:10.1371/journal.pbio.1002128 |
| [44] | Cumming G. (2014). The new statistics: Why and how. Psychol. Sci. 25:7−29. DOI:10.1177/0956797613504966 |
| [45] | Hopewell S., Chan A. W., Collins G. S., et al. (2025). CONSORT 2025 statement: Updated guideline for reporting randomised trials. BMJ 389:e081123. DOI:10.1136/bmj-2024-081123 |
| [46] | Staffa S. J. and Zurakowski D. (2021). Statistical development and validation of clinical prediction models. Anesthesiology 135:396−405. DOI:10.1097/aln.0000000000003871 |
| [47] | Riley R. D., Ensor J., Snell K. I., et al. (2016). External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: Opportunities and challenges. BMJ 353:i3140. DOI:10.1136/bmj.i3140 |
| [48] | Bouwmeester W., Zuithoff N. P., Mallett S., et al. (2012). Reporting and methods in clinical prediction research: A systematic review. PLoS Med. 9:1−12. DOI:10.1371/journal.pmed.1001221 |
| [49] | Fry A., Littlejohns T. J., Sudlow C., et al. (2017). Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186:1026−1034. DOI:10.1093/aje/kwx246 |
| [50] | Collins G. S., Dhiman P., Ma J., et al. (2024). Evaluation of clinical prediction models (part 1): From development to external validation. BMJ 384:e074819. DOI:10.1136/bmj-2023-074819 |
| [51] | Riley R. D., Archer L., Snell K. I. E., et al. (2024). Evaluation of clinical prediction models (part 2): How to undertake an external validation study. BMJ 384:e074820. DOI:10.1136/bmj-2023-074820 |
| [52] | Toll D. B., Janssen K. J., Vergouwe Y. et al. (2008). Validation, updating and impact of clinical prediction rules: A review. J. Clin. Epidemiol. 61:1085−1094. DOI:10.1016/j.jclinepi.2008.04.008 |
| [53] | Strandberg R., Jepsen P. and Hagström H. (2024). Developing and validating clinical prediction models in hepatology - An overview for clinicians. J. Hepatol. 81:149−162. DOI:10.1016/j.jhep.2024.03.030 |
| [54] | Alba A. C., Agoritsas T., Walsh M., et al. (2017). Discrimination and calibration of clinical prediction models: Users' guides to the medical literature. JAMA 318:1377−1384. DOI:10.1001/jama.2017.12126 |
| [55] | Binuya M. A. E., Engelhardt E. G., Schats W., et al. (2022). Methodological guidance for the evaluation and updating of clinical prediction models: A systematic review. BMC Med. Res. Methodol. 22:316. DOI:10.1186/s12874-022-01801-8 |
| [56] | Moons K. G., Kengne A. P., Grobbee D. E., et al. (2012). Risk prediction models: II. External validation, model updating, and impact assessment. Heart 98:691−698. DOI:10.1136/heartjnl-2011-301247 |
| [57] | Wasserstein R. L. and Lazar N. A. (2016). The ASA's statement on p-values: Context, process, and purpose. Am. Stat. 70:129−133. DOI:10.1080/00031305.2016.1154108 |
| [58] | Greenland S., Senn S. J., Rothman K. J., et al. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. Eur. J. Epidemiol. 31:337−350. DOI:10.1007/s10654-016-0149-3 |
| [59] | Rovetta A., Piretta L. and Mansournia M. A. (2025). p-Values and confidence intervals as compatibility measures: Guidelines for interpreting statistical studies in clinical research. Lancet Reg. Health Southeast Asia 33:100534. DOI:10.1016/j.lansea.2025.100534 |
| [60] | Altman D. G. and Bland J. M. (1995). Absence of evidence is not evidence of absence. BMJ 311:485. DOI:10.1136/bmj.311.7003.485 |
| [61] | Boutron I., Dutton S., Ravaud P., et al. (2010). Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 303:2058−2064. DOI:10.1001/jama.2010.651 |
| [62] | Hernán M. A. (2018). The C-Word: Scientific euphemisms do not improve causal inference from observational data. Am. J. Public Health 108:616−619. DOI:10.2105/ajph.2018.304337 |
| [63] | Vandenbroucke J. P., von Elm E., Altman D. G., et al. (2007). Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and elaboration. Ann. Intern. Med. 147:W163−194. DOI:10.7326/0003-4819-147-8-200710160-00010-w1 |
| [64] | Stürmer T., Wang T., Golightly Y. M., et al. (2020). Methodological considerations when analysing and interpreting real-world data. Rheumatology (Oxford) 59:14−25. DOI:10.1093/rheumatology/kez320 |
| Feng G., Wang H., Zhang T., et al. (2026). A Chinese Lunar New Year countdown to trustworthy medical research: Eight traditions, eight statistical checkpoints. The Innovation Medicine 4:100197. https://doi.org/10.59717/j.xinn-med.2026.100197 |
To request copyright permission to republish or share portions of our works, please visit Copyright Clearance Center's (CCC) Marketplace website at marketplace.copyright.com.