Performance of Artificial Intelligence in Evidence Synthesis and Systematic Reviews is Affected by Multiplicity and Duplication

Farhad
6 min readSep 19, 2024
Document Multiplicity and Data Duplication. Created by: https://deepai.org/machine-learning-model/text2img

Not new and not good news!

Since duplication and multiplicity are connected, I put them in one post. They are so connected that I have seen people exclude multiple but different reports of one Study, calling them duplicates! That's not true!

Duplicate Records in Screening and AI

I have babbled about duplicates and deduplication in length, from the typology of duplicates at record level to data level to why EndNote and other programs cannot find the duplicates. Here, I summarize:

Finding and removing duplicates was already difficult for human reviewers and machines. Now, we know that having duplicates can affect the weighting of the relevancy of the records and bias the machine. So, we must remove almost all the duplicates before asking AI to screen them! I’m well against perfectionism in deduplication, and in my other post, I explained that we don’t need to remove 100% of duplicate records, but in light of the new information, we may be better do our best to remove all the duplicates if planning to use AI! Ironically, AI could be the best help to detect duplicate records, but many tools are still ignoring this important step…

--

--

Farhad

An Evidence Scientist with a Pinch of Career and Life Lessons