In Black Box We Trust: Machine Learning-Based Record Screening for Systematic Reviews

8 min readDec 1, 2022
Black Box in Machine Learning

I had an amazing two days at Search Solutions 2022 with a lot of lively discussions, and thrown, hit and missed punches :D I don’t miss Search Solutions, the only conference I attend every year.

I’m not going to keep you waiting, I’m going straight to the point. It is time for us to start trusting machine learning-based features for searching and screening stages of systematic reviews. Believe it or not, we will have no choice but to gradually adopt them into our routine workload or fall behind our fellow colleagues.

Using devices and technologies that use Machine Learning (ML) features in daily life and work is inevitable. Sometimes you have a choice to use or not to use them (turn off your smartphone), and sometimes you don’t (your bank’s chatbot until you get mad, then it connects you to a human).

If you are conducting a systematic review, you have a choice to use or not use ML-based technologies. Before writing this post, I have spoken to many of my colleagues who are big fans of automation, but it does not necessarily mean that they like or use ML-based features!

What is the Difference Between Automation and ML?

Automation of course differs from ML. In automation, the machine follows a pre-set number of steps and rules (algorithms) to achieve a known outcome or complete a task usually for repetitive tasks again and again with no performance improvement; ML uses machines to learn from data and improve the performance of a task.

In a systematic review context, the data could be the records, words in records, the density of words in the record, and the distance of words from each other. If no human decision is involved, this data set could be enough for an unsupervised ML system. If a human/humans screen some records and make include/exclude decisions for some records (labels the data), the decisions made by a human/humans can be part of the data, and this set of data could be a case for a semi-supervised or supervised ML system.

A good example of automation in systematic reviews is using a set filter in EndNote to find duplicates or using the EndNote tool to find full texts. Record screening


An Evidence Scientist with a Pinch of Career and Life Lessons