Priority Screening for Systematic Reviews: All you need to know
Love it or hate it, machine learning and artificial intelligence hacked their way into our lives, and those who can use them to their advantage live longer and work shorter.
If you are one of those systematic reviewers still using EndNote to screen your search results, I must respectfully guide you toward the industry news: Priority Screening is here to stay.
We cannot waste more hours on screening. We are done going through all those sensitive search results; the machine is here to help without interrupting your independence. And God knows we’ve paid our dues!
What is Priority Screening?
In the systematic review and evidence synthesis context, priority screening is screening those search results that are identified by the machine learning model as relevant/undecidable as the priority. The rest of the results could be either ignored or viewed and marked as irrelevant by one human reviewer.
Priority Screening has three requirements:
- Human Screener: The human reviewer starts the routine screening process by including/excluding the records.
- Machine Learning Model: After a certain number of eligibility decisions by humans, a machine learning (ML) model can be built (automatically by the screening system or human command) and rank the most relevant records at the top of the list of records for screening. As the screening goes on, this ML model can be updated (automatically by the screening system — Active Learning — or by human command).
- Progress Graph: The screening continues until the time that the screening of new records by the human screener does not lead to the inclusion of any relevant record. For example, this can be set as part of the review protocol: “If during priority screening the reviewer excludes 100 continuous records, the screening will stop”. Since counting the numbers during the screening might not be pragmatic, a Progress Graph could be a good indicator. If you are interested in knowing more about when to stop, please read Callaghan & Müller-Hansen 2020 to know about four ways to stop screening and their shortcomings.
Depending on the context, stopping does not always mean not screening the rest of the records; however, you can switch to single-reviewer screening or by reducing your mental workload, you can sift through the remaining more rapidly as you know the majority are going to be irrelevant.
What is a Progress Graph?
A progress graph is a simple graph where X is the number of records screened, and Y is the number of records identified as relevant. When the graph starts going flat, it is a good time to stop or continue with one reviewer.
Priority Screening in the systematic review context is similar to the Saturation concept in qualitative research where interviewing new participants does not add new unique content to the content of previous interviews and so new interviews from new participants in not required.
What are the benefits of priority screening?
- Sensitive search and unlimited search results: you don’t have to limit your search strategies because your team cannot manage the high number of results. The search can be sensitive and comprehensive.
- Saving time: during the priority screening, your team may need to screen 1–20% of search results, not all, for most reviews. In rare cases, up to 50% of records should be enough. That will save a lot of time and resources.
- Rapid screening: highlighting keywords, priority screening based on relevancy ranking, subject expertise, and user-friendliness of the automation tool are the four main contributing factors to the screening speed.
- Acceptability as part of PRISMA Diagram 2020: The exclusion of records through automation tools is now part of the PRISMA diagram, so the community and the evidence scientists have accepted it as a method.
What automation programs support Priority Screening?
EPPI-Reviewer (not free) was probably the first and has improved since its inception. Most other programmes do not provide a progress graph, making it slightly harder to stop. Any programme using ML, such as Rayyan (free), Covidence (possibly free for LMIC and fee-based), Abstrackr (free), PICO Portal (free and fee-based version), ASReview (free and open), Colandr (free), and Sysrev (free and fee-based version) can be used for priority screening.
Tsou et al. 2020 compared EPPI-Reviewer and Abstrackr if you are interested.
Will Priority Screening by Machine Replace Screening by Humans?
For Systematic Reviews: By switching from two reviewers to one review after the flat-graph point, resources will be saved, but there is no replacement. It is just saving human resources.
For Scoping Reviews: It is controversial; however, at some point, it is possible to stop screening in scoping/mapping review after the flat-graph point without looking at the rest of the records. Again, it saves human resources, not replaces them.
For Living Systematic Reviews (LSRs): When updating the LSRs, if machine and human have worked together to build a machine learning model at baseline search, it might be possible that the machine can fully replace human only for screening the update searchresults; however, this has not been tested or validated.
Conclusion
Like any other innovation, it will take time for the majority to learn, adapt and fluently use priority screening until it becomes a routine practice.
It is important to know the golden rule in dealing with machine learning or any other input-process-outcome (IPO) system:
Garbage in, garbage out!
- The quality of machine learning’s decision will be as good as the reviewer(s) who train the machine. Wrong eligibility decisions will make an ML model perform randomly or by chance. The more effort you put into your part in screening, the more benefit you will get in return from the machine. As suggested: “over-cautious screening = confused model = poor performance” (conversation on Twitter).
- Before setting up the automation program for priority screening, read the documentation to ensure you are doing it correctly.
- For complex reviews, you may end up screening more than 50% until you get to a safe stopping point.
- One way to continue after the flat graph would be for one reviewer (not two) to swiftly sift through the remaining records.
We should wait for more automation programs or features to reduce the prevalence of carpal tunnel syndrome among the systematic reviewers.
Life will get easier for many systematic reviewers.
If you liked this blog post, please support me by pressing the green Follow button and signing up so I can write more. Email Subscription Dysfunctions. Thank you :D