Automating Search Updates for Systematic Reviews by Building a Living Search Systems: Barriers and Solutions
All systematic reviews are as up-to-date as their search date. The systematic reviews start getting out of date as soon as the search is finished, even though you have not even started the screening.
It is also accepted that depending on the topic of your review, your systematic review may require weekly, monthly, quarterly, biennial, or annual update searches. One way is to assume that you will always have access to your library resources and that the librarians actually have time to run searches for zillions of systematic reviews. If you have access to such librarians, enjoy your luxury life!
Those who don’t have access to search experts (librarians and information specialists) are trying so hard to replace them with machines! If you are one of them, I must disappoint you because this post is not about that.
However, if you care about saving resources (including your librarians’ precious time), and you have the resources to handle the steps after searches, it seems logical to build a workflow to automate the update searches.
The steps for automating the update search are as follow:
A. A search expert, designs, and tests the searches. A second search expert peer-reviews the search (best practice).
B. The searches are run, the results are exported, and the search strategies are saved in a dedicated account in each search interface, setting the frequency of running them and receiving the update search results in a dedicated stable Email inbox.
As simple as it sounds, you may find it more complicated than that. As Hafez of Shiraz says: “Love seemed easy at first, but soon difficulties occurred”.
Barrier 1: Not all sources support features such as the personal account, saving searches, automatic searches, and automatic emailing of the results.
Of course, you can do this for MEDLINE, Embase, Google Scholar, etc. But can you save searches in ClinicalTrials.Gov (CTG) or WHO ICTRP? We are doomed.
Solutions
First, consider this when writing your protocol and think if you need all the listed sources; if not, dump them.
Second, think if it is possible to replace these tricky sources with an alternative source. For example, think about whether it is possible to run auto-alert features for WHO ICTRP and CTG through CENTRAL in Cochrane Library. It does not mean that CENTRAL can replace searching CTG or WHO ICTRP; however, this is a pragmatic solution when needed and when we clearly know and list the limitations of such pragmatism. Alternatively, can you run a search for a specific website or source via Google’s Site Search or Google Scholar and save it in your Google Alerts or Google Scholar Alerts and receive search updates? Of course, it would depend on how well the sources are indexed by Google and how reliable Goolge’s alerting system is. I guess, this is another limitation we should list in the search report.
Third, if there is no choice, a mix of automatic and manual searches can work as long as you can run the manual searches as soon as you receive the automatic search. This helps smooth the recording of numbers, reporting a single search date, and updating the PRISMA flow diagram only once per period.
Barrier 2: The frequency of auto-alerts may vary across the sources.
It would be perfect if the sources allow you to set up the same frequency (daily, weekly, monthly, quarterly, biennial, and annual); however, not all sources have these exact options.
Solutions
First, ignore the lack of consistency in the frequency of receiving the updates and deal with them on an as-arrive basis. So you deal with monthly searches every month and deal with a few other quarterly searches on a quarterly basis. The difficulty in this solution is spending more time on recording the numbers, de-duplication, and seeing or screening duplicate records several times.
Second, let the search arrive in the inbox and deal with them when appropriate. If your plan is to update every month but some of the resources provide only weekly automatic emails, and some of them monthly automatic emails, you can ignore the weekly emails until the end of the month and look at them at the end of the month. The best way to do this is to create a Rule (in MS Outlook) or Filter (in Gmail) to assign these emails to a label and put them away from the inbox so you can check them at the end of the month.
Barrier 3. The format in which results are received is not supported by citation management or screening programs.
Depending on how good the resource is, the format of the search results you will receive may vary, and you may need to spend some time on one of these solutions.
Solutions
- Reproducing the search results from the original source and downloading them in a machine-readable format; think about clicking a link from your PubMed automatic search alert in your inbox that takes you to the search results page, where you have to select and export the results. Or you may have to copy the unique IDs of records from records in your inbox, then combine them with OR and run them as a search in the database to re-create the results and export them.
- Cleaning the search results semi-manually to make them readable for the citation managers. Think about the old Ovid email auth-alerts that you copy into a .txt file and replace four spaces with two spaces and a few other modifications so you can import them into a citation manager.
- Creating an import filter for the citation manager that can read the search results in a format you have received in your email. Think about the import filters we create to import the search results from CTG or WHO ICTRP into EndNote.
In an ideal world where bibliographic database developers actually listen to the librarians, all update search results should be available in multiple formats with RIS = RefMan = Reference Manager and Comma-Separated Values (CSV) file formats the default standards.
Barrier 4. Removing and recording cross-database and cross-search duplicate records
Since you will be receiving results from multiple databases and multiple updates, handling different types of duplicate records is going to be challenging, as I have discussed before; think about that scary PRISMA diagram :D. So, we may have to exclude the same duplicate over and over again across multiple updates and record the numbers for each update search.
Solutions
- Poor reviewers will suffer from recording the numbers manually for each update.
- The wiser reviewers will use one of the screening systems as I listed in the previous post. This system can automatically detect most duplicates, record the numbers, and even create a PRISMA automatically. While the system detects the duplicates, you have to mark them as duplicates manually “one by one” which is practical only if you are dealing with a few hundred results per update, not thousands. So the dilemma is if your librarian ran the first search and they were kind enough to remove the duplicates for you in EndNote, your system-generated PRISMA is not going to record the duplicates that your librarian removed. Remember that even though the screening programs provide an accurate system for finding (not removing) duplicates “one by one”, in terms of speed of removing duplicates (hundreds or thousands in one click after your check), none of them can compete with EndNote (See how EndNote does that).
Barrier 5. Search strategies may change as the topic is living.
Let’s be honest: if we are conducting a living systematic review, it means the research on the topic is active and develops rapidly. As a result, it is very likely that new concepts or terminologies will develop. A Living Search Strategy (as we suggested for the first time for COVID-19) is needed. So, if you add a new search term for an existing search strategy, there are several problem scenarios with possible solutions.
Solutions
- The search terms are so new, that a retrospective or prospective search is not required because the current search strategy captures them without needing additional terms. In other words, adding new terms does not retrieve any “new, unique, relevant” results. That’s the loveliest case scenario. Yeah, You can wish to see this scenario in your sweet dreams!
- The search terms retrieve new, unique, relevant results, but they are so new a prospective search would cover all the relevant studies. So, a retrospective search is not needed. Regardless, to find these new studies, we have to run the old search strategy (Line #1) and then the new search strategy (Line #2) and then NOT the results to get only the new result (Line #2 NOT Line #1). Then, you can revise the saved search strategies to cover the changes in automatic update searches.
- The search terms retrieve new, unique, relevant results, some very new, some not so new! It means we have simply missed relevant words in your old search strategy. Now you need to add the new words, re-run the full search, and again NOT the result, as I explained in the above paragraph.
In heavens, where the Living Search System is in place, you will be able to add new words to the saved search strategies and get only the new results because an internal accurate De-Duplicator would remove the need for using NOT or manual or semi-automated de-duplication. One more thing, in heavens, you will run your search in one meta-database (such as OpenAlex) not in several databases with horrible features, so no cross-database duplicates.
Barrier 6. The review protocol may change as the topic is living.
So, people change and review protocols change and create a mess. There are two searches that we care about regarding the search: 1. Changes that change the search strategies; 2. Changes that change the list of search sources.
Solutions
Changes to search strategies happen in one of these forms:
Addition of new concepts or words. Needs NOTting and/or De-Duplication (see Barrier 5).
Removal of old concepts or words. Mostly, no action is required.
Revisions to existing concepts or words. Depends on the effect of change on increasing/decreasing the number of results, but ‘usually’ needs NOTting and/or De-Duplication (see Barrier 5).
Addition and removal of a limitation to the search. ‘Usually’ needs NOTting and/or De-Duplication (see Barrier 5).
Similar recommendations are applicable when there are changes in the list of search sources.
Barrier 7. Rigidity of PRISMA flow diagram.
Do you know any systematic reviewer who has not felt the PRISMA Diagram’s Pain? Of course, it must be automated, but simply because dealing with records happens on several platforms, some manual work on PRISMA is always required. What platforms:
- Databases and sources where we run the search. We kind of de-duplicate when we use NOT in Barriers 5 and 6. Some databases allow excluding the records from Other databases (CINAHL can remove MEDLINE records, Embase can remove MEDLINE records). If you run a search in two databases in the same interface at the same time (Embase and MEDLINE via Ovid SP in one session), there is an option to de-duplicate.
- Citation managers. When the search is run for the very first time, most librarians use EndNote to de-duplicate thousands of results at once to make life easier for reviewers, who usually de-duplicate the records one by one. EndNote is the fastest tool for removing duplicates but not the most accurate tool for finding duplicates.
- Screening managers. Screening programs usually have some of the most accurate de-duplicators! However, you have to remove the duplicate records one by one in these programs, which means de-duplication could be very time-consuming if librarians don’t use EndNote, which is still the faster tool for removing the duplicates.
When assessing the performance of de-duplication tools, we need to consider three factors:
1. Time to find the duplicates. Most systems are equally good.
2. Accuracy of duplicates found. Rayyan and Covidence seem to be sensitive and specific, respectively.
3. Time to remove the duplicate. EndNote seems to be the fastest.
Since we use different platforms for de-duplication across different steps (while running the first search inside databases, after exporting search results to EndNote, and after importing the result into Covidence), automatic recording of numbers in the PRISMA diagram is not yet possible.
Solutions
Sorry, manual or semi-automated solutions are still the best options.
Barrier 8. Loss of institutional access to the databases, loss of the account owner, or loss of access to the inbox.
Not everyone is privileged to have access to luxury databases. Most academics and students at the university don’t have any idea how much it costs otherwise, they won’t stop searching! Anyway, what happens to the search saved in the databases for automatic search if we lose institutional access to the database because we leave the uni or change the university? What if the person who has created the account in the databases leaves or worse? How about losing access to the email inbox?
Solutions
As long as you plan ahead, you will be fine with plans B, C, and D. It is important that there are at least 2–3 people who know the process and logins and inbox. Regardless, the login info and a protocol should be in place for any of these changes. We better use a stable general organisational email address that does not change when people leave. Such email address can be used for creating accounts in search interfaces and be the inbox for receiving automatic emails containing the search results.
Barrier 9. Databases change.
Databases change constantly. We receive emails from databases all the time telling us about new features, new updates to the existing records, new coverage, new policies, etc. AND most importantly, the databases’ controlled vocabularies and fields and field tags change. So, if you have used MeSH or Emtree terms or database fields that have changed (added, removed, revised), you should be mindful of any change’s implications to your search strategies Constantly.
Solutions
See Barrier 5.
Change is the only constant and if you are conducting the living systematic review, you know better than anyone else that everything can change.
Barrier 10. The living systematic review goes into a coma or dies.
If your review lived a full life, you have no worries. If it goes into a coma, make sure to have all the ‘reproducible’ protocols in place for its return. Such protocols are better saved not only on your organisational and internal hard drives and cloud systems but also shared openly in the public domain in platforms such as Open Science Framework (OSF), so regardless of how many people come and go and how many institutes take over the project, the methods are there for anyone who can afford to update the review.
Conclusion
An ideal future for systematic review searches means
Searching one meta-database only. Will sources such as OpenAlex make this dream come true one day? Wait and see! This has been on librarians’ list of dreams for so long!
Internal accurate De-Duplicator in a database management system that only retrieves the unique new records for any update search or after any changes in search, so no further de-duplication is required.
In this post, I tried to list the barriers that we are facing or may face if we try to automate the update searches for systematic reviews and the possible solutions for each barrier (if it exists). I seek your feedback to update this post.
If you liked this blog post, please support me by pressing the green Follow button and signing up so I can write more. Email Subscription Dysfunctions. Thank you :D