Stop searching, and you will find it: Search-Resistant Concepts in Systematic Searching
Not only might you have heard of this statement, but you have also experienced it. When something is lost, and you concentrate on finding it, it hides away, and you cannot find it. It is not found until you stop searching for it, and it will appear in front of you. Is it another Murphy’s law? I don’t know. What I know is that we practically use this idea in systematic searching. I call them Search-Resistant Concepts.
Search-Resistant Concepts (SRCs) are the concepts that when added to a the search, are more likely to miss the relevant records. In other words, they will bias search towards specificity and reduce the sensitivity in an uncertain way. The best examples are Clinical Outcomes. The three main reasons for existence of such concepts are poor reporting (Visibility and Searchability), lack of standards terminology (Indexability), and our lack of knowledge (Ignorance: Known Unknowns).
Reason 1: Visibility and Searchability (Reporting in Literature)
The first rule to finding a term through free-text or natural language searching is that it should exist in writing, “somewhere” in the paper. By ‘somewhere’ here, I mean the parts of the papers — in technical databases term, fields of the record. Not all parts of the papers are visible and findable through the usual systematic searching methods. Here is why:
- Thanks to the horrible open science practice in medicine during the past century, we can only see and read the title and abstract of most academic papers for free. As a result, most search interfaces and databases can only index and search such ‘visible’ parts.
- Again thanks to terrible practice by academic publishers, the authors are allowed to have abstracts between 150–400 words. You have to squeeze your 300-page PhD thesis into an 8-page journal paper; if that’s not enough, you must squeeze those pages into a 250-word abstract. Since many authors think more about publishing than findability, their titles and abstracts do not contain all the relevant, standard and known terminology. The poor audience should be able to find the published work based on what the authors put in the title and abstract. Let’s say you have measured 30 outcomes in your randomised controlled trial, and you have a 250-word limitation to write your Objective, Methods, Results, and Conclusion in the Abstract. How many outcomes will you be able to report (=make searchable)?
- Last but not least, the publishing industry is rigid and dead. The publishers usually don’t follow a living model, so it is impossible to retrospectively correct or standardise all the previous papers to make them more visible or indexable. It would not benefit the publishers because they get paid for providing access to content.
The most famous search-resistant concepts in medical systematic searching are Outcomes and Controls/Comparators. It does not mean that reporting the quality of the other PICOS concepts in clinical publications is acceptable but reporting the outcomes, which are usually in the results section of the abstract and full text, is the worst. In terms of being terrible, reporting the Control ranks second after outcomes.
Drop the concepts from PICOS as long as the number of search results is manageable.
Example: Let’s say there are 20 relevant studies in the literature with Outcome A for the PICOS Question B; however, only 6 of them reported this outcome in the searchable part of the paper. If you add Outcome A to your PICOS search, you will probably find these 6 studies, but you will miss the remaining 14. However, if you go with PICS, PIS, or PI and drop the outcome (O) instead of searching PICOS, you will find all 20 studies! It’s like magic. So to find them, you stop searching for them.
Alongside O (Outcomes), we also can drop C from PICOS and take the PIS to the search box. Somehow, we may also get rid of S and put PI into the search box.
Reason 2: Standardisation of Terminology and Indexability
If you notice, we search free-text terms (usually title and abstract) and controlled vocabularies (MeSH and Emtree). Many non-experts ask: what is the point of adding controlled vocabularies?
Some controlled vocabularies are being selected by human indexers or in a semi-automated or automated way based on the ‘full text’ of the journal papers, not just the title and abstract. Controlled vocabulary widens the horizon of retrieval compared to when we are searching for free text which is limited to the visible part of the papers (title and abstract). We should be grateful to the librarians who liberate these hidden part of knowledge from the paywalled full texts into searchable content in the form of controlled vocabularies. While it is a bless, when automated, full text indexing can be a curse adding to the number of irrelevant results.
While a sub-specialty of librarians called ‘indexers’ do their best to help us make the content visible, their efforts would pay off only if there is standard or semi-standard terminology widely used in the literature. Repetition of such standard terminology in an indexer’s eye (human or machine) would mean the concept is important and is being used frequently and in a standard way. So, it can be indexed.
Standardisation of terminology increases the chance of indexability and findability, in turn.
So lack of standardisation is the enemy of searchability and findability. ‘Clinical improvement’ is one of the most important primary outcomes in many systematic reviews, but can you imagine in how many ways it might have been measured and reported in the literature? If I’m searching for interventions that clinically improve the patient, can I add ‘clinical improvement’ to my search strategy? Do we have any idea how many ways exist to report an outcome?
Reason 3: Acknowledging our ignorance or lack of knowledge (Known unknowns)
Our lack of knowledge of concepts is another reason why we cannot add a search block for a concept with confidence. Working with 18 Cochrane groups, I noticed that there are so many emerging new interventions related to all conditions. So many that it is impossible to keep up and know them. When I finished part of my PhD, I reported that there are about 2800 interventions related to schizophrenia tested in randomised/controlled clinical trials. None of the experts was aware of all or most of them. This finding emphasises the value of knowledge discovery based on indexing by information professionals.
So we can’t search for a concept when we are not aware of all or most of the relevant terminology. For example, a review team wanted to find all the interventions for a low-global-prevalence disease. They spent 4 weeks adding to the long list of interventions until I suggested they leave Intervention block (I) out of their PICOS because the number of search results is still manageable without an intervention block (PS rather than PICOS). At the end of the review, they confessed that what they were trying to achieve at the beginning of the review (list of interventions) was supposed to be part of their results at the end of the review. We identified interventions that none of the review team members knew of. No matter how expert you are, acknowledge your ignorance.
We can only include the known search terms in a search strategy with condfidence (Known Knowns). Having the knowledge of our ignorance (Known Unknowns), we can usually and confidently drop or exclude a concept from a search to know this unknowns in later stages of the systematic review (Knowledge Discovery). A good systematic review usually discovers some knowledge (Known Knowns) and more ignorance (Known Unknowns).
Compromises in Search
However academic it sounds, pragmatics may be different from what we preach. I have run many searches where the team members and myself were sure that it was not the best way to search because we were not aware of all the relevant terminology (known unknowns); however, to reduce the number of search results to a manageable level, we had no choice but to add a block containing only known terms. Good examples of systematic reviews of Pathophysiology or Psychotherapy. Can anyone dare to say they can list all psychotherapies or all pathophysiology terms? But look around for the number of systematic reviews with search blocks for both concepts. It is always good to add a limitation section to your final report to confess that no search is perfect = Don’t Judge Me, Peer-Reviewer!
Conclusion
So it is true that even in systematic searching if you stop searching, you may find what you are looking for. It’s contrary to ‘if you build it, they will come’. In Search-Resistant Concepts, if you build a block, they may not come, or only some will come!
The reasons we cannot search all concepts of a question are 1. poor reporting of terminology in searchable parts of the database records, 2. lack of standard terminology for the concept that affects the indexability adversely, and 3. lack of knowledge of the concept.
While dropping Search-Resistant Concepts is the best solution, it would only work if the number of search results for the systematic review is manageable based on the available time and human and monetary resources.
Acknowledgement
Two years after publishing this blog post, its mature version was published as a paper in the BMJ Evidence-Based Medicine journal (free accepted version here):
Shokraneh F. Stop searching and you will find it: Search-Resistant Concepts in systematic review searches. BMJ Evidence-Based Medicine 2024. doi: 10.1136/bmjebm-2023–112798. PMID: 39107090.
If you liked this blog post, please support me by pressing the green Follow button and signing up so I can write more. Email Subscription Dysfunctions. Thank you :D