How to know if a journal is indexed in PubMed or Google Scholar? What's the difference?
What does indexing a journal means?
- It means the journal's bibliographic information, abstracts, and sometimes full texts have been added to the database, and by searching the databases, you will be able to find them.
- It also means some level of light quality check has happened for a journal to meet the criteria for being indexed in the database.
- Indexing on some databases also means human/machine/both have extracted/assigned or will extract/assign controlled vocabulary to each article to improve their retrieval/findability.
Why is it important for a journal to be indexed or for researchers to publish in these journals?
One word, visibility. The main and probably the only benefit of indexing journal or publishing in indexed journals is making your work visible. So if you can make your work publicly available and visible (indexed), you won’t need journals at all.
Of course, peer review is another process that is part of publishing in journals, but you don't need to publish in journals to have peers review your work! Some of the greatest academic works (including dissertations are peer-reviewed outside the commercial publishing ecosystem.
How does Google Scholar index the journal articles?
Google Scholar's indexing takes place automatically based on the meta-tags or meta-data from the webpage of each article or metadata in files (PDF, DOC, DOCX, PPT, etc.). It also indexed the main free and public sources of articles such as PubMed, preprint databases, etc. For full details on its indexing guides, please see here: https://scholar.google.co.uk/intl/en/scholar/inclusion.html
About 15 years ago, I followed these guidelines to index about 23 journals in Google Scholar.
As I mentioned “search engines such as Google Scholar do not provide the list of journals they index” mainly because such a list would require human curation and is costly and also because of web and search engine dynamics.
Reasons for removal or exclusion of journals or articles from Google Scholar
Many journals disappear from Google Scholar’s index and as a result from search results after a while or search engines remove their pages from their index because of the following reasons:
- Their host server is down on several occasions, and the website has become inaccessible;
- Changes in the website’s content management system that requires time for the search engine to figure out the structure;
- Lack of XML site map to help the search engine figure out the structure of the website;
- Changes in search engine’s algorithms;
- Using scripts or Java scripts that are not bot-friendly in the website;
- A mistake in the robot.txt file in the host’s root or using the noindex tag;
- Not having privacy and cookies consent 3. not providing meta-tags (i.e. Dublin Core).
We cannot assume every journal with an online footprint is indexed in Google Scholar; many are not because they don’t know the basics of web design and Search Engine Optimization (SEO). Many have some of their content indexed; others have all their content indexed in Google Scholar; however, it does not mean that just because some content is indexed today, it will be there next year.
What are the differences between indexing in Search Engines (i.e. Google Scholar) and Bibliographic Databases (i.e. MEDLINE)?
As a search engine, the concept of indexing in Google Scholar is different from indexing in a bibliographic database.
1. Automatic/Technical: Indexing in a search engine happens automatically as its crawler spider goes through the web to find the pages that meet its technical criteria. There is no quality control and minimum human involvement. Webmasters can modify their websites or their Google admin account to make this happen more smoothly. It may take one month to one year from realising the content on the web until its visibility in Google Scholar.
2. Dynamic: Indexed pages/articles in search engines may disappear with changes in the search engine's policy and criteria if the server hosting the website is down or for many other reasons.
3. Mixed Quality: There is no human quality assessment/control, so you can find restaurant food menus in Google Scholar because they are in PDF and have good metadata (metatags).
4. High Volume: the quantity of indexed content is more important than the content's quality for search engines.
5. Generality: a search engine such as Google Scholar indexes any page meeting its technical criteria regardless of its topic being medicine, engineering, art, philosophy, or a restaurant's menu.
On the contrary, bibliographic databases need humans for indexing, their content is way more static (once indexed, almost always there), and they care about quality and a balance between quality and quantity. Most bibliographic databases with controlled vocabularies (MeSH, Emtree) are dedicated to specific scientific field such as medicine (MEDLINE, Embase) or engineering/physics/computing (Inspec).
Requirements for indexing in bibliographic databases
Indexing in bibliographic databases needs lots of requirements (caries across databases) such as publication history (varies from 50 articles to 3 years of publications), quality of published articles, editorial board's diversity and expertise in the topic, publisher's policy on publication ethics, editorial processes, peer-review, and so on. Humans assess the journals, and they may get rejected, and if they do, the journals can reapply after a waiting time (sometimes 3–5 years). Some databases only index the content from the year the journal passes the approval criteria, not the previous ones.
How to get a journal or article indexed in PubMed?
PubMed does not directly index journals/papers. To become visible (not indexed) in PubMed, the papers should get indexed/included in one of the following subsets:
1. Index the Journal in MEDLINE: probably the hardest way to get a journal to become visible in PubMed.
2. Deposit Journal Articles to PubMed Central: probably the easiest way to get a journal to become visible in PubMed (requires XML output for full-text paper, and you need an XML technician or company who can produce XML of papers).
3. Preprints that report NIH-funded research results to make individual papers (not journals) visible in PubMed. Some journals use this as a scam, saying that they are indexed in PubMed while only NIH-funded papers published in these journals are visible in PubMed.
How to find out the indexing status of a journal in PubMed
The best way to find out the indexing status of a journal in PubMed/MEDLINE is to search the title of the journal in NLM Catalog:
https://www.ncbi.nlm.nih.gov/nlmcatalog
How to find out if a journal is indexed in Google Scholar?
Unlike bibliographic databases, search engines such as Google Scholar do not provide the list of journals they index. So, there is no definite answer.
If you search the journal's name or the title of one of the most papers published in the journal and find it in Google Scholar, then you or the journal can "CLAIM" the journal is indexed in Google Scholar; however, there is no way to say ALL content/papers of the journal is indexed in Google Scholar unless you run a search and check all papers one by one manually which is not useful. For example, you can search using the source: "Journal Name" command in the search box or click on Hamburger Menu and select advanced search and in "Return articles published in", you can write the journal's name and run the search.
The algorithm that GS, as a search engine, uses, is to help the users find the most relevant document on top; its purpose is not to find every relevant paper for the user, and that's why it only shows the first 1000 results. If you want to see more than that, you must break/filter your search to sub-searches based on publication year to see all the results.
How to publish in indexed journals/other platforms to serve Open Science
For the sake of open science and fighting the commercial all-for-profit publishers:
- Don't publish if you don't have to.
- Publish on open access journals that do not charge you an article processing charge. Free for authors and free for readers.
- Do not publish in journals that put your paper behind a paywall, own the copyright and sell your work for a profit without paying you, the reviewers, or the editors (all free labour/slavery).
- If you have to publish in paywalled journals, make your paper's accepted version (before publication version) available publicly and freely on your institutional repository or other free online repositories.
- Do not publish or upload in PDF format. PDF is the greatest format enemy of open science. Try HTML, just like the blog post you are reading.
- Publish in preprints servers such as medRxiv, bioRxiv, psyArxiv, OSF, etc., before submitting to the journals. Even if you don't publish, Google Scholar and Embase index preprints, and PubMed will index all preprints soon: https://www.ncbi.nlm.nih.gov/pmc/about/nihpreprints/
If you liked this blog post, please support me by pressing the green Follow button and signing up so I can write more. Email Subscription Dysfunctions. Thank you :D