Bibliographic & Citation Databases:
Scholarly Information Sources for Researcher’s
Shivaram BS, PhD
Joint Head, ICAST
CSIR-National Aerospace Laboratories, Bangalore
shivaram@nal.res.in
Objectives of the Module 5.1
• To know about scholarly literature
• To explore various platforms for literature search
• To know the use of AI tools for literature search
Presentation Outline
• Literature Search Avenues
• Bibliographic Databases
• Internet based discovery platforms for Literature Search
• Patent Databases
• AI Tools for Literature Search
Database
Database An organized collection of structured information, or
data, typically stored electronically in a computer system
Scholarly A type of database used to find academic publications
Information on topics across academic disciplines
Database
Types of Scholarly
Databases
Publication Metrics
Databases Publication Misconduct
Research Writing
Reference Management
Literature Search
Validation
Experimentation/Modeling /
Simulations
Problem Definition
Literature Search
• Starts with searching of Bibliographic / Metadata sources
• Bibliographic Data: the information needed to identify and retrieve publications
such as Journal articles, books, Conf. items, etc.
• Metadata : data about data which is used to describe digital objects
• Examples Bibliographic data (fields):
• Title of the book or article
• Author / Creator
• Journal Name
• Year of publication
• Key words
• Abstract
• The Resources which contain Bibliographic data are called Bibliographic
sources. Ex: Search Engines, Directories, databases,
Avenues for Scholarly Information/Literature (Bib sources)
Past Present
Information Retrieval on Internet
Digital Content
Tools for Web Information Retrieval
• Search Engines (Generic Information)
– Meta Search Engines
– Specialty Search Engines
– Web Directories
– Portals & Gateways
• Scholarly Databases
– Bibliographic databases
– Citation Databases
– Patent Databases
– Digital Library Platforms
– Open Access Literature Platforms
Search Engines (A layman approach)
Search Engine: is a program that searches for keywords specified
by the user, in the databases of websites on the World Wide Web
About 8,41,00,000 results
Top Search engines
Google (95% uses) https://www.google.com/
Bing http://www.bing.com/
Ask http://www.ask.com/
Yahoo https://www.yahoo.com/
Lycos http://www.lycos.com/ About 71,500,000 results
DuckduckGo (Indian, No tracking) https://duckduckgo.com/
Yandex (Translator) https://www.yandex.com/
Entireweb http://www.entireweb.com/
Gigablast (Interesting features) http://www.gigablast.com/
Meta Search Engines
Meta search Engines: take input from a user and simultaneously send
out queries to third party search engines for results
Top Meta Search engines
WebCrawler http://www.webcrawler.com/
Dogpile http://www.dogpile.com/
Info.com http://www.info.com/
Startpage https://www.startpage.com/
eXicte http://www.excite.com/
zoo http://www.zoo.com/
Search.com http://www.search.com/
Yippy http://www.yippy.com/
Mamma https://mamma.com/
Infospace http://infospace.com/
Specialty Search Engines
Specialty search Engines: take input from a user and simultaneously
send out queries to public search engines for results & organizes
search results into clusters; offers better visualizations
Top Specialty Search engines
Carrot2 https://search.carrot2.org/#/search/web
https://millie.northernlight.com/dashboardfolder.php
Millie
Search strategy in Google
Phrase Search:
General To narrow
Search down
: 95% Noise
Google Advanced
Restricting Search Options
by Filetype: pdf
I am an Academician/Researcher
Interested in scholarly literature!!! How to
get????
Literature (Information) Band
Retrievals by General Search Engines
Books
Adams R J, Smart P and Huff A S, Shades of grey: guidelines for working with the grey literature in systematic reviews for
management and organizational studies, International Journal of Management Reviews, 19(4) (2017) 432–454
Scholarly Information
• Information created in the course of research activities
• Information published by scholars to inform their learning / research findings
• Information which is undergone a rigorous review process by peers in their
discipline
• Published in regular publishing framework – Commercial, societies, Open
access, so on
Scholarly Information: Document types
• Journal articles
– Review articles
– Original Research articles
– Case study
– Rapid communications
• Conference papers
• Books / Book chapters
• Government reports
• Case Studies reports
Scholarly Information growth
• Global scientific output doubles every nine years (Nature News Blog dated 07 May 2014 by
Richard Van Noorden)
• 36000+ English Language and 10000+ non English Language Peer reviewed
journals adding over 3 million articles every year (STM Report 2022)
• Scholarly Literature (Source: Web of Knowledge platform)
Journal & Conf. Papers Patents E-Books Data sets
155 million 39.3 million patent families with More than one lakh 7.3 million
more then 70 million patents
Thanks to ICT, most of them are available Online
How do I Trust Web Information (Research)
Follow CRAAP Model
Scholarly Information: Access Modes
Offline Mode Online (Digital) Mode
Subscribed Content STM Publishers
Libraries
Gold OA
Open Access Content
Personal Green OS
Collection
Author Profiles / Homepages
Free Content
Academic Social Media
Archival Centers Platforms
Scholarly Information Discovery Platforms
Grey literature
Scholarly Search Data Repositories E-Books
Engines
Bib. Databases
Open Access Report servers
content
Library OPACs
Patent Resources Reference Management
platforms
Digital Libraries
E-print servers
Publisher
Aggregators platforms
Datasets
Thesis &
Dissertations
Servers Manuals
Scholarly Search Engines
• Specialty Search Engine • Google Scholar
– https://scholar.google.co.in/
• Academic Search Engines • Microsoft Academic Search
– http://academic.research.microsoft.com/
• Restricted to Scholarly • CrossRef Metadata Search
– http://www.crossref.org/
Content
• Semantic Scholar – AI Powered
– https://www.semanticscholar.org/
• Add on functionalities • Gettheresearch
– https://gettheresearch.org/
• Powerful search • BASE (Open Access articles)
functionalities – http://www.base-search.net/
Search Engine: General Vs Scholar
Google Scholar
Google Always recommend scholarly search engine
Google Scholar: Tips
Link to available Full text
Cited by: links to all articles list who has cited
Related articles: Brings you related articles
All Versions: links all available places where details of the article present
Cite: Exports Citation of the article (MLA, APA, Chicago, Harvard) (Bibtext, Refman, Endnote, RefWorks)
Save: will save to your Google scholar library Demonstration
CrossRef
Not-for-profit membership organization for scholarly publishing to make content easy to find,
cite, link, and assess
BASE
• 100 Million documents from 5000 sources, 60% is open access content
• Contain Metadata of academically relevant resources - journals, institutional repositories, digital collections etc
• Indexed only document servers which matches the quality criteria of BASE
• Discloses web resources of the "Deep Web“ which commercial search engine fails
• Excellent Refining filters (browse by Library Classification Number)
• BASE is an OAI Service provider, it can be integrated to local collection – Federated search, Discovery
BASE: https://www.base-search.net/
BASE Search Results
Semantic Scholar
• 20+ Million digital items across all
disciplines
• Profile based functionalities
• Citation tracking
• Citation / reference export functionalities
• Setup library (Personal collection)
• Automatic Alerts
https://www.scienceopen.com/
81 Million Articles, 25 K Journals, 3200 publishers
Advance Search, Filtering Options, OA articles, References Export, Altmetrics
Database of bibliographic records, an organized digital collection of references to published
literature which includes journal articles, conference proceedings, reports, patents,
books, etc.
• Subject Specific
• Platform for comprehensive literature search
• Wider Coverage
• CDs / DVDs / Web Version
• Powerful search interface
Engineering: Engineering Village (Combination of Databases)
• Provides access to 12 engineering document databases
• Published by Elsevier (Commercial)
• 190 engineering disciplines & 73 countries
• 3,800+ journals from 1,988 publishers
• 117 trade magazines
• 80,000+ conference proceedings & 83 book series
• Link to Full text Articles
• Quick discovery of engineering literature: Thesaurus & Controlled Vocabulary
• Analyze and landscaping of engineering research Literature
• Alert features automatically push the latest updates to end users
• PlumX metrics helps users evaluate the impact and relevancy of articles
• Created by the Institution of Engineering and Technology (IET)
• Service Provided by EBSCO (Commercial)
• Subject Coverage: physics, electrical engineering, electronics, communications, control
engineering, computing, information technology, manufacturing, production and mechanical
engineering
• Coverage: 30+ Million articles from 4500 Journals published by 500+ Publishers
• Inspec : also indexes more than 6 million conference items, plus
preprints, books, dissertations, patents, reports and videos
• Inspec Analytics: helps to know the research trend
• Inspec Archive: Science abstracts from 1898-1968
• Published CAS a division of American Chemical
Society (ACS)
• Access to the world’s most reliable and
comprehensive chemical and scientific information
– Rigorous quality check
• Powerful Smartsearch technology
– Substance Search
– Structure Search
– Chemical Properties & reaction Search
• Technology Trends
Next Class
Aggregators
Databases of full-text articles, defined by subject area and sold as a single product, rather
than as individual subscriptions.
• Ingentaconnect: (http://www.ingentaconnect.com/ )
• 10000 publications from 290+ publishers
• 630 Engineering titles
• ProQuest: http://www.proquest.com
• 9000 publishers
• Project MUSE: http://muse.jhu.edu/
• 240 Publishers in Humanities and social sciences
• JSTOR: www.jstor.org
• 214 titles from 48 publishers + Ebooks
• Highwire Press: http://home.highwire.org/
• 3000 scholarly journals and thousands of scholarly books Open Access Article
Publisher Platforms
• Sciencedirect
• Springerlink
• Wiley
• Emerald
• IEEE Digital Library
• ASME/ACS/
Many more!!!
Subscribed Content
Encourage users to create Profile
How do I Find books published in my field ??
Library OPACs – Free to access
• Library of Congress
– 17 million book titles (https://catalog.loc.gov/ )
• Indcat – Inflibnet
– 8.19 Million books from 176 Indian
universities (Indian books)
(https://indcat.inflibnet.ac.in/)
• College OPAC
Full text E-Books- Digital Libraries
• Internet Archive Books
– 1 million full-text books
(https://archive.org/details/internetarchivebooks
• National Digital Library
– 3.9 Million books (World e-book library)
(https://ndl.iitkgp.ac.in/ )
• Google Books (Project Ocean)
– 30 + Million books
– Free full text Access to part of the collection
(https://books.google.com/)
Grey Literature Servers
“Grey literature are materials produced by
organizations outside of the traditional (commercial
or academic) publishing and distribution channels”
• HAL Repositories
– 1.7 Million records (https://hal.archives-ouvertes.fr/ )
• Open GreyNet
(http://www.greynet.org/opengreyrepository.html )
Patent Databases
Patent Information
• Information found in patent applications and granted patents.
• Patent information includes
– Bibliographic data
– Abstract
– Description
– Claims
– Drawings
• Patent information is publicly discloses the newly developed technologies
• Patent information helps to develop new technical solutions by other
inventors
Patent Databases
Free Databases Commercial Databases
•PATENTSCOPE
•Google Patents •Thomson Innovations
•Lens.org
•USPTO
•Questel Orbit
•Espacenet •XLPAT
•Country Specific •IEEE Innovation Q Plus
•Japan – PAJ
•Germany- DPMA Register •PATSNAP
• India - inPASS •Patbase
•Freepatentonline
Not Possible
Possible
Prior Art Search
• All public information available prior to the date of
filing of the relevant patent application against which
the patentability of the invention will be determined. Information not
– Journal Articles, Conference Papers, etc considered in prior art
– Report literature
– Patents (Filed & Granted)
• Non-public Information
• Existing relevant technology
• Trade Secrete
• Traditional Knowledge / Oral disclosures
• Documents in internal use /
• Novelty/Non-obviousness circulation
• First to File/First to Invent
Types of Prior Art Search
Novelty Search: to find novelty / non-obvious.
Patentability Search: ascertain the chance or likelihood of an invention getting a
patent.
Infringement search make sure that nobody without your consent makes, uses, or
sells your patented invention.
Validity / Invalidity Search conducted after the issuance of patent to validate the
enforceability of a patent’s claims.
Patent Landscape To know business, scientific and technological trends in the
area / domain
Whitespace analysis To know the little or no patenting activity.
Why Novelty Search?
• Large Investment
• High cost in maintaining patents
• Helps to find out novelty of research by comparing prior
inventions
• Helps to identify White spaces
• Helps in future R & D Strategy and Decision making
• To avoid Future litigation
White Space Analysis based on Patent Landscape Search
• White-spaces are gaps in a
technology landscape.
• “White Space” is the area with
little or no patenting activity.
• White-space analysis is used as
methods for strategic product
innovation
Patent Landscaping : Trends
Patent Information (Structure)
• Each Information field is denoted by Numerical code
• First Page Information (Descriptive information)
• Patented country
• Patent Number
• Bibliographic Details
• Title
• Inventors
• Assignee
• Application Number
• Cited references
• Abstract
Patent Information (Structure)
• Drawings
– Parts named with numbers which are cross referred in
detailed description
• Field of Invention
• Background of Invention (Prior art data)
• Summary of Invention
– The objects
– Problems solved
• Detailed description of specification
• Claims
– Independent claims
– Dependent claims
Standards Database
• Standard- is an agreed way of doing something
– Making a product
– Managing a process
– Delivering a service
– Supplying materials
• Standards - provide a reliable basis for people to
share the same expectations about a product or service
– facilitate trade
– provide a framework for achieving economies & efficiencies
– enhance consumer protection and confidence
Expensiv
AI Platforms for Literature Search
Literature Discovery: Web Search Vs AI Based Search
Web / Database Search AI Search
• Familiarity of Query Language • Natural Language Processing (NLP)
• Search operators • AI based evaluation & Summarization
– Boolean • AI based priority display
– Proximity • Contextual search (Facet analysis)
– Truncation Dr. SR
• Manual Evaluation of search results Ranganathan’s
PMEST
• Manual Prioritising of Results
approach
• Key word based search (Facet analysis)
AI based Search Platform: Semantic Scholar
• Sematic Scholar is a free, AI-driven search and discovery platform
• 200 million papers from 50 + reputed sources
• Uses NLP techniques
• Generate Super short summaries of an article – TLDR (Too Long; Didn’t
Read) summaries
• Checks highly influential citations
• Cite option in various styles
• Online Library – AI based feeds for paper recommendations & Alerts
https://www.semanticscholar.org/
AI based Search Platform: Semantic Scholar
AI based Search Platform: Research Rabbit
• Build online collection – Library
• Automated summaries
• Interactive visualization
– Network of papers
– Network of authors
• Personalised Recommendations & Digests (Email)
• Zotero Integration
• Collaborations
https://www.researchrabbit.ai/
AI based Search Platform: Elicit
• Find relevant papers even if they don't match
keywords (Synonyms)
• Read summaries of abstracts specific to query
• Automatically search forwards and backwards in the
citation graph to find more relevant papers
• Filter based on study type
• Save & Export search results
https://elicit.org/
AI Synthesizer & Summariser: System Pro
• Cloud based platform
• Currently limited to PubMed Data (Scholarly articles)
• Helps find, synthesize, and contextualize scientific literature
• Synthesized text are clearly cited with all sources used
• Keywords Relationship Maps
• Summary for each articles
• Content addition – Daily basis
https://www.system.com/
AI Synthesizer & Summariser: System Pro
AI Synthesizer & Summariser: textero.ai
• Used to write essays and
research papers
• Generate unique content
• Text Summariser
• Finding References
https://www.textero.ai/
Topics Covered
• Literature Search
• Types Databases
• Bibliographic Databases
• Search Engines & Speciality Search Engines
• Various Platforms for Literature Search
• Patent Databases
• AI tools for Literature Search
Next Class: Search Strategies & Bibliographic
database Search