SEARCHING BIOTECHNOLOGY INFORMATION IN THE 2010s:
Section II (Databases & Search Strategies)
Luca Falciola, IP Manager, Promethera Biosciences
Sardegna Ricerche-Univ. Cagliari (Sep. 15th 2014)
2
Databases & Biotechnology : A foreword
Covering even a limited number of databases is already pretty impossible and a selection is required according to a few criteria
Free access (at most, requiring the registration at a website using a user/password and an e-mail to get full access to services ; at this scoep better using a separate, specific free e-mail on the Internet to be used only at this scopes and for receiving Table of Contents, updates etc.)
Overall positive reputation , importance , and good
« search experience » even for occasional user
This selection can be easily expanded for specific objectives by
Searching the structured repertory in Nucl. Acid Res . website and the yearly update
Combining search topic in Google and/or Pubmed
Exploring he NCBI and EBI websites
Sardegna Ricerche L. Falciola 15/09/2014
1 Scientific Literature
- Pubmed HighWire - Publishers’ website 2 Patent Literature
3 Chemical Structure & Biological Sequences 4 Metabases
Sardegna Ricerche L. Falciola 15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
4
The databases of scientific literature are several, mostly thematic, and Pubmed has a major role in life sciences
More than 1,2 entries for year 2013, for a total of more than 25 millions
Considered by many as the most complete database
This leadership should not forget other resources that, at different levels, may be competitive for identifying relevant literature
Commercial ones (EMBASE, SCISEARCH,SCI VERSE, BIOSIS, etc.)
Databases covering a large panel of publishers for promoting the purchase of articles that provide full-text search features or other advances search / push services
The full-text Vs indexing/completeness comparison is actually a main topic
Scientific Literature:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
5
Pubmed offers almost everything you need with exception of full-text search
A well organized help page including links to Youtube and other tutorials
Access to the large panel of services of NCBI as summarized in this guide and this NAR paper
Sign-in page for accessing a even larger panel of features
Guides to other literature databases , NCBI digital library, and the MeSH system
Pubmed:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
6 Sardegna Ricerche L. Falciola 15/09/2014
Both features are un the same page that can be maintained or even saved
Pubmed:
Advanced Search and Search History
7 Sardegna Ricerche L. Falciola 15/09/2014
A large number of fields is available for text or numeric searches
Pubmed:
Field Search
8 Sardegna Ricerche L. Falciola 15/09/2014
Pubmed:
MeSH Examples (antibodies)
9 Sardegna Ricerche L. Falciola 15/09/2014
Some Pubmed tutorial are too complex and some university provide simplified versions like for MeSH and insisting in
pursuing a sequential structured approach to identify the more relevant MeSH terms
Not forgetting that MeSH are not always present and are relevant for extract a more relevant subset of references to explore with a series of related criteria
Pubmed:
MeSH tutorial
10 Sardegna Ricerche L. Falciola 15/09/2014
A large number of operators /symbols expand the possibilities well beyond AND OR NOT (and truncation, double quotes are essential for pursuing precise but not too extensive searches)
The search can be also improved by the large selection of filters in left sidebar
Pubmed:
Search Operators
11 Sardegna Ricerche L. Falciola 15/09/2014
An important issue is that Pubmed is intended to provide publications as soon to users, explaining some heterogeneity in indexing and access to articles
Pubmed:
Heterogeneity
12 Sardegna Ricerche L. Falciola 15/09/2014
Substitute “” with a – between two words in a phrase
The use of truncation shows how many spelling errors are present in the database that may make you miss some relevant hits
Sedn Pubmed reference by e-mail just by indicating the PMID after http://www.ncbi.nlm.nih.gov/pubmed/
eg http://www.ncbi.nlm.nih.gov/pubmed/25031662,25000062
The « Related » references can be saved in the search history and combined with keywords to search within them
Search History is limited in time and length (better not exceeding 50-80 entries)
Pubmed:
Some tricks
13 Sardegna Ricerche L. Falciola 15/09/2014
Large literature life science database hosted by Stanford Univ. aggregating journals from many major publishers but also books and conference
abstracts, also as full text and with some useful filters
Highwire Press:
Overview
14 Sardegna Ricerche L. Falciola 15/09/2014
Highwire:
Help Page
15 Sardegna Ricerche L. Falciola 15/09/2014
Highwire:
Search Results & History
16 Sardegna Ricerche L. Falciola 15/09/2014
Highwire:
Services
Preview of keywords in the context, alerting for new articles including a given citation or keywords, alternative viewing features, links to supplementary/
free documents and management of ToC are well implemented
17
All main publishers with a large panel of journals have nice feature to keep track of new articles or searching heir publications
Nature, Science, Wiley, Springer
Scienedirect of Elsevier is particularly rich of functions and has a broad coverage (even of journals not indexed in Pubmed
Publishers’ Website:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
18
Publishers’ Website:
Other Examples
Sardegna Ricerche L. Falciola 15/09/2014
Wiley
1 Scientific Literature 2 Patent Literature
- Lens - Espacenet - Patentscope 3 Chemical Structure & Biological Sequences
4 Metabases
Sardegna Ricerche L. Falciola 15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
20
Patent Literature:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
Patent information that may be relevant for a biotech search is available in a variety of formats:
Text-based
Biological sequences
Chemical structures
Regular review of patent publications can be performed by using appropriately three types of tools :
Multi-Patent offices websites (Patentscope, Espacenet, Lens)
Patent office-specific tools (at USPTO, EPO, Australian, Indian, etc.) but in general poorly implemented outside basic number or proceedings
Access for sequence- or structure-based searches (Lens, EBI)
Each approach and tool has own strengths/weakness:
Need to compare/double-check
Access to PDF and identification of keyword context
21
Patent Literature:
Overview
Sardegna Ricerche L. Falciola 15/09/2014
Main strengths :
Patentscope and Lens: full text/stemmed/nested searches, large number of criteria, login for saving search strategies, graphical/automated grouping of results
Patentscope and Espacenet: machine-based translation
Lens: somehow easier to use for both searching and getting/sending links to PDF files, nice support section, possible to search only granted patents, nice sorting/filtering functions, claims and abstract on the same page
Espacenet: Cooperative Patent Classification & citing/cited documents features for(non-) EP appl. , link to EPO register, links to (often) reliable patent family & Inpadoc/status information
Main weaknesses:
Patentscope: unstability in case of long search session, IPC only, no clear patent family information
Lens: format inconsistency for code/number fields, coverage and patent family definition, with functions appearing and disappearing (now providing IPC and USPC)
Espacenet: somehow old-style for both searching documents and getting PDF files
In general:
No visibility on actual coverage for all collections
Limited means to identify keyword context
22
Lens :
Search window
Sardegna Ricerche L. Falciola 15/09/2014
23
Lens:
Search Window
Sardegna Ricerche L. Falciola 15/09/2014
24
Lens:
Search Window
Sardegna Ricerche L. Falciola 15/09/2014
25 Sardegna Ricerche L. Falciola 15/09/2014
Lens:
Filtering Features
26 Sardegna Ricerche L. Falciola 15/09/2014
Lens:
Help Page
27
Espacenet:
Search Window and Criteria
Sardegna Ricerche L. Falciola 15/09/2014
28
Espacenet:
Patent Kind Codes & Help
Sardegna Ricerche L. Falciola 15/09/2014
29
Espacenet:
CPC Classification
Sardegna Ricerche L. Falciola 15/09/2014
30
Espacenet:
Results & Record View
Sardegna Ricerche L. Falciola 15/09/2014
31
Patentscope:
Search Window & Results
Sardegna Ricerche L. Falciola 15/09/2014
32
Patentscope:
Record & Records Analysis
Sardegna Ricerche L. Falciola 15/09/2014
1 Scientific Literature 2 Patent Literature
3 Chemical Structure & Biological Sequences
-Uniprot - EBI-Fasta - ChEMBL/Pubchem 4 Metabases
Sardegna Ricerche L. Falciola 15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
34
Uniprot:
Overview & Search Criteria
Sardegna Ricerche L. Falciola 15/09/2014
35
Uniprot:
Overview & Search Criteria
Sardegna Ricerche L. Falciola 15/09/2014
36
Uniprot:
HBB in Genecards Vs Uniprot
Sardegna Ricerche L. Falciola 15/09/2014
37
EBI-Fasta:
Search Window
Sardegna Ricerche L. Falciola 15/09/2014
38
EBI-Fasta:
Overview of Results
Sardegna Ricerche L. Falciola 15/09/2014
39
EBI-Fasta:
Patent Sequence Record
Sardegna Ricerche L. Falciola 15/09/2014
40
ChEMBL:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
Medicinal chemistry data/products is now more accessible also to non-
specialist through portals such as EBI/ChEMBL, PubChem, or Drugbank portals that aggregate and make them searchable through different criteria, across biological/medical/patent information together with chemical information from proprietary repositories) for creating Molecular Clouds
(Ertl and Rohde, J Cheminf 2012)
41 Sardegna Ricerche L. Falciola 15/09/2014
ChEMBL:
Features
42 Sardegna Ricerche L. Falciola 15/09/2014
ChEMBL:
Search & Browse Features
43 Sardegna Ricerche L. Falciola 15/09/2014
ChEMBL:
Targets, Ligands & Drug Approvals
1 Scientific Literature 2 Patent Literature
3 Chemical Structure & Biological Sequences 4 Metabases
- Google - Google Scholar - Drugbank
Sardegna Ricerche L. Falciola 15/09/2014
DATABASES FOR BIOTECHNOLOGY INFORMATION
45
Google:
Advanced Search & GoogleGuide
Sardegna Ricerche L. Falciola 15/09/2014
46
This site claims having broad coverage of both scientific and patent literature but it is actually unclear the coverage:
beyond US patent documents and by which date (they index papers and not journals
of which publishers
The system has some additional useful features compared to “pure“ Google
Separate advanced search features
Management of alerts through own Gmail account
Import features for reference management systems (but not always precise)
Selection of publication date instead of appearance on the web (but again not always precise)
Clear link to PDF on the left side of the window
Citation list (that can be searched separately) and “related articles” features
Metrics / search by journal
Focused help page with advis on how getting your paper indexed
Google Scholar:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
47 Sardegna Ricerche L. Falciola 15/09/2014
Google Scholar:
Advanced Search Features
48 Sardegna Ricerche L. Falciola 15/09/2014
Google Scholar:
Settings and Metrics Features
49 Sardegna Ricerche L. Falciola 15/09/2014
Google Scholar provides means for overcoming only some limitations of
“pure” Google
Lack of visibility about publication/journal coverage
Unstructured search features within documents
Lack of indexing
It is an interesting tool for exploratory searches or completing searches made in “traditional” databases
Exploiting full-text and advanced search features in a more structured environment
Linking articles to combinations of specific technical details, cross-references, authors
Obtaining additional search criteria to be used elsewhere
Google Scholar:
Final Comments
50
DrugBank:
Introduction
Sardegna Ricerche L. Falciola 15/09/2014
51
DrugBank:
Results
Sardegna Ricerche L. Falciola 15/09/2014
52
DrugBank:
Records
Sardegna Ricerche L. Falciola 15/09/2014
53