• Non ci sono risultati.

SEARCHING BIOTECHNOLOGY INFORMATION IN THE 2010s: Section II (Databases & Search Strategies)

N/A
N/A
Protected

Academic year: 2021

Condividi "SEARCHING BIOTECHNOLOGY INFORMATION IN THE 2010s: Section II (Databases & Search Strategies)"

Copied!
53
0
0

Testo completo

(1)

SEARCHING BIOTECHNOLOGY INFORMATION IN THE 2010s:

Section II (Databases & Search Strategies)

Luca Falciola, IP Manager, Promethera Biosciences

Sardegna Ricerche-Univ. Cagliari (Sep. 15th 2014)

(2)

2

Databases & Biotechnology : A foreword

 Covering even a limited number of databases is already pretty impossible and a selection is required according to a few criteria

 Free access (at most, requiring the registration at a website using a user/password and an e-mail to get full access to services ; at this scoep better using a separate, specific free e-mail on the Internet to be used only at this scopes and for receiving Table of Contents, updates etc.)

 Overall positive reputation , importance , and good

« search experience » even for occasional user

 This selection can be easily expanded for specific objectives by

 Searching the structured repertory in Nucl. Acid Res . website and the yearly update

 Combining search topic in Google and/or Pubmed

 Exploring he NCBI and EBI websites

Sardegna Ricerche L. Falciola 15/09/2014

(3)

1 Scientific Literature

- Pubmed HighWire - Publishers’ website 2 Patent Literature

3 Chemical Structure & Biological Sequences 4 Metabases

Sardegna Ricerche L. Falciola 15/09/2014

DATABASES FOR BIOTECHNOLOGY INFORMATION

(4)

4

 The databases of scientific literature are several, mostly thematic, and Pubmed has a major role in life sciences

 More than 1,2 entries for year 2013, for a total of more than 25 millions

 Considered by many as the most complete database

 This leadership should not forget other resources that, at different levels, may be competitive for identifying relevant literature

 Commercial ones (EMBASE, SCISEARCH,SCI VERSE, BIOSIS, etc.)

 Databases covering a large panel of publishers for promoting the purchase of articles that provide full-text search features or other advances search / push services

 The full-text Vs indexing/completeness comparison is actually a main topic

Scientific Literature:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

(5)

5

 Pubmed offers almost everything you need with exception of full-text search

 A well organized help page including links to Youtube and other tutorials

 Access to the large panel of services of NCBI as summarized in this guide and this NAR paper

 Sign-in page for accessing a even larger panel of features

 Guides to other literature databases , NCBI digital library, and the MeSH system

Pubmed:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

(6)

6 Sardegna Ricerche L. Falciola 15/09/2014

 Both features are un the same page that can be maintained or even saved

Pubmed:

Advanced Search and Search History

(7)

7 Sardegna Ricerche L. Falciola 15/09/2014

 A large number of fields is available for text or numeric searches

Pubmed:

Field Search

(8)

8 Sardegna Ricerche L. Falciola 15/09/2014

Pubmed:

MeSH Examples (antibodies)

(9)

9 Sardegna Ricerche L. Falciola 15/09/2014

 Some Pubmed tutorial are too complex and some university provide simplified versions like for MeSH and insisting in

 pursuing a sequential structured approach to identify the more relevant MeSH terms

 Not forgetting that MeSH are not always present and are relevant for extract a more relevant subset of references to explore with a series of related criteria

Pubmed:

MeSH tutorial

(10)

10 Sardegna Ricerche L. Falciola 15/09/2014

 A large number of operators /symbols expand the possibilities well beyond AND OR NOT (and truncation, double quotes are essential for pursuing precise but not too extensive searches)

 The search can be also improved by the large selection of filters in left sidebar

Pubmed:

Search Operators

(11)

11 Sardegna Ricerche L. Falciola 15/09/2014

 An important issue is that Pubmed is intended to provide publications as soon to users, explaining some heterogeneity in indexing and access to articles

Pubmed:

Heterogeneity

(12)

12 Sardegna Ricerche L. Falciola 15/09/2014

 Substitute “” with a – between two words in a phrase

 The use of truncation shows how many spelling errors are present in the database that may make you miss some relevant hits

 Sedn Pubmed reference by e-mail just by indicating the PMID after http://www.ncbi.nlm.nih.gov/pubmed/

eg http://www.ncbi.nlm.nih.gov/pubmed/25031662,25000062

 The « Related » references can be saved in the search history and combined with keywords to search within them

 Search History is limited in time and length (better not exceeding 50-80 entries)

Pubmed:

Some tricks

(13)

13 Sardegna Ricerche L. Falciola 15/09/2014

 Large literature life science database hosted by Stanford Univ. aggregating journals from many major publishers but also books and conference

abstracts, also as full text and with some useful filters

Highwire Press:

Overview

(14)

14 Sardegna Ricerche L. Falciola 15/09/2014

Highwire:

Help Page

(15)

15 Sardegna Ricerche L. Falciola 15/09/2014

Highwire:

Search Results & History

(16)

16 Sardegna Ricerche L. Falciola 15/09/2014

Highwire:

Services

 Preview of keywords in the context, alerting for new articles including a given citation or keywords, alternative viewing features, links to supplementary/

free documents and management of ToC are well implemented

(17)

17

 All main publishers with a large panel of journals have nice feature to keep track of new articles or searching heir publications

 Nature, Science, Wiley, Springer

 Scienedirect of Elsevier is particularly rich of functions and has a broad coverage (even of journals not indexed in Pubmed

Publishers’ Website:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

(18)

18

Publishers’ Website:

Other Examples

Sardegna Ricerche L. Falciola 15/09/2014

 Wiley

(19)

1 Scientific Literature 2 Patent Literature

- Lens - Espacenet - Patentscope 3 Chemical Structure & Biological Sequences

4 Metabases

Sardegna Ricerche L. Falciola 15/09/2014

DATABASES FOR BIOTECHNOLOGY INFORMATION

(20)

20

Patent Literature:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

 Patent information that may be relevant for a biotech search is available in a variety of formats:

 Text-based

 Biological sequences

 Chemical structures

 Regular review of patent publications can be performed by using appropriately three types of tools :

 Multi-Patent offices websites (Patentscope, Espacenet, Lens)

 Patent office-specific tools (at USPTO, EPO, Australian, Indian, etc.) but in general poorly implemented outside basic number or proceedings

 Access for sequence- or structure-based searches (Lens, EBI)

 Each approach and tool has own strengths/weakness:

 Need to compare/double-check

 Access to PDF and identification of keyword context

(21)

21

Patent Literature:

Overview

Sardegna Ricerche L. Falciola 15/09/2014

 Main strengths :

 Patentscope and Lens: full text/stemmed/nested searches, large number of criteria, login for saving search strategies, graphical/automated grouping of results

 Patentscope and Espacenet: machine-based translation

 Lens: somehow easier to use for both searching and getting/sending links to PDF files, nice support section, possible to search only granted patents, nice sorting/filtering functions, claims and abstract on the same page

 Espacenet: Cooperative Patent Classification & citing/cited documents features for(non-) EP appl. , link to EPO register, links to (often) reliable patent family & Inpadoc/status information

 Main weaknesses:

 Patentscope: unstability in case of long search session, IPC only, no clear patent family information

 Lens: format inconsistency for code/number fields, coverage and patent family definition, with functions appearing and disappearing (now providing IPC and USPC)

 Espacenet: somehow old-style for both searching documents and getting PDF files

 In general:

 No visibility on actual coverage for all collections

 Limited means to identify keyword context

(22)

22

Lens :

Search window

Sardegna Ricerche L. Falciola 15/09/2014

(23)

23

Lens:

Search Window

Sardegna Ricerche L. Falciola 15/09/2014

(24)

24

Lens:

Search Window

Sardegna Ricerche L. Falciola 15/09/2014

(25)

25 Sardegna Ricerche L. Falciola 15/09/2014

Lens:

Filtering Features

(26)

26 Sardegna Ricerche L. Falciola 15/09/2014

Lens:

Help Page

(27)

27

Espacenet:

Search Window and Criteria

Sardegna Ricerche L. Falciola 15/09/2014

(28)

28

Espacenet:

Patent Kind Codes & Help

Sardegna Ricerche L. Falciola 15/09/2014

(29)

29

Espacenet:

CPC Classification

Sardegna Ricerche L. Falciola 15/09/2014

(30)

30

Espacenet:

Results & Record View

Sardegna Ricerche L. Falciola 15/09/2014

(31)

31

Patentscope:

Search Window & Results

Sardegna Ricerche L. Falciola 15/09/2014

(32)

32

Patentscope:

Record & Records Analysis

Sardegna Ricerche L. Falciola 15/09/2014

(33)

1 Scientific Literature 2 Patent Literature

3 Chemical Structure & Biological Sequences

-Uniprot - EBI-Fasta - ChEMBL/Pubchem 4 Metabases

Sardegna Ricerche L. Falciola 15/09/2014

DATABASES FOR BIOTECHNOLOGY INFORMATION

(34)

34

Uniprot:

Overview & Search Criteria

Sardegna Ricerche L. Falciola 15/09/2014

(35)

35

Uniprot:

Overview & Search Criteria

Sardegna Ricerche L. Falciola 15/09/2014

(36)

36

Uniprot:

HBB in Genecards Vs Uniprot

Sardegna Ricerche L. Falciola 15/09/2014

(37)

37

EBI-Fasta:

Search Window

Sardegna Ricerche L. Falciola 15/09/2014

(38)

38

EBI-Fasta:

Overview of Results

Sardegna Ricerche L. Falciola 15/09/2014

(39)

39

EBI-Fasta:

Patent Sequence Record

Sardegna Ricerche L. Falciola 15/09/2014

(40)

40

ChEMBL:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

 Medicinal chemistry data/products is now more accessible also to non-

specialist through portals such as EBI/ChEMBL, PubChem, or Drugbank portals that aggregate and make them searchable through different criteria, across biological/medical/patent information together with chemical information from proprietary repositories) for creating Molecular Clouds

(Ertl and Rohde, J Cheminf 2012)

(41)

41 Sardegna Ricerche L. Falciola 15/09/2014

ChEMBL:

Features

(42)

42 Sardegna Ricerche L. Falciola 15/09/2014

ChEMBL:

Search & Browse Features

(43)

43 Sardegna Ricerche L. Falciola 15/09/2014

ChEMBL:

Targets, Ligands & Drug Approvals

(44)

1 Scientific Literature 2 Patent Literature

3 Chemical Structure & Biological Sequences 4 Metabases

- Google - Google Scholar - Drugbank

Sardegna Ricerche L. Falciola 15/09/2014

DATABASES FOR BIOTECHNOLOGY INFORMATION

(45)

45

Google:

Advanced Search & GoogleGuide

Sardegna Ricerche L. Falciola 15/09/2014

(46)

46

 This site claims having broad coverage of both scientific and patent literature but it is actually unclear the coverage:

 beyond US patent documents and by which date (they index papers and not journals

 of which publishers

 The system has some additional useful features compared to “pure“ Google

 Separate advanced search features

 Management of alerts through own Gmail account

 Import features for reference management systems (but not always precise)

 Selection of publication date instead of appearance on the web (but again not always precise)

 Clear link to PDF on the left side of the window

 Citation list (that can be searched separately) and “related articles” features

 Metrics / search by journal

 Focused help page with advis on how getting your paper indexed

Google Scholar:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

(47)

47 Sardegna Ricerche L. Falciola 15/09/2014

Google Scholar:

Advanced Search Features

(48)

48 Sardegna Ricerche L. Falciola 15/09/2014

Google Scholar:

Settings and Metrics Features

(49)

49 Sardegna Ricerche L. Falciola 15/09/2014

 Google Scholar provides means for overcoming only some limitations of

“pure” Google

 Lack of visibility about publication/journal coverage

 Unstructured search features within documents

 Lack of indexing

 It is an interesting tool for exploratory searches or completing searches made in “traditional” databases

 Exploiting full-text and advanced search features in a more structured environment

 Linking articles to combinations of specific technical details, cross-references, authors

 Obtaining additional search criteria to be used elsewhere

Google Scholar:

Final Comments

(50)

50

DrugBank:

Introduction

Sardegna Ricerche L. Falciola 15/09/2014

(51)

51

DrugBank:

Results

Sardegna Ricerche L. Falciola 15/09/2014

(52)

52

DrugBank:

Records

Sardegna Ricerche L. Falciola 15/09/2014

(53)

53

Thank you !!

Luca.falciola@promethera.com

The views and the opinions expressed in this presentation are the

author’s personal thoughts on these subjects. They are not intended

to be considered opinions and positions of Promethera, nor imply any

commitment by Promethera to any particular action.

Riferimenti

Documenti correlati

Google employs a number of techniques to improve search quality including page rank, anchor text, and proximity information.. Furthermore, Google is a complete architecture

All in all, the academic literature is almost unanimous in rejecting search neutrality as an antitrust claim. 67 The main arguments brought forward by academics

The CAP does not contain/impose any specific measure related to non-food Bio- economy sectors; however several measures can be used in this direction by local strat- egy design.

 Internet provides not only access to databases but means to search relevant biotechnology information directly into scientific & patent literature through:.  A search

Systematic uncertainties that impact the calculation of the branching fractions of the signal modes are due to assumptions made about the distributions of the final-state particles

In Italy’s industrial relations parlance, the term flessibilità in entrata – literally ‘entry flexibility’ – is often used to refer to schemes that facilitate

A DM stream propagating along the Sun-Earth direction could surpass the local DM density, and this will give rise to an unexpectedly large DM flux exposure to an axion haloscope.

Missense mutations: refer to a change in one amino acid in a protein, arising from a point mutation in a single nucleotide. In AKU there are more than 100 different missense