0% found this document useful (0 votes)

19 views75 pages

Biological Databases

The document provides an overview of bioinformatics, focusing on sequence and genome analysis, and highlights key resources such as EMBnet and NCBI. It details the structure, mission, and services offered by EMBnet, including education, software development, and support for over 40,000 users globally. Additionally, it discusses various nucleic acid sequence databases and specialized genomic resources that facilitate research in molecular biology.

Uploaded by

Shashi Ranjan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views75 pages

Biological Databases

Uploaded by

Shashi Ranjan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 75

Bioinformatics

Information Resources And Networks

Title : Bioinformatics
Subtitle : Sequence and genome analysis
Author : Mount, David W.
Publ.Plc : New York
Publ. : Cold Spring Harbor
Pages : xii, 564p.
ISBN : 0-87969-608-7
Title : Introduction to bioinformatics
Author : Attwood, Teresa K.
Parrysmith, D. J.
Publ.Plc : New Delhi
Publ. : Pearson Education
Pages : xvi, 218p.
Ser Note : Cell and molecular biology
ISBN : 81-7808-507-0
Outline
Bioinformatics Information Resources And Networks
•

EMBnet – European Molecular Biology Network
• DBs and Tools
• NCBI – National Center For Biotechnology Information
• DBs and Tools

• Nucleic Acid Sequence Databases

• Protein Information Resources
• Metabolic Databases
• Mapping Databases
• Databases concerning Mutations
• Literature Databases
EMBnet – European Molecular
Biology Network

 Founded in 1988
 Network that links European laboratories that use
biocomputing and bioinformatics in molecular biology research
 is a science-
science-based group of collaborating nodes throughout
Europe and nodes outside Europe
 provides information, services and training to the users
 efforts to increase the availability and
accessibility of data resources and
computing tools
 increase knowledge and proficiency in bioinformatics through
education and training
EMBnet - Nodes http://www.embnet.org/

National
• governmental
Nodes
(18)

• academic, industrial EMBnet • Biocomputing centers from

research centers (41 nodes) non European countries

Specialist Associate
Nodes Nodes
(9) (11)
EMBnet - Nodes
National Nodes

Appointed by the
Vienna Biocenter - Austria BEN - Belgium

CSC - Finland INFOBIOGEN - France governments
DKFZ - Germany HEN - Hungary
 Provide on-line services,
user support and training
INCBI - Ireland INN - Israel

IEN-AdR - Italy CMBI - Netherlands

Bio - Norway IBB - Poland

PEN - Portugal GeneBee - Russia

CNB-CSIC - Spain BMC - Sweden

SIB - Switzerland SEQNET - UK

EMBnet - Nodes
Munich Information Center for protein sequences
 Academic, industrial or
Specialist Nodes
research centers in
MIPS specific areas of
bioinformatics
Largely responsible for
ICGEB

Pharmarcia maintainance of
biological databases and
software
F.Hoffmann – La Roche

EBI Important key specialist node

Hinxton
and home of:
HGMP - RC Hall
(Cambridge UK)
EMBL, SWISS-PROT and
Sanger TrEMBL databases
UCL
EMBnet - Nodes

Centers from non

Associate Nodes


European countries
IBBM - Argentina ANGIS - Australia

CBI - China CIGB - Cuba

CDFD - India SANBI – South Africa

EMBnet - Brazil CBR - Canada

EMBnet - Chile EBMnet - Colombia

CIFN - MEXICO
EMBnet’s Mission

 Assist in biotechnological and bioinformatics related

research

 Provide training and education

 Exploit network infrastructures

 Investigate and develop new technologies

 Bridge between commercial and academic sectors

What does EMBnet do?
 Education and training
 Software development
 Computing resources
 Technical support
 Help desk in local languages
 Publications
Who are EMBnet’s Users?
 > 40,000 registered users from all over
the world as well as a larger number of
Internet users
 All scientists working in Life Sciences,
from undergraduate students to top level
scientists, in academia as well as
industry, can get support from EMBnet
EMBnets – SRS
National Sequence Retrieval System - SRS
Nodes
• result of a research project with the
EMBnet to interrogating all resources
gathered together
EMBnet • SRS is a network browser for DBs in
molecular Biology
Specialist Associate
Nodes Nodes • SRS allows any flat-file DB to be
indexed to any other
• queries across a range of different
DB types via a single interface
• independent of underlying data
structures or query languages
http://srs.ebi.ac.uk/
Sequence Retrieval System
Network Browser for Databanks in Molecular Biology

Rele Availa
Data Bank No Entries Indexing Date Group
ase bility

SWISSPROT 163235 10-Jun-2005 Sequence ok

SWISSNEW 81134 22-Mar-2006 Sequence ok
NRDB 2269647 29-Mar-2006 Sequence ok
SWALL 3022528 22-Mar-2006 Sequence ok
UNIPROT_SPROT 212425 22-Mar-2006 Sequence ok
UNIPROT_TREMBL 2666963 23-Mar-2006 Sequence ok
TREMBLNEW 624819 12-Dec-2005 Sequence ok
TREMBL 2576118 04-Oct-2005 Sequence ok
Availa
Data Bank No Entries Indexing Date Group
bility

SPTREMBL 1449374 16-Jun-2005 Sequence ok

SPTREMBLNEW 143140 17-Jun-2005 Sequence ok
REMTREMBL 92182 20-Jun-2005 Sequence ok
PIR 283416 16-Jun-2005 Sequence ok
WORMPEP 19538 16-Jun-2005 Sequence ok
DROSOPHILA 14100 16-Jun-2005 Sequence ok
EMBLNEW 4035816 21-Nov-2005 Sequence ok
EMBL 20343598 30-Dec-2005 Sequence ok
EMBLEST 31990232 06-Jan-2006 Sequence ok
EMBLWGS 11106060 24-Sep-2005 Sequence ok
GENBANK 19233264 18-Nov-2005 Sequence ok
GENBANKEST 31008556 23-Feb-2006 Sequence ok
REFSEQP 8006 16-Jun-2005 Sequence ok
SUBTILIST 1 16-Jun-2005 Sequence ok
Availa
Data Bank No Entries Indexing Date Group
bility

PROSITE 1935 22-Mar-2006 SeqRelated ok

PROSITEDOC 1407 22-Mar-2006 SeqRelated ok
BLOCKS 4034 16-Jun-2005 SeqRelated ok
EPD 1375 16-Jun-2005 SeqRelated ok
ENZYME 4173 16-Jun-2005 SeqRelated ok
PRINTS 865 16-Jun-2005 SeqRelated ok
TFSITE 4342 07-Apr-2003 TransFac ok
TFFACTOR 1799 07-Apr-2003 TransFac ok
TFCELL 816 07-Apr-2003 TransFac ok
TFCLASS 27 07-Apr-2003 TransFac ok
TFMATRIX 246 07-Apr-2003 TransFac ok
TFGENE 1035 07-Apr-2003 TransFac ok
PDB 34927 08-Feb-2006 Protein3DStruct ok
DSSP 30832 22-Nov-2005 Protein3DStruct ok
HSSP 30369 08-Feb-2006 Protein3DStruct ok
PDBFINDER 35701 28-Mar-2006 Protein3DStruct ok
NRL3D 6063 16-Jun-2005 Protein3DStruct ok
FLYGENES 7556 16-Jun-2005 Genome ok
FLYREFS 0 07-Apr-2003 Genome ok
OMIM 17004 18-Oct-2005 Mutations ok
REPTILIA 8364 18-Jan-2006 Others ok
EMBnets - EMBOSS
 The European Molecular Biology Open Software Suite
 EMBOSS is a free Open Source software analysis package
specially developed for the needs of the molecular biology (e.g.
EMBnet) user community.
 The software automatically copes with data in a variety of
formats and even allows transparent retrieval of sequence data
from the web.
 Also, as extensive libraries are provided with the package, it is
a platform to allow other scientists to develop and release
software in true open source spirit.
 EMBOSS also integrates a range of currently available
packages and tools for sequence analysis into a seamless
whole.
What can EMBOSS do for
you?

 Within EMBOSS you will find around hundreds of

programs (applications) covering areas such as:
• Sequence alignment,
• Rapid database searching with sequence patterns,
• Protein motif identification, including domain analysis,
• Nucleotide sequence pattern
• Codon usage analysis for small genomes,
• Rapid identification of sequence patterns in large scale
sequence sets,
• Presentation tools for publication,
and much more. Check:
http://emboss.sourceforge.net/

NCBI – National Center For
Biotechnology Information
Mission:
 Development of new information
 Leading American technologies to aid our
information provider understanding of the molecular
and genetic processes that
 Established in 1988 as underlie health and disease
a division of the  Creation of systems for storing and
National Library of analysing biological information
Medicine (NLM)  Development of advanced methods
• Located on the of computer-based information
campus of the processing
National Institute of  Facilitation of user access to DBs
Health (NIH – and software
Rockville/Maryland)  Co-ordination of efforts to gather
biotechnology information
worldwide
NCBI
 Since 1992 – maintenance of GenBank and collaboration
with international nucleotide DBs: EMBL and DDBJ
(Japan)
 Providing the Entrez that facilitates to access biological
DBs (similar to SRS that is provided by the EMBnet)
 gquery (https://www.ncbi.nlm.nih.gov/gquery/)
NCBI - Responsibilities
 administers research on biomedical problems at the molecular
level using mathematical and computational methods
 maintains collaborations with several NIH (National Institutes of
Health) institutes, academia, industry, and other governmental
agencies
 promotes scientific communication by sponsoring meetings,
workshops, and lecture series
 supports training on basic and applied research in
computational biology for postdoctoral fellows through the NIH
Intramural Research Program
 engages members of the international scientific community in
informatics research and training through the Scientific Visitors
Program
 develops, distributes, supports, and coordinates access to a
variety of databases and software for the scientific and medical
communities
 develops and promotes standards for databases, data
deposition and exchange, and biological nomenclature
Nucleic Acid Sequence Databases
• the principal nucleic acid sequence databases are GeneBank,
EMBL and DDBJ, which each collect a portion of the total sequence
data reported world-wide, and exchange new and updated entries
on a daily basis

Nucleic acid sequence Databases

EMBL (Europe)
GenBank (USA)
DDBJ (Japan)
ENSEMBL (project between EMBL - EBI and the Sanger Institute)
dbEST (division of GenBank)
GSDB (division of GenBank)
EMBL
source: http://www3.ebi.ac.uk/Services/DBStats/

Nucleic Acid Sequence Databases - EMBL

This week the EMBL Database contained 301,588,430,608 nucleotides in
199,575,971 entries
Breakdown by entry type:

Entry TypeEntries Nucleotides

Standard 128,262,666 120,603,334,814

Constructed (CON) 6,381,010 225,047,233,405
Third Party Annotation (TPA) 6,894 385,832,010
Whole Genome Shotgun (WGS) 64,925,118 180,599,264,067
The EMBL Nucleotide Sequence Database (also known as EMBL-Bank)
constitutes Europe's primary nucleotide sequence resource. Main sources
for DNA and RNA sequences are direct submissions from individual
researchers, genome sequencing projects and patent applications. The
database is produced in an international collaboration with GenBank (USA)
and the DNA Database of Japan (DDBJ). Each of the three groups collects a
portion of the total sequence data reported worldwide, and all new and
updated database entries are exchanged between the groups on a daily
basis.
Nucleic Acid Sequence Databases -
EMBL
Number of entries Total nucleotides
(current 69,666,551) (current 127,450,085,130 )

Ref: EMBL Nucleotide Sequence Database:developments in 2005,

Nucleic Acids Research, 2006, Vol. 34, D10–D15
Nucleic Acid Sequence Databases -
EMBL By nucleotide count

Pan
Homo Mus Rattus
troglodyt
sapiens musculus norvegicus
es
Bos Canis Monodelphis Danio
taurus familiaris domestica rerio

Macaca Loxodonta
Other
mulatta africana
Nucleic Acid Sequence
Databases – GenBank
 GenBank which is produced at NCBI, is split
into smaller, discrete divisions.
 This facilitates fast, specific searches by
restricting queries to perticular database
subsets
 During 1992-1997, the level of EST and STS
data within GenBank grew 10-fold.
 the overall sequence information contributed
by such partial data was still less than that of
higher quality sequences in the other major
divisions
Specialised Genomic Resources

 In addition to the comprehensive DNA sequence DBs, there

is a variety of more specialised genomic resources.
 These so called boutique DBs bring focus to species-
specific genomics and to particular sequencing techniques.

Specialised Genomic Resources

SGD – Saccharomyces Genome Database
UniGene - gene-oriented clusters from GenBank
TIGR - Databases of The Institute for Genomic
Research
ACeDB – A C.elegans DataBase
Specialised Genomic Databases
 SGD
http://www.yeastgenome.org/ (bakers yeast)
 AceDB
http://www.acedb.org (c.elegans)
 FlyBase
http://flybase.org/ (fruit fly)
 MGD
http://www.informatics.jax.org (Mouse)
Protein Information Resources

Levels of protein sequence and structural organisation:

primary The primary structure of a protein is its amino acid sequence

The second structure of a protein corresponds to regions of

secondary
local regularity (e.g., α-helices and β-strands).

The tertiary structure of a protein arises from the packing

tertiary of its secondary structure elements, which may form
discrete domains within a fold.
Protein Information Resources
Levels of protein sequence and structural organisation:

primary
primary sequence AVILDRYFH
database

secondary
secondary motif [AS]-[IL]2-X[DE]-R-[FYW]2-H
database

structure
tertiary domain module a,b,c @.*,#
database
Primary Protein Databases

• The primary structure of a protein is its amino acid sequence

• these are stored in primary databases as linear alphabets that
denote the constituent residues

Protein sequence Databases

SWISS-PROT - Protein knowledgebase
TrEMBL - Computer-annotated supplement to Swiss-Prot
PIR – Protein Information Resource
MIPS – Munich Information Centre for Protein Sequences
NRL-3D - produced by PIR
Protein Sequence Databases
Table of the most represented species
 Swiss-Prot contains 197,228
sequence entries, comprising No. Frequ. Species
71,501,181 amino acids
abstracted from 135,257 1 13049 Homo sapiens (Human)
references 2 10132 Mus musculus (Mouse)
 Total number of species Saccharomyces cerevisiae
represented in Swiss-Prot: 3 5189
(Baker's yeast)
9,520
4 4847 Escherichia coli
 The average sequence length
in Swiss-Prot is 362 amino 5 4669 Rattus norvegicus (Rat)
acids. 6 3665
Arabidopsis thaliana (Mouse-
 Swiss-Prot is the most highly ear cress)
annotated protein sequence Schizosaccharomyces pombe
8 2863
DB (Fission yeast)
 http://expasy.org/sprot/ 7 2814 Bacillus subtilis
9 2750 Caenorhabditis elegans
Drosophila melanogaster (Fruit
10 2286
fly)
Composite Protein Sequence
Databases
 Composite databases amalgamate a variety of
different primary databases
 They render sequence searching much more
efficient, because they obviate the need to
interrogate multiple resources
 Different composite databases use different
primary sources and different redundancy
criteria in their amalgamation procedures
Composite Protein Sequence
Databases
NRDB OWL MIPSX SP+TrEMBL
Natural Resource DB SwissProt TrEMBL
PDB SWISS-PROT PIR1-4 SWISS-PROT
SWISS-PROT PIR MIPSOwn TrEMBL
PIR GenBank MIPSTrn
GenPept NRL-3D MIPSH
SWISS-PROTupdate PIRMOD
GenPeptupdate NRL-3D
SWISS-PROT
EMTrans
GBTrans
Kabat
PseqIP
Secondary databases
 Secondary databases contain pattern data, i.e., diagnostic
signatures for protein families. These signatures encode the
most highly conserved features of multiply aligned sequences,
which are often crucial to the structure or function of the protein.

 The secondary structure of a protein corresponds to regions of

local regularity (e.g., α-helices and β-strands), which in sequence
alignments, are often apparent as well-conserved motifs.

 Patterns are regular expressions, fingerprints, blocks, profiles,

etc.
Secondary databases
Primary Stored
Secondary DB
source information
PROSITE SWISS-PROT Regular expressions
(patterns)
Profiles SWISS-PROT Weighted matrices
(profiles)
PRINTS OWL Aligned motifs
(fingerprints)
BLOCKS PROSITE/PRINTS Aligned motifs
(blocks)
IDENTIFY BLOCKS/PRINTS Fuzzy regular
expressions
(patterns)
Secondary databases

 TRANSFAC
http://transfac.gbf.de
 EPD
http://www.epd.isb-sib.ch
 InterPro
http://www.ebi.ac.uk/interpro/
 PROSITE
http://www.expasy.ch/prosite
 BLOCKS
http://blocks.fhcrc.org
 PRINTS
ftp://ftp.seqnet.dl.ac.uk/pub/database/prints
 PFAM
http://www.sanger.ac.uk/Software/Pfam/index.shtml
 ProDom
http://www.toulouse.inra.fr/prodom.html
 InterPro
http://www.ebi.ac.uk/interpro
 GeneCards
http://bioinformatics.weizmann.ac.il/cards
 ENSEMBL
http://www.ensembl.org
 EcoCyc
http://ecocyc.panbio.com/ecocyc/ecocyc.html
Secondary databases
 There is some overlap in content between the secondary
databases
 PDBsum alone has 35,291 entries

 Pattern DB growth is slow because the addition of

detailed family annotation is very time consuming.

 PROSITE and PRINTS are the only comprehensively,

manually annotated secondary DBs

 To address the annotation bottleneck, the secondary

database curators are together created a unified
database of protein families known as InterPro
Structure Classification DBs
 Contain 3D structures available from
crystallographic and spectroscopic studies

Structure Classification Databases

PDBsum – Protein Data Bank
CATH – Class, Architecture, Topology, Homology
SCOP – Structural Classification of Proteins
Structure Classification DBs
 PDB
http://www.rcsb.org
 SCOP
http://scop.mrc-lmb.cam.ac.uk/scop
 CATH
http://www.cathdb.info/
 DSSP
http://swift.cmbi.ru.nl/gv/dssp/
 FSSP
http://www.ebi.ac.uk/dali/fssp
 HSSP
 http://swift.cmbi.kun.nl/swift/hssp/
Metabolic Databases

A number of metabolic databases are available electronically

some with features for querying and visualizing metabolic
pathways and regulatory networks.

KEGG (Kyoto Encyclopedia of Genes and Genomes)

http://www.genome.ad.jp/kegg
 ENZYME (Enzyme nomenclature database)
http://www.expasy.ch/enzyme
 BRENDA (Enzyme Information System)
 http://www.brenda-enzymes.org/
 EMP (Enzymes and Metabolic Pathways database)
http://www.metacyc.org/
Mapping Databases

 OMIM
http://www.ncbi.nlm.nih.gov/omim

 GDB (The GDB Human Genome Data Base: a source of

integrated genetic mapping and disease data.)
 http://morissardjerome.free.fr/infobiogen/www.gdb.org/gdb/
Databases concerning
Mutations

 dbSNP
http://www.ncbi.nlm.nih.gov/SNP

 The SNP Consortium (TSC)

http://snp.cshl.org

 http://www4a.biotec.or.th/PASNP
Literature Databases

 PubMed
http://www.ncbi.nlm.nih.gov/entrez/query

 Bioinformatics Online
http://www.bioinformatics.oupjournals.org

 Nature
http://www.nature.com

 Science
http://www.sciencemag.org
In 2003 scientists in the Human Genome
Project obtained the DNA sequence of the 3
billion base pairs making up the human
genome
Sequencing the Human Genome:
A Landmark in the History of Mankind
What we’ve learned so far from
the Human Genome Project

The human genome is nearly the same

(99.9%) in all people

Only about 2% of the human genome

contains genes, which are the
instructions for making proteins
Other Lessons from the
Human Genome Project

Humans have an estimated 30,000 genes;

the functions of more than half of them
are unknown

Almost half of all human proteins share

similarities with other organisms,
underscoring the unity of live
Sequence Alignment Logic
Evaluation of the alignment is a biological concept (significance)
Are you ready for the revolution?

If biologists do not adapt to the powerful computational tools

needed to exploit huge data sets, says Declan Butler, they could find
themselves floundering in the wake of advances in genomics.
Need to understand better from Human Genome sequence:

•Gene number, exact locations, and functions

•Gene regulation
•DNA sequence organization
•Chromosomal structure and organization
•Noncoding DNA types, amount, distribution, information content, and
functions
•Coordination of gene expression, protein synthesis, and post-translational
events
•Interaction of proteins in complex molecular machines
•Predicted vs experimentally determined gene function
•Evolutionary conservation among organisms
•Protein conservation (structure and function)
•Proteomes (total protein content and function) in organisms
•Correlation of SNPs (single-base DNA variations among individuals) with
health and disease
•Disease-susceptibility prediction based on gene sequence variation
•Genes involved in complex traits and multigene diseases
•Complex systems biology including microbial consortia useful for
environmental restoration
•Developmental genetics, genomics
Fast Forward to 2020: What to Expect in Molecular Medicine?

Docs to tailor Dose by your Smart Card

Personalized Medicine

More Effective Pharmaceuticals Societal Implications

Genetic Testing, Therapy Understanding Life

Challenges
74
Outlook – coming lecture
 Introduction to sequence alignment
 pair wise sequence alignment
• The Dot Matrix
• Dynamic Programming
• Scoring Matrices
 local alignment
 Alignment tools
• BLAST
• FASTA

Introduction to Bioinformatics
No ratings yet
Introduction to Bioinformatics
56 pages
Lecture 2
No ratings yet
Lecture 2
24 pages
Bioinformatics Tools & Resources Guide
No ratings yet
Bioinformatics Tools & Resources Guide
283 pages
European Molecular Biology Laboratory (EMBL) : Hafiz.M.Zeeshan - Raza Research Associate - HEC - NRPU
No ratings yet
European Molecular Biology Laboratory (EMBL) : Hafiz.M.Zeeshan - Raza Research Associate - HEC - NRPU
22 pages
Biological Databases - May2023
No ratings yet
Biological Databases - May2023
30 pages
NCBI Databases Overview & Access Guide
No ratings yet
NCBI Databases Overview & Access Guide
7 pages
System Biology Assignment
No ratings yet
System Biology Assignment
17 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics for Scientists
No ratings yet
Bioinformatics for Scientists
34 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
Lecture 2 Introduction To The Computational Tools
No ratings yet
Lecture 2 Introduction To The Computational Tools
15 pages
Manual
No ratings yet
Manual
68 pages
Introduction To Databases
No ratings yet
Introduction To Databases
29 pages
Bioinformatics Lab Notebook: Comsats University, Islamabad
No ratings yet
Bioinformatics Lab Notebook: Comsats University, Islamabad
27 pages
Biological Databases Lec 2,3
No ratings yet
Biological Databases Lec 2,3
49 pages
Bioinformatics Database Basics
No ratings yet
Bioinformatics Database Basics
18 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
Database
No ratings yet
Database
40 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Biological Databases ODL
No ratings yet
Biological Databases ODL
31 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Bioinformatics for Researchers
No ratings yet
Bioinformatics for Researchers
23 pages
Class 1 Bioinfo Course Microdome-1
No ratings yet
Class 1 Bioinfo Course Microdome-1
23 pages
Index: Auroras Technological and Research Institute
No ratings yet
Index: Auroras Technological and Research Institute
56 pages
The National Center For Biotechnology Information
No ratings yet
The National Center For Biotechnology Information
15 pages
Lecture3 4
No ratings yet
Lecture3 4
73 pages
Bioinformatics Overview for Students
No ratings yet
Bioinformatics Overview for Students
32 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Bookshelf NBK21101
100% (1)
Bookshelf NBK21101
451 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Ncbi Handbook
No ratings yet
Ncbi Handbook
14 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Biological Information
No ratings yet
Biological Information
50 pages
Lecture 5-6 - Databases NR
No ratings yet
Lecture 5-6 - Databases NR
35 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Bioinformatics Lab Guide
No ratings yet
Bioinformatics Lab Guide
29 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Databases Bioinformatics
No ratings yet
Databases Bioinformatics
42 pages
Lecture 3
No ratings yet
Lecture 3
55 pages
Online Biological Databases: A/Prof. Ly Le
No ratings yet
Online Biological Databases: A/Prof. Ly Le
64 pages
Pharmacogenomics 002A Kashyap MK 06-09-2020
No ratings yet
Pharmacogenomics 002A Kashyap MK 06-09-2020
93 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
Bioinformatics Notes
No ratings yet
Bioinformatics Notes
4 pages
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
No ratings yet
FALLSEM2019-20 BIT2001 ETH VL2019201000690 Reference Material I 11-Jul-2019 Unit I New
48 pages
Bioinformatics Database Guide
No ratings yet
Bioinformatics Database Guide
19 pages
Fat Noews
No ratings yet
Fat Noews
32 pages
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
No ratings yet
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
41 pages
Module1 Understanding Bioinformatics
No ratings yet
Module1 Understanding Bioinformatics
28 pages
Bioinformatics for Researchers
No ratings yet
Bioinformatics for Researchers
105 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Wa0035.
No ratings yet
Wa0035.
11 pages
Seminar Bioinformatics
No ratings yet
Seminar Bioinformatics
13 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
04 Computer Applications in Pharmacy Full Unit IV
No ratings yet
04 Computer Applications in Pharmacy Full Unit IV
14 pages
PB Bioinfo L1 2023
No ratings yet
PB Bioinfo L1 2023
21 pages
Bioinfi U3 Part - 1
No ratings yet
Bioinfi U3 Part - 1
4 pages
Bioinformatics & Gene Banks
No ratings yet
Bioinformatics & Gene Banks
2 pages
The Art of Quoting (C/o Hancock High School Webpage)
No ratings yet
The Art of Quoting (C/o Hancock High School Webpage)
2 pages
Intensive Program For NEET-2025-01 Result 27-03-2025 ALL
No ratings yet
Intensive Program For NEET-2025-01 Result 27-03-2025 ALL
8 pages
Art of Defining A Concept Paper
No ratings yet
Art of Defining A Concept Paper
22 pages
Bot Youtube Comentar Curtir
No ratings yet
Bot Youtube Comentar Curtir
3 pages
Essentials of Understanding Abnormal Behavior 2nd Edition Sue HQ File Fast Access
No ratings yet
Essentials of Understanding Abnormal Behavior 2nd Edition Sue HQ File Fast Access
320 pages
Oil Industry Risk Assessment Techniques
No ratings yet
Oil Industry Risk Assessment Techniques
56 pages
VVK Model Paper-1
No ratings yet
VVK Model Paper-1
4 pages
What Is Education For - David Orr
No ratings yet
What Is Education For - David Orr
4 pages
It's Not All About Me The Top Ten Techniques For Building Quick Rapport With Anyone PDF
74% (19)
It's Not All About Me The Top Ten Techniques For Building Quick Rapport With Anyone PDF
177 pages
List of United States Urban Areas
No ratings yet
List of United States Urban Areas
43 pages
MKTG5 5th Edition Joe F. Hair Instant Download
No ratings yet
MKTG5 5th Edition Joe F. Hair Instant Download
116 pages
UF Recovery Optimization TB5882EN00 2 MM
No ratings yet
UF Recovery Optimization TB5882EN00 2 MM
13 pages
Lesson 2: The Self, Society, and Culture
100% (3)
Lesson 2: The Self, Society, and Culture
8 pages
Dr. Suvandan Saraswat: Machine Design I (NME-501)
No ratings yet
Dr. Suvandan Saraswat: Machine Design I (NME-501)
47 pages
Power E Energy
No ratings yet
Power E Energy
20 pages
Siling Labuyo Pigment Separation
No ratings yet
Siling Labuyo Pigment Separation
3 pages
OTC 7799 Strength and Stiffness of Tubular Joints For Assessment/Design Purposes
No ratings yet
OTC 7799 Strength and Stiffness of Tubular Joints For Assessment/Design Purposes
8 pages
EXT-121 Practical Ex.
No ratings yet
EXT-121 Practical Ex.
7 pages
Thesis About Yemen
No ratings yet
Thesis About Yemen
488 pages
Corbett 2000
No ratings yet
Corbett 2000
2 pages
Nelson
No ratings yet
Nelson
1 page
Astrobiology An Introduction
No ratings yet
Astrobiology An Introduction
2 pages
Environmental Migration: Challenges & Solutions
No ratings yet
Environmental Migration: Challenges & Solutions
15 pages
Electric Flux Problems and Solutions
No ratings yet
Electric Flux Problems and Solutions
1 page
Homework Help Services by Lakeshore
100% (1)
Homework Help Services by Lakeshore
5 pages
GLT 121 Topic 4-1-2
No ratings yet
GLT 121 Topic 4-1-2
7 pages
Poster Consumable - Data 7
No ratings yet
Poster Consumable - Data 7
4 pages
Criminal Justice Education Quiz
100% (1)
Criminal Justice Education Quiz
3 pages
DN700 - Econ Maths Stats
No ratings yet
DN700 - Econ Maths Stats
2 pages
G12 Saffron Clearance 2ND Sem
No ratings yet
G12 Saffron Clearance 2ND Sem
1 page

Biological Databases

Uploaded by

Biological Databases

Uploaded by

Bioinformatics

Information Resources And Networks

• Nucleic Acid Sequence Databases

• academic, industrial EMBnet • Biocomputing centers from

IEN-AdR - Italy CMBI - Netherlands

Bio - Norway IBB - Poland

PEN - Portugal GeneBee - Russia

CNB-CSIC - Spain BMC - Sweden

SIB - Switzerland SEQNET - UK

EBI Important key specialist node

Centers from non

CBI - China CIGB - Cuba

CDFD - India SANBI – South Africa

EMBnet - Brazil CBR - Canada

EMBnet - Chile EBMnet - Colombia

 Assist in biotechnological and bioinformatics related

 Provide training and education

 Exploit network infrastructures

 Investigate and develop new technologies

 Bridge between commercial and academic sectors

SWISSPROT 163235 10-Jun-2005 Sequence ok

SPTREMBL 1449374 16-Jun-2005 Sequence ok

PROSITE 1935 22-Mar-2006 SeqRelated ok

 Within EMBOSS you will find around hundreds of

Nucleic acid sequence Databases

Nucleic Acid Sequence Databases - EMBL

Entry TypeEntries Nucleotides

Standard 128,262,666 120,603,334,814

Ref: EMBL Nucleotide Sequence Database:developments in 2005,

 In addition to the comprehensive DNA sequence DBs, there

Specialised Genomic Resources

Levels of protein sequence and structural organisation:

primary The primary structure of a protein is its amino acid sequence

The second structure of a protein corresponds to regions of

The tertiary structure of a protein arises from the packing

• The primary structure of a protein is its amino acid sequence

Protein sequence Databases

 The secondary structure of a protein corresponds to regions of

 Patterns are regular expressions, fingerprints, blocks, profiles,

 Pattern DB growth is slow because the addition of

 PROSITE and PRINTS are the only comprehensively,

 To address the annotation bottleneck, the secondary

Structure Classification Databases

A number of metabolic databases are available electronically

KEGG (Kyoto Encyclopedia of Genes and Genomes)

 GDB (The GDB Human Genome Data Base: a source of

 The SNP Consortium (TSC)

The human genome is nearly the same

Only about 2% of the human genome

Humans have an estimated 30,000 genes;

Almost half of all human proteins share

If biologists do not adapt to the powerful computational tools

•Gene number, exact locations, and functions

Docs to tailor Dose by your Smart Card

More Effective Pharmaceuticals Societal Implications

Genetic Testing, Therapy Understanding Life

You might also like