Lecture 1and 2 Introduction

The document provides an introduction to bioinformatics, highlighting its integration of biology, computer science, and information technology to solve biological problems. It covers various related fields such as genomics, proteomics, and medical informatics, as well as the history and development of bioinformatics and biological databases. Additionally, it details the types of biological databases, their architecture, and examples of important databases for genomic and protein data.

Uploaded by

Its Zainu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views47 pages

Lecture 1and 2 Introduction

Uploaded by

Its Zainu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Introduction to

Bioinformatics
&
Biological Databases
Ishtiaq Ahmad
IIB GCU Lahore
What Is Bioinformatics?
• Bioinformatics is the unified discipline formed
from the combination of biology, computer
science, and information technology.
• "The mathematical, statistical and computing
methods that aim to solve biological problems
using DNA and amino acid sequences and
related information.“ –Frank Tekaia
A Molecular Alphabet
• Macromolecules are polymers of monomers
• All monomers belong to the same general class,
but there are several types with distinct and well-
defined characteristics
• Many monomers can be joined to form a single,
large macromolecule; the ordering of monomers
in the macromolecule encodes information, just
like the letters of an alphabet
Other related Fields:
Computational Biology
• The study and application of computing
methods for biological data
• Primarily concerned with the computation
of data related to evolutionary, population
and theoretical biology aspects.
Related Fields:
Medical Informatics
• The study and application of computing
methods to improve communication,
understanding, and management of
medical data
• Generally concerned with how the data is
manipulated rather than the data itself.
Related Fields:
Cheminformatics
• The study and application of computing
methods, along with chemical and
biological technology, for drug design and
development
Related Fields:
Genomics
• Analysis and comparison of the entire
genome of a single species or of multiple
species
• A genome is the set of all genes
possessed by an organism
• Genomics existed before any genomes
were completely sequenced, but in a very
primitive state
Related Fields:
Proteomics
• Study of how the genome is expressed in
proteins, and of how these proteins
function and interact
• Concerned with the actual states of
specific cells, rather than the potential
states described by the genome
Related Fields:
Pharmacogenomics
• The application of genomic methods to
identify drug targets
• For example, searching entire genomes
for potential drug receptors, or by studying
gene expression patterns in tumors
Related Fields:
Pharmacogenetics
• The use of genomic methods to determine
what causes variations in individual
response to drug treatments
• The goal is to identify drugs that may be
only be effective for subsets of patients, or
to tailor drugs for specific individuals or
groups
History of Bioinformatics
• Genetics
• Computers and Computer Science
• Bioinformatics
History of Genetics
• Gregor Mendel
• Chromosomes
• DNA
History of Chromosomes
• Walter Flemming
• August Weissman
• Theodor Boveri
• Walter S. Sutton
• Thomas Hunt Morgan
History of Computers
Computer Timeline
• ~1000BC The abacus
• 1621 The slide rule invented
• 1625 Wilhelm Schickard's mechanical calculator
• 1822 Charles Babbage's Difference Engine
• 1926 First patent for a semiconductor transistor
• 1937 Alan Turing invents the Turing Machine
• 1939 Atanasoff-Berry Computer created at Iowa State
– the world's first electronic digital computer
• 1939 to 1944 Howard Aiken's Harvard Mark I (the IBM ASCC)
• 1940 Konrad Zuse -Z2 uses telephone relays instead of mechanical logical
circuits
• 1943 Collossus - British vacuum tube computer
• 1944 Grace Hopper, Mark I Programmer (Harvard Mark I)
• 1945 First Computer "Bug", Vannevar Bush "As we may think"
Computer Timeline (cont.)
• 1948 to 1951 The first commercial computer – UNIVAC
• 1952 G.W.A. Dummer conceives integrated circuits
• 1954 FORTRAN language developed by John Backus (IBM)
• 1955 First disk storage (IBM)
• 1958 First integrated circuit
• 1963 Mouse invented by Douglas Englebart
• 1963 BASIC (standing for Beginner's All Purpose Symbolic Instruction Code) was written (invented) at Dartmouth
College, by mathematicians John George Kemeny and Tom Kurtzas as a teaching tool for undergraduates
• 1969 UNIX OS developed by Kenneth Thompson
• 1970 First static and dynamic RAMs
• 1971 First microprocessor: the 4004
• 1972 C language created by Dennis Ritchie
• 1975 Microsoft founded by Bill Gates and Paul Allen
• 1976 Apple I and Apple II microcomputers released
• 1981 First IBM PC with DOS
• 1985 Microsoft Windows introduced
• 1985 C++ language introduced
• 1992 Pentium processor
• 1993 First PDA
• 1994 JAVA introduced by James Gosling
• 1994 Csharp language introduced
Putting it all Together
• Bioinformatics is basically where the findings in genetics
and the advancement in technology meet in that
computers can be helpful to the advancement of
genetics.
• Depending on the definition of Bioinformatics used, or
the source , it can be anywhere between 30 to 55 years
old
– Bioinformatics like studies were being performed in
the ’60s long before it was given a name
• Sometimes called “molecular evolution”
– The term Bioinformatics was first published in 1991
Genomics
• Classic Genomics
• Post Genomic era
– Comparative Genomics
– Functional Genomics
– Structural Genomics
What is Genomics?
• Genome
– complete set of genetic instructions for
making an organism
• Genomics
– any attempt to analyze or compare the entire
genetic complement of a species
– Early genomics was mostly recording genome
sequences
History of Genomics
• 1995
– Haemophilus influenzea genome sequenced (flu bacteria, 1.8 Mb)
• 1996
– Saccharomyces cerevisiae (baker's yeast, 12.1 Mb)
• 1997
– E. coli (4.7 Mbp)
• 2000
– Pseudomonas aeruginosa (6.3 Mbp)
– A. thaliana genome (100 Mb)
– D. melanogaster genome (180Mb)
2001 The Big One
• The Human Genome sequence is
published
– 3 Gb
What next?
• Post Genomic era
– Comparative Genomics
– Functional Genomics
– Structural Genomics
Comparative Genomics
• the management and analysis of the
millions of data points that result from
Genomics
– Sorting out the mess

Comparative genomics involves the management

and analysis of vast amounts of data resulting from
genomics studies.
Functional Genomics
• Other, more direct, large-scale ways of
identifying gene functions and
associations
– (for example yeast two-hybrid methods
Functional genomics aims to directly identify the
functions and associations of genes within a
genome. It involves large-scale methods for
studying gene functions, interactions, and
regulatory mechanisms.
Structural Genomics
• emphasizes high-throughput, whole-
genome analysis.
– outlines the current state
– future plans of structural genomics efforts
around the world and describes the possible
benefits of this research
Structural genomics emphasizes high-throughput
analysis of the 3D structures of biomolecules, such
as proteins and nucleic acids, at a genome-wide
scale. It seeks to determine the structures of all the
proteins encoded by an organism's genome.
What Is Proteomics?
• Proteomics is the study of the proteome—
the “PROTEin complement of the
genOME”
• More specifically, "the qualitative and
quantitative comparison of proteomes
under different conditions to further
unravel biological processes"
What Makes Proteomics
Important?
• A cell’s DNA—its genome—describes a
blueprint for the cell’s potential, all the
possible forms that it could conceivably
take. It does not describe the cell’s actual,
current form, in the same way that the
source code of a computer program does
not tell us what input a particular user is
currently giving his copy of that program.
What Makes Proteomics
Important?
• All cells in an organism contain the same DNA.
• This DNA encodes every possible cell type in
that organism—muscle, bone, nerve, skin, etc.
• If we want to know about the type and state of a
particular cell, the DNA does not help us, in the
same way that knowing what language a
computer program was written in tells us nothing
about what the program does.
Biological Databases
• Biological databases are the collection of
biological data organized and annotated in such
form that can be reused for research purposes.
• Source of the data contained in the biological
databases can be highly sophisticated
experimental results, published literature or
computational analyses related to taxonomy,
phylogeny, genomics, proteomics, microarray
gene expression etc.
Basic Components of Biological
Database Architecture
Biological database design, development
and management are the basic areas in
bioinformatics, which requires following;
rational database management system
• RDBMS programs from computer Science.

• Information retrieval system from digital

libraries.
Information in Biological Databases

The information contained in different

biological databases may be
• A gene or protein sequence, SwissProt,
GenBank etc.
• Descriptions in text form.
• Ontological classification
• Citation record
• Tables
Data Formats of Biological
Databases

Majority of them contain semi-structured

data in form of text descriptions
• Tabular data.
• Tab or space delimited data records.
• XML data format. extensible markup language
• Cross referencing other databases.
Primary Sequence Databases
• Genome sequence
- Nucleotide sequence of gene(s)
- DNA and RNA

• Proteome sequence
- Amino acid sequence of proteins
expressed or derived from the gene
sequences
Genome Databases
• Collect, organize, annotate, analyze and
manage the whole genome sequence of
single or different organisms.
Examples: Corn, a database of maize genome
Ensembl, a database of human, mouse, other
vertebrates and eukaryotes genomes
• These databases are accessible publicly
Important Genome Databases
• Corn: Maiz genome www.maizgdb.org
Education Resources
• ERIC: Enteropathogen genome www.ericbrc.org Information Center
• National Microbial Pathogen Data Resource www.nmpdr.org
• JGI Genomes: Eukaryote and microbial genome joint genome institute
http://genome.jgi.doe.gov/
• MGI Mouse Genome www.informatics.jax.org mouse genome institute
• Wormbase: C. elegans genome
• Flybase: Genome of fruit fly
• Saccharomyces Genome Database: Genome of yeast model organism
• Ensembl: Human, mouse, other vertebrates and eukaryotic genome
database www.ensembl.org
• TAIR: Arabidoopsis http://arabidopsis.org
The Arabidopsis Information Resource
Nucleotide (Gene) Sequence Databases

• DDBJ: DNA Data Bank of Japan

http://www.ddbj.nig.ac.jp/Welcome-e.html
• EMBL Nucleotide DB: European Molecular Biology
Laboratory http://www.ebi.ac.uk/embl/index.html
• GenBank: National Center for Biotechnology Information (NCBI)
www.pubmed.com
Protein Sequence Databases
Protein sequences have been stored in
different databases as annotations
containing general and specific details
about different aspects of protein
properties and features along with
sequence details of each protein.
List of Protein Sequences Databases
• Uniprot: http://www.ebi.ac.uk/, http://expasy.org
• PIR: http://www-nbrf.georgetown.edu/pir/searchdb.html
• SwissProt: http://expasy.org
• PROSITE: Database of Protein Families and Domains
www.expasy.org/prosite
• DIP: Database of Interacting Proteins sequences and
structures http://dip.doe-mbi.ucla.edu/
• Pfam: Protein families database of alignments and
HMMs http://www.sanger.ac.uk/Software/Pfam
• ProDom: Comprehensive set of Protein Domain Families
http://protein.foulouse.inra.fr/prodom/current/html/home.
php
Protein Structure Databases
• Protein Data Bank (PDB) www.rcsb.org
• CATH (Class, Architecture, Topology,
Homologous super-family): Protein structure
classification www.cathdb.info
• SCOP: Structural Classification of protein
http://scop.mrc-lmb.cam.ac.uk/scop/
• PDBe: www.ebi.ac.uk/pdbe/
• SWISS-MODEL: A Server and collection of
protin structures from PDB acting as templates
http://swissmodel.expasy.org//SWISS-
MODEL.html
• ModBase: A database of comparative structure
Models of proteins http://salilab.org/modbase
Protein-Protein Interaction Databases

• STRING: A database of experimental &

predicted protein-protein interactions
http://string.embl.de/
• DIP: Database of Interacting Proteins
sequences and structures http://dip.doe-
mbi.ucla.edu/
• BIND: A database of biomolecular
interaction network www.bind.ca
Metabolic Pathway Databases
• BioCyc: A collection of 3563
Pathway/Genome Databases
with tools for understanding their
data http://biocyc.org/
• KEGG: Kyoto Encyclopedia of Genes and Genomes
• MANET Molecular Ancestry Networks
• Reactome
Microarray-Gene Expression Databases

• ArrayExpress (EBI)
• Gene Expression Omnibus (NCBI)
• maxd (Univ. of Manchester)
• SMD (Stanford University)
• GPX (Scottish Centre for Genomic
Technology and Informatics)
Mathematical Model Databases

• CellML: http://www.cellml.org/models

• Biomodels: http://www.ebi.ac.uk/biomodels/
PCR Primer Databases

• PathoOligoDB: A free QPCR oligo

database for pathogens
Meta-Databases
A type of database source or platform
hosting different database sources
presenting the data of these databases in
a new and rather simpler and unified form
or containing the information of that
particular gene or protein with its
implication to a specific disease etc.
Entrez is one of the example of a meta-
database.
Major Meta Databases
• Entrez
• euGenes
• GeneCards
• SOURCE
• mGen
• Bioinformatic Harvester
• MetaBase
Questions and Answers

Bio in For Matics
No ratings yet
Bio in For Matics
67 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
76 pages
Intro To Bioinformatics
No ratings yet
Intro To Bioinformatics
50 pages
Unit I
No ratings yet
Unit I
11 pages
An Introduction On Bioinformatics
No ratings yet
An Introduction On Bioinformatics
66 pages
An Assignment
No ratings yet
An Assignment
6 pages
Collection
No ratings yet
Collection
8 pages
PB Bioinfo L1 2023
No ratings yet
PB Bioinfo L1 2023
21 pages
Bioinformatics Overview & Applications
No ratings yet
Bioinformatics Overview & Applications
9 pages
Lecture 1-2 Intro
No ratings yet
Lecture 1-2 Intro
24 pages
Bioinformatics Lecture 1-Fall 2024
No ratings yet
Bioinformatics Lecture 1-Fall 2024
39 pages
Lecture Notes Biotechnology and Bioinformatics Mls 412 Bioinformatics Section
No ratings yet
Lecture Notes Biotechnology and Bioinformatics Mls 412 Bioinformatics Section
16 pages
Unit 1 Bioinformatics
No ratings yet
Unit 1 Bioinformatics
38 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
13 pages
Introduction to Bioinformatics
No ratings yet
Introduction to Bioinformatics
14 pages
BTH 403-BTG407 Lecture 1
No ratings yet
BTH 403-BTG407 Lecture 1
6 pages
Bioinformatics Day1
No ratings yet
Bioinformatics Day1
5 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
10 pages
Databases
No ratings yet
Databases
34 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
Notas
No ratings yet
Notas
4 pages
Bioinformatics for Biomedical Students
No ratings yet
Bioinformatics for Biomedical Students
48 pages
Introduction To Bioinformatics and Biocomputing I: DR Tan Tin Wee Director Bioinformatics Centre
No ratings yet
Introduction To Bioinformatics and Biocomputing I: DR Tan Tin Wee Director Bioinformatics Centre
39 pages
Concepts of Bioinformatics PDF
100% (2)
Concepts of Bioinformatics PDF
20 pages
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
No ratings yet
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
12 pages
Cap Unit Iv
No ratings yet
Cap Unit Iv
8 pages
04 Computer Applications in Pharmacy Full Unit IV
No ratings yet
04 Computer Applications in Pharmacy Full Unit IV
14 pages
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
100% (2)
Bioinformatics: Nadiya Akmal Binti Baharum (PHD)
54 pages
What Is Bioinformatics An Introduction and Overvie
No ratings yet
What Is Bioinformatics An Introduction and Overvie
31 pages
BMS Lecture 1
No ratings yet
BMS Lecture 1
24 pages
BioInformetics PPT
No ratings yet
BioInformetics PPT
19 pages
Class03-What Is bioinformatics-2022-SIV2001
No ratings yet
Class03-What Is bioinformatics-2022-SIV2001
21 pages
Unit 7 (Application of Bioinformatics in Agriculture)
No ratings yet
Unit 7 (Application of Bioinformatics in Agriculture)
25 pages
Introduction To Bioinformatics 2ed Edition Lesk A.M. Online Reading
No ratings yet
Introduction To Bioinformatics 2ed Edition Lesk A.M. Online Reading
105 pages
Bioinformatics
No ratings yet
Bioinformatics
53 pages
Unit 7 Bio Informatics @2025
No ratings yet
Unit 7 Bio Informatics @2025
46 pages
Introduction to Bioinformatics Course
No ratings yet
Introduction to Bioinformatics Course
34 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
7 pages
Introduction To NCBI Resources
No ratings yet
Introduction To NCBI Resources
39 pages
Bioinformatics for Researchers
100% (2)
Bioinformatics for Researchers
21 pages
Bioinformatics: Tools and Applications
No ratings yet
Bioinformatics: Tools and Applications
17 pages
Role of Bioinformatics in Biotechnology
No ratings yet
Role of Bioinformatics in Biotechnology
5 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
8 pages
Bioinformatics 1
No ratings yet
Bioinformatics 1
37 pages
Bioinformatics: Major Research Areas
No ratings yet
Bioinformatics: Major Research Areas
2 pages
Bioin
No ratings yet
Bioin
34 pages
Bioinformatics for Scientists
No ratings yet
Bioinformatics for Scientists
5 pages
Bioinformatics
100% (2)
Bioinformatics
104 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
61 pages
What Is Bioinformatics
100% (1)
What Is Bioinformatics
22 pages
Bioinformatics 1
No ratings yet
Bioinformatics 1
52 pages
Bioinformatics: Aditya Ray Aditya Agarwal Aman Chauhan Kashish Punyani
No ratings yet
Bioinformatics: Aditya Ray Aditya Agarwal Aman Chauhan Kashish Punyani
14 pages
Bioinformatics for Biologists
No ratings yet
Bioinformatics for Biologists
30 pages
Xu GMX 9 D JN
No ratings yet
Xu GMX 9 D JN
270 pages
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
100% (2)
D. Higgins, Willie Taylor Bioinformatics Sequence, Structure and Databanks PDF
268 pages
Bioinformatics Seminar Overview
No ratings yet
Bioinformatics Seminar Overview
15 pages
4 Introduction To Bioinformatics-Intro
No ratings yet
4 Introduction To Bioinformatics-Intro
22 pages
Iceberg Invasion Tutorial
No ratings yet
Iceberg Invasion Tutorial
8 pages
Sulfonation of Kraft Lignin
No ratings yet
Sulfonation of Kraft Lignin
155 pages
Multi Radio BTS Installation Guide
No ratings yet
Multi Radio BTS Installation Guide
11 pages
KSPRS Newsletter Vol.137
No ratings yet
KSPRS Newsletter Vol.137
21 pages
PIS S-2210 Internal Treatment PDF
No ratings yet
PIS S-2210 Internal Treatment PDF
1 page
Beverages - Govinda Lemon Ginger
No ratings yet
Beverages - Govinda Lemon Ginger
1 page
Aemc 6526
No ratings yet
Aemc 6526
52 pages
Tutorial Biology FGS0044 Answer All Questions. Diagram of An Animal Cell. Label The Parts
No ratings yet
Tutorial Biology FGS0044 Answer All Questions. Diagram of An Animal Cell. Label The Parts
3 pages
Curriculum - Vitae: Objective
No ratings yet
Curriculum - Vitae: Objective
3 pages
Vrindavan Braj Parikrama
No ratings yet
Vrindavan Braj Parikrama
2 pages
Esat Review Matz PDF
No ratings yet
Esat Review Matz PDF
13 pages
Sikacrete - 211 Sccplus
No ratings yet
Sikacrete - 211 Sccplus
4 pages
The Path To The Kingdom - Father Arsenie Boca
71% (7)
The Path To The Kingdom - Father Arsenie Boca
327 pages
Digestive System Overview
No ratings yet
Digestive System Overview
15 pages
Sujok in Nervous System Disorders
No ratings yet
Sujok in Nervous System Disorders
44 pages
Partnership For Market
No ratings yet
Partnership For Market
42 pages
Spatial Reasoning Instrument September 2017
No ratings yet
Spatial Reasoning Instrument September 2017
32 pages
Heavymetal Issue276
100% (1)
Heavymetal Issue276
121 pages
AI For Social Good: Using Artificial Intelligence To Save The World 1st Edition Rahul Dodhia Full
100% (2)
AI For Social Good: Using Artificial Intelligence To Save The World 1st Edition Rahul Dodhia Full
47 pages
Thyristor and Two Transistor Model of Thyristor
No ratings yet
Thyristor and Two Transistor Model of Thyristor
9 pages
Pe Final Exam Reviewer
No ratings yet
Pe Final Exam Reviewer
3 pages
Presentation Learbnbay - Flight Fare Prediction
No ratings yet
Presentation Learbnbay - Flight Fare Prediction
15 pages
First Year
No ratings yet
First Year
6 pages
Bridge Final Project
100% (4)
Bridge Final Project
244 pages
Basaan Sa San Juan
30% (10)
Basaan Sa San Juan
14 pages
Marine Pump Systems & Equipment Guide
No ratings yet
Marine Pump Systems & Equipment Guide
34 pages
Proposal Draft
No ratings yet
Proposal Draft
26 pages
11kv 630a 25ka Outdoor VCB Panel
100% (1)
11kv 630a 25ka Outdoor VCB Panel
3 pages
PROSERV Subsea Ctrls Project Ref
No ratings yet
PROSERV Subsea Ctrls Project Ref
81 pages
Perineal Care Final
100% (2)
Perineal Care Final
9 pages

Lecture 1and 2 Introduction

Uploaded by

Lecture 1and 2 Introduction

Uploaded by

Introduction to

Comparative genomics involves the management

• Information retrieval system from digital

The information contained in different

Majority of them contain semi-structured

• DDBJ: DNA Data Bank of Japan

• STRING: A database of experimental &

• PathoOligoDB: A free QPCR oligo

You might also like