0% found this document useful (0 votes)

22 views11 pages

Bioinformatics

The document provides definitions and explanations of various biological concepts and tools, including KEGG, ORF, and Sanger sequencing. It discusses the importance of databases like STRING and PDB for protein analysis, as well as the functionalities of different BLAST types for sequence alignment. Additionally, it covers RNA and DNA chemistry, secondary structures, and the significance of NGS in modern biological research.

Uploaded by

Tamanna Jena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views11 pages

Bioinformatics

Uploaded by

Tamanna Jena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

4. Define KEGG? What is BRITE in KEGG database?

Ans: KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases

dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG
is utilized for bioinformatics research and education, including data analysis in genomics,
metagenomics, metabolomics and other omics studies, modeling and simulation in systems
biology, and translational research in drug development.
The KEGG BRITE database is a collection of BRITE hierarchy files,
called htext (hierarchical text) files, with additional files for binary relations.

5.Book page 98 and 105.

6. Define ORF with a valid explanation? Which codons are called initiation and
termination codons?
Ans: An open reading frame (ORF), is a portion of a DNA sequence that does not include
a stop codon (which functions as a stop signal). Detect potential coding regions by looking at
ORFs
– A genome of length n is comprised of (n/3) codons
– Stop codons (TAA, TAG or TGA) break genome into segments
between consecutive Stop codons
– The subsegments of these that start from the Start codon (ATG) are ORFs
ORFs in different frames may overlap
A start codon interacts with initiation factors or nearby sequences to initiate the translation
process. A stop codon can individually initiate the termination. The standard start codon is
AUG. The standard stop codon is UAG, UGA and UAA.
7.
Some tRNAs can form base pairs with more than one codon.
Atypical base pairs—between nucleotides other than A-U and G-C—can form at the third
position of the codon, a phenomenon known as wobble. Wobble pairing doesn't follow
normal rules, but it does have its own rules. For instance, a G in the anticodon can pair with a
C or U (but not an A or G) in the third position of the codon, as shown below. Rules like this
ensure codons are read correctly despite wobble.

The answer may be that wobble pairing allows fewer tRNAs to cover all the codons of the
genetic code, while still making sure that the code is read accurately.
[A wobble base pair is a pairing between two nucleotides in RNA molecules that does not
follow Watson-Crick base pair rules](1 mark question).

8. Define terminal and internal nodes in a phylogenetic tree structure?

Ans:
Terminal nodes - represent the data (e.g sequences) under comparison (A,B,C,D,E), also
known as OTUs,(Operational Taxonomic Units).

Internal nodes - represent inferred ancestral units (usually without empirical

data), also known as HTUs, (Hypothetical Taxonomic Units).

9. Which confidence measure does the AlphaFold2 uses?

Ans: We observe high side-chain accuracy when the backbone prediction is accurate and we
show that our confidence measure, the predicted local-distance difference test (pLDDT),
reliably predicts the Cα local-distance difference test (lDDT-Cα) accuracy of the
corresponding prediction.

10. Which amino acids act as helix breaker and helix formers?
Ans: proline and glycine - helix breaker
Alanine - helix former

11. State the formula for systematic conformational search?

Ans: Systematic (deterministic) search procedures
● Grid Scan
● Custom Search
● Cyclic Modelling

There are two ways to perform a systematic search, Grid Scan and Custom Search.
In a Grid Scan search, each specified torsion angle is varied over a grid of equally spaced
values. If more than one torsion angle is involved, the variation of the torsion angles are
nested. If there are two angles a and b, for a given value of a, angle b assumes a grid of
values. If b is the faster torsion angle, the b loop is inside the a loop (see case 1). Although
the application is capable of handling up to 10 grid torsions, it is impractical in most cases to
employ grid scan for more than four torsion angles.
In a Custom Search, torsion angles are assigned specific values. These values do not need to
be equally spaced. This is an advantage in those cases where favorable states of a torsion
angle are known from previous modeling studies and the intent is to restrict the systematic
search to these values. A further advantage of Custom Search is that it can handle, if so
desired, simultaneous changes in several torsion angles. As in Grid Scan, these changes may
also be nested.

12. Write down different databases used for constructing functional association
networks for proteins?
STRING, HumanNet, GeneMania, HumanBase, IMP, I2D, and ConsensuspathDB.

13. Define SANGER? What is it used for?

Ans: Sanger sequencing is a method of DNA sequencing that involves electrophoresis and is
based on the random incorporation of chain-terminating dideoxynucleotides by DNA
polymerase during in vitro DNA replication.
Sanger sequencing, also known as the “chain termination method”, is a method for
determining the nucleotide sequence of DNA. Sanger sequencing was used in the Human
Genome Project to determine the sequences of relatively small fragments of human DNA
(900 bp or less). These fragments were used to assemble larger DNA fragments and,
eventually, entire chromosomes.
14. Give a difference between BLAST and BLAST+.

Ans: The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity
between sequences. The program compares nucleotide or protein sequences to sequence
databases and calculates the statistical significance of matches. BLAST can be used to infer
functional and evolutionary relationships between sequences as well as help identify
members of gene families.
The NCBI provides a suite of command-line tools to run BLAST called BLAST+. This
allows users to perform BLAST searches on their own server without size, volume and
database restrictions. BLAST+ can be used with a command line so it can be integrated
directly into your workflow.

15. What is the E and S value used in BLAST?

E value :
• E-value is the statistical theory used in the BLAST for the alignment of each pair of
sequences and provides the idea of whether the alignment is good or not and whether
the two sequences match with it or not.

• The number of expected hits of similar quality (score) that could be found just by
chance is the BLAST E-value and the E-value of 10 means that up to 10 hits can be
expected to be found by chance.

• The E-value provides the information about the likelihood that a given sequence
match is purely by chance and is used as the first quality filter for the BLAST search
result.

• The lower the E-value the better the match which means if E is less than 1e-50, then
there is high confidence that the database match is a result of homologous
relationships.

• If the value of E is between 0.01 and 10 then the match is considered to be non-
significant but may have a weak homology relationship.

• Similarly, if the value of E is greater than 10, then the sequence under consideration is
either unrelated or if related then has an extremely distant relationship.

• A corrected bit-score adjusted to the sequence database size is the E-value (expected
value) and it depends on the size of the used sequence database.

• When presented in the smaller database, the sequence hit would get a better E-value.

S Value :
• Once a similar sequence has been found for the query sequence in the database
through BLAST, then it becomes essential to have the idea of whether the alignment
is good or whether it shows the possible biological relationships or not. So BLAST
uses statistical theory to produce a bit score for each alignment pair.

• The indication of the good alignment is given by the bit score, which shows the higher
the scores, the better the alignments.

• Generally, this score is calculated by taking into consideration the alignment of the
similar or identical residues and the gaps introduced while aligning the sequences.

• It uses the “substitution matrix” for the alignment of any possible residues.

• For most of the BLAST programs, the BLOSUM62 matrix is the default with the
exception of BLASTn and MegaBLAST as these are the programs that perform
nucleotide-nucleotide comparisons and do not use protein-specific matrices.

• Bit scores from different alignments can be compared, even if there is the use of
different matrices.

• Bit score is not dependent on the size of the database and gives the same value for hits
in databases of different sizes.

16. Same as 6.

17.Name some secondary protein structure. Which tool can be used to visualize

these structures?

Ans: The most common types of secondary structures are the α helix and the β pleated
sheet. Both structures are held in shape by hydrogen bonds, which form between the carbonyl
O of one amino acid and the amino H of another.

PDB can be used to visualize the secondary structures of proteins.

18. Same as 7.

19. Name the databases for metabolic pathways, and protein structure information.

Ans: Metabolic pathway databases: KEGG

PDB for protein structure information.

20. Same as 8.

21. What is AlphaFold2?

Ans: An artificial intelligence (AI) tool called AlphaFold2. The software could predict the
3D shape of proteins from their genetic sequence with, for the most part, pinpoint
accuracy.

Q. Give a plausible explanation on the chemistry of RNA and DNA? Describe the
components of RNA secondary structure? Which database is used for RNA 3D
structure prediction?

DNA (deoxyribonucleic acid) is the genomic material in cells that contains the genetic
information used in the development and functioning of all known living organisms. DNA,
along with RNA and proteins, is one of the three major macromolecules that are essential for
life. Most of the DNA is located in the nucleus, although a small amount can be found in
mitochondria (mitochondrial DNA). Within the nucleus of eukaryotic cells, DNA is
organized into structures called chromosomes. DNA consists of two long polymers of simple
units called nucleotides, with backbones made of sugars and phosphate groups joined by ester
bonds. These two strands run in opposite directions to each other and are therefore anti-
parallel. Attached to each sugar is one of four types of molecules called nucleobases (bases).
It is the sequence of these four bases along the backbone that encodes information. The
sequence of these bases comprises the genetic code, which subsequently specifies the
sequence of the amino acids within proteins. The ends of DNA strands are called the 5′(five
prime) and 3′ (three prime) ends. The 5′ end has a terminal phosphate group and the 3′ end a
terminal hydroxyl group.Bases are classified into two types: the purines, A and G, and the
pyrimidines, the six-membered rings C, T and U. Uracil (U), takes the place of thymine in
RNA and differs from thymine by lacking a methyl group on its ring. Uracil is not usually
found in DNA, occurring only as a breakdown product of cytosine.
Levels of DNA,
1)Primary
2)Secondary
3)Tertiary
4)Quarternary

RNA, is another macromolecule essential for all known forms of life. Like DNA, RNA is
made up of nucleotides. Once thought to play ancillary roles, RNAs are now understood to be
among a cell’s key regulatory players where they catalyze biological reactions, control and
modulate gene expression, sensing and communicating responses to cellular signals, etc.The
chemical structure of RNA is very similar to that of DNA: each nucleotide consists of a
nucleobase a ribose sugar, and a phosphate group. There are two differences that distinguish
DNA from RNA: (a) RNA contains the sugar ribose, while DNA contains the slightly
different sugar deoxyribose (a type of ribose that lacks one oxygen atom), and (b) RNA has
the nucleobase uracil while DNA contains thymine. Unlike DNA, most RNA molecules are
single-stranded and can adopt very complex three-dimensional structures

This RNA secondary structure is also called the stem-and-loop structure, As long as all the
paired bases of an RNA sequence are determined, the secondary structure of the entire RNA
can be determined.
Levels of RNA ,

1)The primary structure of RNA is the sequence of nucleotides (i.e., four bases A, C, G, and
U) in the single-stranded polymer of RNA.

2)secondary (hairpins, bulges and internal loops),

3)tertiary (A-minor motif, 3-way junction, pseudoknot, etc.)

4)and quaternary structure (supermolecular organisation).

Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided
into four different levels: primary, secondary, tertiary, and quaternary.

The database used for RNA 3D structure prediction is RNArchitecture

Q. Give a brief description on the different types of BLAST and describe their functionalities.
Ans: BLASTN

• The query is a nucleotide sequence

• The database is a nucleotide database
• No conversion is done on the query or database
• DNA :: DNA homology
• Mapping oligos to a genome
• Annotating genomic DNA with transcriptome data from ESTs and RNA-Seq
• Annotating untranslated regions

• BLASTP
• The query is an amino acid sequence
• The database is an amino acid database
• No conversion is done on the query or database
• Protein :: Protein homology
• Protein function exploration
• Novel gene 🡺 make parameters more sensitive
• BLASTX
• The query is a nucleotide sequence
• The database is an amino acid database
• All six reading frames are translated on the query and used to search the database

• Coding nucleotide seq :: Protein homology

• Gene finding in genomic DNA
• Annotating ESTs and transcripts assembled from RNA-Seq data

• TBLASTN
• The query is an amino sequence
• The database is a nucleotide database
• All six frames are translated in the database and searched with the protein
sequence

• Protein :: Coding nucleotide DB homology

• Mapping a protein to a genome
• Mining ESTs and RNA-Seq data for protein similarities

• TBLASTX
• The query is a nucleotide sequence
• The database is a nucleotide database
• All six frames are translated on the query and on the database
• Coding :: Coding homology
• Searching distantly-related species
• Sensitive but expensive

Q. Suppose you have two sequences, and you suspect that they diverge from common
ancestor. What possible events might have occurred during the evolution process? Draw a
schematic to represent the evolution process? State the differences between homology,
orthology, paralogy, xenology, analogy and cenancestor? (Sequence alignment concepts : Pg
4-6)
Ans: Divergence from the common ancestor can either be due to duplication or speciation.
Mutational events occur during their evolution,
● substitutions
● deletions
● Insertions
● Homology: the two sequences diverged from a common ancestor. The same organ
under every variety of form and function. Homology is the relationship of any two
characters that have descended, usually with divergence, from a common ancestral
character.

● Analogy: relationship of two characters that have developed convergently from

unrelated ancestor.

● Orthology: relationship of any two homologous characters whose common ancestor

lies in the cenancestor of the taxa from which the two sequences were obtained.

● Paralogy: Relationship of two characters arising from a duplication of the gene for
that character.

● Xenology: relationship of any two characters whose history, since their common
ancestor, involves interspecies (horizontal) transfer of the genetic material for at least
one of those characters.

● Cenancestor: the most recent common ancestor of the taxa under consideration.

Q. Describe NGS. What are the various techniques used to carry out NGS? Give brief
elaboration.
Ans: Next-generation sequencing (NGS) is a massively parallel sequencing technology that
offers ultra-high throughput, scalability, and speed. The technology is used to determine the
order of nucleotides in entire genomes or targeted regions of DNA or RNA. NGS has
revolutionized the biological sciences, allowing labs to perform a wide variety of applications
and study biological systems at a level never before possible.
Q. If you get a particular protein named ‘Keratin’. How will you retrieve its (a) Nucleic acid
sequence (b) Protein sequence (c) Carbohydrate binding site, if present. (d) Protein chains (e)
Amino acid frequency etc? Describe briefly

All the required information concerning any protein (i.e., keratin) can be obtained from the
appropriate databases.
(a) Nucleic acid sequence encoding keratin (gene and cDNA or mRNA) can be obtained from
NCBI and Ensemble databases. These services contain the complete sequences of all human
genes, as well as genes present in other organisms.
(b) Protein sequence can be also found in these databases (NCBI, Ensemble), as well as
UniProt database. On the other hand, protein sequence can be retrieved by a simple
translation of cDNA or mRNA sequence using ExPASy translation tool. In general, a reading
frame represented by the longest translation product corresponds to the correct protein
sequence.
(c) / (d) Both carbohydrate-binding site and protein chains are related to the structural
features of the protein that can be retrieved from the RCSB PDB database containing 164174
biological macromolecular structures, as well as their structural and functional features.
(e) Amino acid frequency can be calculated using the ExPASy ProtParam tool that calculates
the percentage of each amino acid in the protein while the one-letter amino acid sequence is
used as an input.

Q. Give an overview of High-throughput sequencing?

Ans: Sequencing that is capable of sequencing multiple DNA molecules in parallel, enabling
hundreds of millions of DNA molecules to be sequenced at a time.

Q. Perform Needleman wunch algorithm with explanation (tabular chart) and algorithm for
the
following sequences.
Sequence 1: GATTACA
Sequence 2: GTCGACGCA
Match score 2
Mismatch score -2
Gap score -5

Q. How to identify a biomarker?

● Bioinformatics plays a key role in the biomarker discovery
process, bridging the gap between initial discovery phases
such as experimental design, clinical study execution, and
bioanalytics, including sample preparation, separation and
high-throughput profiling and independent validation of
identified candidate biomarkers.

● Once a biomarker cohort study has been set up, and sample
collection, preparation, separation and MS analysis have
been carried out, an extensive technical review of
generated data is essential to ensure a high degree of
consistency, completeness and reproducibility in the data.

● Data preprocessing, as a preliminary data mining practice

performed on the raw data, is necessary to transform data
into a format that will be more easily and effectively
processed for the purpose of targeted analyses. There are a
number of methods used for data preprocessing, including
data transformation (e.g. logarithmic scaling of data) and
normalization, e.g. using z-transformation, data sampling or
outlier detection.

Q. How to download fasta sequence of protein?

1. Open NCBI website (http://www.ncbi.nlm.nih.gov/)

2. Select the Protein (ALL databases), write the name of protein.
3. The list obtained, choice the specific protein click on that.
4. Just below the name of the protein, FASTA is written, click on it.
5. Download in the .txt format.

University of Kwazulu-Natal Bioinformatics Gene320 3 May 2016 Test 2 Duration 100 Minutes Total Marks: 70
No ratings yet
University of Kwazulu-Natal Bioinformatics Gene320 3 May 2016 Test 2 Duration 100 Minutes Total Marks: 70
6 pages
Search Sequence Database
No ratings yet
Search Sequence Database
6 pages
BIF401 MID Term Exam 2022 Preparation by BADSHA ALI
No ratings yet
BIF401 MID Term Exam 2022 Preparation by BADSHA ALI
6 pages
Unit Iv - Blast
No ratings yet
Unit Iv - Blast
21 pages
Bioinformatics Lab 2
No ratings yet
Bioinformatics Lab 2
9 pages
Bioinfo Final Practical
No ratings yet
Bioinfo Final Practical
66 pages
Bioinformatics Lab 2 (Evelyn)
No ratings yet
Bioinformatics Lab 2 (Evelyn)
9 pages
Bioinformatics for Biochem Students
No ratings yet
Bioinformatics for Biochem Students
6 pages
Bioinformatics Tutorial
No ratings yet
Bioinformatics Tutorial
12 pages
Exam Year Questions and Answers
No ratings yet
Exam Year Questions and Answers
8 pages
BI205 Prac 5&6
No ratings yet
BI205 Prac 5&6
11 pages
Lecture - 02 - Comparative Sequence Analysis
No ratings yet
Lecture - 02 - Comparative Sequence Analysis
28 pages
Module in Tics
No ratings yet
Module in Tics
20 pages
Blast
100% (1)
Blast
21 pages
TY-Exercise 4
No ratings yet
TY-Exercise 4
8 pages
Sequence Similarity Search with BLAST
No ratings yet
Sequence Similarity Search with BLAST
19 pages
Exam Year Questions and Answers
No ratings yet
Exam Year Questions and Answers
8 pages
Blast Introduction
No ratings yet
Blast Introduction
42 pages
Bioinformatics: Blast and Sequence Analysis
No ratings yet
Bioinformatics: Blast and Sequence Analysis
45 pages
Gene Sequence Analysis Guide
No ratings yet
Gene Sequence Analysis Guide
14 pages
Blast Introduction
No ratings yet
Blast Introduction
42 pages
Bioinformatics Tutorial 2019
No ratings yet
Bioinformatics Tutorial 2019
54 pages
Bioinformatics: Genetic Databases & Tools
No ratings yet
Bioinformatics: Genetic Databases & Tools
8 pages
Mid Bioinfor
No ratings yet
Mid Bioinfor
6 pages
BLAST: Sequence Alignment Tool Guide
No ratings yet
BLAST: Sequence Alignment Tool Guide
12 pages
Bs982 l08 Basic Blast
No ratings yet
Bs982 l08 Basic Blast
38 pages
Genetic Engineering Software Guide
No ratings yet
Genetic Engineering Software Guide
44 pages
Lab Report 03
No ratings yet
Lab Report 03
18 pages
Solnlug
No ratings yet
Solnlug
10 pages
BIF501-Bioinformatics-II Solved Questions FINAL TERM (PAST PAPERS)
No ratings yet
BIF501-Bioinformatics-II Solved Questions FINAL TERM (PAST PAPERS)
23 pages
Bioinformatics: ABE 2007 Kent Koster Group 3
No ratings yet
Bioinformatics: ABE 2007 Kent Koster Group 3
43 pages
Quiz Dna
100% (3)
Quiz Dna
8 pages
BLAST
100% (1)
BLAST
4 pages
IBB - MB.501 Database Search and Sequence Alignment
No ratings yet
IBB - MB.501 Database Search and Sequence Alignment
51 pages
ALLIENU Blast and Fasta
No ratings yet
ALLIENU Blast and Fasta
27 pages
Bioinformatics Exam Prep Guide
No ratings yet
Bioinformatics Exam Prep Guide
16 pages
BLAST Guide for Biologists
0% (1)
BLAST Guide for Biologists
3 pages
BIO101 Module 3
No ratings yet
BIO101 Module 3
15 pages
BLAST: Fast Sequence Search Tool
No ratings yet
BLAST: Fast Sequence Search Tool
6 pages
Questions - PDNA - Sequence Lec 4
No ratings yet
Questions - PDNA - Sequence Lec 4
6 pages
Lesson 4.3 Fast Blast
No ratings yet
Lesson 4.3 Fast Blast
45 pages
Dsappart 2 Blaststranscriptandnotes
No ratings yet
Dsappart 2 Blaststranscriptandnotes
7 pages
Retrieval of Data
No ratings yet
Retrieval of Data
22 pages
Bioinformatics Manual Updated
No ratings yet
Bioinformatics Manual Updated
48 pages
BLAST Glossary With Highlights
No ratings yet
BLAST Glossary With Highlights
9 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
BLAST Guide for Bioinformatics Students
No ratings yet
BLAST Guide for Bioinformatics Students
36 pages
Blast
No ratings yet
Blast
18 pages
lecture2-BGGN213 F17
No ratings yet
lecture2-BGGN213 F17
10 pages
Blast Fasta
No ratings yet
Blast Fasta
27 pages
Data Retrieval
67% (3)
Data Retrieval
17 pages
TY-Exercise 4 (35) (Updated)
No ratings yet
TY-Exercise 4 (35) (Updated)
7 pages
Database Searching
No ratings yet
Database Searching
41 pages
Introduction To Different Resources of Bioinformatics and Application PDF
No ratings yet
Introduction To Different Resources of Bioinformatics and Application PDF
55 pages
Database Similarity Searching
No ratings yet
Database Similarity Searching
4 pages
Heredity DLL
No ratings yet
Heredity DLL
13 pages
Plant Kingdom Classification Guide
No ratings yet
Plant Kingdom Classification Guide
39 pages
1.3 Intro To Biology PPT
No ratings yet
1.3 Intro To Biology PPT
18 pages
Tutorial Biology FGS0044 Answer All Questions. Diagram of An Animal Cell. Label The Parts
No ratings yet
Tutorial Biology FGS0044 Answer All Questions. Diagram of An Animal Cell. Label The Parts
3 pages
Gary Ruvkun
No ratings yet
Gary Ruvkun
10 pages
Dmae 012
No ratings yet
Dmae 012
29 pages
B2.3 Cell Specialisation (HL)
No ratings yet
B2.3 Cell Specialisation (HL)
19 pages
TT-15000 High Speed Centrifuge Guide
No ratings yet
TT-15000 High Speed Centrifuge Guide
3 pages
Pre and Post Columbian Gene and Cultural
No ratings yet
Pre and Post Columbian Gene and Cultural
12 pages
Eals Finals Exam Reviewer
No ratings yet
Eals Finals Exam Reviewer
8 pages
Unit 7 PDF
No ratings yet
Unit 7 PDF
22 pages
Sporophyte of Riccia
No ratings yet
Sporophyte of Riccia
9 pages
OEDOGONIUM
No ratings yet
OEDOGONIUM
7 pages
Cell Structure Quiz Key
100% (3)
Cell Structure Quiz Key
3 pages
Practice - Test - 2 Princeton Cracking SAT 2020
No ratings yet
Practice - Test - 2 Princeton Cracking SAT 2020
56 pages
Burkina Faso Tomato Virus Study
No ratings yet
Burkina Faso Tomato Virus Study
11 pages
MolCloning Tech Guide
No ratings yet
MolCloning Tech Guide
40 pages
Acute and Chronic Toxicity Effects of Silver Nanoparticles (NPS) On Drosophila Melanogaster
No ratings yet
Acute and Chronic Toxicity Effects of Silver Nanoparticles (NPS) On Drosophila Melanogaster
6 pages
Biology Mock Exam for Students
No ratings yet
Biology Mock Exam for Students
12 pages
CV Rita Singh Oct 2022
No ratings yet
CV Rita Singh Oct 2022
15 pages
Kozar Et Al - Acanthococcidae and Related Families of The Palaearctic Region
No ratings yet
Kozar Et Al - Acanthococcidae and Related Families of The Palaearctic Region
680 pages
Bio2 11 - 12 Q3 0202 PF FD
No ratings yet
Bio2 11 - 12 Q3 0202 PF FD
73 pages
11-Mendelian Disorders - Part 1.
No ratings yet
11-Mendelian Disorders - Part 1.
22 pages
SR Bipc Ipe Term-II Botany Imp Questions
No ratings yet
SR Bipc Ipe Term-II Botany Imp Questions
3 pages
Genetic Variation and Change Exam
No ratings yet
Genetic Variation and Change Exam
12 pages
The Major Histocompatibility Complex (MHC/ HLA) in Medicine A Personal Recollection Best Quality Download
No ratings yet
The Major Histocompatibility Complex (MHC/ HLA) in Medicine A Personal Recollection Best Quality Download
16 pages
RE-SEED Program, Center For STEM Education, Northeastern University 520 INV, 360 Huntington Ave., Boston, MA 02115
100% (1)
RE-SEED Program, Center For STEM Education, Northeastern University 520 INV, 360 Huntington Ave., Boston, MA 02115
4 pages
Dna Rna
No ratings yet
Dna Rna
2 pages
Ielts-Academic-Reading-Task-Type-1-Identifying Information (TrueFalseNot Given)
No ratings yet
Ielts-Academic-Reading-Task-Type-1-Identifying Information (TrueFalseNot Given)
8 pages
Plant (Assignment 2) Answer Key (R)
No ratings yet
Plant (Assignment 2) Answer Key (R)
6 pages

Bioinformatics

Uploaded by

Bioinformatics

Uploaded by

4. Define KEGG? What is BRITE in KEGG database?

Ans: KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases

5.Book page 98 and 105.

8. Define terminal and internal nodes in a phylogenetic tree structure?

Internal nodes - represent inferred ancestral units (usually without empirical

9. Which confidence measure does the AlphaFold2 uses?

11. State the formula for systematic conformational search?

13. Define SANGER? What is it used for?

15. What is the E and S value used in BLAST?

PDB can be used to visualize the secondary structures of proteins.

Ans: Metabolic pathway databases: KEGG

PDB for protein structure information.

21. What is AlphaFold2?

2)secondary (hairpins, bulges and internal loops),

3)tertiary (A-minor motif, 3-way junction, pseudoknot, etc.)

4)and quaternary structure (supermolecular organisation).

The database used for RNA 3D structure prediction is RNArchitecture

• The query is a nucleotide sequence

• Coding nucleotide seq :: Protein homology

• Protein :: Coding nucleotide DB homology

● Analogy: relationship of two characters that have developed convergently from

● Orthology: relationship of any two homologous characters whose common ancestor

Q. Give an overview of High-throughput sequencing?

Q. How to identify a biomarker?

● Data preprocessing, as a preliminary data mining practice

Q. How to download fasta sequence of protein?

1. Open NCBI website (http://www.ncbi.nlm.nih.gov/)

You might also like