Lecture 3 Flashcards

(49 cards)

1
Q

Procedue of comparing two or more sequences by searching for a series of individual characteris that are in the same order in the sequence

A

Sequence Alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Significance of Sequence Alignment

A
  • a way to learn about a gene or protein
  • discover functional, structural and evol. information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

process of lining up two sequences to achieve maximal levels of identity and show conservation of residues

A

pairwise sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

used to assess degree of similarity and possibly homology

A

Pairwise sequence alignment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

term thats refers to the same residues between two proteins; may be in global or local alignment

A

Identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

term that referes to residues have structural or functional related; can only use “higher/lower degree of similarity”

A

Similar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sum of both identical and similar residues; goal of pairwise alignment

A

Percent similarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Calculating percent similarity, steps

A
  1. Count the number of sequence in matching positions
  2. Count number of aligned positions
  3. Divide matching by total # of aligned
  4. Multiply by 10
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Formula for percent similarity

A

Percent Similarity = (Number of matching position / total # of aligned position ) x 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Refers to the exact match between two nucleotide or amino acids

A

Identity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

refers to a resemblance between two residues that is greater than one would expect at random

A

Similarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  • Simple fraction of identical residues
  • genetic distances based on models of DNA sequence evolution
A

Nucleotide sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Genetic distances based on models of AMINO ACID sequence evolution
- measured using matrices

A

Peptide sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Example of matrices used to measure peptide sequences

A
  • PAM
  • BLOSUM
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

a sequences that share a common evolutionary ancestry

A

Homologous sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

sequence regions that are homologous are also called _____

A

Conserved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

share a common evolutionary ancestry

A

Homologs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

derived from a single ancestral gene in the Least Common Ancestor

A

Orthologous genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

two or more homologous genes found within a single species

  • duplicated genes with the some genome, may evolve new function
A

Paralogous Genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Two types of pairwise comparison

A
  • Graphical analysis
  • Residue/Residue analysis
21
Q

Example of Graphical Analysis

A

Dot plot matrix

22
Q

Types of Residue/Residue analysis

A
  1. Global Alignment
  2. Local alignment
23
Q
  • visually compares two sequences (like DNA, RNA, or protein) on a 2D grid, placing a dot for each matching character to highlight similarities and identify repeats, insertions, or deletions through graphical pattern
24
Q
  • Compares the sequences as a whole
  • analyzes polymorphism between closely related sequences
A

Global Sequences Alignment

25
What algorithm does Global sequence alignment use?
Needleman-Wunch algorithm
26
uses dynamic programming equations to align sequences - aims to align all sites optimally within the sequences
Needleman-Wunsch algorithm (1970)
27
stretches of sequences with high density of matches are aligned - detect similar subsequences in two sequences
Local alignment
28
algorithm that is used in Local alignment
- Smith-waterman Algorithm
29
the algorithm based on high-scoring local match - identify all possible subsequences of 2 sequences
Smith-waterman algorithm
30
is used in sequence alignment to assign numerical values to matches, mismatches, and substitutions.
Scoring matrix
31
types of substitution (scoring) matrices used in sequence alignment
- PAM (Percent Accepted Mutation) - BLOSUM-X (Blocks Amino Acid Substitution Matrix)
32
Derived from observed evolutionary mutations in closely related proteins. - Estimates the rate at which each possible residue in a sequence changes to each other resider over time
PAM
33
- Derived from conserved sequence blocks in protein families. - Designed to find both close and distant sequence relationship
BLOSUM
34
Different pairwise alignment Software
- ClustalW2/ ClustalOmega - TCoffee - JAligner - BLAST
35
Early version for multiple sequence alignment
ClustalW2
36
Modern, improved version of ClustalW2; handles large datasets efficiently.
ClustalOmega
37
- Produces highly accurate multiple sequence alignments. - Combines results from different alignment methods for better quality.
TCoffee
38
an algorithm for rapid searching of nucleotide and protein databases
Basic Local Alignment Search Tool
39
Applications of Blast
- Sequence Identification - Determine homologs - Determine proteins or genes - Gene discovery - Discover variants of gene or protein - Investigate alternative splice sites
40
Types of BLAST Variants
- blastp - blastn - blastx - tblastn - tblastx
41
use to compare a protein query to a data base of proteins
blastp
42
use to compare both strands of DNA query against DNA data base
blastn
43
translates a DNA sequence into six protein sequences using all six possible reading frames, and then compares "each of these" proteins to a protein databse
blastx
44
used to translate every DNA sequence in a database into six potential proteins, and the compare "your protein query" against each of those translated proteins
tblastn
45
is the most computational intensitive BLAST algorithm. - translates DNA from both a query and data base into six potential protein
tblastx
46
What doe E-value represent in blast result
- evaluates the statistical significance of a BLAST alignment.
47
the E-value taht is considered the cuttoff point
E=10^-4
48
The E-value which means that the two sequences are statistically identical
E=0
49