protein structure prediction Flashcards

1
Q

motivation for structure prediction

A
  • inform about function
  • guide raitonal drug design
  • mutagenesis
  • solve structures from experimental data
  • fundamental understanding of chemistry of protein structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

CASP

A
  • critical assessment of protein structure prediction
  • blind trial to evaluate different approaches
  • sequences sent to predictors prior to revealing experimental coordinates
  • manual evaluation every 2 years
  • combined with server-only predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ab initio energy calculations

A
  • original idea to describe interactions between atoms
  • search for conformaiton of lowest energy
  • energy minimisation methods, followed by molecular dynamics
  • from first principles
  • energy function needed first
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

energy function

A
  • potential energy of a protein in a particular conformation
  • V = bond length + bond angle + bond dihedral rotation + VDW + electrostatic interactions
  • molecular dynamics adds water molecules
  • energy minimsation adds ad hoc terms for hydrophobicity
    • or works in vacuo
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

energy minimisation

A
  • x, y, z obtained for each atom
  • calculate energy
  • make small positional changes to find path to lowest energy conformation (deltaG is minimal)
  • some success with small proteins
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

issues with energy minimisation

A
  • can get stuck in local minimum
    • think it is the lowest point but there is a global minimum
      • just can’t get there
    • solve with molecular dynamics
      • simulate protein as moving object
      • has momentum to overcome energy barriers
  • energy ladnscape is difficult to define
    • unsure if you are going up or down
    • energy terms are difficult to define - calculation can be wrong
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

secondary structure prediction

A
  • identify local structures
    • alpha, beta, coil, sometimes turn
    • 3 or 4 state prediction
  • determines local 3D structure to an extent
  • doesn’t work with 7 residue sequences
    • same 7 residue sequence in different proteins can produce compeltely different structure
  • algorithms look at window of ~15
    • long range effects involved
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

secondary structure prediction

accuracy measure

A
  • no of residues correctly predicted/no of residues considered
  • Q3 = accuracy measure of 3 state prediction
    • random result with equal numbers of each state = 33%
    • in a protein dominated by helices (80:20), best random prediction would say all helical = 80%
    • typical mix of 3 states, random result = 40%
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

old single sequence methods

A
  • simpler
  • used to derive newer methods
  • mainly based on obtaining rules from counting frequencies of residues in known structures
    • empirical
  • e.g. chou fasman
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

chou-fasman

A
  • numerical residue scores derived from data and ad hoc rules
  • based on secondary structure propensity
  • score>1 implies residue occurs in helix morefrequently than by chance
  • create matrix for alpha and beta propensities of all amino acids
  • pro/gly = helix breakers
  • some residues are similar
  • can be greater than 1 (not probability)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

helix breakers

A
  • helices need H bonds between NH and CO
  • pro:
    • side chains bends back to covalently bind NH
      • no H bond
  • gly:
    • small residue
    • makes a cavity
    • packs poorly against the rest of the helix
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

rules of chou-fasman

A
  • helix if:
    • run of 4 out of 6 residues favouring a helix
    • average helix propensity > 1 and > average beta strand propensity
  • extend helix until pro is found, or run of 4 residues with helix propensity <1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

stereochemical methods

A
  • recognise patterns of hydrophobic residues that favour secondary structures
  • empirical
    • enhanced by inspection of structures
  • no longer used but pattern concept still important
    • difficult to program
  • original Q3 ~ 60%
    • improved by ML and neural networks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

stereochemical methods

alpha vs beta

A
  • alpha:
    • 3.6 res per turn
    • amphipathic pattern consistent with helix
    • helical wheel plot
      • one side hydrophobic, other hydrophilic
  • beta:
    • can be buried
      • sheet with helices either side, run of hydrophobic residues
    • can be surface
      • stacked pair of beta sheets (Ig fold)
      • bottom sheet alternates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

artifical neural networks

A
  • simulates computation of brains
  • input signal and set of nodes
    • weight nodes so that input signal gives an output signal of alpha/beta
    • input and answer known - only need to find weights
  • once weights know, new sequence output can be produced
  • improved with MSAs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

use of MSAs

A
  • average Q3 = 80%
    • nearly all alpha identified, most beta
    • short edge beta strands poorly predicted
    • errors in defining precise ends
  • programs:
    • psipred
    • jnet neural nets
17
Q

homology modelling

A
  • infer structural similarity from sequence homology
    • search query sequence against sequence library of known structures
  • comparative/template based modelling
  • most accurate structure prediction method
18
Q

process of homology modelling

A
  • match query to database sequences
  • use best match to predict fold
    • >20% identity in psi-blast
    • can use multiple sequences/structures
      • helps gap placement
  • fit sequence and align main chain
  • add and adjust side chains to create predicted 3D structure
19
Q

loop regions in homology modelling

A
  • fix backbone coordinates immediately in structurally conserved regions
  • model variable regions in one of 2 ways:
    • database search of PDB
      • find short region of best guess of loop and transplant
    • energy minimisation to find geometrically viable methods for 5 residue regions
  • <6 residues generally well predicted
  • long loops poorly predicted
20
Q

homology modelling

side chain packing

A
  • specify dihedral chi angles of side chains
  • build up library of allowed angles
    • series of rotamers
    • each has position and likelihood
  • use algorithm to decide which rotaemrs fit well together
    • remove clashes to find best combination
21
Q

homology modelling

refinement

A
  • improve predicted structure with energy minimisation
  • remove bumps
  • molecular mechanics to calculate energy
  • CASP:
    • only a few gorups managed to improve models
    • often makes it worse
    • bumps drag things a long way out of position
      • better off to leave them sometimes
22
Q

fold recognition

A
  • threading
  • enhanced version of remote homolog recognition, beyond PSI-BLAST and template-based modelling
  • 3D information of library as well as searching the sequence
    • recongising fold as well as sequence
  • match sequence vs sequence with HMM as well as structure vs structure relationships
  • HHSEARCH
  • HHPRED
23
Q

Phyre2

A
  • algorithm for fold recognition
  • single domain:
    • HHblits gives MSA for query
    • PSIPRED predicts secondary structure
    • HMM for sequence and structure
    • carry over fixed regions from alignment
    • loop modelling
    • add side chains
  • multiple domains need to be combined
    • add together with ab initio methods
    • final mdoel with linked domains