Aller au contenu
PcPerf.fr
PcPerf bot

New methods for analyzing FAH data

Messages recommandés

Guest post from Dr. Gregory Bowman, UC Berkeley

Two general objectives of the Folding@home project are (1)to explain the molecular origins of existing experimental data and (2) toprovide new insights that will inspire the next generation of cutting edgeexperiments.  We have made tremendousprogress in both areas, but particularly in the first area.  Obtaining new insight is even more of an artand, therefore, less automatable. 

To help facilitate new insights, I recently developed aBayesian algorithm for coarse-graining our models.  To explain, when we are studying someprocess—like the folding of a particular protein—we typically start by drawingon the computing resources you share with us to run extensive simulations ofthe process.  Next, we build a Markovmodel from this data.  As I’ve explainedpreviously, these models are something like maps of the conformational space aprotein explores.  Specifically, theyenumerate conformations the protein can adopt, how likely the protein is toform each of these structures, and how long it takes to morph from onestructure to another.  Typically, ourinitial models have tens of thousands of parameters and are capable ofcapturing fine details of the process at hand. Such models are superb for making a connection with experiments becausewe can capture all the little details that contribute to particularexperimental observations.  However, theyare extremely hard to understand. Therefore, it is to our advantage to coarse-grain them.  That is, we attempt to build a model withvery few parameters that is as close as possible to the original, complicatedmodel.  If done properly, the new modelcan capture the essence of the phenomenon in a way that is easier for us towrap our minds around.  Based on theunderstanding this new model provides, we can start to generate new hypothesesand then test them with our more complicated models and, ultimately, viaexperiment.

Statistical uncertainty is a major hurdle in performing thissort of coarse-graining.  For example, ifwe observe 100 transitions between a pair of conformations and each of thesetransitions is slow, then we can be pretty sure this is really a slowtransition.  However, if we only observeanother transition once and it happens to occur slowly, who knows?  It could be that it is really a slowtransition.  On the other hand, it couldbe we just got unlucky. 

Existing methods for coarse-graining our Markov modelsassume we have enough data to accurately describe each transition.  Therefore, they often pick up these poorlycharacterized transitions as being important (for protein folding, we typicallycare most about the slow steps, so slow and important are synonymous).  The new method I’ve developed (describedhere) explicitly takes into account how many times a transition wasobserved.  Therefore, it canappropriately place emphasis on the transitions we observed enough times totrust while disregarding the transitions we don’t trust.  To accomplish this, I draw on Bayesianstatistics.  I can’t do this subjectjustice here, but if you’re ever trying to make sense of data that you havevarying degrees of faith in, I highly recommend you look into Bayesian statistics.

 

 

 

Voir l'article complet

Partager ce message


Lien à poster
Partager sur d’autres sites

×
×
  • Créer...