A Question about Naming/Numbering Conventions for FoldWork

Come join the... uh... er... fold.

Moderators: just brew it!, farmpuma

A Question about Naming/Numbering Conventions for FoldWork

Postposted on Fri Oct 05, 2012 11:27 am

I'd post this in one of the F@H "official" forums, but I don't have an ID yet. Will set one up if I decide to keep folding after Frankie.

So what is the numbering convention in parenthesis that comes after the project number?

For example, I've got one running right now for 7808(6,32,27) on my i7 laptop; set to wrap up in about 8 hours. Originally had a 1.2 day estimate, I lost a few hours when the laptop went to sleep while I was at work yesterday. This piece of work seems to be a big one, with a 7,000-ish point bonus for beating the timeout. Barring any unforeseen problems, that machine will easily finish in plenty of time.

So what does the 6,32,27 part mean? Do any of those numbers indicate the amount of processing/resources that might be expected?

Thanks!
BIF
Gerbil Jedi
Gold subscriber
 
 
Posts: 1629
Joined: Tue May 25, 2004 7:41 pm

Re: A Question about Naming/Numbering Conventions for FoldWo

Postposted on Fri Oct 05, 2012 11:40 am

Those numbers are the RUN, CLONE, and GENERATION of that particular protein.

http://fah-web.stanford.edu/cgi-bin/fah ... ned?p=7808

Dan Ensign @ Stanford wrote:Okay, here it is: The CLONE numbers are labels for each trajectory that we run. Each GENeration is another chunk of time along that trajectory. So, say that I benchmark CLONE0, GEN0 (the first 4 ns). That WU is then done, and the FAH software builds a new WU with starting coordinates (and velocities and stuff) where mine left off. Then the new WU -- GEN1 of CLONE0 -- gets sent to you, and you simulate the next 4 ns. And so on. So CLONE is a label for an individual trajectory, and GENerations are time steps along that trajectory.

RUNs are groups of similar CLONEs. All the CLONEs in a RUN have the exact same atoms, the exact atom positions, the same temperature, etc. The difference is the starting velocities -- the initial motions of all the atoms in the protein are randomized. Although statistically the velocities are determined by the temperature, there are countless ways of partitioning the velocities to the atoms, so we try out 100 or so CLONEs to get a good feel for the sample space. Assigning different velocity sets to the atoms turns out to be wildly important: if the conformation we start with happens to represent the transition state (sort of halfway from folded, halfway from unfolded) then 50 of our 100 CLONEs will fold, and 50 won't.

The different RUNs in a PROJect might, in their simplest form, represent different starting conformations. So, we could start off 100 RUNs of different partially unfolded structures and try to find the one for which half of its CLONEs fold -- then that RUN has the conformation of a representative of the transition state.
Main: i7 4790K - Z97 mATX - 16GiB DDR3 1866 - GTX 970 - 256GB 850 PRO - 500GB 840 EVO - HGST 3TB - U2415 - Win8 Pro x64
Work/Play: 2012 13" Macbook Air
Work: 2014 Dell Inspiron 7000
DancinJack
Minister of Gerbil Affairs
 
Posts: 2067
Joined: Sat Nov 25, 2006 3:21 pm
Location: Massachusetts

Re: A Question about Naming/Numbering Conventions for FoldWo

Postposted on Fri Oct 05, 2012 3:18 pm

Fascinating, but..damn, I think I just heard a popping sound in my brain while I read that! :o

Thanks for that info; I'm loving this stuff!

So then a followup question. It's obvious that run/gen/clone does not indicate the "bigness" or complexity of a work unit. Is there a consistent way to tell, or is it maybe easiest to tell based on the base points awarded and/or the bonus points?

My system's reported "ETA" is one way, but ETA on my i7 doesn't align with ETA on the old Q6600, so it's really hard to compare.
BIF
Gerbil Jedi
Gold subscriber
 
 
Posts: 1629
Joined: Tue May 25, 2004 7:41 pm


Return to TR Distributed Computing Effort

Who is online

Users browsing this forum: No registered users and 1 guest