Aller au contenu
PcPerf.fr

PcPerf bot

PcPerfonaute
  • Compteur de contenus

    399
  • Inscription

  • Dernière visite

Tout ce qui a été posté par PcPerf bot

  1. I wanted to post an update on our GPU3 beta test. It is going well, so we have put the GPU3 client on our high performance client download page. This new client is required for all Fermi hardware, but also allows pre-Fermi NVIDIA GPUs to access the new GPU3 cores. These cores are labeled core15 (which has already been extensively tested and is in production right now) as well as a new core16 which will be appearing in testing in the coming weeks. We are also working to finish our OpenCL port for ATI GPUs to support GPU3 on ATI, but there are still performance issues which are holding back this release. You can see more information about the key software behind the GPU3 cores at the OpenMM project website. If you're curious, there is openCL code there for ATI and we invite the open source OpenCL community to check out this code and see how they can help if interested (note the code is released under an LGPL license). Voir l'article complet
  2. There has been an ATI WU shortage over the last few days. We have been working on it during the week, but there have been some unusual circumstances involved this time which has delayed a complete fix. We have made a temporary fix early Sunday (1AM pacific time) which should help some donors, but are continuing to monitor the situation and expect that a more complete fix will come next week. Voir l'article complet
  3. With Folding@home (FAH), we have the computer power to tackle challenging problems involved with protein folding. One of the interesting folding-related problems has to do with how proteins (and their conformational change) catalyze viral infection. While viral infection is not a major thrust of FAH, it has been a pilot project for several years. We are happy to announce the publication of some of our recent FAH scientific results: "Atomic-Resolution Simulations Predict a Transition State for Vesicle Fusion Defined by Contact of a Few Lipid Tails" This work represents a major step forward in this project, as we can now study the process in all-atom detail and get some better sense of the role of proteins and protein conformational change in the process. This paper describes work on the mechanism of vesicle fusion, a process involved in viral infection, the transmission of nerve impulses, and cellular secretion. In it, we have analyzed the mechanism of membrane fusion in greater detail than previously feasible, yielding predictions for how influenza may use this mechanism to enter cells. This analysis was powered by our Folding@Home donors. FAH project 2681 directly contributed to this work; we are also following up several other avenues. http://www.ploscompbiol.org/article/info:d...al.pcbi.1000829 A summary follows below: Membrane fusion is a common underlying process critical to neurotransmitter release, cellular trafficking, and infection by many viruses. Proteins have been identified that catalyze fusion, and mutations to these proteins have yielded important information on how fusion occurs. However, the precise mechanism by which membrane fusion begins is the subject of active investigation. We have used atomic-resolution simulations to model the process of vesicle fusion and to identify a transition state for the formation of an initial fusion stalk. Doing so required substantial technical advances in combining high-performance simulation and distributed computing to analyze the transition state of a complex reaction in a large system. The transition state we identify in our simulations involves specific structural changes by a few lipid molecules. We also simulate fusion peptides from influenza hemagglutinin and show that they promote the same structural changes as are required for fusion in our model. We therefore hypothesize that these changes to individual lipid molecules may explain a portion of the catalytic activity of fusion proteins such as influenza hemagglutinin. Voir l'article complet
  4. Here's an update on our v7 client efforts. The v7 client is a complete re-write in order to make the client more reliable, simple to maintain, and easier to include features that donors have requested. The first versions will not have everything that donors have asked for, but there will be some significant changes, such as the integration of classic, SMP, and GPU clients, the ability for a single client to handle all of these (eg multi-core + GPU) simultaneously (via multiple cores), a cleaner and more reliable GUI, much better and sophisticated tools for 3rd party developers, and in time the ability to use the FAH client to manage multiple machines easily. The console version of the v7 client has gone through internal testing and we're starting a very limited alpha release. There are definitely some rough edges, but I'm excited to see it get this far. Assuming there are no show stoppers, I hope that we'll have something ready for open beta testing in a month or two, maybe sooner. Voir l'article complet
  5. One key SMP server is down (vspg9), which brings down all of its associated interfaces (vspg9a, vspg9b). This is making us very short on SMP WUs. We are actively working on this one, although our IT staff has told me that this one isn't an easy fix (multiple restarts haven't brought the key RAID back and they are working with the hardware vendor to see what's going on). I will post updates as we get them. Voir l'article complet
  6. The GPU3 open beta test for NVIDIA GPUs is going well. There have been a few issues uncovered and our team is working on it. However, there haven't been any major show stoppers so far, which is good news. Here's what lies ahead. We will continue QA in an open beta format for a little while longer until we can resolve the remaining most significant issues. Then, the new client will replace the existing GPU client. However, the science/WS switch over will take much longer. In general for FAH, science calculations that were started with one core (eg core11) will need to be completed with it. New NVIDIA GPU projects will start up with core15 (although there may be a few projects already in the pipeline that will use still core11), but this switchover to using only core15 could take a while, easily 3 to 6 months, or maybe longer depending on how long we need to complete the existing core11 projects. Since Fermi boards must run core15, we are prioritizing core15 assigns to Fermi (with only a few core15 projects, we would run out of Fermi work unless we do this). So, donors without Fermi cards will likely see mostly core11 WUs short term. This will change as more core15 project come on line in the coming months. However, there is still a major benefit for GPU donors to run the new client. As we switch over, more and more WUs will be running in core15, so running the old client would eventually lead to WU shortages, etc. There is no need to switch over immediately as there will be plenty of core11 WUs for quite a while (eg on the weeks to months timescale). We are also making a major push for OpenCL on ATI and NVIDIA. We are working closely with NVIDIA and ATI on this and together we are making progress, although this new core does still seem to be some time out. Voir l'article complet
  7. We have a new GPU core (core 15) going into an open beta test for NVIDIA clients. This core requires a new client (see below) as well as the latest drivers (197.45). This core is the first run of the GPU3 technology, derived from the OpenMM project at Stanford (http://simtk.org/home/openmm). You can find more information in our GPU3 FAQ (see url below). While this release is for NVIDIA only to start, we are actively pushing ATI support (with the help of AMD/ATI), although we have no ETA at the moment. This is the first open beta test of this new client and core, so there are likely bugs to be found as more donors try this out on more diverse sets of hardware. Also, the documentation (GPU3 FAQ) is new too and there are possibly some errors there too. However, the client has been QA'd both internally at Stanford and with our closed group of beta testers and is looking pretty good so far. Some testers in the closed beta test have found problems with 8800 and 9800 class GPUs (we are working on this). Please post bugs or issues in the new GPU3 section of this forum NVIDIA Client download: SYSTRAY: http://www.stanford.edu/~friedrim/.Folding@home-systray-632.msi ://http://www.stanford.edu/~friedrim/....ystray-632.msi (md5sum=effd87ba12c96be28e252bccbe776ff9) VISTA CONSOLE: http://www.stanford.edu/~friedrim/.Folding@home-Win32-GPU_Vista-631.zip ://http://www.stanford.edu/~friedrim/...._Vista-631.zip (md5sum=b41301886881958c64c1907b3ed6acae) XP CONSOLE: http://www.stanford.edu/~friedrim/.Folding@home-Win32-GPU_XP-631.zip ://http://www.stanford.edu/~friedrim/....GPU_XP-631.zip (md5sum=885e36a477d247487f8009335bd4e3cc) GPU3 FAQ: http://folding.stanford.edu/English/FAQ-NVIDIA-GPU3 Voir l'article complet
  8. We've been working on the WU shortage issue and have some positive items to report. First, we have greatly improved the AS logic so it uses more information about the CPU. This information is only available in the v6 client or later, so it is important to upgrade to v6 if you're not getting WUs. The main jist is that we can now identify directly whether a machine has SSE or SSE2 support directly, so we can better assign to cores that only support SSE or SSE2 (such as the protomol core, which currently only supports SSE2). This should be a big help to Linux clients as well, which were not well handled by the AS before. Second, there are a lot of available WUs for Protomol right now, but only for advanced methods clients. If you would like to try that out, set your client for the "Advanced Methods" setting. Note that the Protomol team looks to have fixed the checkpoint bug (which has kept this core in the Advanced Methods QA level) and we hope to roll out this core to all of FAH once again, with this issue fixed. Finally, we have also identified a potential issue with the AS code which might make its logic fail in certain cases. Basically, in the old days of FAH, we could get away with 32-bit floating point numbers for internal AS calculations, but now with so many servers and all of FAH's complexities, floating point roundoff for certain AS logic could be causing problems. We will be working on a fix for this, but this is something we must do carefully (not just a global replace of float -> double) and so it will take some time to implement and test this. Voir l'article complet
  9. We're low on jobs for machines w/o SSE capabilities. We are working to fix this. By the way, I often get asked "how come FAH can get low on jobs?" This is a good question, considering that since FAH studies temporal phenomena, when one (Work Unit) WU comes in, the work servers automatically build the next one. So, it should be impossible (o rat least very difficult) to run out of jobs, IF everyone plays by the rules. But that's not the case. Many people attempt to "cherry pick" WUs, i.e. they dump WUs until they get one which is most favorable for them points-wise. This means that they take away WUs from other people, since our server waits until the WU times out before sending it to someone else. This can take a long time on certain WUs. We have several schemes implemented to fight cherry picking and keep WUs flowing to all the donors, but some times the cherry picking gets very aggressive and we run out of WUs, like today. We are looking into addressing this issue short term (getting more jobs going) as well as long term (better solutions to cherry picking problems). The FFF bonus scheme is such an example of a plan, which seems to be working reasonably well. We are looking into expanding it more broadly. However, you can help us help other donors (and keep our research going). Please do not cherry pick WUs. This slows down FAHs progress, makes other donors unhappy, and (eg based on FFF schemes) will lead to lower points for those who do this in the future. Voir l'article complet
  10. One machine (vsp08) and its interfaces (vsp08a, vsp08b, vsp08c) is down. Our sysadmins have been notified. Their response is slower on the weekends, so this may have to wait until Monday to come back up. Voir l'article complet
  11. We needed to take one physical server down, but that takes down several interfaces, including vsp07, vsp07v, vsp07b, vsp17, vsp17v, vsp22, vsp22v The machine is fscking now, so it may take a few hours for it to come back. UPDATE 7am PST Apr 28 2010: This machine is still down and may be having trouble coming back from its restart. We will know more when the sysadmins report on this later today. UPDATE 2pm PST Apr 28 2010: This issue is now resolved. Voir l'article complet
  12. As we've discussed in previous posts, due to its great computational abilities, our GPU client has had a great scientific impact so far. In our most recent FAH paper (also see the ), the GPU clients play a star role in allowing Folding@home to push to unprecedented levels, simulating protein folding on the millisecond timescale in an atomistic model. We are prepping for the rollout of the next generation GPU client (GPU3). As mentioned in previous posts, GPU3 will allow for greatly enhanced science (including more accurate models, new science can be done, 2x faster execution of the science, more stable simulations, OpenCL support for run time science optimizations, and greater flexibility for adding new scientific capability). This is accomplished through the use of the OpenMM GPU library (which originally came from FAH GPU code, but has been significantly enhanced by Simbios staff). We would like to give donors a heads up of what's coming. We are doing internal testing now and will do closed beta testing hopefully soon. With the rollout of the new GPU3/OpenMM-based core (core15) for NVIDIA GPU clients, we will need donors to do two software installs (please note that this is not required to be done immediately, since the new client is not openly available): 1) In order to get WUs using this new core, donors will need to make sure their CUDA level is least CUDA 2.2, but ideally 2.3 or the most recent. To know which version of CUDA you have, you can find out based on your driver version: CUDA 2.0: 177.35+ CUDA 2.1: 180.60+ CUDA 2.2: 185.85+ CUDA 2.3: 190.38+ CUDA 3.0/OpenCL: 195.36+ 2) A new client will be needed to access GPU3 WUs. This new client will report the CUDA level to the assignment server, so it can assign around machines with less capable CUDA levels. Note that "assigning around" the issue means that if your client can't do the work available, it won't be assigned a WU, so it's best to make sure your CUDA drivers are reasonably updated. We feel this is better than giving a WU which will crash the core, etc. While the new client has not been openly released yet, we wanted to give this heads up to donors so they have time to upgrade their drivers. Thanks to all of the GPU folders. We have done some great work so far and the best results are yet to come! Voir l'article complet
  13. We had some WU shortages over the weekend, but for the most part handled the biggest demands. However, we still have shortages for certain types of WUs, especially for pre-v6 clients. We are working to add more A3 WUs as well as more staff members to prepare A3 WU projects. Several new classic WUs came on line. Also, once the protomol core (B4) gets more broad code applicability to older hardware (eg pre SSE), those WUs will be able to roll out more broadly as well. Voir l'article complet
  14. We have been working behind the scenes to optimize the Folding@home GPU client for the new NVIDIA GTX 4xx hardware. So far, it's been going well with us hitting some strong performance numbers. We are internally testing this and hope to soon (weeks) release this for outside beta testing. Please note that GTX 4xx support will require a new client and also requires some changes to our cluster backend software. Voir l'article complet
  15. We will likely be taking an ATI GPU server (vspg2v2) down today for maintenance. We are working to get more jobs on its sister server (vspg3v2) to avoid WU shortages, but we're giving a heads up for donors just in case this doesn't time out well. Voir l'article complet
  16. There have been some questions about the server status for Folding@home. The problem from the donor perspective is that the lack of WUs looks very similar to the servers being down -- the client reports it can't connect to the server. The servers have been up and in good shape since about Feb 25 (see the blog post on that day -- and that was just for the limited case of NVIDIA GPU servers, not FAH-wide). However, we have had a WU shortage now and again over the last week or so, which donors have mistaken as server reliability issues. To fix this, we have been working to greatly increase the number of WUs, both for classic and SMP clients. There are a very large number of classic WUs that are coming out in new cores: the Protomol B4 core and the new Gromacs A4 core. B4 has rolled out and A4 is coming out in a week or so (maybe sooner). We also have new A3 SMP WUs in the pipeline. There have been some science issues that we have been working out on them. Right now we are a bit too close, which leads to shortages for certain types of clients. My goal is to have way more WUs than we need so there are never any donor delays in getting WUs and we are pretty close to that, once the last few issues get worked out. Voir l'article complet
  17. It is looking like the WS bug fixed has helped resolve the NV issue GPU work server issues. We are keeping a close eye on things, but it looks like the situation has been stable so far. Voir l'article complet
  18. Joe has been pounding on the v5 WS trying to shake it out from the recent disaster with problems returning NVIDIA GPU WUs. The upshot of all of this is that the v5 server code was pushed hard in many ways and several issues have now been found. Joe is testing them, but we're hopefully that beyond the initial good news we had a few days ago, that several additional issues may now be fixed. It's too early too tell since we're still testing, but I'm optimistic. This only affects particular servers (vsp07b, vspg10a, vsp11a) and the vsp09a CS. Voir l'article complet
  19. We have been working to track down the nasty bug on the NVIDIA GPU WS's that is causing problems for donors sending back WUs. We have been trying different fixes over the last week, but this has been very tricky to figure out. After another brainstorming session this afternoon, I think we have a good plan for the short term and long term. I hope that new WUs being assigned won't see this problem due to rerouting of assignments. Joe is also going to pound out the bugs on his new WS on vspg11a to get that going. I'm very sorry for this major issue. This has been called the worst outage we've had and I think we agree. I've had a long chat with the development team about this and we've talked about how to fix issues in the WS code release cycle. I think the plan we have in place will stop this from happening in the future, but the main issue right now is to solve the problems at hand. Voir l'article complet
  20. Looks like the stanford main web servers (including those which host the main university page www.stanford.edu ) are down or very slow. We've filed a ticket. Note that the stats pages on fah-web.stanford.edu are run on a separate system and this server is up, including DONOR STATS: http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userstats TEAM STATS: http://fah-web.stanford.edu/cgi-bin/main.py?qtype=teamstats Voir l'article complet
  21. Our main stats web server is being hit with a denial of service-like attack from several machines. They are accessing cgi-bin urls multiple times per second per IP, which is slowing down the web server for everyone else. We have banned some IPs, but we will look back and ban some more as needed. Please stop running scripts -- it ruins the stats for everyone else. UPDATE: It seems like the DOS-ers are often going to the fahproject page, so we have deactivated it for now to keep the rest of the site up. This seems to have helped a lot, coupled with some IP banning. Voir l'article complet
  22. We're done with the bulk of our initial update to new hardware. We'll be doing some more work in the future to build up some additional capacity, namely hopefully getting to the point where the stats are never off line. For now I think we're in good shape. The stats are much faster than before, so we've turned back on a lot of the capabilities we previously turned off. Also, stats update are taking about 5 minutes and now are limited not so much by db access than by other issues. Moreover, we have now set the third party stats to update once an hour (instead of once every 3 hours). It's set to update 10 minutes before the hour, every hour, so checking on the hour should be safe. Note that the pages that are updated are: http://fah-web.stanford.edu/daily_user_summary.txt.bz2 http://fah-web.stanford.edu/daily_team_summary.txt.bz2 Please do not use scripts to access our main pages (i.e. anything with a cgi-bin in the url). We reserve the right to ban any IP that violates this rule, as it slows down the stats for everyone else. Voir l'article complet
  23. We've talked about this for some time, but now's the time to start the migration to the new stats db hardware. We are doing it now and everything looks ok so far. We are keeping several safeguards in place in case there is a problem. IF there is a problem with the stats, please bear with us. There are several links we need to update and it's possible that a link is still pointing to the old db. Also, in case of emergency, we are keeping track of all the new stats from this point in a special place, so even in the worst case scenario, we can just go back to the old db and input all the new stats into it. So, the stats will be down for a bit and there may be some inconsistencies for a day or so while we get all the links updated. The good news is that we'll have much faster stats soon, which will be great for all of us. UPDATE 1 The migration is now done and it looks like everything is working. We've tested out the stats pages and done a small manual stats update. All looks good. However, since stats are so important, before diving in and just putting everything back to automatic updates, I wanted to see if donors see any problems. If you do, please report them in our forum (http://foldingforum.org). UPDATE 2 It looks like everything has migrated well. We have the stats back on normal updates and those updates are going fast (under 10 minutes). With the new hardware, I bet we can make it even faster, but that's for later. We have turned back on certain features we previously turned off (eg CPU counts). We have more ambitious plans for the future, especially ideally getting to the point where the stats are never off line (even during updates), which is now possible with the new hardware. Voir l'article complet
  24. Over the weekend, we've had trouble with cron on the server which initiates stats updates. This doesn't affect that stats data itself (past, present, or future), just the initiation of various tasks. This machine was backedup and then restarted this morning and it looks to be happy now. We are keeping an eye on things to make sure that all is back to normal. Voir l'article complet
  25. The electrical power is out at most Stanford buildings as of 5:20 this morning (Tuesday, Jan 19). SHC Engineering and Maintenance reports that the Stanford Cogen power plant (the power plan that powers Stanford) is currently off line. Emergency generators in most of our server rooms are operating, but one room (associated with VSPG* servers) is currently without power. We do not currently have an estimate of when power will be restored. Moreover, while much of FAH is still up at the moment, we may have to take servers down if the temperature in the server rooms gets too high (the cooling is down as well). We'll update the blog as we get news. You can also see which servers are up or down on our serverstat page. Voir l'article complet
×
×
  • Créer...