BOINC 6.4.5 and CUDA?

Author	Message
mdoerner Volunteer developer Volunteer tester Send message Joined: 30 Jul 08 Posts: 202 Credit: 6,998,388 RAC: 0	Message 771 - Posted: 29 Jan 2009, 23:46:17 UTC Hiya TJM, Looks like the latest version of BOINC supports NVIDIA GPU's as a co-processor. Any chance our source code can be updated to accommodate those of us with NVIDIA GPU's? I suppose it just as easy to update the code with AMD ACML libraries too.....;-) Mike Doerner ID: 771 · Rating: 0 · rate: / Reply Quote

mdoerner Volunteer developer Volunteer tester Send message Joined: 30 Jul 08 Posts: 202 Credit: 6,998,388 RAC: 0	Message 789 - Posted: 15 Feb 2009, 23:32:35 UTC - in response to Message 771. What....no sense of humor here???? ;-) Mike Doerner PS I take it that incorporating CUDA technology into the code is a rather big deal..... ID: 789 · Rating: 0 · rate: / Reply Quote

thinking_goose Send message Joined: 12 Nov 07 Posts: 119 Credit: 2,750,621 RAC: 0	Message 790 - Posted: 16 Feb 2009, 0:32:26 UTC - in response to Message 789. I have also noticed that the new client supports these gpu's. It would be nice to take advantage of this facility, but at the moment I beleive only 1 or 2 projects have actually attempted to incorporate it. I can only think it is very involved. ID: 790 · Rating: 0 · rate: / Reply Quote

noderaser Send message Joined: 24 Dec 08 Posts: 88 Credit: 1,496,863 RAC: 0	Message 791 - Posted: 16 Feb 2009, 7:27:03 UTC This question is being asked at virtually every project, and the answer for most is either not very soon, or never. Although I'm no programming master, GPUs are not designed to do general computing, and there are only a few projects where GPUs would be useful for the types of calculations being done. I'm going to say that codebreaking is probably not one of the project types that would benefit from adding GPU support. I would expect that when things like OpenCL come out (that make programming for CPUs and GPUs virtually the same), that you will see GPU support for more applications and projects. There are others who can describe the inner-workings of a GPU and why they're not ideal for all projects better than I, but that's the gist of it. Click Here to see My Detailed BOINC Stats ID: 791 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 792 - Posted: 16 Feb 2009, 13:12:50 UTC Implementing CUDA basically means that the whole app has to be rewritten to support GPU or CPU+GPU. This is definitely going to be too hard for me, my C/C++ programming skills are low, and knowledge about CUDA programming is close to none :-P M4 Project homepage M4 Project wiki ID: 792 · Rating: 0 · rate: / Reply Quote

batan Send message Joined: 26 Nov 08 Posts: 1 Credit: 11,815 RAC: 0	Message 795 - Posted: 16 Feb 2009, 16:36:39 UTC - in response to Message 791. Last modified: 16 Feb 2009, 16:37:52 UTC ... I'm going to say that codebreaking is probably not one of the project types that would benefit from adding GPU support. From the Wikipedia article about CUDA: "CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more." ID: 795 · Rating: 0 · rate: / Reply Quote

Carter11 Send message Joined: 25 Nov 08 Posts: 12 Credit: 149,614 RAC: 0	Message 799 - Posted: 18 Feb 2009, 18:29:33 UTC If I unterstand the server status page right, it will be at most one year, until all work for this project is done (after hceyz is done in july, completion time for awgly will decline too, right?). I wonder if it's worth rewriting the whole app for that. ID: 799 · Rating: 0 · rate: / Reply Quote

bill brandt-gasuen Send message Joined: 19 Oct 08 Posts: 1 Credit: 1,102,081 RAC: 0	Message 800 - Posted: 24 Feb 2009, 5:08:43 UTC I for one say don't mess with a good thing. For all TJM's claims of lacking the required proficiency this project has run like clockwork ever since I started. I just hope the solution doesn't end up like Walter Miller's A Canticle For Leibowitz! ID: 800 · Rating: 0 · rate: / Reply Quote

Petter Neumann Send message Joined: 26 Jan 09 Posts: 1 Credit: 2,865,602 RAC: 0	Message 1275 - Posted: 1 Oct 2009, 15:20:49 UTC - in response to Message 800. Enigma CUDA support ?? could this be implemented ??''' http://www.werty98.homelinux.com/tech/enigma/ ID: 1275 · Rating: 0 · rate: / Reply Quote

fitz Send message Joined: 15 Apr 09 Posts: 31 Credit: 147,954 RAC: 0	Message 1276 - Posted: 1 Oct 2009, 18:29:28 UTC looks like i may have to get a cuda card :(!...but would certainly be cool - might be worth seeing if they could port it to open cl - i think like me there are a lot of peeps with ATI cards out there who are feeling left out! ID: 1276 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1279 - Posted: 3 Oct 2009, 0:44:40 UTC - in response to Message 1275. Last modified: 3 Oct 2009, 0:57:02 UTC Enigma CUDA support ?? could this be implemented ??''' http://www.werty98.homelinux.com/tech/enigma/ This is a Turing's bombe simulator, completely different approach compared to what we use here. And useless without cribs. I think that most of the possible cribs for the third message were tried already, I remember that someone from the M4 mailing lists was trying to solve the 3rd message that way. This of course doesn't mean we couldn't try more cribs, but there are at least two problems: -The bombe input data cannot be auto-generated. I think it wouldn't be too hard to write a script/program to autogenerate the bombe menu, or multiple menus, but still it would require user interaction to provide a crib for each single workunit (or set of workunits - perhaps a better way would be to try a crib at all possible locations). -The bombe output is (sometimes a very long) list of machine settings. Again, checking this can be automated only partially - writing a script to provide plaintext for each entry on the list won't be too hard, but then someone has to review all the results. Theoretically this could also use a bigram/trigram scoring to detect/mark the best possible results. Integrating the simulator into the BOINC seems quite easy, I believe it would run without problems with a standard BOINC wrapper, but probably without progress meter. I'm open for any suggestions - if you want to run a 'distributed bombe' - I'll ask the author of the simulator for permission to use his app and eventually we'll do it. looks like i may have to get a cuda card :(!...but would certainly be cool - might be worth seeing if they could port it to open cl - i think like me there are a lot of peeps with ATI cards out there who are feeling left out! Is the OpenCL support for ATI cards finished already ? I just started to learn GPU programming, so I thought that I'll try to learn both CUDA and OpenCL in parallel. I have access to the latest nVidia tools/drivers/docs (I'm a registered developer), so I played a bit with the OpenCL already. For now it seems still quite buggy, even with the examples provided with SDK I had some glitches here and there, and the compiler itself didn't like my install paths at all, so I had to edit a few things to make it work. M4 Project homepage M4 Project wiki ID: 1279 · Rating: 0 · rate: / Reply Quote

Orakk Send message Joined: 11 Oct 09 Posts: 1 Credit: 786 RAC: 0	Message 1296 - Posted: 11 Oct 2009, 1:13:02 UTC Milkyway & Collatz are doing well with both Cuda & ATI right now. Working well to is Boinc development, 6.10.3 is mostly behaving itself. SeriousCrunchers@Home ID: 1296 · Rating: 0 · rate: / Reply Quote

MJ Send message Joined: 17 Nov 07 Posts: 16 Credit: 95,844 RAC: 0	Message 1301 - Posted: 23 Oct 2009, 20:49:50 UTC I think it would be really cool for people to be able to suggest cribs. It would help keep interest in the project. Too bad you kind of have to know German though. I guess you could still just try common words from enigma messages without german. ID: 1301 · Rating: 0 · rate: / Reply Quote

doublechaz Send message Joined: 5 Mar 09 Posts: 27 Credit: 1,517,764 RAC: 0	Message 1375 - Posted: 19 Nov 2009, 1:19:20 UTC Given this project I don't advocate CUDA. It's too bad though. I tried the RC5 tool on CUDA from distributed.net and I got a speedup of 100x with my GTX285. 3Mkeys/s to 308Mkeys/s. Wow! But we'll finish this project before the porting time. At least I hope we will. :) ID: 1375 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1409 - Posted: 26 Nov 2009, 1:04:34 UTC - in response to Message 1375. Last modified: 26 Nov 2009, 1:12:16 UTC Well, the third Naval M4 message is still unbroken for now, so I think that the CUDA app might still be useful in future. For now instead of porting the entire app, I'd rather go for brute force approach. The hillclimb algorithm isn't that complicated, but rewriting it to run fast on the GPU might exceed my skills. I'm not sure if algorith with so many conditions and loops / short procedures could even run at a decent speed on GPUs. Bruteforce on the other hand is much simplier, it will be much easier to implement on GPU. At least I hope so. It has to be extremely fast to be useful, because the total number of possible machine setting for an M4 Naval enigma is just insane: 619 953 965 349 522 804 000 000 Going through such a high number of combinations would be impossible on CPUs, but parallel algorithm for modern GPUs might be able to do it. The bruteforce itself seems realistic, but there's another thing - no one ever will be able to manually check all the results, so there must be some kind of automated scoring algorithm, perhaps more than one, to check the results and filter out all the junk. It also has to be fast, otherwise it will slow down the entire process. Probably I'll see if it's possible to run something similar to bigram/trigram scoring in parallel on a GPU. M4 Project homepage M4 Project wiki ID: 1409 · Rating: 0 · rate: / Reply Quote

Sword Send message Joined: 18 Nov 09 Posts: 11 Credit: 1,052,256 RAC: 0	Message 1411 - Posted: 26 Nov 2009, 16:28:50 UTC - in response to Message 1409. The first two Naval messages from the M4 project are broken. Both were signed by U-boat commanders in the very last position of the text or in the very beginning,as far as I understand.The third and unbroken message may also be from another U-boat commander. I got an idea that may not be original and useful, but I think of that I have been reading about socalled cribs. Here is a link to the list of all german U-boat commanders during WW2 : http://www.uboat.net/men/commanders/l.htm . The names of both Schroeder and Looks are to be found here. By first " washing " this list and testing the last names in the right position maybe it is to some help nearer a solution. ID: 1411 · Rating: 0 · rate: / Reply Quote

mdoerner Volunteer developer Volunteer tester Send message Joined: 30 Jul 08 Posts: 202 Credit: 6,998,388 RAC: 0	Message 1414 - Posted: 27 Nov 2009, 19:10:58 UTC - in response to Message 1409. Hey TJM, I believe the Nvidia ncc compiler is based on the Pathscale/Open64 compiler similar to what AMD did with Open64. You may not need to optimize it for the GPU since I would think all the optimization would be done by the compiler. (I think) Mike D ID: 1414 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1419 - Posted: 30 Nov 2009, 12:30:36 UTC - in response to Message 1414. Hey TJM, I believe the Nvidia ncc compiler is based on the Pathscale/Open64 compiler similar to what AMD did with Open64. You may not need to optimize it for the GPU since I would think all the optimization would be done by the compiler. (I think) Mike D It doesn't work that way. Theoretically it's possible to rewrite simple app for GPU with just minor changes in the code. But without completely redesigning the app performance is very poor - for example, some time ago I compiled a single function (icscore) to check the performance on a GPU. Standard code was very slow, barely could be compared to a Pentium III 700-800MHz on a GF9600GT, while the same function rewritten to fully use GPU capabilities ran around 50 times faster. I think that some parts of the enigma code would be extremely slow on GPU due to large number of conditional statements switching/skipping part of the code. M4 Project homepage M4 Project wiki ID: 1419 · Rating: 0 · rate: / Reply Quote

TJM Project administrator Project developer Project scientist Send message Joined: 25 Aug 07 Posts: 843 Credit: 267,994,998 RAC: 0	Message 1421 - Posted: 30 Nov 2009, 21:05:28 UTC Last modified: 30 Nov 2009, 23:44:40 UTC I tested the bruteforce method on a CPU. It takes around 30 seconds to test a single stecker doing a fulł round on M3`s single wheel order. Considering the number of all stecker connections (all combinations of 10 pairs out of 26 letters) thats way too slow. Now the question is, how fast could the GPU app run compared to the CPU. Without ~1000x speedup I doubt that we could see any result without traveling in time :-) M4 Project homepage M4 Project wiki ID: 1421 · Rating: 0 · rate: / Reply Quote

mdoerner Volunteer developer Volunteer tester Send message Joined: 30 Jul 08 Posts: 202 Credit: 6,998,388 RAC: 0	Message 1422 - Posted: 30 Nov 2009, 23:37:52 UTC - in response to Message 1421. I hope you get a Nvidia GPU soon. Of course, my GPU is available for testing, should the need arise....;-) The only thing I didn't realize about using my particular video card is my G92 GPU (older 9600 GSO card) is OVERCLOCKED by default from the factory (mine was made by PNY). Occasionally, it chokes, which is why I don't do GPUGrid.net anymore. I'm sure more cooling would help, or using nvclock to slow it down, but that's after DAYS of non-stop crunching. FWIW. Mike D ID: 1422 · Rating: 0 · rate: / Reply Quote