BOINC 6.4.5 and CUDA?

Message boards : Number crunching : BOINC 6.4.5 and CUDA?

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 771 - Posted: 29 Jan 2009, 23:46:17 UTC

Hiya TJM,

Looks like the latest version of BOINC supports NVIDIA GPU's as a co-processor. Any chance our source code can be updated to accommodate those of us with NVIDIA GPU's? I suppose it just as easy to update the code with AMD ACML libraries too.....;-)

Mike Doerner
ID: 771 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 789 - Posted: 15 Feb 2009, 23:32:35 UTC - in response to Message 771.  

What....no sense of humor here???? ;-)

Mike Doerner

PS I take it that incorporating CUDA technology into the code is a rather big deal.....
ID: 789 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
thinking_goose

Send message
Joined: 12 Nov 07
Posts: 116
Credit: 1,105,645
RAC: 0
Message 790 - Posted: 16 Feb 2009, 0:32:26 UTC - in response to Message 789.  

I have also noticed that the new client supports these gpu's. It would be nice to take advantage of this facility, but at the moment I beleive only 1 or 2 projects have actually attempted to incorporate it. I can only think it is very involved.
ID: 790 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
noderaser
Avatar

Send message
Joined: 24 Dec 08
Posts: 88
Credit: 629,026
RAC: 0
Message 791 - Posted: 16 Feb 2009, 7:27:03 UTC

This question is being asked at virtually every project, and the answer for most is either not very soon, or never. Although I'm no programming master, GPUs are not designed to do general computing, and there are only a few projects where GPUs would be useful for the types of calculations being done. I'm going to say that codebreaking is probably not one of the project types that would benefit from adding GPU support.

I would expect that when things like OpenCL come out (that make programming for CPUs and GPUs virtually the same), that you will see GPU support for more applications and projects.

There are others who can describe the inner-workings of a GPU and why they're not ideal for all projects better than I, but that's the gist of it.

ID: 791 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 69,628,840
RAC: 383,626
Message 792 - Posted: 16 Feb 2009, 13:12:50 UTC

Implementing CUDA basically means that the whole app has to be rewritten to support GPU or CPU+GPU. This is definitely going to be too hard for me, my C/C++ programming skills are low, and knowledge about CUDA programming is close to none :-P

M4 Project homepage
M4 Project wiki
ID: 792 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
batan

Send message
Joined: 26 Nov 08
Posts: 1
Credit: 11,815
RAC: 0
Message 795 - Posted: 16 Feb 2009, 16:36:39 UTC - in response to Message 791.  
Last modified: 16 Feb 2009, 16:37:52 UTC

... I'm going to say that codebreaking is probably not one of the project types that would benefit from adding GPU support.

From the Wikipedia article about CUDA: "CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more."
ID: 795 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Carter11

Send message
Joined: 25 Nov 08
Posts: 12
Credit: 148,116
RAC: 1
Message 799 - Posted: 18 Feb 2009, 18:29:33 UTC

If I unterstand the server status page right, it will be at most one year, until all work for this project is done (after hceyz is done in july, completion time for awgly will decline too, right?). I wonder if it's worth rewriting the whole app for that.
ID: 799 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
bill brandt-gasuen

Send message
Joined: 19 Oct 08
Posts: 1
Credit: 1,097,807
RAC: 0
Message 800 - Posted: 24 Feb 2009, 5:08:43 UTC

I for one say don't mess with a good thing. For all TJM's claims of lacking the required proficiency this project has run like clockwork ever since I started.
I just hope the solution doesn't end up like Walter Miller's A Canticle For Leibowitz!
ID: 800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Petter Neumann

Send message
Joined: 26 Jan 09
Posts: 1
Credit: 2,865,602
RAC: 0
Message 1275 - Posted: 1 Oct 2009, 15:20:49 UTC - in response to Message 800.  

Enigma CUDA support ??

could this be implemented ??'''

http://www.werty98.homelinux.com/tech/enigma/
ID: 1275 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fitz

Send message
Joined: 15 Apr 09
Posts: 31
Credit: 147,954
RAC: 0
Message 1276 - Posted: 1 Oct 2009, 18:29:28 UTC

looks like i may have to get a cuda card :(!...but would certainly be cool - might be worth seeing if they could port it to open cl - i think like me there are a lot of peeps with ATI cards out there who are feeling left out!
ID: 1276 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 69,628,840
RAC: 383,626
Message 1279 - Posted: 3 Oct 2009, 0:44:40 UTC - in response to Message 1275.  
Last modified: 3 Oct 2009, 0:57:02 UTC

Enigma CUDA support ??

could this be implemented ??'''

http://www.werty98.homelinux.com/tech/enigma/


This is a Turing's bombe simulator, completely different approach compared to what we use here. And useless without cribs. I think that most of the possible cribs for the third message were tried already, I remember that someone from the M4 mailing lists was trying to solve the 3rd message that way.
This of course doesn't mean we couldn't try more cribs, but there are at least two problems:

-The bombe input data cannot be auto-generated. I think it wouldn't be too hard to write a script/program to autogenerate the bombe menu, or multiple menus, but still it would require user interaction to provide a crib for each single workunit (or set of workunits - perhaps a better way would be to try a crib at all possible locations).

-The bombe output is (sometimes a very long) list of machine settings. Again, checking this can be automated only partially - writing a script to provide plaintext for each entry on the list won't be too hard, but then someone has to review all the results. Theoretically this could also use a bigram/trigram scoring to detect/mark the best possible results.

Integrating the simulator into the BOINC seems quite easy, I believe it would run without problems with a standard BOINC wrapper, but probably without progress meter.

I'm open for any suggestions - if you want to run a 'distributed bombe' - I'll ask the author of the simulator for permission to use his app and eventually we'll do it.


looks like i may have to get a cuda card :(!...but would certainly be cool - might be worth seeing if they could port it to open cl - i think like me there are a lot of peeps with ATI cards out there who are feeling left out!


Is the OpenCL support for ATI cards finished already ? I just started to learn GPU programming, so I thought that I'll try to learn both CUDA and OpenCL in parallel.
I have access to the latest nVidia tools/drivers/docs (I'm a registered developer), so I played a bit with the OpenCL already. For now it seems still quite buggy, even with the examples provided with SDK I had some glitches here and there, and the compiler itself didn't like my install paths at all, so I had to edit a few things to make it work.
M4 Project homepage
M4 Project wiki
ID: 1279 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Orakk
Avatar

Send message
Joined: 11 Oct 09
Posts: 1
Credit: 786
RAC: 0
Message 1296 - Posted: 11 Oct 2009, 1:13:02 UTC

Milkyway & Collatz are doing well with both Cuda & ATI right now. Working well to is Boinc development, 6.10.3 is mostly behaving itself.
SeriousCrunchers@Home
ID: 1296 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJ

Send message
Joined: 17 Nov 07
Posts: 16
Credit: 95,844
RAC: 0
Message 1301 - Posted: 23 Oct 2009, 20:49:50 UTC

I think it would be really cool for people to be able to suggest cribs. It would help keep interest in the project. Too bad you kind of have to know German though. I guess you could still just try common words from enigma messages without german.
ID: 1301 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile doublechaz

Send message
Joined: 5 Mar 09
Posts: 15
Credit: 1,123,069
RAC: 0
Message 1375 - Posted: 19 Nov 2009, 1:19:20 UTC

Given this project I don't advocate CUDA. It's too bad though. I tried the RC5 tool on CUDA from distributed.net and I got a speedup of 100x with my GTX285. 3Mkeys/s to 308Mkeys/s. Wow!

But we'll finish this project before the porting time. At least I hope we will. :)
ID: 1375 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 69,628,840
RAC: 383,626
Message 1409 - Posted: 26 Nov 2009, 1:04:34 UTC - in response to Message 1375.  
Last modified: 26 Nov 2009, 1:12:16 UTC

Well, the third Naval M4 message is still unbroken for now, so I think that the CUDA app might still be useful in future.
For now instead of porting the entire app, I'd rather go for brute force approach.
The hillclimb algorithm isn't that complicated, but rewriting it to run fast on the GPU might exceed my skills. I'm not sure if algorith with so many conditions and loops / short procedures could even run at a decent speed on GPUs.
Bruteforce on the other hand is much simplier, it will be much easier to implement on GPU. At least I hope so. It has to be extremely fast to be useful, because the total number of possible machine setting for an M4 Naval enigma is just insane:

619 953 965 349 522 804 000 000


Going through such a high number of combinations would be impossible on CPUs, but parallel algorithm for modern GPUs might be able to do it.
The bruteforce itself seems realistic, but there's another thing - no one ever will be able to manually check all the results, so there must be some kind of automated scoring algorithm, perhaps more than one, to check the results and filter out all the junk. It also has to be fast, otherwise it will slow down the entire process. Probably I'll see if it's possible to run something similar to bigram/trigram scoring in parallel on a GPU.
M4 Project homepage
M4 Project wiki
ID: 1409 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sword

Send message
Joined: 18 Nov 09
Posts: 11
Credit: 1,052,256
RAC: 0
Message 1411 - Posted: 26 Nov 2009, 16:28:50 UTC - in response to Message 1409.  

The first two Naval messages from the M4 project are broken. Both were signed by U-boat commanders in the very last position of the text or in the very beginning,as far as I understand.The third and unbroken message may also be from another U-boat commander.

I got an idea that may not be original and useful, but I think of that I have been reading about socalled cribs. Here is a link to the list of all german U-boat commanders during WW2 : http://www.uboat.net/men/commanders/l.htm . The names of both Schroeder and Looks are to be found here.

By first " washing " this list and testing the last names in the right position maybe it is to some help nearer a solution.

ID: 1411 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 1414 - Posted: 27 Nov 2009, 19:10:58 UTC - in response to Message 1409.  

Hey TJM,

I believe the Nvidia ncc compiler is based on the Pathscale/Open64 compiler similar to what AMD did with Open64. You may not need to optimize it for the GPU since I would think all the optimization would be done by the compiler. (I think)

Mike D
ID: 1414 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 69,628,840
RAC: 383,626
Message 1419 - Posted: 30 Nov 2009, 12:30:36 UTC - in response to Message 1414.  

Hey TJM,

I believe the Nvidia ncc compiler is based on the Pathscale/Open64 compiler similar to what AMD did with Open64. You may not need to optimize it for the GPU since I would think all the optimization would be done by the compiler. (I think)

Mike D



It doesn't work that way. Theoretically it's possible to rewrite simple app for GPU with just minor changes in the code. But without completely redesigning the app performance is very poor - for example, some time ago I compiled a single function (icscore) to check the performance on a GPU. Standard code was very slow, barely could be compared to a Pentium III 700-800MHz on a GF9600GT, while the same function rewritten to fully use GPU capabilities ran around 50 times faster.
I think that some parts of the enigma code would be extremely slow on GPU due to large number of conditional statements switching/skipping part of the code.
M4 Project homepage
M4 Project wiki
ID: 1419 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TJM
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 25 Aug 07
Posts: 843
Credit: 69,628,840
RAC: 383,626
Message 1421 - Posted: 30 Nov 2009, 21:05:28 UTC
Last modified: 30 Nov 2009, 23:44:40 UTC

I tested the bruteforce method on a CPU. It takes around 30 seconds to test a single stecker doing a fulł round on M3`s single wheel order. Considering the number of all stecker connections (all combinations of 10 pairs out of 26 letters) thats way too slow.
Now the question is, how fast could the GPU app run compared to the CPU. Without ~1000x speedup I doubt that we could see any result without traveling in time :-)
M4 Project homepage
M4 Project wiki
ID: 1421 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 1422 - Posted: 30 Nov 2009, 23:37:52 UTC - in response to Message 1421.  

I hope you get a Nvidia GPU soon. Of course, my GPU is available for testing, should the need arise....;-) The only thing I didn't realize about using my particular video card is my G92 GPU (older 9600 GSO card) is OVERCLOCKED by default from the factory (mine was made by PNY). Occasionally, it chokes, which is why I don't do GPUGrid.net anymore. I'm sure more cooling would help, or using nvclock to slow it down, but that's after DAYS of non-stop crunching. FWIW.

Mike D
ID: 1422 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : BOINC 6.4.5 and CUDA?




Copyright © 2017 TJM