Optimized app Makefile and x86 Open64 Compiler Suite

Message boards : Number crunching : Optimized app Makefile and x86 Open64 Compiler Suite

To post messages, you must log in.

AuthorMessage
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 853 - Posted: 29 Apr 2009, 2:00:35 UTC

Howdy TJM,

AMD has just released the x86 Open64 Compiler Suite. I'd like to try re-compiling a copy of the benchmark using this new software (I'd love to get an optimized app closer to Intel speed if possible). However, gcc isn't directly called in the Makefile, the executable "compile" is called. What is "compile"? How does it interact with gcc? And most Importantly, is the application code in C or C++, as AMD has "opencc" for C code and "openCC" for C++ code. (I can make coffee, design conveyor systems, and run a Makefile, but that's about it....;-) )

Here's the link in-case anyone wants to try it for themselves.....

http://developer.amd.com/CPU/OPEN64/Pages/default.aspx

Mike Doerner
ID: 853 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 854 - Posted: 29 Apr 2009, 2:07:54 UTC - in response to Message 853.  

....and it looks like we're working in C and not C++ as the files end in .c and not .C......time for a nap....;-)

Mike Doerner
ID: 854 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 856 - Posted: 29 Apr 2009, 11:11:16 UTC

Just a guess here but "conf-cc" and "conf-ld" need to be edited?

Mike Doerner
ID: 856 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 857 - Posted: 29 Apr 2009, 11:50:32 UTC - in response to Message 856.  
Last modified: 29 Apr 2009, 11:53:19 UTC

Also, conf-cc contains the line.....

gcc -Wall -W -O3

What does "-W" do? I see that "-w" suppresses warning messages, but I don't see any gcc or opencc documentation for the "-W" flag.

So if I want to use an alternate compiler I would just change the line to....

opencc -Wall -W -Ofast

...assuming gcc and opencc's other options/flags are the same? (opencc has -O3 and a new -Ofast flag)

....and then conf-ld is...

gcc -fomit-frame-pointer -s

....but it looks like there is no flags like this available under opencc (unless "-s"="-S"). Also, -fomit-frame-pointer is not listed as a flag under the opencc documentation.

opencc -S

??????? - My lack of C compiling experience is really showing here...:-(

Mike Doerner
ID: 857 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 858 - Posted: 29 Apr 2009, 12:28:58 UTC
Last modified: 29 Apr 2009, 12:40:15 UTC

Really, I don't normally talk to myself this much, but here some benchmark results.....

P3 optimized app run on my Phenom 3.0Ghz....(the one TJM included in the tgz file)

real 3m18.083s
user 3m14.084s
sys 0m0.072s

Phenom optimized app I compiled on gcc 4.3.2....

real 3m8.567s
user 3m7.072s
sys 0m0.024s


And the new opencc 4.2.2 optimized app (with flags same as conf-cc and conf-ld with the exception of -O3 was changed to -Ofast)

real 2m33.937s
user 2m32.274s
sys 0m0.000s


I got my 22.2% performance increase so now I should be clock-for-clock compared to Intel Core 2 architecture. Dunno about i7 yet.....

I will start using my new app today and I should see another productivity increase here soon. WOO-HOO!

Mike Doerner
ID: 858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 859 - Posted: 29 Apr 2009, 14:36:15 UTC
Last modified: 29 Apr 2009, 14:38:48 UTC

More results....
These times are at 2.6Ghz....


Before - awgly100_0_1998795_r0 3,747.89 seconds of cpu time (default app)
After - awgly100_0_2001488_r0 2,765.39 seconds of cpu time (optimized phenom app)



And the Intel....


Well this is annoying. My other computer is a laptop w/ C2D T7500 2.2Ghz.

With the C2D optimized app in Windows.....

awgly100_0_1760349_r0 2,542.52 Secs CPU time. (220 seconds less than my 2.6 Ghz Phenom)



My phenom is currently clocked at 2.93Ghz, and I'm getting awgly100_0_2153304_r0 2,487.11 secs CPU time w/ my new optimized app. 2.6 / 2.93 = 0.887 so 2765.39 should be 2453.93 secs if I had overclocked earlier with the old optimized app. Right in the ballpark, but hard to say if I've actually saved a minute of computing time or not, as I see completion times change between the different awgly100's. I'll have to ust watch it I guess....:-)
Mike Doerner


From the old app I compiled (gcc 4.3.2 optimized).....

awgly100_0_5472713_r0 2,408.81 secs

The new app as of this morning (opencc 4.2.2)......

awgly100_0_5491438_r0 1,964.76 secs

So to compare by total clock cycles....(Hz = cycles/sec)

Intel Core 2 (w/ TJM's C2D opt 32-bit app) -> 2.2 Ghz * 2,542.52 Secs = 5593.544 Gigacyles to complete a task in total.

Phenom (gcc optimized) -> 2.997Ghz * 2408.81 sec = 7219.20 Gigacyles to complete a task (old optimized app) -> this is where the 20%-25% "Integer Superiority" of Intel's C2D came into play....

Phenom (opencc optimized) -> 2.997Ghz * 1964.76 secs = 5888.38 Gigacycles total. This brings us to within 5% of Intel's C2D performance. (Looks like AMD got tired of being beaten by Intel ;-) )

hceyz72_0_5487171_r0 took 1,426.06 under my old gcc app, and the new opencc app reports hceyz72_0_5488007_r0 as 1,175.20 secs of elapsed time.....an increase of 17.6% compared to my gcc app.

Mike Doerner

PS Hey TJM, it may be worthwhile to recompile the AMD apps again with opencc from AMD. I didn't see any restrictions on distribution like the Intel compiler when I downloaded it. I am seeing improvement across the board in performance.

You may want to verify I'm not pushing up bad WU's though, just to be on the safe side.
ID: 859 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mdoerner
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 30 Jul 08
Posts: 202
Credit: 6,998,388
RAC: 0
Message 860 - Posted: 30 Apr 2009, 9:02:10 UTC
Last modified: 30 Apr 2009, 9:02:41 UTC

Also, apparently Intel has also contributed to open64 in the form of Itanium optimizations. Does anyone with an Intel Itanium or Intel Core 2 Duo want to try and compile an optimized app with the software from the AMD web site? I'm curious if the Open64 compilations would run faster than the gcc optimized apps TJM has provided previously for the Intel processors.

Mike Doerner
ID: 860 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Optimized app Makefile and x86 Open64 Compiler Suite




Copyright © 2024 TJM