CPDN, GPU's, code effectiveness and efficiency

Forum rules
Nightlord

#1 CPDN, GPU's, code effectiveness and efficiency

Post by Nightlord »

Hi guys, seeing the discussion in the 2010 goals thread, I wanted to comment, but maybe a new thread here is a better place than clogging up the goals thread. So here goes....

One of the main reasons given by the project for not even contemplating a shift in the code is that it is 1 Million lines long.

I've never subscribed that being the blocking factor to developing newer modules to take advantage of home computing advances. Here's why:

1 Million lines of code - you what!?? Do you know how long it takes to write that amount of code accurately and debug it completely? Even if the application developers had started 30 years ago (which may be the reason for using Fortran), it shows a fundamental lack of foresight to stay with one software model and not to make the beast modular, sustainable and dynamic.

How large is the Boinc science application - One million lines of code should translate into a simply humongous client side executable application. Of course it isn't a direct port of the main code application - that runs on the Met Office servers. So, if it isn't a direct port, then perhaps there are elements that can be looked at to optimise? Only then do we hear the "can't do that because it might introduce model instability" argument. The trouble with that mode of working is that no innovation takes place and a stale application results.

Anyway, having gone down that route it seems to be a sacred cow that cannot be touched for fear of breaking something. Hence complete intransigence of the scientists and project staff to accommodate any shift in the code.

Scientific advance is more often than not made by proposing a hypothesis, testing it, proving it wrong, improving the hypothesis and so on until new discoveries are made and understanding advanced. Staying put on a baseline of code that is so large and unwieldy causes the project to underachieve due to it's inability to take advantage to technical advances (and here I'm not just talking about GPU's - there are plenty of other parallel computing advances out there).

I suspect the real reason is to do with scientific secrecy - what boils down to money in the end. The application is closed and cannot be shared unless under very specific terms. This holds it very close to the Met Office, so they can benefit from any discoveries made; though I suspect that opportunities for discoveries to be made software which does not advance are somewhat limited.

CPDN is a worthy project indeed, but I have long felt it has been playing with it's loyal subscribers and has been unable to take the difficult decisions to move forwards.

Rant mode off....... I feel better for that :wink:
PinkPenguin

#2 Re: CPDN, GPU's, code effectiveness and efficiency

Post by PinkPenguin »

...Don't knock Fortran, it may be long in the tooth, but it is still used to benchmark the fastest computers and widely used in scientific computing for a variety of good reasons. It is modular and sustainable... CPDN software is based on the Unified Model developed by the Met Office and the code for an older 1998 version is actually available on the NCAS site. From what I understand the agreement allows the academic system to make use of it on a non-profit basis.

I think the real problem is whether or not the model produces accurate results / predictions and I am under the impression that this has not been established as there is still a lively debate on the accuracy of the models. Until there is some consensus on the accuracy of actual models it may not be a good idea to introduce radical changes - whatever other motives scientists may have (and they are human) I can't help but have some sympathy with scientific conservatism...

...after all my boss thinks I'm Scrat and says that the next time I plant another nut (2-3 milion line project) she'll nail my ....s to the pool table and play billiards with them... :shock:
Nightlord

#3

Post by Nightlord »

...after all my boss thinks I'm Scrat and says that the next time I plant another nut (2-3 milion line project) she'll nail my ....s to the pool table and play billiards with them... Shocked
That's a picture I'm going to have trouble getting out of my head! :wink:

I'm not knocking Fortran at all. I guess I'm frustrated by the project's public communication on the issue of development and optimisation. First they say the main reason is 1 Million lines of code, then they say that 32 bit GPU's do not have sufficient precision (but aren't the majority of CPU's 32bit???), then they say it's memory addressing (which could be overcome by porting modules to the GPU and leaving elements on the CPU), then they note that the disk array (a shatteringly huge 21 Terra Bytes) is nearly full and can't manage the extraction of results for the project scientists any faster than we are filling it up. Each time a reason is questioned a new reason is presented.

A long time ago, in another life so to speak, a wise man told me always to ensure at least an element of truth in public communication, but much more importantly no matter what is said always to be 100% consistent. To suggest the project is lying might be libelous, so just to be clear, I'm not accusing them of that. However, the only thing they have been consistent about over the last couple of years in discussion over the subject is "nope - we're not going there".

I would much prefer a straight answer such as "We don't really understand if the models are accurate yet, so want to play safe and not introduce further variables until we know for certain", followed by a plan to demonstrate the effectiveness or otherwise of the models. However, what we get is "It has taken the developers 2 years to modify the latest models and they are still not ready for release yet". It doesn't give a very good impression and leads to more probing questions; which in turn are pushed to one side.

I've been involved on both sides of the equation over the years and yes, at times I had to spin a line - but only when there were overriding factors about the project(s) that benefited from a degree of obfuscation.

Sorry for the grumpiness, I just get annoyed at projects that treat volunteer users in this way.
PinkPenguin

#4

Post by PinkPenguin »

I agree they did give a variety of reasons (some of them not very valid) and they're choice of comunication strategy was mainly Technical (which is usually a big mistake at this level - I think the example you gave is very good).

I believe its the operating systems that are mainly 32-bit while the CPUs are 64-bit and have been for a while. I don't think it's a real problem, double-precision FP has been available for yonks and is not dependent on the addressing domain of the CPU. But then software usually lags ten or twenty years behind hardware.... :?

The short of it is they could resolve the problem if they wanted to and much of what they say sounds more like an excuse than a reason (it is only really a question of time/effort and money...8) ) I think the grumpiness is justified...
Ben

#5

Post by Ben »

I think it was due to GPU's not having the depth of precision that a CPU can do. If you look at the IEEE standard of both, the CPU can do much more precise calculations which they say they need...

I still agree with you though Nightlord. It isn't too expensive these days to get a 1/2TB hard drive! And if they implemented a GPU version that would be a god send, i would swap to that project any day! A worthwhile project in my opinion :)
Post Reply Previous topicNext topic

Return to “Climate Prediction”