Page 1 of 1

#1 Re: QuChemPedIA@home

Posted: Sun Oct 06, 2019 1:24 am
by Dirk Broer
Do you have the invitation code then?

#2 Re: QuChemPedIA@home

Posted: Sun Oct 06, 2019 1:22 pm
by Dirk Broer
Thanks! They did not accept the code you supplied, but a new one popped up after I pressed 'join'.

#3 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 11:17 am
by davidbam
this is now a FB project and looks awkward enough to where it might put people off. A chance of some points perhaps?

#4 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 2:29 pm
by Bryan
There is a native Linux app that doesn't require VBox and it runs quite well. The only caveat is you need to turn HT OFF if you have more than 32 threads. The app issues the taskset command 0xFFFFFFFF that puts all running WU onto the 1st 32 threads. Quite often the program launches child processes and every time one of those starts it reissues the 32 thread taskset affinity mask so you can't even setup your own script to use more than 32 threads.

I asked damotbe to change the affinity mask to allow at least 72 threads but he said that they don't currently have a programmer so it may or may not happen at some point.

#5 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 2:32 pm
by davidbam
So I guess it doesn't help to run several instances of 32 threads?

Is it worth trying with HT on and a different project on threads 33-and-up ?

#6 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 2:46 pm
by Bryan
No, regardless of how many instances are run ALL QuChem WU will be assigned to the 1st 32 threads with the taskset affinity mask.

You could certainly run another project on the top 32 threads but then you are using HT and the QuChem WU will take twice as long as running with threads vs cores. The Wu take from 18 - 30 hours running with HT off (IIRC).

Of course you could always get off your lazy butt and turn HT OFF .... just sayin' :roll:

#7 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 2:51 pm
by davidbam
Naw - won't be doing that. It would take all day to put monitor / keyboard on all the headless workstations. We can't all afford server mobo with IPMI :D

If I left HT on but only ran 32 threads on the 2990wx machines, would that be ALMOST the same as turning HT off ?

#8 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 3:13 pm
by Bryan
No, you would wind up with the top 2 dies sitting idle and all 32 WU stuck on the 16 cores of the bottom 2 dies.

BTW, unless they've changed the WU do NOT checkpoint so once started you want to make sure they keep on keepin' on.

When they 1st started the project ALL WU were assigned to the 1st thread on each CPU so it had 2 threads available for processing. I had a script that ran every 2 minutes and would set the affinity mask for 72 threads. Needless to say, I was kicking some butt since I could use all threads. They changed the executable/wrapper so it would set the affinity mask to the 1st 32 threads/cores. The script is no longer useful because the program launches child processes quite frequently and every time a new one launches it issues the taskset command and slams ALL running executables to the 1st 32 threads.

HERE is a link to a php implementation but as a I said, all child processes issue the affinity mask so it isn't all that useful anymore.

#9 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 3:20 pm
by davidbam
30 hours with no checkpointing !!!!! Sounds as bad as SRbase

I'll maybe try it on one 2990wx with HT off but, man, when I said 'difficult', I didn't realise how bdooly difficult.

#10 Re: QuChemPedIA@home

Posted: Wed Jan 01, 2020 5:47 pm
by Bryan
The server status is showing the avg to be 1.7 hours so I guess they are running a shorter batch of WU now.

#11 Re: QuChemPedIA@home

Posted: Sat Jan 04, 2020 11:31 am
by Hal Bregg
davidBAM wrote: Wed Jan 01, 2020 3:20 pm 30 hours with no checkpointing !!!!! Sounds as bad as SRbase

I'll maybe try it on one 2990wx with HT off but, man, when I said 'difficult', I didn't realise how bdooly difficult.
SRBase has checkpoints. The progress bar doesn't reflect actual percentage of work done if you restart the task (see stderr.txt file for actual work progress).

More about checkpoints in this thread
http://srbase.my-firewall.org/sr5/forum ... =1001#4460

#12 Re: QuChemPedIA@home

Posted: Sat Jan 04, 2020 11:34 am
by davidbam
Oh, TYVM. That would certainly help a lot

#13 Re: QuChemPedIA@home

Posted: Sat Jan 04, 2020 10:53 pm
by Megacruncher
I only joined this project yesterday. Mostly because I hadn't noticed it before.
It doesn't seem to be causing any problems and the credit isn't too bad. As well as getting some FB points for us I'm making it my next million credit target.
I notice an issue above about it not working well with hyperthreading. But my experience suggests that it is working okay. My Threadripper 1950X 16-Core Processor Linux machine is running 32 WU at a time without error and is more or less matching a 32 CPU instance on my Threadripper 2990WX 32-Core Processor, also Linux, machine.

#14 Re: QuChemPedIA@home

Posted: Sat Jan 04, 2020 11:44 pm
by Bryan
The problem isn't hyperthreading. The issue is it slams ALL WU onto the 1st 32 threads of a machine. If you only have 32 threads then it isn't an issue.

#15 Re: QuChemPedIA@home

Posted: Sun Jan 05, 2020 1:48 am
by Alez
It also seems to require quite a bit of memory. Had more than a few tasks sitting with waiting for memory whilst running it.

#16 Re: QuChemPedIA@home

Posted: Sun Jan 05, 2020 4:48 pm
by Megacruncher
I just tried it on Windows. Total wipeout. 169 tasks. 169 errors. I'll stick to Linux.

#17 Re: QuChemPedIA@home

Posted: Sun Jan 05, 2020 11:32 pm
by Bryan
Linux runs native, Win runs VBox.

#18 Re: QuChemPedIA@home

Posted: Sun Jan 05, 2020 11:44 pm
by Megacruncher
Bryan wrote: Sun Jan 05, 2020 11:32 pm Win runs VBox.
I'll not bother then!

#19 Re: QuChemPedIA@home

Posted: Mon Jan 06, 2020 1:11 am
by Alez
Same for me, Vbox has been a pointless exercise.

#20 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 10:06 am
by davidbam
Not a good start from me. Two WU errored, 50 were invalid after significant run times. Only 3 validated at about 8pts per thread per hour on an OC 3900X. To add insult to injury, I forgot to join the team so even they didn't count.

Is this normal? (Linux)

#21 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 1:28 pm
by UBT - Woodles
The only errors I've had have been during downloading or cancelled by the server, none resulted in any time being lost.

However, I do have about a third ending up as invalid, most after 30 ish seconds but some running to the normal execution time.

Every one has invalids, it's a project "feature" :)

I'm getting about 50 credits an hour with a single thread, are you running them multithreaded?

#22 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 1:37 pm
by davidbam
Thanks. No, not running MT. Does the last setting really mean # threads per WU ?
Image

#23 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 1:56 pm
by UBT - Woodles
You would think so but no. I'm not sure what it's used for, I have it set to "No Limit" but tasks run on a single thread.

They have no work at the moment so I can't try different options.

The project developer has said that they've experimented and there's no advantage to using more than one core per workunit.

#24 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 2:11 pm
by davidbam
Okay thanks. Maybe I'll try again later on a lesser machine but there is no way I am putting my thoroughbred 3900X onto something which has 50 out of 55 invalid :D

#25 Re: QuChemPedIA@home

Posted: Wed Jan 15, 2020 2:41 pm
by UBT - Woodles
Makes sense :lol:

#26 Re: QuChemPedIA@home

Posted: Thu Jan 16, 2020 8:47 am
by davidbam
tried an Intel machine this time, older OS (ubuntu 18.10): 3 valid, 74 invalid

Not a good ratio, especially when some of the invalid WU have run for 20-30 mins before failing.

#27 Re: QuChemPedIA@home

Posted: Thu Jan 16, 2020 9:07 am
by UBT - Woodles
I only have the one box on QuChem so can't help with different CPUs or OS.

If it helps, I've just downloaded four tasks and all ended up as invalid, three after 30 seconds, one after twenty minutes. Normally I'd get at least one valid out of four but it's a small sample size.

#28 Re: QuChemPedIA@home

Posted: Thu Jan 16, 2020 9:27 am
by davidbam
the CPU/OS may be a red herring but, out of interest, what are you running on please and I'll see if I have anything close

#29 Re: QuChemPedIA@home

Posted: Thu Jan 16, 2020 1:11 pm
by UBT - Woodles
It's a 1950X, default settings with 32 Gig of RAM running Ubuntu 19.04 if I remember correctly. QuChem is down again so I can't check.

Edit: Just had a thought and checked on WUProp, details are correct (also Boinc version 7.14.2 if it matters?)

#30 Re: QuChemPedIA@home

Posted: Thu Mar 04, 2021 7:55 pm
by Dirk Broer
Any specs on the machine in question? 64-bit is pretty standard on modern machines now.
I'd even go so far as saying that QuChemPedIA only has 64-bit apps for vbox (used for Windows and MacOS), and needs at least a 64-bit CPU for the rest (=Linux).