Xeon - not utilising more than 32 threads
- scole of TSBT
- Boinc Major General
- Posts: 5982
- Joined: Mon Feb 03, 2014 2:38 pm
- Location: Goldsboro, (Eastern) North Carolina, USA
#71 Re: Xeon - not utilising more than 32 threads
I would guess it has something to do with your specific application and a bottleneck with the other system resources such as disk, network or memory. What do your memory, disk and network resource stats look like with no instances of your program running and what do they look like with 1, then 2, then 3 and so on running?
#72 Re: Xeon - not utilising more than 32 threads
The processes are compute-bound. There is very little system resources used while they are active. Disk I/O, Network I/O, and memory use are all low (the system has 32 GB of RAM and each process uses about 100 MB, about 9-10GB of RAM total, divided between the 88 processes). The processes run great on my other two systems, of 28 cores each, with either hyperthreading turned on or off. It must have something to do with the way scheduling is interacting with the NUMA nodes for >64 processes, don't you think?
#73 Re: Xeon - not utilising more than 32 threads
I have .bat files setup to launch instances of BOINC. I'm trying to blow off the cobwebs, but I ran into a problem with making the START command inclusive to the .bat file. What works on my systems without fail is to create the .bat file that sets up the program/thread and then call a .bat file that contains the START command and points to the other .bat file. What I'm suggesting is you create the file you want to execute (filename.bat) and then call it from another command file that has the START/node/affinity filename.bat.
Install ProcessLasso onto your Windows system. From there you can see where each of the threads is allowed to run. It will show the affinity. If you've set them up as 1/2 on each CPU it should show the affinity on 1/2 of them as 0-23 and the other 1/2 should show 24-43. If it doesn't then the NUMA node is not being setup correctly.
Install ProcessLasso onto your Windows system. From there you can see where each of the threads is allowed to run. It will show the affinity. If you've set them up as 1/2 on each CPU it should show the affinity on 1/2 of them as 0-23 and the other 1/2 should show 24-43. If it doesn't then the NUMA node is not being setup correctly.
- scole of TSBT
- Boinc Major General
- Posts: 5982
- Joined: Mon Feb 03, 2014 2:38 pm
- Location: Goldsboro, (Eastern) North Carolina, USA
#74 Re: Xeon - not utilising more than 32 threads
Exactly what model CPUs are these? are they retail or ES?
And running some WCG for the next 24 hours might help us diagnose the issue
And running some WCG for the next 24 hours might help us diagnose the issue
#75 Re: Xeon - not utilising more than 32 threads
Thanks for all these comments. The affinity is correct as you described, I had already checked that. As I mentioned before, the core dropout on one NUMA node corresponds to when one batch file is finishing up and the next one is taking over from it. The explanation has to have something to do with that. However I've now checked on my 28 core systems (an identical hardware setup apart from the fewer cores) that are working perfectly whether hyperthreading makes any difference. For these single-threaded command line processes that run in parallel in the OS, there is no measurable difference between hyperthreading on or off. I can run twice the number of process with hyperthreading on, and have a batch list that is half as long, or have fewer processes with the hyperthreading off and a longer batch list. The total time to completion is within statistical error of being identical. Therefore the simplest solution for me now is to turn hyperthreading off on the 44 core system, which brings the thread count below 64, and alll is well (I've already tested it).
Perhaps I will come back to this later to try to get to the root of the problem!
[EDIT} The CPU model nos are E5-2696 @ 2.8 GHz when all cores are at 100%. They are retail chips.
Perhaps I will come back to this later to try to get to the root of the problem!
[EDIT} The CPU model nos are E5-2696 @ 2.8 GHz when all cores are at 100%. They are retail chips.
- Dirk Broer
- Corsair
- Posts: 1964
- Joined: Thu Feb 20, 2014 11:24 pm
- Location: Leiden, South Holland, Netherlands
- Contact:
#76 Re: Xeon - not utilising more than 32 threads
With the new Threadripper coming out with 32 cores/64 threads and future EPYCs rumoured to get yet more cores (64c/128t), this thread about threads is getting more and more interesting...A dual EPYC2 mobo might run into yet another limit with its 256 threads, I fear.
- Dirk Broer
- Corsair
- Posts: 1964
- Joined: Thu Feb 20, 2014 11:24 pm
- Location: Leiden, South Holland, Netherlands
- Contact:
#77 Re: Xeon - not utilising more than 32 threads
There is computing outside of x86, of course. IBM's Power9 consists out of either 12 cores with 8-way SMT or 24 cores with 4-way SMT and their systems can house sixteen (16!) sockets.
That is 16x 12x 8, or 16x 24x 4, both ending up with 1536 threads per system, max. These systems are not meant for Joe Sixpack though (but who is stopping sysadmin Sixpack from 'burning in' the new company server with a 7-day long BOINC run? ), and they won't be running Windows either.
That is 16x 12x 8, or 16x 24x 4, both ending up with 1536 threads per system, max. These systems are not meant for Joe Sixpack though (but who is stopping sysadmin Sixpack from 'burning in' the new company server with a 7-day long BOINC run? ), and they won't be running Windows either.