Xeon - not utilising more than 32 threads

Forum to discuss and compare Hardware profiles and Benchmarking
User avatar
scole of TSBT
Boinc Major General
Boinc Major General
Posts: 5982
Joined: Mon Feb 03, 2014 2:38 pm
Location: Goldsboro, (Eastern) North Carolina, USA

#71 Re: Xeon - not utilising more than 32 threads

Post by scole of TSBT »

I would guess it has something to do with your specific application and a bottleneck with the other system resources such as disk, network or memory. What do your memory, disk and network resource stats look like with no instances of your program running and what do they look like with 1, then 2, then 3 and so on running?
Image
noetus
Boinc Corporal
Boinc Corporal
Posts: 50
Joined: Tue May 30, 2017 3:15 am

#72 Re: Xeon - not utilising more than 32 threads

Post by noetus »

The processes are compute-bound. There is very little system resources used while they are active. Disk I/O, Network I/O, and memory use are all low (the system has 32 GB of RAM and each process uses about 100 MB, about 9-10GB of RAM total, divided between the 88 processes). The processes run great on my other two systems, of 28 cores each, with either hyperthreading turned on or off. It must have something to do with the way scheduling is interacting with the NUMA nodes for >64 processes, don't you think?
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm

#73 Re: Xeon - not utilising more than 32 threads

Post by Bryan »

I have .bat files setup to launch instances of BOINC. I'm trying to blow off the cobwebs, but I ran into a problem with making the START command inclusive to the .bat file. What works on my systems without fail is to create the .bat file that sets up the program/thread and then call a .bat file that contains the START command and points to the other .bat file. What I'm suggesting is you create the file you want to execute (filename.bat) and then call it from another command file that has the START/node/affinity filename.bat.

Install ProcessLasso onto your Windows system. From there you can see where each of the threads is allowed to run. It will show the affinity. If you've set them up as 1/2 on each CPU it should show the affinity on 1/2 of them as 0-23 and the other 1/2 should show 24-43. If it doesn't then the NUMA node is not being setup correctly.
Image
User avatar
scole of TSBT
Boinc Major General
Boinc Major General
Posts: 5982
Joined: Mon Feb 03, 2014 2:38 pm
Location: Goldsboro, (Eastern) North Carolina, USA

#74 Re: Xeon - not utilising more than 32 threads

Post by scole of TSBT »

Exactly what model CPUs are these? are they retail or ES?

And running some WCG for the next 24 hours might help us diagnose the issue :-)
Image
noetus
Boinc Corporal
Boinc Corporal
Posts: 50
Joined: Tue May 30, 2017 3:15 am

#75 Re: Xeon - not utilising more than 32 threads

Post by noetus »

Thanks for all these comments. The affinity is correct as you described, I had already checked that. As I mentioned before, the core dropout on one NUMA node corresponds to when one batch file is finishing up and the next one is taking over from it. The explanation has to have something to do with that. However I've now checked on my 28 core systems (an identical hardware setup apart from the fewer cores) that are working perfectly whether hyperthreading makes any difference. For these single-threaded command line processes that run in parallel in the OS, there is no measurable difference between hyperthreading on or off. I can run twice the number of process with hyperthreading on, and have a batch list that is half as long, or have fewer processes with the hyperthreading off and a longer batch list. The total time to completion is within statistical error of being identical. Therefore the simplest solution for me now is to turn hyperthreading off on the 44 core system, which brings the thread count below 64, and alll is well (I've already tested it).

Perhaps I will come back to this later to try to get to the root of the problem!

[EDIT} The CPU model nos are E5-2696 @ 2.8 GHz when all cores are at 100%. They are retail chips.
User avatar
Dirk Broer
Corsair
Corsair
Posts: 1964
Joined: Thu Feb 20, 2014 11:24 pm
Location: Leiden, South Holland, Netherlands
Contact:

#76 Re: Xeon - not utilising more than 32 threads

Post by Dirk Broer »

With the new Threadripper coming out with 32 cores/64 threads and future EPYCs rumoured to get yet more cores (64c/128t), this thread about threads is getting more and more interesting...A dual EPYC2 mobo might run into yet another limit with its 256 threads, I fear.
Image
User avatar
Dirk Broer
Corsair
Corsair
Posts: 1964
Joined: Thu Feb 20, 2014 11:24 pm
Location: Leiden, South Holland, Netherlands
Contact:

#77 Re: Xeon - not utilising more than 32 threads

Post by Dirk Broer »

There is computing outside of x86, of course. IBM's Power9 consists out of either 12 cores with 8-way SMT or 24 cores with 4-way SMT and their systems can house sixteen (16!) sockets.
That is 16x 12x 8, or 16x 24x 4, both ending up with 1536 threads per system, max. These systems are not meant for Joe Sixpack though (but who is stopping sysadmin Sixpack from 'burning in' the new company server with a 7-day long BOINC run? :wink: ), and they won't be running Windows either.
Image
Post Reply Previous topicNext topic

Return to “Benchmarking and Hardware”