Collatz GPU optimisations

Forum rules
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#1 Collatz GPU optimisations

Post by Alez »

For my 2 x 980's I'm running this and twice as fast as standard

["app_config.xml"]

Code: Select all

<app_config>
<app>
<name>collatz_sieve</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
</app_config>
use notepad to make this file and save as other. For a single card change <max_concurrent>2</max_concurrent> to <max_concurrent>1</max_concurrent>

For config

Code: Select all

verbose=0
kernels_per_reduction=48
threads=8
lut_size=17
sleep=0
cache_sieve=1
reduce_cpu=0
sieve_size=30
This file is collatz_sieve_1.30_windows_x86_64_opencl_nvidia_gpu.config andwill be empty. Linux will obviously be slightly different name and you will be looking for the AMD version but in var/lib/boinc/projects/boinc.thesonntags.com_collatz

For different cards look here and pick closest.

edit using gedit. In terminal use gksudo gedit and then navigate to file. If you dont have gedit then first off sudo apt-get gedit wlll do the trick. Gdit is standard in ubuntu but not my prefered Lubunu.
Image
The best form of help from above is a sniper on the rooftop....
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#2 Re: Collatz GPU optimisations

Post by Alez »

Configuration file locations (on Windows, replace the first path components on other OSs accordingly):

C:\ProgramData\BOINC\projects\boinc.thesonntags.com_collatz\<app_name>.config

During project initialization on your client, empty <app_name>.config files will be created for each of the application versions that match your GPUs. You can enter parameters into these files in order to deviate from default values, and they will be picked up as soon as a Collatz GPU task starts.

Configuration file format

Plain text file, one "parameter=value" pair per line, unrecognized parameter names are simply ignored (you can use this to comment out some parameters during testing), missing parameters fall back to their default values.

Example (suitable for a GTX 1080):
kernels_per_reduction=48
threads=9
lut_size=17
sieve_size=30
cache_sieve=1

Parameters

cache_sieve
default: 1 (?)
range: 0 or 1 (?)
definition: "any setting other than 1 will add several seconds to the run time as it will re-create the sieve for each WU run rather than re-using it"

kernels_per_reduction
default: 32
range: 1...64
definition: "the number of kernels that will be run before doing a reduction. Too high a number may cause a video driver crash or poor video response. Too low a number will slow down processing. Suggested values are between 8 and 48 depending upon the speed of the GPU."
comment: "affects GPU usage and video lag the most from what I [sosiris] tested."

lut_size
default: 10
range: 2...31
definition: "the size (in power of 2) of the lookup table. Chances are that any value over 20 will cause the GPU driver to crash and processing to hang. The default results in 2^10 or 1024 items. Each item uses 8 bytes. So 10 would result in 2^10 * 8 bytes or 8192 bytes. Larger is better so long as it will fit in the GPUs L1/L2 cache. Once it exceeds the cache size, it will actually take longer to complete a WU since it has to read from slower global memory rather than high speed cached memory."
comment: "I [sosiris] choose 16, 65536 items for the look up table because it would fit into the L2$ (512KB) in GCN devices. IMHO it could be 20 for NV GPUs, just like previous apps, because NV GPUs have better caching."

reduce_cpu
default: 0
range: 0 or 1
definition: "The default is 0 which will do the total steps summation and high steps comparison on the GPU. Setting to 1 will result in more CPU utilization but may make the video more responsive. I have yet to find a reason to do the reduction on the CPU other than for testing the output of new versions."
comment: "I [sosiris] choose to do the reduction on the CPU because AMD OpenCL apps will take up a CPU core no matter what you do (aka 'busy waiting') and because I want better video response."

sieve_size
default: ?
range: 15...32
definition: "controls both the size of the sieve used 2^15 thru 2^32 as well as the items per kernel are they are directly associated with the sieve size. A sieve size of 26 uses approx 1 million items per kernel. Each value higher roughly doubles the amount. Each value lower decreases the amount by about half. Too high a value will crash the video driver."

sleep
default: 1
range: ?
definition: "the number of milliseconds to sleep while waiting for a kernel to complete. A higher value may result in less CPU utilization and improve video response, but it also may lengthen the processing time."

threads
default: 6
range: 6...11
definition: "the 2^N size of the local size (a.k.a. work group size or threads). Too high a value results in more threads but that means more registers being used. If too many registers are used, it will use slower non-register memory. The goal is to use as many as possible, but not so many that processing slows down. AMD GPUs tend to work best with a value of 6 or 7 even though they can support values of up to 10 or 11. nVidia GPUs seem to work as well with higher values as lower values."
comment: "I [sosiris] didn't see lots of difference once items per work-group is more than wavefront size (64) of my HD7850 in the profiler."

verbose
default: 0
range: 0 or 1
definition: "1 will result in more detail in the output."

Definitions are taken from Slicker's post from June 2015, last modified in September 2015.
Comments are taken from sosiris' post from June 2015.
Edit April 28 2018, added definition of cache_sieve from a post from Slicker from April 2018
Image
The best form of help from above is a sniper on the rooftop....
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#3 Re: Collatz GPU optimisations

Post by Alez »

If you have more than one GPU in the system you will want to make this cc_config.xml file to make both or more work. By default BOINC will only use the most capable GPU in a system.

["cc_config.xml"] usually c:/ProgramData/BOINC or /Var/Lib/BOINC in Linux

Code: Select all

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<skip_cpu_benchmarks>1</skip_cpu_benchmarks>
<report_results_immediately>1</report_results_immediately>
</options>
</cc_config>
Make file with notepad or Gedit depending on system. Ensure you save as when finished . Do not simply save or your file will be called cc_config.xml.txt and will not be read.
Image
The best form of help from above is a sniper on the rooftop....
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#4 Re: Collatz GPU optimisations

Post by Alez »

and from slickers FAQ

Q: How come I'm not getting any work?
A: Your computer may already have enough work. Just because the boinc log says it requested work, it may have requested 0 seconds. The ONLY way to see what it really asked for is to enable sched_op_debug in Boinc Manager via the Options, Event Log Options screen.

If you want work for your GPU you need to have OpenCL drivers installed. The Windows drivers installed automatically by Microsoft may not contain the required OpenCL files. Try installing the version from the AMD, nVidia, or Intel web sites.

Check what preferences you have set for the Collatz project via the web site. You won't get work if you don't have it enabled.

Lastly, BOINC bases its calculations on how many floating point operations your computer can do. Unfortunately, Collatz only uses integers which causes the estimates to be way off. In addition, the GPU applications can run anywhere from twice as fast (older slower GPUs and Intel embedded GPUs) to hundreds of times faster. For example, it thinks my Android phone is 1/4 the speed of my i7 laptop when in reality, it is about 1/400 the speed.

Q: All the workunits have errors. What's wrong?
A: The Windows versions require the Microsoft C Runtime library. If you are running a 64-bit version of Windows, you will need BOTH 32 and 64 bit versions since BOINC will likely send you both even though the server has been set to prefer sending 64-bit apps to 64-bit operating systems.

Q: When are the new apps going to be available for my computer?
A: It takes about 40 hours to test each individual application to make sure it calculates correctly. That's 25 apps x 40 hours each for OS X, Windows, and Linux. So, it takes 1,000 hours to run through all the tests, and if there's a bug, start over. Since this is not my full time job just as crunching is not your full time job, I have limited time to spend doing it.
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#5 Re: Collatz GPU optimisations

Post by davidbam »

Question: I have my nvidia GTX1080 turning out WU in 6 mins with 2 WU loaded onto GPU. When I add a second identical GPX1080, I get 4 running WU but each one takes twice as long to run !!

Thinking it might need more CPU, I disabled WCG which was the only CPU project running - to no avail. Plenty of CPU available, plenty RAM, plenty SSD, CPU has 40 PCIe lanes

Should I hyperthread or not?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#6 Re: Collatz GPU optimisations

Post by Alez »

are you running the app_config.xml ? Did you remember to double the cpu you have allocated ? The cpu allocated there is for collatz GPU in total, if set 0.5 then only 1/2 core to feed 2 GPU/4 apps. I presume the PSU is up to supplying enough power ?
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#7 Re: Collatz GPU optimisations

Post by davidbam »

Yes, I get 4 WU to run. Will check PSU when I get home tomorrow afternoon
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#8 Re: Collatz GPU optimisations

Post by Alez »

davidBAM wrote: Sun Nov 18, 2018 11:19 am Yes, I get 4 WU to run. Will check PSU when I get home tomorrow afternoon
What I mean is , if you are running

Code: Select all

<cpu_usage>0.5</cpu_usage>
in your app_config.xml
then change it to

Code: Select all

<cpu_usage>1</cpu_usage>
so you are still allocating 0.25 core/running app
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#9 Re: Collatz GPU optimisations

Post by davidbam »

Will check when I get back. Tried logging in remotely but I think my connection must be down
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
scole of TSBT
Boinc Major General
Boinc Major General
Posts: 5980
Joined: Mon Feb 03, 2014 2:38 pm
Location: Goldsboro, (Eastern) North Carolina, USA

#10 Re: Collatz GPU optimisations

Post by scole of TSBT »

I've not run 2 GPUs in one system for a while but make sure it's using the 2nd GPU. Sounds like all 4 are running on 1 GPU.

Also, regardless of how many WUs you are running per GPU, allocate 1 CPU per WU. OpenCL is CPU intensive.
<gpu_usage>.5</gpu_usage>
<cpu_usage>1</cpu_usage>

Is this line needed in the cc_config.xml?
<use_all_gpus>1</use_all_gpus>
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#11 Re: Collatz GPU optimisations

Post by Alez »

Funny that was exactly what I was thinking. Are you sure it is 2 units per GPU and not 4 units on one and nothing on the other ?
Use cc_config.xml in var/lib/BOINC

Code: Select all

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
<skip_cpu_benchmarks>1</skip_cpu_benchmarks>
<report_results_immediately>1</report_results_immediately>
</options>
</cc_config>
use this in var/lib/BOINC/projects/boinc.thesonntags.com_collatz

Code: Select all

<app_config>
<app>
<name>collatz_sieve</name>
<max_concurrent>4</max_concurrent>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#12 Re: Collatz GPU optimisations

Post by davidbam »

Code: Select all

root@lw1-asrockx79:/var/lib/boinc# cat cc*xml
<!--
This is a minimal configuration file cc_config.xml of the BOINC core client.
For a complete list of all available options and logging flags and their
meaning see: https://boinc.berkeley.edu/wiki/client_configuration
-->
<cc_config>
	 <options>
 <use_all_gpus>1</use_all_gpus>
 </options>
  <log_flags>
    <task>1</task>
    <file_xfer>1</file_xfer>
    <sched_ops>1</sched_ops>
  </log_flags>
</cc_config>

root@lw1-asrockx79:/var/lib/boinc/projects/boinc.thesonntags.com_collatz# cat app_config.xml
<app_config>
<app>
<name>collatz_sieve</name>
<max_concurrent>4</max_concurrent>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>
</app_config>

All other Boinc projects suspended. Boincmgr permits 75% of available CPUs - and 100% of CPU time. I am baffled. Hyperthreading is enabled
Image
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#13 Re: Collatz GPU optimisations

Post by davidbam »

This is from a Collatz WU when only one GPU installed:

Code: Select all

Name 	collatz_sieve_435baba7-08a4-4cdd-8f59-3f43b2860b50_0
Workunit 	13092421
Created 	15 Nov 2018, 21:55:39 UTC
Sent 	15 Nov 2018, 22:08:01 UTC
Report deadline 	29 Nov 2018, 22:08:01 UTC
Received 	15 Nov 2018, 23:46:24 UTC
Server state 	Over
Outcome 	Success
Client state 	Done
Exit status 	0 (0x00000000)
Computer ID 	842433
Run time 	6 min 7 sec
CPU time 	3 sec
Validate state 	Valid
Credit 	27,706.70
Device peak FLOPS 	9,074.50 GFLOPS
Application version 	Collatz Sieve v1.40 (opencl_nvidia)
x86_64-pc-linux-gnu
Peak working set size 	168.96 MB
Peak swap size 	13,092.76 MB
Peak disk usage 	48.74 MB
And when 2 are installed - takes twice as long but CPU time is 7 times as long

Code: Select all

Name 	collatz_sieve_c194960f-a03f-492b-867d-87a8679837e1_0
Workunit 	13098944
Created 	15 Nov 2018, 23:41:42 UTC
Sent 	15 Nov 2018, 23:52:45 UTC
Report deadline 	29 Nov 2018, 23:52:45 UTC
Received 	16 Nov 2018, 12:23:07 UTC
Server state 	Over
Outcome 	Success
Client state 	Done
Exit status 	0 (0x00000000)
Computer ID 	842433
Run time 	13 min 38 sec
CPU time 	22 sec
Validate state 	Valid
Credit 	28,740.65
Device peak FLOPS 	9,074.50 GFLOPS
Application version 	Collatz Sieve v1.40 (opencl_nvidia)
x86_64-pc-linux-gnu
Peak working set size 	168.99 MB
Peak swap size 	21,284.78 MB
Peak disk usage 	48.74 MB
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#14 Re: Collatz GPU optimisations

Post by davidbam »

Hmmm - is it maybe RAM? I only have 24Gb in there

2 x Peak swap size 13,092.76 MB will swap much much less than
4 x Peak swap size 21,284.78 MB

Or I am reading that wrong? I wasn't expecting any swapping whatsoever TBH
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#15 Re: Collatz GPU optimisations

Post by Alez »

The machine I have 2 x 980's in only has 12GB ram I think, is running hyper threaded and only has a single core set aside as excluded from Boinc.
Is that running 2 x units per card ? Are you sure it is not running all 4 units on the one card ?
I think the best option is to take a step back. Run a single unit on each card and see what times you get as comparison.
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#16 Re: Collatz GPU optimisations

Post by davidbam »

Okay ta. The screenshot in post #12 shows 2 units on each card
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#17 Re: Collatz GPU optimisations

Post by Alez »

davidBAM wrote: Tue Nov 20, 2018 8:10 am Okay ta. The screenshot in post #12 shows 2 units on each card
Yes that's what the config says and that's what it should be. What I'm asking is whether it is actually working or not. Have you checked that both GPU's are actually loaded ? Check the nVidia panal that both GPU's are being used ( load, temp etc. ). Also check the start up log on BOINC manager that BOINC see's 2 GPU's and that the cc_config is being found and read. There will be a flag in the start of the log stating that cc_config is present and that use all GPU's has been set.
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#18 Re: Collatz GPU optimisations

Post by davidbam »

Just back from walking dogs - took ages for them to catch up with 3 days of p-mail :P

The screen dump was from boincmgr so I took it at face value. I've put a monitor/keyboard on it now so will check all. Both cards are certainly very hot to the touch
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#19 Re: Collatz GPU optimisations

Post by davidbam »

Code: Select all

Tue 20 Nov 2018 09:56:14 GMT |  | Starting BOINC client version 7.12.0 for x86_64-pc-linux-gnu
Tue 20 Nov 2018 09:56:14 GMT |  | log flags: file_xfer, sched_ops, task
Tue 20 Nov 2018 09:56:14 GMT |  | Libraries: libcurl/7.61.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.4) nghttp2/1.32.1 librtmp/2.3
Tue 20 Nov 2018 09:56:14 GMT |  | Data directory: /var/lib/boinc-client
Tue 20 Nov 2018 09:56:14 GMT |  | CUDA: NVIDIA GPU 0: GeForce GTX 1080 (driver version 390.87, CUDA version 9.1, compute capability 6.1, 4096MB, 3982MB available, 9070 GFLOPS peak)
Tue 20 Nov 2018 09:56:14 GMT |  | CUDA: NVIDIA GPU 1: GeForce GTX 1080 (driver version 390.87, CUDA version 9.1, compute capability 6.1, 4096MB, 3982MB available, 9070 GFLOPS peak)
Tue 20 Nov 2018 09:56:14 GMT |  | OpenCL: NVIDIA GPU 0: GeForce GTX 1080 (driver version 390.87, device version OpenCL 1.2 CUDA, 8120MB, 3982MB available, 9070 GFLOPS peak)
Tue 20 Nov 2018 09:56:14 GMT |  | OpenCL: NVIDIA GPU 1: GeForce GTX 1080 (driver version 390.87, device version OpenCL 1.2 CUDA, 8118MB, 3982MB available, 9070 GFLOPS peak)
Tue 20 Nov 2018 09:56:14 GMT |  | [libc detection] gathered: 2.28, Ubuntu GLIBC 2.28-0ubuntu1
Tue 20 Nov 2018 09:56:14 GMT |  | Host name: lw1-asrockx79
Tue 20 Nov 2018 09:56:14 GMT |  | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz [Family 6 Model 62 Stepping 4]
Tue 20 Nov 2018 09:56:14 GMT |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts flush_l1d
Tue 20 Nov 2018 09:56:14 GMT |  | OS: Linux Ubuntu: Ubuntu 18.10 [4.18.0-11-generic|libc 2.28 (Ubuntu GLIBC 2.28-0ubuntu1)]
Tue 20 Nov 2018 09:56:14 GMT |  | Memory: 23.49 GB physical, 0 bytes virtual
Tue 20 Nov 2018 09:56:14 GMT |  | Disk: 251.15 GB total, 231.93 GB free
Tue 20 Nov 2018 09:56:14 GMT |  | Local time is UTC +0 hours
Tue 20 Nov 2018 09:56:14 GMT | collatz | Found app_config.xml
Tue 20 Nov 2018 09:56:14 GMT | PrimeGrid | Found app_config.xml

Tue 20 Nov 2018 09:56:14 GMT |  | Config: use all coprocessors
Tue 20 Nov 2018 09:56:14 GMT | collatz | URL https://boinc.thesonntags.com/collatz/; Computer ID 842433; resource share 10000
Tue 20 Nov 2018 09:56:14 GMT | PrimeGrid | URL http://www.primegrid.com/; Computer ID 940055; resource share 1000
Tue 20 Nov 2018 09:56:14 GMT | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 5016304; resource share 2000
Tue 20 Nov 2018 09:56:14 GMT |  | General prefs: from http://einstein.phys.uwm.edu/ (last modified 06-Nov-2018 04:14:50)
Tue 20 Nov 2018 09:56:14 GMT |  | Host location: none
Tue 20 Nov 2018 09:56:14 GMT |  | General prefs: using your defaults
Tue 20 Nov 2018 09:56:14 GMT |  | Reading preferences override file
Tue 20 Nov 2018 09:56:14 GMT |  | Preferences:
Tue 20 Nov 2018 09:56:14 GMT |  | max memory usage when active: 12027.89 MB
Tue 20 Nov 2018 09:56:14 GMT |  | max memory usage when idle: 21650.20 MB
Tue 20 Nov 2018 09:56:14 GMT |  | max disk usage: 226.03 GB
Tue 20 Nov 2018 09:56:14 GMT |  | max CPUs used: 9
Tue 20 Nov 2018 09:56:14 GMT |  | suspend work if non-BOINC CPU load exceeds 25%
Tue 20 Nov 2018 09:56:14 GMT |  | (to change preferences, visit a project web site or select Preferences in the Manager)
Tue 20 Nov 2018 09:56:14 GMT |  | Setting up project and slot directories
Tue 20 Nov 2018 09:56:14 GMT |  | Checking active tasks
Tue 20 Nov 2018 09:56:14 GMT |  | Setting up GUI RPC socket
Image

I have now reduced it to WU per GPU and it looks as if times are reducing significantly. Not sure if credits earned will be any higher than putting 2 WU on a single card to be honest. Will report back

Incidentally, the graph shows both cards jammed up at 100% utilisation
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#20 Re: Collatz GPU optimisations

Post by Alez »

Yes, everything looks correct, boinc definitely see's and is using both cards. Tons of memory, cpu etc. I really have no idea. As a long shot set suspend work in non boinc load to 75%. Only other thing I can think of is that one slot is 16e and the other is 8e but even that shouldn't account for double the time. If the utilisation is at 100% with one unit per card then no point in running 2 per card. Very strange. Let it run 1 unit/card for a bit and see what times/credits are as a benchmark.
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#21 Re: Collatz GPU optimisations

Post by davidbam »

Alez wrote: Tue Nov 20, 2018 11:22 am If the utilisation is at 100% with one unit per card then no point in running 2 per card
THIS !!!

I remembered it all wrong. With the app optimisation on Collatz, there is no benefit from running 2 WU per GPU. I think that thought was a hangover from trying it on a sprint.

a Collatz WU is now completing in 6 mins +/- a few seconds so comfortably over 13 million / day Collatz from one machine. It has 9 threads on WCG as a bonus
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#22 Re: Collatz GPU optimisations

Post by Alez »

Sorted then, 1 unit at 6 mins and 2 at 13 mins. Better to run the single units and all working as it should be. Very nice points haul per day from only 2 cards.
Those two cards are matching the output I have from 4 and a bit cards :clap:
................. Of course this is how arms races start :whistle:
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#23 Re: Collatz GPU optimisations

Post by davidbam »

Cheers - I am keeping my eyes peeled for another GTX1080 as they go for a fair bit less than GTX1080Ti. The Ti doesn't seem to be dropping much in price - possibly due to the bad press that that its successor seems to be getting.

Doubtless the points-per-£ ratio will all change in a few months though
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#24 Re: Collatz GPU optimisations

Post by davidbam »

my next problem is very slow WU on GTX960 - 2.5 hours or thereby so approx 320K per day.

I've tried the optimisation parameters from https://boinc.thesonntags.com/collatz/f ... postid=769 and they give a computation error. I note this was for Windoze; is it different under Linux?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#25 Re: Collatz GPU optimisations

Post by Alez »

davidBAM wrote: Wed Nov 21, 2018 7:33 pm my next problem is very slow WU on GTX960 - 2.5 hours or thereby so approx 320K per day.

I've tried the optimisation parameters from https://boinc.thesonntags.com/collatz/f ... postid=769 and they give a computation error. I note this was for Windoze; is it different under Linux?
That is wrong. The GTX 960 is only 2GB and not massively more powerful than my GTX 750ti, but that card manages an un-optimised task in approx 46 mins on average.
The comp error is probably due to the card not handling the optimisation especially sieve_size=30 probabily overflows the 2 GB memory. I'd expect the card to run a task in approx 30 mins.

try

Code: Select all

verbose=0
kernels_per_reduction=48
threads=9
lut_size=17
sieve_size=27
reduce_cpu=0
cache_sieve=1
sleep=0
if that crashes go down to sieve_size=26 and threads=8
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#26 Re: Collatz GPU optimisations

Post by davidbam »

Thanks - that is marginally better but still estimating over 2 hours :-(

Card reports as having 4Gb btw ...
Wed 21 Nov 2018 23:20:29 GMT | | CUDA: NVIDIA GPU 0: GeForce GTX 960 (driver version 390.87, CUDA version 9.1, compute capability 2.1, 4096MB, 4011MB available, 691 GFLOPS peak)
Wed 21 Nov 2018 23:20:29 GMT | | OpenCL: NVIDIA GPU 0: GeForce GTX 960 (driver version 390.87, device version OpenCL 1.1 CUDA, 4535MB, 4011MB available, 691 GFLOPS peak)

Just noticed ... OpenCL 1.1? Is that correct? And CUDA version 9.1. Wondering if I have the wrong drivers?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#27 Re: Collatz GPU optimisations

Post by Alez »

[mention]davidBAM[/mention] I currently don't have any systems running GPU / Linux but 390 is old. I believe 396 works well as do the 415 / 416 beta drivers. Some have claimed problems with 410 and 411 although 410.73 are reported to be very good on GPUgrid ? Make of that what you will. You could try 396 or straight to the beta 415 / 416. I believe 410 is the latest version officially approved for LTS so you should be able to choose it from software center. If not, follow instructions below.
I see your 1080's and opencl 1.2

try updating the drivers:

Code: Select all

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
Check which drivers are available in software center
or continue change the 410 to whichever version you want

Code: Select all

sudo apt-get install nvidia-driver-410
sudo apt-get install nvidia-modprobe --reinstall
sudo apt-get autoremove
if you want the absolute latest drivers then install this ppa

Code: Select all

sudo add-apt-repository ppa:xorg-edgers/ppa
check that the opencl hasn't been wiped after new driver install

Code: Select all

sudo apt install ocl-icd-libopencl1
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#28 Re: Collatz GPU optimisations

Post by davidbam »

An update ... I was going fairly well until "sudo apt-get install nvidia-driver-410" errored with all sorts of dependency failures. I worked through these until 410 seemed to be installed but a reboot had boincmgr report no useable GPU.

I think I'll have a fresh try at this from a new install after the Sprint (as that machine is having some success with LHC CPU work). Problem is though - I have about 1000 Collatz jobs bunkered for the 20 fake GPU cards too :lol: :lol: I am going to be very unpopular if I reset that project
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10363
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet

#29 Re: Collatz GPU optimisations

Post by Alez »

[mention]davidBAM[/mention] do this

Code: Select all

sudo apt-get install libnvidia-compute-410
and then do

Code: Select all

sudo apt-get install nvidia-driver-410
sudo apt-get install nvidia-modprobe --reinstall
sudo apt-get autoremove
hopefully that works and then check the opencl is installed.

or if it has installed, BOINC will not see the GPU unless you install nvidia-modprobe.
Image
The best form of help from above is a sniper on the rooftop....
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm

#30 Re: Collatz GPU optimisations

Post by Bryan »

If you get into a bind and have to reinstall Linux you can save all your BOINC WU. Copy the /var/lib/boinc-client folder to a USB stick. After you get the new Linux installed then install the same version of BOINC. Once you have it running shut it down and overwrite the boinc-client with the one that you saved.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6371
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Contact:

#31 Re: Collatz GPU optimisations

Post by davidbam »

Awesome, thanks. I did wonder about that but felt the chances of success were ... well ... limited
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
Post Reply Previous topicNext topic

Return to “Collatz Conjecture”