Project

General

Profile

Feature #11194

Mixed GPU&CPU-pixelpiepline for best performance possible?

Added by Anonymous almost 3 years ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
General
Target version:
-
Start date:
10/05/2016
Due date:
% Done:

0%

Affected Version:
2.0.6
System:
Ubuntu
bitness:
64-bit
hardware architecture:
amd64/x86

Description

Hi,

i didn't find another way to ask my question besides of this tracker. I'm using darktable from the Darktable Releases PPA (v 2.0.6) with my Ubuntu 16.04 installation using an older Intel Core2Quad Processor Q6600 and a cheap GeForce GT610. With the newer installs of darktable opencl is per default activated and works functionally fine with my setup. But i found a decrease of speed in comparison to the older installations without activated-OpenCL-Option.

The Problem seems, that many NVIDIA-Cards have a poor performance on some operations (Non-local means???). On my system the modules denoiseprofile_vert and denoiseprofile_accu work slower on the GT610 using OpenCL than working directly on the cpu. All other modules and filter work multiple times faster on the GT610 than on the Intel CPU. So a mixed setup would boost the darktable-performance most!

I attached a part of the debuglog for export a photo with exactly the same settings using the GPU and using the CPU below (denoise profile actiavated). You can easily see that only some module are performing very slow on the GPU.

My question is: Is it possible to activate the opencl-option globally while disabling it for some modules? Then darktable would get a hugh performancewin in comparison to are truly gpu- or cpu-based pixelpipeline. Perhaps it would be possible to implement some kind of "initial benchmarking" to check wich option to run on the gpu and which on the cpu. This would be a very luxury option. If this isn`t possible perhaps there is a way to deactivate the gpu for i.e. the denoise-profile-filter or others manually in the config-files!? It seems that there is already a module "gamma" just working on the cpu all the time? It seems the i'm not the only one with this problem.

Thank you for a very great photography workflow application.

Regards,
Oliver.

[opencl_init] OpenCL successfully initialized.
[opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
[opencl_init] 0 'GeForce GT 610'
[opencl_init] these are your device priorities:
[opencl_init] image preview export thumbnail
[opencl_init] 0 -1 0 0
[opencl_init] FINALLY: opencl is AVAILABLE on this system.
[opencl_init] initial status of opencl enabled flag is OFF.
[opencl_create_kernel] successfully loaded kernel `zero' (0) for device 0
[opencl_create_kernel] successfully loaded kernel `splat' (1) for device 0
[opencl_create_kernel] successfully loaded kernel `blur_line' (2) for device 0
[opencl_create_kernel] successfully loaded kernel `blur_line_z' (3) for device 0
[opencl_create_kernel] successfully loaded kernel `slice' (4) for device 0
[opencl_create_kernel] successfully loaded kernel `slice_to_output' (5) for device 0
[opencl_create_kernel] successfully loaded kernel `gaussian_column_1c' (6) for device 0
[opencl_create_kernel] successfully loaded kernel `gaussian_transpose_1c' (7) for device 0
[opencl_create_kernel] successfully loaded kernel `gaussian_column_4c' (8) for device 0
[opencl_create_kernel] successfully loaded kernel `gaussian_transpose_4c' (9) for device 0
[opencl_create_kernel] successfully loaded kernel `interpolation_resample' (10) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_mask_Lab' (11) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_mask_RAW' (12) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_mask_rgb' (13) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_Lab' (14) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_RAW' (15) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_rgb' (16) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_copy_alpha' (17) for device 0
[opencl_create_kernel] successfully loaded kernel `blendop_set_mask' (18) for device 0
[opencl_create_kernel] successfully loaded kernel `colorout' (19) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_rotate_bilinear' (20) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_rotate_bicubic' (21) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_rotate_lanczos2' (22) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_rotate_lanczos3' (23) for device 0
[opencl_create_kernel] successfully loaded kernel `highpass_invert' (24) for device 0
[opencl_create_kernel] successfully loaded kernel `highpass_hblur' (25) for device 0
[opencl_create_kernel] successfully loaded kernel `highpass_vblur' (26) for device 0
[opencl_create_kernel] successfully loaded kernel `highpass_mix' (27) for device 0
[opencl_create_kernel] successfully loaded kernel `lowlight' (28) for device 0
[opencl_create_kernel] successfully loaded kernel `vignette' (29) for device 0
[opencl_create_kernel] successfully loaded kernel `tonecurve' (30) for device 0
[opencl_create_kernel] successfully loaded kernel `velvia' (31) for device 0
[opencl_create_kernel] successfully loaded kernel `zonesystem' (32) for device 0
[opencl_create_kernel] successfully loaded kernel `channelmixer' (33) for device 0
[opencl_create_kernel] successfully loaded kernel `colorin_unbound' (34) for device 0
[opencl_create_kernel] successfully loaded kernel `colorin_clipping' (35) for device 0
[opencl_create_kernel] successfully loaded kernel `graduatedndp' (36) for device 0
[opencl_create_kernel] successfully loaded kernel `graduatedndm' (37) for device 0
[opencl_create_kernel] successfully loaded kernel `lens_distort_bilinear' (38) for device 0
[opencl_create_kernel] successfully loaded kernel `lens_distort_bicubic' (39) for device 0
[opencl_create_kernel] successfully loaded kernel `lens_distort_lanczos2' (40) for device 0
[opencl_create_kernel] successfully loaded kernel `lens_distort_lanczos3' (41) for device 0
[opencl_create_kernel] successfully loaded kernel `lens_vignette' (42) for device 0
[opencl_create_kernel] successfully loaded kernel `sharpen_hblur' (43) for device 0
[opencl_create_kernel] successfully loaded kernel `sharpen_vblur' (44) for device 0
[opencl_create_kernel] successfully loaded kernel `sharpen_mix' (45) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_init' (46) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_dist' (47) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_horiz' (48) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_vert' (49) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_accu' (50) for device 0
[opencl_create_kernel] successfully loaded kernel `nlmeans_finish' (51) for device 0
[opencl_create_kernel] successfully loaded kernel `colorreconstruction_zero' (52) for device 0
[opencl_create_kernel] successfully loaded kernel `colorreconstruction_splat' (53) for device 0
[opencl_create_kernel] successfully loaded kernel `colorreconstruction_blur_line' (54) for device 0
[opencl_create_kernel] successfully loaded kernel `colorreconstruction_slice' (55) for device 0
[opencl_create_kernel] successfully loaded kernel `pixelmax_first' (56) for device 0
[opencl_create_kernel] successfully loaded kernel `pixelmax_second' (57) for device 0
[opencl_create_kernel] successfully loaded kernel `global_tonemap_reinhard' (58) for device 0
[opencl_create_kernel] successfully loaded kernel `global_tonemap_drago' (59) for device 0
[opencl_create_kernel] successfully loaded kernel `global_tonemap_filmic' (60) for device 0
[opencl_create_kernel] successfully loaded kernel `invert_1f' (61) for device 0
[opencl_create_kernel] successfully loaded kernel `invert_4f' (62) for device 0
[opencl_create_kernel] successfully loaded kernel `flip' (63) for device 0
[opencl_create_kernel] successfully loaded kernel `colormapping_histogram' (64) for device 0
[opencl_create_kernel] successfully loaded kernel `colormapping_mapping' (65) for device 0
[opencl_create_kernel] successfully loaded kernel `exposure' (66) for device 0
[opencl_create_kernel] successfully loaded kernel `colisa' (67) for device 0
[opencl_create_kernel] successfully loaded kernel `colorbalance' (68) for device 0
[opencl_create_kernel] successfully loaded kernel `rawprepare_1f' (69) for device 0
[opencl_create_kernel] successfully loaded kernel `rawprepare_4f' (70) for device 0
[opencl_create_kernel] successfully loaded kernel `colorcorrection' (71) for device 0
[opencl_create_kernel] successfully loaded kernel `colorzones' (72) for device 0
[opencl_create_kernel] successfully loaded kernel `highlights_1f' (73) for device 0
[opencl_create_kernel] successfully loaded kernel `highlights_4f' (74) for device 0
[opencl_create_kernel] successfully loaded kernel `soften_overexposed' (75) for device 0
[opencl_create_kernel] successfully loaded kernel `soften_hblur' (76) for device 0
[opencl_create_kernel] successfully loaded kernel `soften_vblur' (77) for device 0
[opencl_create_kernel] successfully loaded kernel `soften_mix' (78) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_and_zoom_demosaic_half_size' (79) for device 0
[opencl_create_kernel] successfully loaded kernel `ppg_demosaic_green' (80) for device 0
[opencl_create_kernel] successfully loaded kernel `green_equilibration' (81) for device 0
[opencl_create_kernel] successfully loaded kernel `pre_median' (82) for device 0
[opencl_create_kernel] successfully loaded kernel `ppg_demosaic_green_median' (83) for device 0
[opencl_create_kernel] successfully loaded kernel `ppg_demosaic_redblue' (84) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_and_zoom' (85) for device 0
[opencl_create_kernel] successfully loaded kernel `border_interpolate' (86) for device 0
[opencl_create_kernel] successfully loaded kernel `color_smoothing' (87) for device 0
[opencl_create_kernel] successfully loaded kernel `passthrough_monochrome' (88) for device 0
[opencl_create_kernel] successfully loaded kernel `clip_and_zoom_demosaic_passthrough_monochrome' (89) for device 0
[opencl_create_kernel] successfully loaded kernel `splittoning' (90) for device 0
[opencl_create_kernel] successfully loaded kernel `monochrome_filter' (91) for device 0
[opencl_create_kernel] successfully loaded kernel `monochrome' (92) for device 0
[opencl_create_kernel] successfully loaded kernel `basecurve' (93) for device 0
[opencl_create_kernel] successfully loaded kernel `colorcontrast' (94) for device 0
[opencl_create_kernel] successfully loaded kernel `vibrance' (95) for device 0
[opencl_create_kernel] successfully loaded kernel `shadows_highlights_mix' (96) for device 0
[opencl_create_kernel] successfully loaded kernel `levels' (97) for device 0
[opencl_create_kernel] successfully loaded kernel `relight' (98) for device 0
[opencl_create_kernel] successfully loaded kernel `colorize' (99) for device 0
[opencl_create_kernel] successfully loaded kernel `whitebalance_4f' (100) for device 0
[opencl_create_kernel] successfully loaded kernel `whitebalance_1f' (101) for device 0
[opencl_create_kernel] successfully loaded kernel `lowpass_mix' (102) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_precondition' (103) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_init' (104) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_dist' (105) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_horiz' (106) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_vert' (107) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_accu' (108) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_finish' (109) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_backtransform' (110) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_decompose' (111) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_synthesize' (112) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_reduce_first' (113) for device 0
[opencl_create_kernel] successfully loaded kernel `denoiseprofile_reduce_second' (114) for device 0
[opencl_create_kernel] successfully loaded kernel `eaw_decompose' (115) for device 0
[opencl_create_kernel] successfully loaded kernel `eaw_synthesize' (116) for device 0
[opencl_create_kernel] successfully loaded kernel `overexposed' (117) for device 0
[opencl_create_kernel] successfully loaded kernel `profilegamma' (118) for device 0
[opencl_create_kernel] successfully loaded kernel `bloom_threshold' (119) for device 0
[opencl_create_kernel] successfully loaded kernel `bloom_hblur' (120) for device 0
[opencl_create_kernel] successfully loaded kernel `bloom_vblur' (121) for device 0
[opencl_create_kernel] successfully loaded kernel `bloom_mix' (122) for device 0
[opencl_create_kernel] successfully loaded kernel `borders_fill' (123) for device 0
[dev] took 0,268 secs (0,424 CPU) to load the image.
[export] creating pixelpipe took 0,188 secs (0,256 CPU)

[opencl_update_enabled] enabled flag set to ON
[pixelpipe_process] [export] using device 0
[dev_pixelpipe] took 0,000 secs (0,000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0,036 secs (0,040 CPU) processed `Raw-Schwarz-/Weißpunkt' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,030 secs (0,040 CPU) processed `Weißabgleich' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,031 secs (0,012 CPU) processed `Spitzlicht-Rekonstruktion' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,251 secs (0,124 CPU) processed `Entrastern' on GPU, blended on GPU [export]
[dev_pixelpipe] took 70,229 secs (42,540 CPU) processed `Entrauschen (Profil) ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,083 secs (0,048 CPU) processed `Belichtung ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 3,852 secs (3,648 CPU) processed `Objektivkorrektur ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,145 secs (0,104 CPU) processed `Basiskurve ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,087 secs (0,060 CPU) processed `Eingabefarbprofil' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,084 secs (0,040 CPU) processed `Lebendigkeit ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,382 secs (0,248 CPU) processed `Lokaler Kontrast ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,148 secs (0,088 CPU) processed `Farbkurve ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,383 secs (0,232 CPU) processed `Schärfen ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,239 secs (0,140 CPU) processed `Ausgabefarbprofil' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,122 secs (0,068 CPU) processed `Vignettierung ' on GPU, blended on GPU [export]
[dev_pixelpipe] took 0,239 secs (0,364 CPU) processed `Gamma' on CPU, blended on CPU [export]
[opencl_profiling] spent 0,0137 seconds in [Write Image (from host to device)]
[opencl_profiling] spent 0,0207 seconds in rawprepare_1f
[opencl_profiling] spent 0,0273 seconds in whitebalance_1f
[opencl_profiling] spent 0,0290 seconds in highlights_1f
[opencl_profiling] spent 0,0806 seconds in ppg_demosaic_green
[opencl_profiling] spent 0,1600 seconds in ppg_demosaic_redblue
[opencl_profiling] spent 0,0084 seconds in border_interpolate
[opencl_profiling] spent 0,0825 seconds in denoiseprofile_precondition
[opencl_profiling] spent 0,0276 seconds in denoiseprofile_init
[opencl_profiling] spent 9,4712 seconds in denoiseprofile_dist
[opencl_profiling] spent 5,4537 seconds in denoiseprofile_horiz
[opencl_profiling] spent 37,4401 seconds in denoiseprofile_vert
[opencl_profiling] spent 17,4766 seconds in denoiseprofile_accu
[opencl_profiling] spent 0,1118 seconds in denoiseprofile_finish
[opencl_profiling] spent 0,0814 seconds in exposure
[opencl_profiling] spent 0,2115 seconds in [Write Buffer (from host to device)]
[opencl_profiling] spent 0,1115 seconds in lens_vignette
[opencl_profiling] spent 3,2186 seconds in lens_distort_lanczos3
[opencl_profiling] spent 0,1425 seconds in basecurve
[opencl_profiling] spent 0,0827 seconds in colorin_unbound
[opencl_profiling] spent 0,0825 seconds in vibrance
[opencl_profiling] spent 0,0000 seconds in zero
[opencl_profiling] spent 0,2960 seconds in splat
[opencl_profiling] spent 0,0001 seconds in [Copy Buffer to Buffer (on device)]
[opencl_profiling] spent 0,0010 seconds in blur_line
[opencl_profiling] spent 0,0000 seconds in blur_line_z
[opencl_profiling] spent 0,0828 seconds in slice
[opencl_profiling] spent 0,1446 seconds in tonecurve
[opencl_profiling] spent 0,1394 seconds in sharpen_hblur
[opencl_profiling] spent 0,0902 seconds in sharpen_vblur
[opencl_profiling] spent 0,1511 seconds in sharpen_mix
[opencl_profiling] spent 0,2352 seconds in colorout
[opencl_profiling] spent 0,1195 seconds in vignette
[opencl_profiling] spent 0,1694 seconds in [Read Image (from device to host)]
[opencl_profiling] spent 75,7629 seconds totally in command queue (with 0 events missing)
[dev_process_export] pixel pipeline processing took 76,571 secs (48,644 CPU)
[export_job] exported to [...]IMG_5234_02.jpg

[dev] took 0,000 secs (0,000 CPU) to load the image.
[export] creating pixelpipe took 0,189 secs (0,260 CPU)
[opencl_update_enabled] enabled flag set to OFF
[pixelpipe_process] [export] using device '-1
[dev_pixelpipe] took 0,000 secs (0,000 CPU) initing base buffer [export]
[dev_pixelpipe] took 0,038 secs (0,072 CPU) processed `Raw-Schwarz-/Weißpunkt' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,037 secs (0,088 CPU) processed `Weißabgleich' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,020 secs (0,040 CPU) processed `Spitzlicht-Rekonstruktion' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,337 secs (0,792 CPU) processed `Entrastern' on CPU, blended on CPU [export]
[dev_pixelpipe] took 28,618 secs (108,888 CPU) processed `Entrauschen (Profil) ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,119 secs (0,416 CPU) processed `Belichtung ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 2,209 secs (8,028 CPU) processed `Objektivkorrektur ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,119 secs (0,404 CPU) processed `Basiskurve ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,098 secs (0,292 CPU) processed `Eingabefarbprofil' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,127 secs (0,400 CPU) processed `Lebendigkeit ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 1,041 secs (3,984 CPU) processed `Lokaler Kontrast ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,126 secs (0,412 CPU) processed `Farbkurve ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,287 secs (1,004 CPU) processed `Schärfen ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,173 secs (0,600 CPU) processed `Ausgabefarbprofil' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,690 secs (2,248 CPU) processed `Vignettierung ' on CPU, blended on CPU [export]
[dev_pixelpipe] took 0,061 secs (0,164 CPU) processed `Gamma' on CPU, blended on CPU [export]
[dev_process_export] pixel pipeline processing took 34,357 secs (128,712 CPU)
[export_job] exported to [...]IMG_5234_02.jpg

Also available in: Atom PDF