Image import time grows quadratically with the size of the image collection
As discussed on the darktable-users list:
Importing images becomes slower and slower as the database grows. The reason for this seems to be that image.c:dt_image_import() does a SQL SELECT for images with the same filename on each import. Apparently sqlite is doing a full table scan to respond to that SELECT. Adding an index to the filename field in the images table should fix this.
#1 Updated by Tobias Ellinghaus about 6 years ago
- % Done changed from 0 to 50
- Assignee set to Tobias Ellinghaus
- Status changed from New to In Progress
I am not sure if an image on filename or on (film_id, filename) would be better, but since we already have one on film_id and I hope that sqlite is smart I will just add one on filename. Please report back if you can try it with master once it's in. I don't have a big collection around to test the speed improvement.
#4 Updated by Tobias Ellinghaus about 6 years ago
- I added
printf("XXXX %f\n", dt_get_wtime());to dt_image_import() to get the wall clock time when an image got imported
- I created 10k JPEG files using
convert -size 6000x4000 -compress LZW xc:white 0.jpg; for ((i=1; i<10000; i++)); do cp 0.jpg $i.jpg; done;
- I imported those into darktable running it as
/opt/darktable/bin/darktable | grep XXXX > /tmp/benchmark.txt(/tmp/ is a ramdisk)
- I imported the resulting file into libreoffice and generated a graph from the times. It's attached
The result looked quite linear so I am not sure why you get quadratic runtimes.
In the "Affected Version" field you said 1.4.1, did you also try it with a compile from git master (I guess that the answer is yes, as you tried the indexing commit)?
#5 Updated by Pedro Côrte-Real about 6 years ago
- File FirstRunChart.png FirstRunChart.png added
- File 20140323 DarktableImportStatistics.gnumeric 20140323 DarktableImportStatistics.gnumeric added
I've attached my chart and gnumeric spreadsheet. I did it by using a stopwatch and noting the elapsed time when a number of images had passed (as the counter of the interface updated). It's clearly quadratic (it fits perfectly). This graph was done with 1.4.1 as was the bug report as that was what I had installed. I tested it with master though and the results were the same.
I'm not sure how you imported into darktable but is it possible the lighttable view update wasn't being run? You were in map mode perhaps? Or you've disabled the dt_image_import signal? It could also be that for some reason your sqlite is faster than mine and so the penalty isn't so large. Maybe it's a newer version or running under a VM makes it much slower for me?