Project

General

Profile

Feature #9460

tool to regroup confused images

Added by Antoine Beaupré over 4 years ago. Updated over 4 years ago.

Status:
Patch attached
Priority:
Low
Assignee:
-
Category:
General
Target version:
Start date:
06/09/2013
Due date:
% Done:

70%

Affected Version:
System:
bitness:
64-bit
hardware architecture:
amd64/x86

Description

During data imports, it is likely that Darktable will not figure out JPG versions of RAW material. I have stumbled upon this problem (bug?) when doing an import from Shotwell (more of that fun here: http://www.darktable.org/redmine/projects/users/wiki/Import_from_other_software).

The problem is that Darktable imports the JPG and the CR2 as two distinct files, and yes, I like the idea of DT still importing the JPG, but they should be grouped. While I have no clue how to fix this in the importer, it seems to me a simple way is to fix this directly in the DB.

The attached script does just that, and fairly safely. Sample run:

anarcat@marcos:~$ python bin/darktable-no-dupe.py -v -l 1 -d -n
loading darktable database...
/home/anarcat/Photos/2012/11/12/IMG_0283.CR2    4027    4027    /home/anarcat/Photos/2012/11/12/IMG_0283_CR2.jpg      4028     4028
would be executing UPDATE images SET group_id = 4027 WHERE id = 4028
grouped 0 images
anarcat@marcos:~$ python bin/darktable-no-dupe.py -v -l 1 -d
loading darktable database...
/home/anarcat/Photos/2012/11/12/IMG_0283.CR2    4027    4027    /home/anarcat/Photos/2012/11/12/IMG_0283_CR2.jpg      4028     4028
would be executing UPDATE images SET group_id = 4027 WHERE id = 4028
grouped 1 images
anarcat@marcos:~$ python bin/darktable-no-dupe.py -v -l 10
loading darktable database...
executing UPDATE images SET group_id = 4065 WHERE id = 4067
executing UPDATE images SET group_id = 4069 WHERE id = 4071
executing UPDATE images SET group_id = 4073 WHERE id = 4074
executing UPDATE images SET group_id = 4076 WHERE id = 4077
executing UPDATE images SET group_id = 4079 WHERE id = 4080
executing UPDATE images SET group_id = 4082 WHERE id = 4083
executing UPDATE images SET group_id = 4085 WHERE id = 4087
executing UPDATE images SET group_id = 4089 WHERE id = 4090
executing UPDATE images SET group_id = 4095 WHERE id = 4097
executing UPDATE images SET group_id = 4099 WHERE id = 4100
grouped 10 images
anarcat@marcos:~$ python bin/darktable-no-dupe.py  -l 1000
loading darktable database...
grouped 281 images
anarcat@marcos:~$ python bin/darktable-no-dupe.py  -l 1
loading darktable database...
grouped 0 images

It only works for CR2 images now, but could easily be adopted to others...

darktable-no-dupe.py Magnifier (1.65 KB) Antoine Beaupré, 06/09/2013 05:40 AM

darktable-no-dupe.py Magnifier (1.68 KB) Antoine Beaupré, 06/09/2013 06:58 AM

History

#1 Updated by Antoine Beaupré over 4 years ago

small change: don't commit on dry runs.

#2 Updated by Tobias Ellinghaus over 4 years ago

  • % Done changed from 0 to 20
  • Status changed from New to Incomplete

What are the filenames or your CR2 and JPEG images looking like? In theory darktable should already group them if they only differ by file extension.

#3 Updated by Antoine Beaupré over 4 years ago

  • % Done changed from 20 to 70
  • Status changed from Incomplete to Patch attached

.CR2 and _CR2.jpg files, this is from a Canon Powershot G12.

But they were imported from a Shotwell directory, and I think that's part of the problem. In fact, sometimes there were three images for the RAW: the one from the camera, and another one generated by shotwell (!), which was a problem for a time in Shotwell (see http://redmine.yorba.org/issues/4261).

So this is not about a "bug" in Darktable, but more a way of cleaning photos that may have duplicated, whatever the reason. I am also considering extending this tool to actually delete JPG images that are copies of RAW files or simply duplicates.

Would that be useful in the tools/ directory?

Also available in: Atom PDF