How to remove duplicate imported photos in Shotwell











up vote
6
down vote

favorite
2












I've noticed that Shotwell has imported many images twice (e.g. from my camera SD card). Apparently the duplicate detection is buggy once a photo is imported, tagged and then re-imported.



I have "write meta data tags" enabled in the settings. If I import a photo test-images.jpg and add tags to it the photo will not be picked up by the duplicate detection upon another import of the same file.
The second time the file is imported it will be named test-images-1.jpg and placed in the library folder as per the active rules (not necessarily into the same folder).



test-images.jpg and test-images-1.jpg will have the same image data but due to the added tag/metadata the files are not the same anymore and won't be picked up by searching for duplicates (e.g. md5 hash).



My usage scenario that caused multiple duplicate is as follows:




  1. I take pictures with my phone

  2. I import the photos from my phone, add tags but leave the images on the phone as I want to keep them for sharing etc.

  3. I add further tags to the imported photos

  4. After some weeks I repeat the import step from the phone and old photos that I have already imported will be imported again (with '-1.jpg' or '-2.jpg' added)


How to clean up the duplicates?
Using a file name based search would be possible but I can't exclude that I have not imported a file ending with -1 to which was not imported as a duplicate.



How can I clean up my photo library? I tried to use the search function in Shotwell but with more than 1000 photos there must be a better, more reliable, less error prone an simpler way.



I'm not to worried about tags getting lost, typically the second import (the duplicate) has no tags applied.










share|improve this question
























  • possible duplicate of How to use fdupes?
    – Panther
    Jun 30 '14 at 16:08






  • 1




    I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
    – seb
    Jul 1 '14 at 11:05










  • If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
    – Panther
    Jul 1 '14 at 12:29















up vote
6
down vote

favorite
2












I've noticed that Shotwell has imported many images twice (e.g. from my camera SD card). Apparently the duplicate detection is buggy once a photo is imported, tagged and then re-imported.



I have "write meta data tags" enabled in the settings. If I import a photo test-images.jpg and add tags to it the photo will not be picked up by the duplicate detection upon another import of the same file.
The second time the file is imported it will be named test-images-1.jpg and placed in the library folder as per the active rules (not necessarily into the same folder).



test-images.jpg and test-images-1.jpg will have the same image data but due to the added tag/metadata the files are not the same anymore and won't be picked up by searching for duplicates (e.g. md5 hash).



My usage scenario that caused multiple duplicate is as follows:




  1. I take pictures with my phone

  2. I import the photos from my phone, add tags but leave the images on the phone as I want to keep them for sharing etc.

  3. I add further tags to the imported photos

  4. After some weeks I repeat the import step from the phone and old photos that I have already imported will be imported again (with '-1.jpg' or '-2.jpg' added)


How to clean up the duplicates?
Using a file name based search would be possible but I can't exclude that I have not imported a file ending with -1 to which was not imported as a duplicate.



How can I clean up my photo library? I tried to use the search function in Shotwell but with more than 1000 photos there must be a better, more reliable, less error prone an simpler way.



I'm not to worried about tags getting lost, typically the second import (the duplicate) has no tags applied.










share|improve this question
























  • possible duplicate of How to use fdupes?
    – Panther
    Jun 30 '14 at 16:08






  • 1




    I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
    – seb
    Jul 1 '14 at 11:05










  • If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
    – Panther
    Jul 1 '14 at 12:29













up vote
6
down vote

favorite
2









up vote
6
down vote

favorite
2






2





I've noticed that Shotwell has imported many images twice (e.g. from my camera SD card). Apparently the duplicate detection is buggy once a photo is imported, tagged and then re-imported.



I have "write meta data tags" enabled in the settings. If I import a photo test-images.jpg and add tags to it the photo will not be picked up by the duplicate detection upon another import of the same file.
The second time the file is imported it will be named test-images-1.jpg and placed in the library folder as per the active rules (not necessarily into the same folder).



test-images.jpg and test-images-1.jpg will have the same image data but due to the added tag/metadata the files are not the same anymore and won't be picked up by searching for duplicates (e.g. md5 hash).



My usage scenario that caused multiple duplicate is as follows:




  1. I take pictures with my phone

  2. I import the photos from my phone, add tags but leave the images on the phone as I want to keep them for sharing etc.

  3. I add further tags to the imported photos

  4. After some weeks I repeat the import step from the phone and old photos that I have already imported will be imported again (with '-1.jpg' or '-2.jpg' added)


How to clean up the duplicates?
Using a file name based search would be possible but I can't exclude that I have not imported a file ending with -1 to which was not imported as a duplicate.



How can I clean up my photo library? I tried to use the search function in Shotwell but with more than 1000 photos there must be a better, more reliable, less error prone an simpler way.



I'm not to worried about tags getting lost, typically the second import (the duplicate) has no tags applied.










share|improve this question















I've noticed that Shotwell has imported many images twice (e.g. from my camera SD card). Apparently the duplicate detection is buggy once a photo is imported, tagged and then re-imported.



I have "write meta data tags" enabled in the settings. If I import a photo test-images.jpg and add tags to it the photo will not be picked up by the duplicate detection upon another import of the same file.
The second time the file is imported it will be named test-images-1.jpg and placed in the library folder as per the active rules (not necessarily into the same folder).



test-images.jpg and test-images-1.jpg will have the same image data but due to the added tag/metadata the files are not the same anymore and won't be picked up by searching for duplicates (e.g. md5 hash).



My usage scenario that caused multiple duplicate is as follows:




  1. I take pictures with my phone

  2. I import the photos from my phone, add tags but leave the images on the phone as I want to keep them for sharing etc.

  3. I add further tags to the imported photos

  4. After some weeks I repeat the import step from the phone and old photos that I have already imported will be imported again (with '-1.jpg' or '-2.jpg' added)


How to clean up the duplicates?
Using a file name based search would be possible but I can't exclude that I have not imported a file ending with -1 to which was not imported as a duplicate.



How can I clean up my photo library? I tried to use the search function in Shotwell but with more than 1000 photos there must be a better, more reliable, less error prone an simpler way.



I'm not to worried about tags getting lost, typically the second import (the duplicate) has no tags applied.







shotwell photo-management tag duplicate






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 12 '14 at 13:08

























asked Jun 30 '14 at 11:40









seb

1,64172024




1,64172024












  • possible duplicate of How to use fdupes?
    – Panther
    Jun 30 '14 at 16:08






  • 1




    I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
    – seb
    Jul 1 '14 at 11:05










  • If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
    – Panther
    Jul 1 '14 at 12:29


















  • possible duplicate of How to use fdupes?
    – Panther
    Jun 30 '14 at 16:08






  • 1




    I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
    – seb
    Jul 1 '14 at 11:05










  • If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
    – Panther
    Jul 1 '14 at 12:29
















possible duplicate of How to use fdupes?
– Panther
Jun 30 '14 at 16:08




possible duplicate of How to use fdupes?
– Panther
Jun 30 '14 at 16:08




1




1




I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
– seb
Jul 1 '14 at 11:05




I disagree that this is a duplicate. As described in the linked bug Shotwell has detection for duplicates but once the first imported file was tagged (and the tag written to the file) the fingerprint/hash of the file changes. Thus the first imported file is different from the next imported file although they might be the same file (when imported).
– seb
Jul 1 '14 at 11:05












If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
– Panther
Jul 1 '14 at 12:29




If the files differ in their hash, and you do not see some sort of pattern to the duplicate files, you would have to manually resolve the problem. Can you update your question with the information you posted in your comment and post additional information on the files.
– Panther
Jul 1 '14 at 12:29










4 Answers
4






active

oldest

votes

















up vote
3
down vote













I ran into the same problem a few weeks ago. The solution I found to resolve this issue is basic but works : inside Shotwell, make a new saved search that displays all pictures not tagged AND with filenames ending with "_1.jpg".
You can then erase all files listed by Shotwell for this search but be careful, make a backup before ;-)
In my case I deleted 2000+ pictures !






share|improve this answer




























    up vote
    3
    down vote













    Sort of spamish, but I found myself with the same problem a few monts ago, and I wrote a small utility that does just that:



    https://github.com/jesjimher/imgdupes



    It's a python script that scans a directory tree looking for duplicates. Its syntax is intentionally similar to fdupes, with the difference that imgdupes ignores all metadata and analyzes only the image data chunk of a JPEG file. This means that two different versions of the same image, with different tags, rotation flags, dates, etc., will be reported as duplicates even if physical files are different (and thus not detected as duplicates by fdupes/shotwell).



    It was recently renamed to jpegdupes, and is now on Pypi repos, so scanning a tree for duplicated images might be done like this:



    sudo pip install jpegdupes
    jpegdupes -d ~/Photos/
    (or whatever your path is)



    It would look for JPEGs which are actually the same picture (differing only in metadata), and would interactively show differences and ask for which version to keep.



    Hope it helps.






    share|improve this answer























    • Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
      – seb
      Jan 27 '17 at 10:28


















    up vote
    0
    down vote













    You could just copy the tagged files back to your phone, so they aren't different any more? I think Shotwell ought to cope with it's own tagging though, and this does look like a bug to me.
    I have a similar problem, but with Shotwell re-developing camera raw files every time it's run.






    share|improve this answer




























      up vote
      0
      down vote













      I ran into same problem and solved by exporting all images from Shotwell into another folder. Even if you have dups, Shotwell shows them only once. For instance I had 64K in the folder but Shotwell showed only 32K. So I selected all and exported preserving size, name, metadata etc.
      The only downside is: if you had complicated folder structure and you want to keep it - this solution may not work for you. I have everything in one folder now. BTW looks like this bug is fixed now.






      share|improve this answer





















        Your Answer








        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "89"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














         

        draft saved


        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f489951%2fhow-to-remove-duplicate-imported-photos-in-shotwell%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        3
        down vote













        I ran into the same problem a few weeks ago. The solution I found to resolve this issue is basic but works : inside Shotwell, make a new saved search that displays all pictures not tagged AND with filenames ending with "_1.jpg".
        You can then erase all files listed by Shotwell for this search but be careful, make a backup before ;-)
        In my case I deleted 2000+ pictures !






        share|improve this answer

























          up vote
          3
          down vote













          I ran into the same problem a few weeks ago. The solution I found to resolve this issue is basic but works : inside Shotwell, make a new saved search that displays all pictures not tagged AND with filenames ending with "_1.jpg".
          You can then erase all files listed by Shotwell for this search but be careful, make a backup before ;-)
          In my case I deleted 2000+ pictures !






          share|improve this answer























            up vote
            3
            down vote










            up vote
            3
            down vote









            I ran into the same problem a few weeks ago. The solution I found to resolve this issue is basic but works : inside Shotwell, make a new saved search that displays all pictures not tagged AND with filenames ending with "_1.jpg".
            You can then erase all files listed by Shotwell for this search but be careful, make a backup before ;-)
            In my case I deleted 2000+ pictures !






            share|improve this answer












            I ran into the same problem a few weeks ago. The solution I found to resolve this issue is basic but works : inside Shotwell, make a new saved search that displays all pictures not tagged AND with filenames ending with "_1.jpg".
            You can then erase all files listed by Shotwell for this search but be careful, make a backup before ;-)
            In my case I deleted 2000+ pictures !







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 9 '14 at 22:59









            Cowboydan

            1167




            1167
























                up vote
                3
                down vote













                Sort of spamish, but I found myself with the same problem a few monts ago, and I wrote a small utility that does just that:



                https://github.com/jesjimher/imgdupes



                It's a python script that scans a directory tree looking for duplicates. Its syntax is intentionally similar to fdupes, with the difference that imgdupes ignores all metadata and analyzes only the image data chunk of a JPEG file. This means that two different versions of the same image, with different tags, rotation flags, dates, etc., will be reported as duplicates even if physical files are different (and thus not detected as duplicates by fdupes/shotwell).



                It was recently renamed to jpegdupes, and is now on Pypi repos, so scanning a tree for duplicated images might be done like this:



                sudo pip install jpegdupes
                jpegdupes -d ~/Photos/
                (or whatever your path is)



                It would look for JPEGs which are actually the same picture (differing only in metadata), and would interactively show differences and ask for which version to keep.



                Hope it helps.






                share|improve this answer























                • Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                  – seb
                  Jan 27 '17 at 10:28















                up vote
                3
                down vote













                Sort of spamish, but I found myself with the same problem a few monts ago, and I wrote a small utility that does just that:



                https://github.com/jesjimher/imgdupes



                It's a python script that scans a directory tree looking for duplicates. Its syntax is intentionally similar to fdupes, with the difference that imgdupes ignores all metadata and analyzes only the image data chunk of a JPEG file. This means that two different versions of the same image, with different tags, rotation flags, dates, etc., will be reported as duplicates even if physical files are different (and thus not detected as duplicates by fdupes/shotwell).



                It was recently renamed to jpegdupes, and is now on Pypi repos, so scanning a tree for duplicated images might be done like this:



                sudo pip install jpegdupes
                jpegdupes -d ~/Photos/
                (or whatever your path is)



                It would look for JPEGs which are actually the same picture (differing only in metadata), and would interactively show differences and ask for which version to keep.



                Hope it helps.






                share|improve this answer























                • Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                  – seb
                  Jan 27 '17 at 10:28













                up vote
                3
                down vote










                up vote
                3
                down vote









                Sort of spamish, but I found myself with the same problem a few monts ago, and I wrote a small utility that does just that:



                https://github.com/jesjimher/imgdupes



                It's a python script that scans a directory tree looking for duplicates. Its syntax is intentionally similar to fdupes, with the difference that imgdupes ignores all metadata and analyzes only the image data chunk of a JPEG file. This means that two different versions of the same image, with different tags, rotation flags, dates, etc., will be reported as duplicates even if physical files are different (and thus not detected as duplicates by fdupes/shotwell).



                It was recently renamed to jpegdupes, and is now on Pypi repos, so scanning a tree for duplicated images might be done like this:



                sudo pip install jpegdupes
                jpegdupes -d ~/Photos/
                (or whatever your path is)



                It would look for JPEGs which are actually the same picture (differing only in metadata), and would interactively show differences and ask for which version to keep.



                Hope it helps.






                share|improve this answer














                Sort of spamish, but I found myself with the same problem a few monts ago, and I wrote a small utility that does just that:



                https://github.com/jesjimher/imgdupes



                It's a python script that scans a directory tree looking for duplicates. Its syntax is intentionally similar to fdupes, with the difference that imgdupes ignores all metadata and analyzes only the image data chunk of a JPEG file. This means that two different versions of the same image, with different tags, rotation flags, dates, etc., will be reported as duplicates even if physical files are different (and thus not detected as duplicates by fdupes/shotwell).



                It was recently renamed to jpegdupes, and is now on Pypi repos, so scanning a tree for duplicated images might be done like this:



                sudo pip install jpegdupes
                jpegdupes -d ~/Photos/
                (or whatever your path is)



                It would look for JPEGs which are actually the same picture (differing only in metadata), and would interactively show differences and ask for which version to keep.



                Hope it helps.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 8 hours ago

























                answered Dec 9 '14 at 12:11









                jesjimher

                25117




                25117












                • Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                  – seb
                  Jan 27 '17 at 10:28


















                • Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                  – seb
                  Jan 27 '17 at 10:28
















                Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                – seb
                Jan 27 '17 at 10:28




                Great, I will give this a shot... besides the other options I have tried. e.g. the search suggestion
                – seb
                Jan 27 '17 at 10:28










                up vote
                0
                down vote













                You could just copy the tagged files back to your phone, so they aren't different any more? I think Shotwell ought to cope with it's own tagging though, and this does look like a bug to me.
                I have a similar problem, but with Shotwell re-developing camera raw files every time it's run.






                share|improve this answer

























                  up vote
                  0
                  down vote













                  You could just copy the tagged files back to your phone, so they aren't different any more? I think Shotwell ought to cope with it's own tagging though, and this does look like a bug to me.
                  I have a similar problem, but with Shotwell re-developing camera raw files every time it's run.






                  share|improve this answer























                    up vote
                    0
                    down vote










                    up vote
                    0
                    down vote









                    You could just copy the tagged files back to your phone, so they aren't different any more? I think Shotwell ought to cope with it's own tagging though, and this does look like a bug to me.
                    I have a similar problem, but with Shotwell re-developing camera raw files every time it's run.






                    share|improve this answer












                    You could just copy the tagged files back to your phone, so they aren't different any more? I think Shotwell ought to cope with it's own tagging though, and this does look like a bug to me.
                    I have a similar problem, but with Shotwell re-developing camera raw files every time it's run.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Dec 9 '14 at 12:01









                    Mark Williams

                    2,275720




                    2,275720






















                        up vote
                        0
                        down vote













                        I ran into same problem and solved by exporting all images from Shotwell into another folder. Even if you have dups, Shotwell shows them only once. For instance I had 64K in the folder but Shotwell showed only 32K. So I selected all and exported preserving size, name, metadata etc.
                        The only downside is: if you had complicated folder structure and you want to keep it - this solution may not work for you. I have everything in one folder now. BTW looks like this bug is fixed now.






                        share|improve this answer

























                          up vote
                          0
                          down vote













                          I ran into same problem and solved by exporting all images from Shotwell into another folder. Even if you have dups, Shotwell shows them only once. For instance I had 64K in the folder but Shotwell showed only 32K. So I selected all and exported preserving size, name, metadata etc.
                          The only downside is: if you had complicated folder structure and you want to keep it - this solution may not work for you. I have everything in one folder now. BTW looks like this bug is fixed now.






                          share|improve this answer























                            up vote
                            0
                            down vote










                            up vote
                            0
                            down vote









                            I ran into same problem and solved by exporting all images from Shotwell into another folder. Even if you have dups, Shotwell shows them only once. For instance I had 64K in the folder but Shotwell showed only 32K. So I selected all and exported preserving size, name, metadata etc.
                            The only downside is: if you had complicated folder structure and you want to keep it - this solution may not work for you. I have everything in one folder now. BTW looks like this bug is fixed now.






                            share|improve this answer












                            I ran into same problem and solved by exporting all images from Shotwell into another folder. Even if you have dups, Shotwell shows them only once. For instance I had 64K in the folder but Shotwell showed only 32K. So I selected all and exported preserving size, name, metadata etc.
                            The only downside is: if you had complicated folder structure and you want to keep it - this solution may not work for you. I have everything in one folder now. BTW looks like this bug is fixed now.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Nov 1 at 0:24









                            Stan M

                            1




                            1






























                                 

                                draft saved


                                draft discarded



















































                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f489951%2fhow-to-remove-duplicate-imported-photos-in-shotwell%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Quarter-circle Tiles

                                build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

                                Mont Emei