petal
Sun, Nov 25 '12, 01:45
Dupe detection?
Sorry, not sure where to ask this--does this booru do dupe detection? One of the big reasons I haven't uploaded anything for a while is because it's a pain to check all the relevant pictures and make sure I'm not duping anything.
Mindwipe
Sun, Nov 25 '12, 01:59
Yes and no. It can detect duplicate FILES, but not duplicate pics. In other words, it doesn't actually examine the pic to see if it matches one that's already posted. It just checks file properties, or something (I honestly don't know exactly what it looks for).

Anyway, duplicates are bound to pop up eventually. We've already had some, and they were removed promptly. I mean, it'd be a bummer if all the pics you uploaded were dupes, but there'd be no harm done. So I wouldn't worry too much about it.
petal
Sun, Nov 25 '12, 02:08
I would assume it checks the MD5 hash, since the full Gelbooru software supports searching by MD5. It's not the most reliable method of dupe detection, since the hash is regenerated whenever an image is saved, so it's irreparably destroyed if the image is edited in any way, even saving over it without actually changing anything. It's very easy to implement, though, and it's a lot better than nothing! Thanks for letting me know.
Mindwipe
Sun, Nov 25 '12, 02:12
That would explain why simply re-saving the image without even editing it is enough to bypass the system.

I learned something today.
Stem_Cell
Mon, Nov 26 '12, 15:08
Well, one way to check for dupes is to search for the tags. Considering there are few pics on this site, some combination of tags will eventually tell you if the image is there or not, particularly tags which are very common (such as including eye and hair color in your search, as in "brown_hair blue_eyes collar", for example).

File MD5s change even if you change one single byte of the image. I use this for checking file integrity on optical media. For everyone that might burn data disks, this is very handy: http://code.kliu.org/hashcheck/ (open-source free, very lightweight and simple to use)
Mindwipe
Mon, Nov 26 '12, 18:54
Stem_Cell said:
Well, one way to check for dupes is to search for the tags. Considering there are few pics on this site, some combination of tags will eventually tell you if the image is there or not, particularly tags which are very common (such as including eye and hair color in your search, as in "brown_hair blue_eyes collar", for example.


That only works assuming that all pics have been tagged thoroughly, and they haven't. I've gone back and tagged some of the old images, but there are still a LOT that are missing tags. =/
Stem_Cell
Tue, Nov 27 '12, 06:14
Well yeah, I wish http://iqdb.org/ would index here. Altough I understand them not caring, there are several boorus serving as personal pic dumps which wouldn't help anybody if indexed.
deathwish
Thu, Dec 13 '12, 15:01
Let's see...Chan sites manage to catch exact copies, but fail if you change it even a little bit (resize it by one pixel, add a dot, reverse it, etc.) so that wouldn't work. I have DupDetector for helping me clear doubled pics off my system, and at certain settings it detects resized versions and matches my manips to my source pics, but within that similarity-range, it detects all uncolored drawings (including manga scans and doujins) as dupes, so that wouldn't work either.

Even if we could alter how the booru handles dupe detection, I don't think there is a perfect way.
1


Reply | New Topic | Help | Forum Index