Re: 1.01 billion images in 0.632 seconds?



On Sat, 14 Feb 2009 05:17:10 -0500
Sherm Pendley <spamtrap@xxxxxxxxxxx> wrote in:
<m1ljs9wcex.fsf@xxxxxxxxxxx>

dorayme <doraymeRidThis@xxxxxxxxxxxxxxx> writes:

http://tineye.com/ claims to have searched 1.01 billion images
in 0.632 seconds? to find three or four pics out there that are
identical to one on my hard disk (yes, I pinched a small Roger
Rabbit one ages ago ...). Should this be believed?

To find one that's *identical*, down to every single pixel? Yeah,
that's believeable. They could calculate a hash value for each
image as its being added to their database. When you upload your
own image, they'd calculate its hash value. Getting a list of
images with the same hash value as yours would then be a simple
query on an indexed integer column - very, very fast for modern
databases.

I can't imagine them doing any more complicated matching in
that kind of time though - no face matching, color space
conversion, fuzzy matching, none of that sort of thing.

I have no ideal how it's done, how many images it checked, but it is
very quick.

[url]
http://tineye.com/search/efec127c48945335a83a88a3f6acf127e2ad7896
[/url]

The above example appeared to take longer to upload the image then it
did to return the results.

As I think you will see, the matches are not matches, but they are
similar in part.

I find it interesting. Very nice find.

Attachment: signature.asc
Description: PGP signature



Relevant Pages

  • Re: Newbie - Is this Reasonable?
    ... because this hash is stored in the database. ... So you use PKCS5v2 to generate a key hash from a salt and the user's passphrase, then store the salt and the hash in a database. ... are even more critical in database applications because the payoff from tampering with selected fields may be much higher, fields tend to be fixed-length so it's easier to tamper with them in a meaningful way, and databases lend themselves to off-line analysis, so the attacker can marshall more resources and take more time to attack your system. ... You're using a stream cipher for encryption. ...
    (sci.crypt)
  • Re: looking for help with a counting algorithm
    ... >> subcategory is counted, the code goes back up the tree to the root, adding ... >> involve retrieving all the category memberships from the database, ... sub ReadCategories{ ... ReadCategories is called with two empty hash pointers by any of the ...
    (comp.lang.perl.misc)
  • Re: Secure Password in database
    ... Subject: Secure Password in database ... > in database as SHA hash. ... You don't want to be able to compromise the client, ... get a bunch of garbage back when you try to get the 2-way encrypted data. ...
    (SecProg)
  • Re: out of memory
    ... read only the smaller file into a hash. ... the smaller file will fit into RAM. ... Depending upon the sorting algorithm this would be Ologor ... put your relevant data into a database and use ...
    (comp.lang.perl.misc)
  • Re: Fuzzy matching of postal addresses
    ... > need to look for matching addresses in the two databases. ... > database B. ... The critical issue is, as you suspected, normalization. ... The first is a flat, house name and street name, the second is a number ...
    (comp.lang.python)