Re: I would've sworn it was mentioned here
- From: singhals <singhals@xxxxxxxxx>
- Date: Fri, 07 Mar 2008 10:36:57 -0500
Ian Goddard wrote:
singhals wrote:
Didn't someone in the past, oh, say, month, mention software that would compare databases and flag matches?
Not necessarily a *specific* genealogy program database, jsut databases in general?
I'm looking for an easy way to vacuum up "hit" lists from Ancestry. WC, Google, et al, and find the common ones.
Cheryl
I don't recall anything like that and a quick google doesn't find anything. Wishful thinking?
It's an interesting problem. First of all what's the format of the hit lists? Are the hits from all the sources in the same format?
Secondly, most comparison tools that I can think of work on a specific file format, usually a flat text file although there are some that work on XML files. You would need to get the files into the appropriate format.
Thirdly, many comparison tools do the opposite of what you want - they look for differences. My favourite approach to looking for multiple occurrences of *identical* lines across multiple files would be the Unix command
cat x y z|sort|uniq -c|sort -rn|more
where x, y & z would be 3 file names (you can cat as few or many files as you like). This will merge the contents into alphabetical order so that duplicates follow each other, process each line with the count of times it was found, re-sort them in descending order of count and page the output. You can then see which lines were in more than one file but not which file they were in.
This requires that you have the hits in a common flat file format or can convert them to that; that hits which you would consider matching are identical within the files; that you either don't care which lists the matches were in, don't mind just comparing them in pairs or are prepared to hunt for them in the files and finally that you have access to Unix-style commands (if you're on Windows only, google for "cygwin").
Yes, quite possibly I was mis-remembering either the details or the list. I couldn't find it either. (g)
I've done it by hand, and it's not /that/ onerous, but the person who needs it would reach for the smellin' salts if I mentioned Unix or even CMD lines.
Thanks.
Cheryl
.
- Follow-Ups:
- Re: I would've sworn it was mentioned here
- From: Ian Goddard
- Re: I would've sworn it was mentioned here
- References:
- I would've sworn it was mentioned here
- From: singhals
- Re: I would've sworn it was mentioned here
- From: Ian Goddard
- I would've sworn it was mentioned here
- Prev by Date: Re: Legacy Charting Pre-Release Edition Now Available
- Next by Date: Re: I would've sworn it was mentioned here
- Previous by thread: Re: I would've sworn it was mentioned here
- Next by thread: Re: I would've sworn it was mentioned here
- Index(es):
Relevant Pages
|