Duplicate record elimination

Community forum for discussions completely unrelated to MediaMonkey.

Moderator: Gurus

Duplicate record elimination

Postby indiramovilla » Sun Jan 03, 2016 1:21 pm

As a user of MediaMonkey for the last 8 years, and a database developer by profession, I have been concerned with the issue of duplicate record elimination for many years now.

Some, perhaps, may discount the importance of that issue, because to them the only cost is some waste in disk space, and that cost is insignificant, these days.

However, for me the issue is different. I needed a utility program that would allow me to identify duplicates, and from the set of those records to automatically choose the one with the highest quality, or bitrate, or some other criterion.

To that effect, I researched what programs were available, and about a year ago I chose one called Similarity. After a year’s use, I am very satisfied with the program, and for that reason I wanted to share my experience with other MM users via this forum.

Like MediaMonkey, Similarity is very efficient with large databases and all file types. My own database has 130,000 records, and Similarity can complete a re-scan that includes 10,000 new records in under 3 hours. For those with much larger databases, I was told by one of Similarity Developers that Similarity’s forthcoming new version will be strong enough to handle efficiently databases of over a million records.

After having located the duplicate groups, Similarity analyzes those files to determine the quality rating of each recording. It then chooses automatically from each duplicate group which file to keep and which to delete based on predetermined criteria and priorities. Obviously, the quality of each recording is the most important criterion, with the possible exception of old, historical recordings. Other factors include bitrate, sampling rate, size, length, and location (incoming vs. permanent location of database).

All in all, with the help of Similarity I have brought my music database in optimal shape, and every time I add new recordings, I let Similarity doing the thinking and deciding about which records to keep, and which to delete. So, when I am listening to music, I am sure to be listening to the best available recording of each particular piece.

Naturally, if same capability can be archived with MMW I'm open to learn.
indiramovilla
 
Posts: 1
Joined: Sun Jan 08, 2012 2:18 pm

Re: Duplicate record elimination

Postby Peke » Sun Jan 03, 2016 1:24 pm

Nice thing to read thank you for posting, it rise interesting questions about duplicate handling.

Personally I'm thinking that full automatic Duplicate decision is bad, but semi one can be archived with Scripting.

Advanced Duplicate find and fix is one of them.
Best regards,
Pavle
MM Core Developer and Admin of free MediaMonkey extensions Hosting
Image
Image
Peke
 
Posts: 10435
Joined: Tue Jun 10, 2003 7:21 pm
Location: Serbia

Re: Duplicate record elimination

Postby dtsig » Sun Jan 03, 2016 1:42 pm

indiramovilla .. so that I understand .. you are talking about removing tracks from *albums* that are also on other *albums*. Is this true? .. So you would delete 'Lady Jane' from the Rollingstones Flowers album because it is also on their Aftermath album and is the same (bit etc)?
Or do you actually have duplicates of say 'Lady Jane' on the Flowers album?
Where's the db and ini stored
Reporting Bugs
Where tags are stored

Not affiliated with MediaMonkey ... just a RABID user/lover
DTSig
dtsig
 
Posts: 2740
Joined: Mon Jan 24, 2011 6:34 pm


Return to Off Topic

Who is online

Users browsing this forum: No registered users and 4 guests