View Single Post
Old 04-25-2008, 03:08 PM   #8
Alex
.
 
Join Date: Feb 2005
Posts: 13,354
Alex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of coolAlex is the epitome of cool
Here's a brute force method. I'm sure I'll think of something graceful later when I can't actually test it out at work.

Assume your name list is in Column A and consists of the following five names (starting in cell A1):

Jones, Greg
Smith, John
Smith, John A.
Smith, John Adam
Wilson, Peter

Then, on the assumption that a first and last name match is sufficient for your purposes of assuming that they are the same person:

In cell B1 enter the following formula:

Code:
 =IF(ISERR(LEFT(A1,FIND(" ",A1,1))&LEFT(RIGHT(A1,LEN(A1)-FIND(" ",A1,1)),FIND(" ",RIGHT(A1,LEN(A1)-FIND(" ",A1,1)),1))), LEFT(A1,FIND(" ",A1,1)) & TRIM(RIGHT(A1,FIND(" ",A1,1)-3)),TRIM(LEFT(A1,FIND(" ",A1,1))&LEFT(RIGHT(A1,LEN(A1)-FIND(" ",A1,1)),FIND(" ",RIGHT(A1,LEN(A1)-FIND(" ",A1,1)),1))))
Hopefully that comes out in a cut and pastable way. If you hand type it you are going to screw up an impossible to find parenthesis.

This will result in cell B1 saying "Jones, Greg" which is no change from A1. However, when you copy that formula down the column it will result in this:

Code:
Jones, Greg            Jones, Greg
Smith, John            Smith, John
Smith, John A.         Smith, John
Smith, John Adam       Smith, John 
Wilson, Peter          Wilson, Peter


So, all of the names that did consist of more than just first and last are now trimmed.

Into cell C2, copy this formula (note, you're leaving C1 blank):


Code:
=IF(B2=B1, "DUP", " ")
Copy that formula down the rest of the C column.

Now what you'll have is:


Code:
 Jones, Greg            Jones, Greg     
 Smith, John            Smith, John
 Smith, John A.         Smith, John      DUP
 Smith, John Adam       Smith, John      DUP
 Wilson, Peter          Wilson, Peter
 


In other words, each row that has a trimmed name that is identical to the row before is now labelled "DUP". You can now quickly scroll through the list and see the duplicates. If you need to manually combine the index lists for each name this may be fine. If you don't need to do that and just want to delete each duplicate row then do this:

Insert a new column between B and C.

Right click on the "B" at the top of the B column so the entire column is highlighted and press ctrl-c. Right click on the new C column and select "Paste special" from the options list. Select the "Values" radio button from the list of options in the window that results. This will paste the values of the B column into C without the underlying formulas.


Update the formula in what is now column D to reference Column C instead of B. Now you can sort the entire spreadsheet by column D, which will group all the duplicate rows and you can mass select and delete them. Resort by Column A to get them back in alphabetical order and voila.

Very brute force and lacking grace.

Alex is offline   Submit to Quotes Reply With Quote