VERIFIED SOLUTION i
X

How to find All Records With Duplicate Values in a Column

UPDATED: July 16, 2018


The process described in this article will find all duplicate records in a table, for a given column and how to display records greater than one. Find these steps by performing two Query > SQL Select commands. 

Please note: Objects will be lost when performing queries columns selected in the "Group By" column section. Please see this article on how to use a group by statement and retain map objects.


In the example in MapInfo Professional, click on Query > SQL Select.

1. Select the table that has the duplicate records. 
2. Click in the Group By Columns section and select Columns drop down for column of duplicate records.
3. Click in the Select Columns section and select Columns drop down for column of duplicate records.
4. Click on Select Columns section again and click on Aggregates drop down and click on Count
5. In quotes type in an alias that represents the duplicate records, in the example it shows "State Duplicates
User-added image

6. Click on Verify button to make sure that the syntax is correct. If so, click the OK button, producing the first Query > SQL Select Command.

User-added image

The next Query > SQL Select command statement will show only duplicate records greater than one. Click on Query > SQL Query to get the SQL Select dialog.

Follow these steps:

1. In the example click on the produced query from the Table drop down and select "Query1"
2. Click in where Condition and click on Columns drop down  for the duplicate records column "StateDuplicates"
3. Next, click on the Aggregates drop down for Greater than sign ">"
4. type in the number 1 following after greater than sign, click the Verify button and click OK.
User-added image

The end result in "Query2"  > SQL Select statement will now display all duplicate records in a column greater than one. 
User-added image

Users can now export the results and use this information to fix and review data as needed.

Note: current versions of 64-bit MapInfo Pro ship with a tool that deletes duplicate records and retains map objects called
"Delete Duplicates":

From the online help for the tool:

The Delete Duplicates tool deletes duplicate records from a table while retaining map objects. This tool does not change the original table in any way. Keep in mind that MapInfo Pro has a 255 byte/ 5-column limit for a Group by clause, so you may have to change the size of your columns if they are unusually wide.

Duplicate rows refers to rows that have duplicate occurrences of the same column value for the column you select.

Removing Duplicate Records from a Table

Before you open the Delete Duplicates tool in MapInfo Pro, make sure the tables you want to work with are open.

To remove duplicate records from a table and save the cleaned table to a new file:

  1. Open the table or tables you want to delete duplicates from.
  2. On the HOME tab, in the Tools group, click Tool Extensions, and double-click Delete Duplicates on the Running tab.

    The Delete Duplicate Records dialog box opens. The list of tables on the left contains all of the currently open tables.

  3. Select the table you want to delete duplicates from in the first column. A list of the columns in that table displays on the right.
  4. Select the column that contains information that should be unique to each row from the Select Columns list.
    CAUTION:
    To avoid deleting data incorrectly, give this step a lot of thought. For example, we do not recommend you select customer names or addresses for this step.
  5. Select the Count check box to add a column to your new table that shows the number of instances of the column value found in the original table. This is an optional feature.
  6. Click OK. The Please Choose a Path dialog box displays.
  7. Select a path and a file name for the new table in this dialog box. Then select Save. The tool removes rows containing duplicate column values according to the column you selected. A Browser window displays with the results of the new table.

    The last column contains the count of instances of the row containing duplicate column values in the original table.



Interested to know more, please read this related articles on duplicate values. 
MapInfo Professional and how to use SQL Select to identify duplicate values in a column of data

If you still not a member, please register to Li360 with this link to stay updated with latest in Location Intelligence world.
 



 

Environment Details

Products affected: MapInfo Pro™

Downloads

  • No Downloads