[]
        
(Showing Draft Content)

Remove Duplicates

RemoveDuplicates

The Remove Duplicates transformation command removes rows with duplicate values in it's input. The data is sorted either by a field you choose or by row number in the data source if no sort order is specified. The transform then compares the values of the selected comparison fields and keeps the top row based on the sort order and removes the rest of the rows.

Remove Duplicates Illustration

Configuration

Remove Duplicates Configuration

After you add a Remove Duplicates ETL command to the ETL designer and connect an input to it, you need to select:

  • The comparison fields to check for duplicates

  • For each comparison field, choose if the comparison should be case-sensitive or not

  • Specify the sort order used before comparison. The row at the top after the sort will be kept, and the rest of the rows will be removed.