The documentation in TDM 4.1 had an error in it. The cross reference identifier does not go in parm3. The cross reference identifier goes in the cross reference column in the masking CSV. If the table is pre populated and you just want to do a lookup then use the IGNORE function together with the cross reference identifier.
In FDM when masking CSV, the cross reference identifier MUST be in the Cross Reference column. Typically you would not use a cross reference with HASHLOV, as it gives consistent masking anyway. Parm3 is used to specify the column to hash on, if this is left blank, then the current value (in the example ADDR_LINE1) is used for the hash.
The connectscramble.txt is just the connection to your scramble database. The XREF details mentioned in the documentation were under masking rules and options available. There were some gaps in our documentation that have been fixed since TDM 4.1.
Using cross reference in FDM:
One way of guaranteeing consistent masking across different tables or databases is to use values from a cross reference table. The structure of the table is as follows:
CREATE TABLE gtsrc_xref (
rx_ref_id varchar (254) NOT NULL,
rx_old_value varchar (254) NOT NULL,
rx_new_value varchar (254) );
ALTER TABLE gtsrc_xref
ADD CONSTRAINT gtsrc_xref_pk PRIMARY KEY (
CREATE INDEX gtsrc_xref_nu ON scramble.gtsrc_xref (
This table can exist in the same schema as the one you are masking, but more typically exists in a different one. If scramble components for Subset have been installed, the table will exist in the scramble schema for Oracle or the scramble database for SQL Server.
When masking 2 different columns the same way in FDM, if the SAME cross reference identifier is used in both masks, then the data will be masked in the same way. The following is an example of how the cross reference table is populated when two credit card numbers are masked.
When a cross reference identifier is supplied in FDM, as masking proceeds, FDM checks to see if the value to be masked exists in the rx_old_value column in the cross reference table for the identifier supplied. If it does exist, the rx_new_value is used, otherwise a new row is inserted into the table with both the old and new (masked) value. In this way, consistency is achieved.
It should be noted that this approach is not the recommended method for consistent masking in FDM as it is not very performant for larger data sets. A better way to achieve consistent masking is to use one of the deterministic functions in FDM, so if masking names or addresses use HASHLOV, or if masking an id column use FORMATENCRYPT.
If using cross reference the following need also be supplied in the options:
CROSSREFCONNECT - this is the name of the connection file used to connect to the cross reference table. Use the mapper to create a connection profile to the schema containing the cross reference table. A drop down list in the options show the connection profiles available, choose the one connecting to gtsrc_xref
CROSSREFTABLE – this is the name of the cross reference table, usually gtsrc_xref.