If un-obfuscated data contains more than 1 double quote the same is not reflected in the obfuscated file

Document ID : KB000008288
Last Modified Date : 14/02/2018
Show Technical Document Details
Issue:

Our team has been given the requirement that if the un-obfuscated data contains more than 1 double quote, it is to be reflected in the obfuscated file with the same number of double quotes.  For example, """"""""""""H"""""""""""", should come out on the other side with """"""""""""H"""""""""""" with the H being obfuscated appropriately.  Now it comes out an obfuscated "H" without the additional quotes.  Is there a way to indicate the additional quotes are wanted?  If not, can the program be made to retain those additional double quotes?

Environment:
FDM 4.2 GA
Resolution:

There is a patch for this issue - FastDataMasker-4.7.31.0.zip that adds the option RETAINDOUBLEQUOTESINDATA that has to be set to retain all double quotes.  Please contact Support for this patch.  If you are installing the latest GA after September 25, 2017, verify with CA Support that this enhancement got into the latest GA of TDM.

For further clarification of potential use cases, could you please view the below details and let us know what needs to be done to get the expected output? 

 

Details:
1. Used RETAINDOUBLEQUOTESINDATA=Y in the options file. 
2. The values , "TEXT1","TEXT2","TEXT3","TEXT4","TEXT5","TEXT5",""TEXT6"","TEXT7" are changed to TEXT1,TEXT2,TEXT3,TEXT4,TEXT5,"""TEXT6""",TEXT7 
3. But the expected output should be as "TEXT1","TEXT2","TEXT3","TEXT4","TEXT5","TEXT5",""TEXT6"","TEXT7"

 

Expected Results based on Details provided:

This looks like expected behavior, for CSV files “TEXT1” and TEXT1 are identical, you would enclose a value in double quotes if you want to have a comma in the data or want to have double quotes within the data itself.  For example, “TE,XT1” or 1997,Ford,E350,"Super, ""luxurious"" truck" will cause the CSV parser to pull back value TEXT1 for BOTH original values “TEXT1” AND for TEXT1.  There is no way to determine if the original value had enclosing double quotes or not.  If the data itself contains a comma, then FDM will enclose with double quotes automatically, otherwise it will not. 

If you want ALL data enclosed by double quotes , then in the definition file, you check the following option:


KB_doublequote_results_00.png


Regarding having double quotes in the data itself, the cell itself must be enclosed with double quotes.  Then each instance of double quotes within the data is escaped with another double quote , i.e. """TEXT6""" is valid, but ""TEXT6"" is not.

Reference:  see https://en.wikipedia.org/wiki/Comma-separated_values 

 

As an example: 

ID,ID2,ID3,KEY,DATE,STATUS_DESCRIPTION 
,,1111,,,"TEXT1" 
,,1111,,,"TEXT2" 
,,1111,,,"TEXT3" 
,,1111,,,TEXT4 
,,1111,,,"TEXT5" 
,,1111,,,""TEXT6"" 
,,1111,,, 
,,1111,,, 
,,1111,,,"TEXT7" 

Displays as:

KB_doublequote_results_01.png

 

Whereas 

ID,ID2,ID3,KEY,DATE,STATUS_DESCRIPTION 
,,1111,,,"TEXT1" 
,,1111,,,"TEXT2" 
,,1111,,,"TEXT3" 
,,1111,,,TEXT4 
,,1111,,,"TEXT5" 
,,1111,,,"""TEXT6""" 
,,1111,,, 
,,1111,,, 
,,1111,,,"TEXT7" 

Displays as below, with the double quotes at the start and end of TEXT6:
 
KB_doublequote_results_02.png

Another valid example would be: 

ID,ID2,ID3,KEY,DATE,STATUS_DESCRIPTION 
,,1111,,,TEXT1 
,,1111,,,TEXT2 
,,1111,,,TEXT3 
,,1111,,,TEXT4 
,,1111,,,TEXT5 
,,1111,,,"SOME ""FINE"" TEXT6" 
,,1111,,, 
,,1111,,, 
,,1111,,,TEXT7 

Which displays as:

KB_doublequote_results_03.png