DevTest: Writing Your Own Data De-Identifier

Document ID : KB000077039
Last Modified Date : 10/04/2018
Show Technical Document Details
Introduction:
The Workstation's VSE Recorder can automatically scrub data, removing sensitive information, as it is being recorded.
DevTest automatically handles several types of sensitive information including names (first and last), addresses, phone numbers, social security numbers, credit card information, etc...
See the "De-indentifying Data" section for your version of DevTest.

However, sometimes a custom de-identifier is needed.
Note: Support cannot assist in writing custom code. If assistance is need, please contact your account manager. The examples provided below are meant as a guideline only.
Background:
The DevTest_Home/de-identify.xml file contains information about the different types of de-identifiers. 
However, creating your own de-identifier is mentioned but examples are not given and is the reason for this document.
Environment:
All supported DevTest versions and environments
Instructions:
How-to create your own de-identifier:

Decide on the type of identifier needed:
The two basic types match on a key or on the data using regex. Examples:
Key: <City>, <State>, <fname>, <lname>, <Temperature>, <Humidity>, etc...
Regex: looks for patterns in the data. For an email address it looks for:
  • Letters or numbers in any combination and any length
  • An "@" symbol
  • Letters or numbers in any combination and any length
  • A dot "."
  • Letters or numbers in any combination and two to four characters long.
The regex for this pattern is: [A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,4}]
 
Prepare the working directory:
Create a directory outside of DevTest
Copy DevTest_Home/lib/core/desensitize-{version number}.jar to the new directory
Ensure your version of the java SDK is compatible with the java version of your DevTest.

Write the code:
In this example, a five digit Weather Station code is replaced with another five digit code in a provider named MyWeatherIdProvider.
Two codes (12345 and 24680) are replaced with specific codes.
Otherwise a random five digit code is generated.

import com.itko.lisa.desensitize.NVPairProvider;
import java.util.Random;
public class MyWeatherIdProvider extends NVPairProvider {
    /**
     * Called by the VSE recorder during the recording phase
     */

    Random random = new Random();
    protected String newValue(String hit) {
if (hit.equals("12345")) return "54321";
if (hit.equals("24680")) return "08642";
return RandomNumericString(5);
    }

    private String RandomNumericString(int length) {
String tempStr = null;
String nextStr = null;
for (int i=0; i<length;i++) {
    nextStr = getNextNumber();
    if (i==0) {
tempStr=nextStr;
    }
   else {
tempStr = tempStr + nextStr; 
   }
}
return tempStr;
    }
 
    private String getNextNumber() {
return Integer.toString(random.nextInt(9 - 0 + 1) + 0);
    }
}

Compile the code and make a jar file:
Compile the code from the working directory like this:
javac -cp ".;desensitize-9.5.1.jar" MyWeatherIdProvider.java
Create the jar file like this:
jar cf MyWeatherIdProvider.jar  MyWeatherIdProvider.class

Deploy your new de-identifier:
Copy the jar file to the DevTest_Home/lib/patches directory (create the patches directory if needed).
Add your de-identifier information to the DevvTest_Home/de-identify.xml file:
<filter name="MyWeatherID" provider="MyWeatherIdProvider">
                <key>WeatherID</key>
                <replacement><![CDATA[Not used in this basic example]]></replacement>
</filter>

where:
filter name= A unique name
provider= The class name for the provider created above
<key>= message's element name for this provider
<replacement>= in this case, does nothing. See the de-identify.xml for other examples.

Restart the Workstation:
The DevTest Workstation must be restarted to pick up the new provider.
 
Additional Information:
Not Applicable