FastDataMasker (FDM): Throws java.lang.OutOfMemoryError

Document ID : KB000005241
Last Modified Date : 01/05/2018
Show Technical Document Details
Issue:
FDM may throw an OutOfMemoryError during large masking jobs.
Cause:
Large data sets require more memory as part of the masking job.

Seedlists will always be stored in memory. If using large and/or custom seedlists, keep this in mind.

Data from the table that is being masked will be stored within memory. The size of the result set is determined by FETCHSIZE.

Masked data is stored in memory waiting to be updated in the database. Masked data is processed in batches the size of which are determined by the BATCHSIZE option.

If the data to be masked is large on an individual row basis (such as with CLOB, XML, and/or JSON data) the memory requirements will increase significantly.
Resolution:

Resolution 1 - Adjust FDM Options

The following options set to low values (in this example 100) will keep less data in memory and commit to the database more frequently:

FETCHSIZE=100
BATCHSIZE=100
COMMIT=100

Lowering the number of rows per commit will help avoid some issues with transaction logs for large jobs.

Note that this will not work for Netezza targets as that JDBC driver works differently that most others. Increasing Memory for the JVM is the only viable solution.
 

Resolution 2 - Increase Memory for the JVM - GUI

When running jobs directly from the FDM GUI, before Saving and Running the mask, adjust the Start/Max memory parameters accordingly:
User-added image
 

Resolution 3 - Increase Memory for the JVM - Batch

By providing additional memory to the FDM java process,  the job should succeed. 

Try with these increased values -Xms1000M or -Xmx3000M (initial and max respectively).  If these do NOT work for your specific configuration, increase them as needed based on the memory of your machine.

The following example snippet shows the contents of a masking batch file with the default memory values.

java -Djava.util.logging.config.file="C:/Program Files/Grid-Tools/FastDataMasker/logging.properties" -Xms100M -Xmx1000M -jar "C:/Program Files/Grid-Tools/FastDataMasker/Fastdatamasker.jar" "C:/Program Files/Grid-Tools/FastDataMasker/doc/connectSQLSERVER.txt" "C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask.csv" "C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask_options.txt"


Now to increase the memory on the above example, modify it like this:

java -Djava.util.logging.config.file="C:/Program Files/Grid-Tools/FastDataMasker/logging.properties" -Xms1000M -Xmx3000M -jar "C:/Program Files/Grid-Tools/FastDataMasker/Fastdatamasker.jar" "C:/Program Files/Grid-Tools/FastDataMasker/doc/connectSQLSERVER.txt" "C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask.csv" "C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask_options.txt"


Where:

  • C:/Program Files/Grid-Tools/FastDataMasker/Fastdatamasker.jar shows the location of the Fast Data Masker .jar file.
  • C:/Program Files/Grid-Tools/FastDataMasker/doc/connectSQLSERVER.txt shows the location of the Microsoft SQL Server connection file.
  • C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask.csv shows the location of the file that contains masking information.
  • C:\Users\<username>\AppData\Roaming\Grid-Tools\Fastdatamasker\MyMask_options.txt shows the location of the file that contains the applied options information.

Be sure to update the required file locations based on your requirements or just modify the batch file as generated by the FDM UI.
Additional Information:

JRE Defaults

By default the Java Run-time Environment (JRE) sets the Initial (-Xms) and Maximum (-Xmx) heap size based on the total amount of physical memory the system has. For a Server based OS, this is generally 1/64th of total physical for Initial and 1/4th of total physical for Maximum. On a system with 16 Gigs of RAM, this would equate to 256 Megs and 4 Gigs respectively. FDM allows for users to specify custom values in the UI or modify the batch file as described above.
 

JRE Memory Validation

You can validate the Initial and Max Heap sizes using the "jinfo" command. This is shipped with the Java Development Kit (JDK). This will only work while the java process in question in running (which shouldn't be a problem for larger/longer FDM jobs).

jinfo -flag InitialHeapSize [PID]

jinfo -flag MaxHeapSize [PID]

Note: The jinfo command works best when the versions of the JDK and JRE are properly matched. You can confirm the java version used by TDM Portal by running the following command:

"C:\Program Files\CA\CA Test Data Manager Portal\jre\bin\java.exe" -version

 

Example thrown Errors from different databases

Microsoft SQL

Exception in thread "pool-2-thread-1" java.lang.OutOfMemoryError: Java heap space

    at java.util.Arrays.copyOfRange(Unknown Source)

    at com.sun.crypto.provider.CipherCore.update(CipherCore.java:737)

    at com.sun.crypto.provider.AESCipher.engineUpdate(AESCipher.java:410)

    at javax.crypto.Cipher.update(Cipher.java:1945)

    at sun.security.ssl.CipherBox.decrypt(Unknown Source)

    at sun.security.ssl.InputRecord.decrypt(Unknown Source)

    at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)

    at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)

    at sun.security.ssl.AppInputStream.read(Unknown Source)

    at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1782)

    at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:4838)

    at com.microsoft.sqlserver.jdbc.TDSCommand.detach(IOBuffer.java:5762)

    at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1704)

    at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:180)

    at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:155)

    at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:616)

    at com.grid_tools.products.datamasker.DBUtils.gettablecols(DBUtils.java:1955)

    at com.grid_tools.products.datamasker.SQLUpdate.columnDataType(SQLUpdate.java:6256)

    at com.grid_tools.products.datamasker.SQLUpdate.batchPreparedMasking(SQLUpdate.java:3374)

    at com.grid_tools.products.datamasker.SQLUpdate.run(SQLUpdate.java:6402)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

    at java.lang.Thread.run(Unknown Source)

PostgreSQL

Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: GC overhead limit exceeded

    at org.postgresql.util.GT.<init>(GT.java:45)

    at org.postgresql.util.GT.<clinit>(GT.java:24)

    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1857)

    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)

    at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:555)

    at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)

    at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)

    at com.grid_tools.products.datamasker.SQLUpdate.batchPreparedMasking(SQLUpdate.java:2630)

    at com.grid_tools.products.datamasker.SQLUpdate.run(SQLUpdate.java:6000)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

    at java.lang.Thread.run(Unknown Source)

Netezza

Exception in thread "pool-1-thread-1" java.lang.OutOfMemoryError: Java heap space

    at java.lang.StringCoding$StringDecoder.decode(Unknown Source)

    at java.lang.StringCoding.decode(Unknown Source)

    at java.lang.String.<init(Unknown Source)

    at org.netezza.core.Encoding.decode(Encoding.java:57)

    at org.netezza.core.Encoding.decode(Encoding.java:77)

    at org.netezza.sql.NzResultSet.getDbosString(NzResultSet.java:3451)

    at org.netezza.sql.NzResultSet.getDbosObject(NzResultSet.java:4082)

    at org.netezza.sql.NzResultSet.getObject(NzResultSet.java:1205)

    at com.grid_tools.products.datamasker.SQLUpdate.batchPreparedMasking(SQLUpdate.java:3443)

    at com.grid_tools.products.datamasker.SQLUpdate.run(SQLUpdate.java:6645)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

    at java.lang.Thread.run(Unknown Source)