TDM Portal tile taking very long time to complete compared to what it use to take. Why?

Document ID : KB000015606
Last Modified Date : 14/02/2018
Show Technical Document Details

The very basic elements of this environment are two database tables.  The complication is that one is a DB2 table on a Mainframe, and the other is Oracle table.  If they were both Oracle tables, we could simply do a join to get to our target rows and columns.  Due to company policy, we cannot simply extract and hold in another repository the data.  Previous publishes use to complete in 8 minutes, but now it is taking 40 minutes.  Why?


How can we determine where the bottleneck is in this extract?  How can we optimize this so we can reduce our time to delivery?  

DB2 OracleTDM 4.2

The Best Possible Solution would be to create a join between the DB2 data and the Oracle table data. However, due to the disparity between the two databases, this is not possible.

The Best Practice Solution requires copying over the DB2 data into a reusable Oracle table, performing a join between the two Oracle tables, and then generating the required output. This solution by-passes the constant SQL queries that are performed against the DB2 and Oracle databases.
However, the Best Practice Solution may not be possible due to company database and security policies.

Therefore, concentrating on performance will expose the Root Cause for the performance issue which could be one or more of the following:

  • A full table scan due to a missing index.
  • The sql is returning all the rows from a very large table.
  • A problem with network latency or the database itself.

The first step in discovering Root Cause is breaking down the SQL into smaller pieces and then build the SQL back up again to the point of failure. 

Debugging Tips:

  • If you do specific column value retrievals, time each column retrieval.  Use that data against the questions that follow.
  • Perform SQL counts against returned records.  Does one return an entire table row count?  If so, you have an unbound SQL call.  The remedy to this would be to use a WHERE clause to limit the rows returned.
  • Create a series of SQL calls from the SQL console or in GTDatamaker from a SQL window. Execute each SQL query separately, note the time to complete, and see if there is a pattern that points to the issue.  If no pattern, all SQL queries seem to be the same long time, this points to latency.  If one is much longer than the rest, continue with the following tips.
  • Does it take a long time for any specific column or columns?  This could be a couple of things.  Top of the list might be an index needs to be added to that column(s).
  • If so, is it related to either specific database?  This could be a matter of latency if DB2.  What is the round trip in ms?  Multiply that times the number of calls to that database - is this where the time is accumulating?
  • If these do not yield anything obvious, then it is back out to original SQL.  The current number of rows is 2500.  By plotting the total time for 100 rows, 500 rows, 1000 rows, 1500 rows, and 2000 rows, does the ratio of number of rows to time taken produce an incremental straight line or is there a jump somewhere?
  • If none of these give a definitive answer, then you can ask the DBA to run some traces on the DB to see where the bottleneck is.  Perhaps there are error messages about not enough disk space, out of memory - doing garbage collection, or perhaps the database is expanding and taking time doing that for additional table space.
  • If the DBA gives the database a clean bill of health, then ask your Network Administrator to put on a line analyzer and see where the time is being spent.  It very well could be a leg of the network is under performing and you are going through that bad router perhaps or some other network appliance that is throttling the throughput.

Based on your data, you should be able to resolve the issue.  In the case that generated this Knowledge Document, the query returned all the table contents.