In the previous post, we have seen that only few smart moves/changes can make cube processing time far less than the current cube processing time. In this post, we will explore the way to identify the performance bottleneck and different ways to resolve it.
In one of my project, cube processing was taking 8 hours where we were running 2 measure groups in parallel. Each measure group were having daily load partitions. Here are the initial stats:
Now when we investigated about load size for each partition, we found that MG1 daily partition has approximately 36-40M rows and MG2 daily partition has approximately 12M rows to process so it was obvious that something is wrong at SSAS application side or at relational database side because 40M rows should be processed in few minutes not in hours.
We can monitor different performance related variable using SQL profiler so we thought let's monitor some of the performance parameters to identify problem space(SSAS/Relational DB).
SSAS or Relational database
For identifying the problem, we need to setup performance counters available under "MSAS 2008: Processing" in PERFMON.
Now you can see that "Rows Read/sec" is ~2K/sec which will be approximately same rate as 40M rows/7-8 Hours. It means that SSAS application is not taking time for writing to partition, it is relational data base read which is consuming most of the processing time. It may be problem with network bandwidth or small packets(fewer rows in one network IO) from relational database.
Upon network investigation, we couldn't find the network issue or any congestion so it was obvious that problem is with relational database. We were using Oracle as source database and we tried to search parameter in connection string so we can increase number of rows in one network trip. After little research, we found Oracle row fetchsize concept. According to Oracle documentation,
"You can set a value of fetchsize in a connection object that affects each statement produced through that connection, and you can override that value in any particular statement object. The default value in a connection object is 10. Prefetching data into the client reduces the number of round trips to the server."
In one of my project, cube processing was taking 8 hours where we were running 2 measure groups in parallel. Each measure group were having daily load partitions. Here are the initial stats:
| Cube Name | Measure Group | Partition | Average Processing time |
| Cube1 | MG1 | Pn | 7-8 hours |
| Cube1 | MG2 | Pn | 4-5 Hours |
Now when we investigated about load size for each partition, we found that MG1 daily partition has approximately 36-40M rows and MG2 daily partition has approximately 12M rows to process so it was obvious that something is wrong at SSAS application side or at relational database side because 40M rows should be processed in few minutes not in hours.
We can monitor different performance related variable using SQL profiler so we thought let's monitor some of the performance parameters to identify problem space(SSAS/Relational DB).
SSAS or Relational database
For identifying the problem, we need to setup performance counters available under "MSAS 2008: Processing" in PERFMON.
Now you can see that "Rows Read/sec" is ~2K/sec which will be approximately same rate as 40M rows/7-8 Hours. It means that SSAS application is not taking time for writing to partition, it is relational data base read which is consuming most of the processing time. It may be problem with network bandwidth or small packets(fewer rows in one network IO) from relational database.
Upon network investigation, we couldn't find the network issue or any congestion so it was obvious that problem is with relational database. We were using Oracle as source database and we tried to search parameter in connection string so we can increase number of rows in one network trip. After little research, we found Oracle row fetchsize concept. According to Oracle documentation,
"You can set a value of fetchsize in a connection object that affects each statement produced through that connection, and you can override that value in any particular statement object. The default value in a connection object is 10. Prefetching data into the client reduces the number of round trips to the server."
So we added this value in connection extended property(FetchSize=496):
It did wonder for us and now speed was ~62K/Sec instead of 2K/sec:
and new cube processing stats were:
| Cube Name | Measure Group | Partition | Average Processing time |
| Cube1 | MG1 | Pn | 12 Minutes |
| Cube1 | MG2 | Pn | 4 Minutes |
It was huge improvement in performance and great relief to our development team.
SSAS 2008 perforamnce guide was great help in understanding performance related concepts and in identifying the problem. Same can be downloaded from below Microsoft link:
Download SSAS 2008 Performance Guide
I noticed that you posted this article on "Friday, March 16, 2012". ???
ReplyDeleteIdentity of the bottleneck and performance of this cube1 and cube2 thanks for great analysis.web hosting review
ReplyDelete