Publication Performance
I have a 5 page Publication with over 300 dynamic text elements and 4 Discovers. I work in Higher Education and this is an academic program profile that looks at 5 years of data for a degree major. Filters consist of College, Department, and Major.
All of the data is coming from 6 Microsoft OLAP Cubes.
Generating one Publication as a PDF takes between 5 and 6 minutes. This is too long.
We have a pretty vanilla installation of Pyramid 2023 on a single server. Server specs are an Intel Xeon CPU E5-2670 2.60 GHz (12 processors) and 20 GB of RAM. OS is Windows 2022 Server.
Is there anything that can be done to reduce the Publication runtime either from a Pyramid settings perspective or a hardware perspective?
Thanks!
8 replies
-
Please provide more context on your environment and setup:
- Who else and what else is using this machine when these things are being processed? Are other people using this solution at the same time?
- A single server is not the minimum recommended approach for Pyramid (check the scaling guide in help).
- Where is the database repository? it should not be on the same machine. It can slow things down considerably.
- How many threads are you running concurrently? Too many concurrent threads, on a single machine no less, will slow things down.
- Have you looked at the query times in the transaction log.
- How long are the queries taking to resolve on your cubes?
- Are your cubes performing well in general?
- How many reports are you generating per batch?
- Is the server a VM? If so, is it sharing cores with other VM's, such that you do not truly have 12 cores fully at your disposal?
- The 300 text elements are being driven off the 4 queries:
- If you have very large queries to answer so many figures, you should change them into several smaller queries that do less work. Large queries with irrelevant data impact performance - unless you are using every element to resolve the 300 items.
- Are your 300 items reading 300 independent objects and building them into sentences? You are better off constructing larger, whole sentences with multiple values that resolve from multiple data points, producing fewer DT objects.
- Who else and what else is using this machine when these things are being processed? Are other people using this solution at the same time?
-
Thanks for the response. My answers to your questions are in blue.
Please provide more context on your environment and setup:
- Who else and what else is using this machine when these things are being processed? Are other people using this solution at the same time?
Currently this is on our development server. We have 4 users who have access to this server. Normally only one person, usually me, is using the server when this publication is being processed.- A single server is not the minimum recommended approach for Pyramid (check the scaling guide in help).
We have 198 total users and have no where near 25% concurrent use in our production environment. - Where is the database repository? it should not be on the same machine. It can slow things down considerably.
The database repository is on a separate server. - How many threads are you running concurrently? Too many concurrent threads, on a single machine no less, will slow things down.
There are about 4,000 concurrent threads
- A single server is not the minimum recommended approach for Pyramid (check the scaling guide in help).
- Have you looked at the query times in the transaction log.
- How long are the queries taking to resolve on your cubes?
It's taking the queries 223,030 MS to run
- Are your cubes performing well in general?
I am not a SSAS expert but we do have an outside consultant that monitors the health and performance of our cubes and ensure that they are running optimally.
- How long are the queries taking to resolve on your cubes?
- How many reports are you generating per batch?
1 report per batch is being run - Is the server a VM? If so, is it sharing cores with other VM's, such that you do not truly have 12 cores fully at your disposal?
It is a VM. I will have to do some checking with our IT team on the VM setup. - The 300 text elements are being driven off the 4 queries:
Let me try to provide a little more context. There are 3 filters (College, Department, Major) on the Publication but it is the Major filter that is doing most of the heavy lifting. The College filter is only being used to populate a page heading placeholder, the Department filter is also used to populate the page heading plus filter 1 Discover and 1 dynamic text element. Major is what filters the rest of the data.- If you have very large queries to answer so many figures, you should change them into several smaller queries that do less work. Large queries with irrelevant data impact performance - unless you are using every element to resolve the 300 items.
I don't think these are particularly large queries. I am using approximately 32 Discovers to populate the dynamic text. These Discovers are looking at data for a 5 year period for undergraduate students filtered by Major. - Are your 300 items reading 300 independent objects and building them into sentences? You are better off constructing larger, whole sentences with multiple values that resolve from multiple data points, producing fewer DT objects.
I am using Matrix tables with dynamic text populating the the cell values. I have attached a screenshot to show you what I am doing.
- If you have very large queries to answer so many figures, you should change them into several smaller queries that do less work. Large queries with irrelevant data impact performance - unless you are using every element to resolve the 300 items.
- Who else and what else is using this machine when these things are being processed? Are other people using this solution at the same time?
-
Based on your responses, the most obvious issue is the query times. If the queries are taking 223 seconds, or just under 4 minutes to run, then Pyramid itself is only consuming about 1minute of your 5minute processing time (80% vs 20%) - which is not too bad considering it needs to read, compile and render a PDF with at least 300 objects in it. This means your cubes are likely the limiting factor. It's worth trying to work out why the queries are so slow.
Your responses are not quite clear on concurrency: 4000 threads sounds really wrong. It should be far lower. You have only got 12 CPUs on this machine. This could be a contributing factor. You're stretching your processing too much and its working against you.
Last comment: using dynamic text to construct your one example grid is awfully cumbersome. If all these figures are coming from 1 cube, you could easily build the entire grid as a single query. It should respond far quicker; the results will be easier to display and maintain. You can easily construct these values using formulas and then just build a grid that pulls all the formulas into that singular query. Another approach, if you have licensed it, is to use the Tabulate tool. It's easier than dynamic text for this purpose.
If you cannot resolve some of this, you may have top contact Pyramid support or the CS team to help you find the issue. We run reports that are 450-500 pages long, with about 1000 different queries, visualizations and dynamic text. They run on relational data but render in about 8 minutes each. Longer, but the workload is far bigger. And for scale, we run the reports with the task engine running on a dedicated server in the cluster.