12

Pyramid Usage Model and Analysis

Using Pyramid to analyze itself.

The following document describes how to build a data model using Pyramid’s database repository as a data source to be able to analyze content usage and performance in Pyramid itself. 


The resulting model will give you insights into: 

  1. WHO: Query transactions executed by users and tenants
  2. WHEN: Query transactions by time - down to the hour and minute level
  3. WHAT: Query transactions by data source server and type, database and model, with further details related to content items, the main content folders, specific hierarchies and measures used in each query (from the “elements” hierarchies)
  4. WHERE: Query transactions by processing server
  5. WHICH: Query transactions by which types of content 

And of course, any combination of the above. 

Setup

  1. Flow:
    1. Start a new advanced data model in Pyramid. Add a relevant data source (PostgreSQL, MS SQL Server, Oracle), depending on which type of repository you have elected to deploy
      1. From the database listing in the property cards, select the Pyramid database repository
      2. From the table selection, choose three views
        • contentview*
        • modelview**
        • transactionview
          *Available in 2020.24 and later versions
          **Use the server_log_transaction_columns table instead for versions older than 2020.24
      3. Click Add Selected Nodes, to add these views to the flow
    2. Add a new Convert Columns node after the transactionview select node in the flow from the Column Operations side menu
      1. Change the date column to type “date”, with format yyyy-mm-dd
    3. Add a new In-memory target database node and provide a name for the database (“Pyramid Usage”)
  2. Model:
    1. Change the default measure to “Total Time” on the configuration tab
    2. On the Tables tab:
      1. Rename the modelview to “Elements”
      2. Rename the transactionview table to “Events”
      3. Make sure the Elements table is inner joined to the Events table using the “transaction_id” column
      4. Make sure the contentview table is inner joined to the Events table using the “item_id” column
      5. Change the following measures in the Events table to “Average” aggregations: column_count, row_count, cell_count, connection_time, post_query_time, pre_query_time, other_time, query_time, total_time
      6. Set client as a non-measure and select it
      7. Right-click on transaction_id column and add a count measure
      8. Optional: Right-click on user_name and add a distinct count measure (to count unique users)
      9. Under contentview, set item_type and sub_type as non-measures and select them 
    3. On the Hierarchy tab:
      1. Add a time hierarchy based on Events with year, quarter-year, month-year, date, hour ,minute
      2. Add a content tree Parent-child hierarchy with Orphans Handling set to Roots and set the following Attribute Selections:
        Child Key – item_id
        Parent Key – folder_id
        Caption - name 
  3. Save and then Process your model.  

The resulting model will take a snapshot of your transactional activity in Pyramid for further analysis.
Set up a schedule on this model to reprocess the database once a day, typically off-peak.  

Metrics

Response Size Metrics

  • Column Count: The number of RAW columns in the result set
  • Row Count: The number of RAW rows in the result set
  • Cell count = columns x rows

The RAW result set is not always indicative of the drawn result in the visual.
For instance, matrix grids can be much larger drawn than their underlying RAW results.

Time Metrics

  •    Pre-query: Pyramid engine time before the query is submitted.
  •    Post-query: Pyramid engine time after the query is submitted.
  •    Other: any time spent on scripting logic and dynamic functions as part of the query
  • Server: pre-query + post-query + other
  •    Connection: the time taken to connect to the data source
  •    Query: the time data source takes to respond to the query
  • Query Engine: Connection + Queries and Calculations
  • Total: server + query engine + network + client

The time aggregates may be slightly off because:

  • the measurements are in millisecond integers - so rounding is ignored.
  • not every time element is always fully additive - because some items may run in parallel to each other (like 'connection' and 'other' times)

Using these metrics, you should be able to spot performance issues in your query designs vs data source performance vs Pyramid's processing performance.

  

12 replies

null
    • Uros_Zdesar
    • 5 yrs ago
    • Reported - view

    Hi Itamar,

     

    This is very useful feature. But I have one question regarding this. The view is returning only data for last 4 months. Is there any way we can extend this period?

     

    Thanks,

    Uroš

    • Itamar_Birenzweig
    • 5 yrs ago
    • Reported - view

    Hi Uroš,
     

    The Usage Model is based on the transactional data that is stored in the Pyramid database repository.
    This data is stored in the db for a limited time which Administrators can change if needed.

    To configure Pyramid to keep the data for a longer period of time, you can change the "Keep Transactions Logs For These Days" setting in the Admin Console > Settings > Logging.

    Before making this change, please make sure the database is capable of storing that much information.


     

    • "making the sophisticated simple"
    • AviPerez
    • 5 yrs ago
    • Reported - view

    For explanations on the model metrics see this posting.

    https://community.pyramidanalytics.com/t/184nyf/pyramid-usage-model

    • Shyam_Pendyala.1
    • 3 yrs ago
    • Reported - view

    Hi Itamar, 

    Is there a way to find exactly how many users are currently logged into Pyramid Application?

    Thanks,
    Shyam Pendyala.

      • "making the sophisticated simple"
      • AviPerez
      • 3 yrs ago
      • Reported - view

      Shyam Pendyala Pyramid, like all web apps, is stateless. That means the client is disconnected from the server (even though it might not appear to be). Its therefore quite difficult to get an accurate read on this without doing constant polling of all clients - an expensive function, using bandwidth.

    • Lead Consultant Advanced Analytics
    • Michael_Daun
    • 1 yr ago
    • Reported - view

    Oh, this seems to be be very useful...
    Does it work for version 2020.27 as well?

    In our ORACLE repository DB I found two databases which sound reasonable - PYRAMID_APPLICATION (no views, only tables) and PYRAMID_ADMIN (empty). But I could not find especially the mentioned views...

    THANKS anyway for the inspiration
    Michael

    • David_Gordon
    • 1 yr ago
    • Reported - view

    Hi Michael,

    You should be able to see the views if you have selected the repository database. Please open a ticket for support to assist you with this

      • Lead Consultant Advanced Analytics
      • Michael_Daun
      • 1 yr ago
      • Reported - view

      David Gordon, thank you!

      There was an issue with ORACLE repository databases. The support team helped very quickly and provided the appropriate scripts. As of version 2020.27.124, the problem has been solved anyway.
       

    • Lead Consultant Advanced Analytics
    • Michael_Daun
    • 1 yr ago
    • Reported - view

    Now that it works on our server, I like to say it again: Very useful showcase!

    But I still have issues with the TIME fragments. In the SQL code of the views I find the following:

    ...
    trans.pre_query_time,
    trans.post_query_time,
    trans.r_script_time + trans.dynamic_function_time as other_time,
    trans.connection_time,
    trans.query_time,
    trans.engine_time,
    trans.post_query_time+trans.engine_time+trans.r_script_time + trans.dynamic_function_time as total_time,
    ...
    

    Together with Itamar's initial explanations regarding the "Time Metrics" I think the first three columns sum up to something we could call "server_time". The sum "server_time" is not part of the view but it could easily be added, right?

    Then, I understand that "engine_time" is more than the sum of "connection_time" and "query_time", due to the calculations which most probably cannot be extracted from the repository database.

    But, why is "pre_query_time" not part of "total_time"? I understand that the time consumption for network and client is hard to retrieve, that is okay - but as Itamar explained, besides "engine_time", the entire "server_time" (= the possibly additional column) should belong to it!

    And, finally, when I look at the concrete times on my server, only "post_query_time", "other_time" and "engine_time" add up to "total_time". And for some rows the sum of "connection_time" and "query_time" is much greater than "engine_time"! How can that be?

      • imbarmarinescubar_pyram
      • 1 yr ago
      • Reported - view

      Hi Michael Daun ,

      The server time is the entire time that the server handled the request minus the Query Time. That might include queueing time, for example, if the server are receiving more requests than what the services are configured to handle in parallel, or for very big requests there might be an overhead time of creating the payload for the server to send back to the client (browser), and these are not captured by any specific column.
      For the most part, you can assume that Pre+Post+Other+(Engine-Query) is a close figure to the Server-Time displayed in the admin, but not an exact match.

      The Engine Time is the time that the query was executed including internal processing by pyramid where as the Query Time is only data source related time consumption (connecting, executing, retrieving the data).

      Connection time is an average time of all the connections that were generated in the request. For simple requests it would probably be an average of 1 item, but there might be multiple. (You can see in the admin view that it's labeled as "average connection").
      The Query Time is a smart accumulation of all the queries executed against the data source, with consideration of parallel queries. The connection times are nested inside this figure, so Connection Time should not be added to Query Time.

      Regarding Pre-Query,  as you commented, it should probably be added to the Total Time column, we will take a look into that.

      Hope this covered your questions, please let me know if I've missed anything.

      Imbar.
       

      • Lead Consultant Advanced Analytics
      • Michael_Daun
      • 1 yr ago
      • Reported - view

      imbar, THANK you very much!

    • Michael_Raam.1
    • 1 yr ago
    • Reported - view

    Hi,

    We are using the usage model and it is very useful.

    We started using the mobile devices for Pyramid in addition to the desktops, how is it possible to see the usage driven by mobile vs the desktop ?

    Tnx,

    Michael

Content aside

  • 12 Likes
  • 1 yr agoLast active
  • 12Replies
  • 1562Views
  • 11 Following