Technical Questions on LLM/GenAI Architecture and Data Flow
Hello PA Team,
I'm looking for a deeper technical understanding of how the GenAI/LLM integration works, specifically regarding data flow and security. I've seen the general documentation, but I have a few more detailed architectural questions.
Could you please provide clarification on the following points:
What data is sent? When a GenAI feature is used, what exactly is included in the payload sent to the external LLM provider? For example, does it send metadata like table/column names, the query structure, or aggregated/sample data from the visual?
Where is it sent? What are the specific endpoints (domains/IPs) that the Pyramid server communicates with for these features? This is important for firewall configurations.
How is it sent? What protocol and port are used for this communication (e.g., HTTPS on port 443)?
Is there any logging? Can I audit or log the prompts sent to the LLM and the responses received back from it? Having a full trail of the entire "conversation" would be very helpful for monitoring and troubleshooting.
A more technical description of this process would be extremely valuable for understanding the security and data governance aspects of using this feature in an enterprise environment.
Thanks!
5 replies
-
Hi ,
I found some information regarding LLM Audit/logging for you to use.
Essentially, if you navigate to the admin console and select logs, you'll find log settings where you will scroll down and see "detailed LLM logging". Here you can enable extra data from LLM to be connected through different settings such as adjusting how long you want data to be logged and "purge now" to simply track data immediately. The only catch with this is that it will be heavier on the database and only should be used for diagnostic purposes.
Also, keep in mind that logging can slow down the application.
Here is a link for your reference with more details on this subject:
Log Settings -
Happy to give you my thoughts as i understand it.
What data is sent? When a GenAI feature is used, what exactly is included in the payload sent to the external LLM provider? For example, does it send metadata like table/column names, the query structure, or aggregated/sample data from the visual?
When using GenAI functions (The chat interface) No actual data is sent to the LLM, context is provided to the LLM with placeholder / sample data of same type. The response is then interpreted by Pyramid which then inserts your data and produces the graphsWhere is it sent? What are the specific endpoints (domains/IPs) that the Pyramid server communicates with for these features? This is important for firewall configurations.
Depends on the LLM you have used, byt as a general rule, when you create your LLM API key you will be given an endpoint and Key, if you use Azure you will also have a model name to consider. You will need to create a rule for that specific endpoint for your chosen LLM providerHow is it sent? What protocol and port are used for this communication (e.g., HTTPS on port 443)?
HTTPS 443, its a generic REST API callIs there any logging? Can I audit or log the prompts sent to the LLM and the responses received back from it? Having a full trail of the entire "conversation" would be very helpful for monitoring and troubleshooting.
Yes, i can view the prompts sent by users, its all in the repository and may require a model to be built. Our Pyramid onboarding team has been fantastic and helped us set this up. That being said, there is a new LLM Logging option in 2025, i don't have this turned on as yet, i would hope this would appear in in the transaction log
Just one final point, this is all related to the GenAI Chat function, if you use the GenAI Model nodes then some of the above is not true, the model nodes will send data to the LLM and logging is not possible. I throw tens of thousands of records at the LLM with the same prompt and then pass the result back to my data warehouse, i would like it if we could add some dynamic prompt function
Please upvote these requests :)
https://community.pyramidanalytics.com/t/35y12q8/dynamic-prompt-options-for-llm-general-node
https://community.pyramidanalytics.com/t/g9y1239/llm-general-node-structured-output-mapping
-
Hi All,
See my response inline below:
What data is sent? When a GenAI feature is used, what exactly is included in the payload sent to the external LLM provider? For example, does it send metadata like table/column names, the query structure, or aggregated/sample data from the visual?
Pyramid does not send any user data to the LLM when a query is initiated. It does however, send the metadata of the Semantic Model to the LLM so that the LLM has some understanding of the data structure. This is combined with a very sophisticated engineered prompt which makes sure the LLM returns a "recipe" to Pyramid of what queries etc. need to be run against the data.
If you have enabled the "Slide Insights" option in Present, then the query results WILL be sent to the LLM for interpretation and output in natural language.
Where is it sent? What are the specific endpoints (domains/IPs) that the Pyramid server communicates with for these features? This is important for firewall configurations.
We currently support OpenAI, AzureAI, Gemini and Mistral LLMs. You can find their URL endpoints with a bit of judicious Googling.
How is it sent? What protocol and port are used for this communication (e.g., HTTPS on port 443)?
Generally, yes. See the relevant documentation for details, for example https://platform.openai.com/docs/api-reference/introduction
Is there any logging? Can I audit or log the prompts sent to the LLM and the responses received back from it? Having a full trail of the entire "conversation" would be very helpful for monitoring and troubleshooting.
Yes, see the response from above.
There are also usage statistics available that monitor the number of tokens exchanged between Pyramid and the LLM. Most LLMs charge per token, so this can be used to cross charge this service or just as an indication of usage and costs.
Hope that helps!
Ian