Hybrid Data Connectivity via Pulse
With increased cloud choices, and distributed servers, it is common for many organizations to have a BI server in the cloud, remote data sources, and end-users in different locations, potentially on different continents. This presents a dichotomy with a key dilemma: either move or copy the data for analysis to the cloud or deploy the BI solution on-premises.
Both options have shortcomings. The first approach entails the duplication of data and the potential sacrifice of on-premises investments. The second approach, staying non -premises, may curtail the scalability and flexibility offered by the cloud. Hence the need for a “hybrid” strategy which delivers the best of both worlds: BI deployed in the cloud and data and infrastructure remaining on-premises.
The challenge is to connect to the remote data with simplicity, without compromising speed and security or creating an administrative nightmare when managing the various data sources in different hosting frameworks (which is potentially exacerbated by multi-tenant access issues).
To address this challenge, Pyramid’s Pulse presents remote data sources to users like any local data source and offers exceptional speed and security. Configuring Pulse and connecting data is simple, and all data sources (remote or otherwise) can be centrally managed and governed from the main administrative tools—with full multi-tenant support.
The problem
Companies may implement BI in the cloud for easier maintenance, reduced capital expenses, and flexibility. But if they don’t move the underlying data with it, they then typically experience speed and security issues when running against remote data sources.
When connecting directly to remote data via VPN, traffic is rerouted across the network, usually making access slow. Transferring the data from the data center to the cloud is often not viable as it necessitates considerable overheads for both data transfer and storage. And worse still, it duplicates the data—which runs counter to the central source of the truth paradigm.
A few third-party BI vendors offer a remote data access solution, but they come with limitations. Microsoft’s “Gateway,” for example, doesn’t connect to all data sources supported in Power BI. SAP’s “Agent” can only connect to a limited set of data sources—and it’s highly complicated to set up and configure.
Other tools tend to run the queries inefficiently over the “long wire” (i.e., not at the gateway level), resulting in a fatter payloads and increased response times. With these types of hurdles, most BI solutions inadvertently encourage users to export their raw data out of remote data servers and reimport it into their cloud-hosted native data stacks to get better analytical functionality. This breaches data governance and security protocols; duplicates analytic layers; decentralizes the “truth”; and ultimately deleverages the investments made in the remote data sources.
Pyramid’s solution
Pulse is a stand-alone application designed to be deployed into disparate networks, separate from the main Pyramid deployment. It allows connectivity to remote data sources without using a VPN or other network security protocols, while still providing full data security. From within the main Pyramid application, the remote data sources are accessible like local data sources without the complexities of setting up reverse agents.
Pulse is quick and easy to set up (2-3 minutes) and its multi-tenancy capabilities allow different tenants to have their own deployments in a shared Pyramid instance. Further, multiple instances of Pulse in an environment offer a high-throughput solution. Lastly, Pulse supports both direct querying of a data source as well as data streaming into data mash-ups and modeling applications—without deprecation of function, feature, or performance.
Business case
Michelle is the BI administrator for CPO Ventures, a multinational retailer with worldwide branches. They store their SAP HANA data on a server in Europe, run a Pyramid cloud installation on AWS on the East Coast of the U.S. Bruce, a BI analyst in Australia, builds dashboards and performs analysis via a web browser. Michelle has set up access to the remote HANA instance using Pulse to provide all worldwide users with easy access and speed.
A typical user session will involve: (1) an end user client interaction being sent to Pyramid hosted in the cloud; (2) Pyramid launching a request to the Pulse server to locally run a direct query against the local data server; (3) the Pulse server sending a response back to the cloud; (4) and the Pulse server transmitting the result back to the end user.
In order to set up Pulse, Michelle ran the Pulse installation on a remote server in the same network as the HANA data (two minutes) and then supplied the Pyramid URL address and Pulse security key. Michelle then added a new data source for the HANA server in the admin console as if it were local to Pyramid in the AWS cloud (two more minutes).
Bruce can now create a new data discovery by selecting the HANA server like any other data source hosted “locally.” The Pulse logo indicates the model is using Pulse to access the database and models—the only indication to Bruce that this is not a “local” data source.
Bruce then uses the Discover app to create a scatter plot to plot sales versus expenses and is able to query and retrieve 6684 records in 850ms—less than a one second round trip!
Summary
Multiple servers in different locations presents the difficult challenge of connecting speedily and securely without using VPN and network security. Remote data centers using networks provide very slow access as traffic is rerouted across the network. Third-party tools can access limited data sources with lousy response times using their remote data implementations. As such, most BI tools encourage users to export raw data out of remote data servers into their native data stacks, deleveraging the remote date center investment.
In contrast, Pyramid’s Pulse application is quick and easy to set up and allows connectivity to remote data centers without using VPN or network security, providing access to remote data sources as if they were local data sources. Both direct querying and data streaming are supported with no deprecation of security, function, feature, or performance—offering tremendous speed and security.
This post originally appeared at https://www.pyramidanalytics.com/blog/details/blog-hybrid-data-connectivity-via-pulse
Other Resources
- Cloud Based Analytics [Video]
Reply
Content aside
- 1 Likes
- 3 yrs agoLast active
- 47Views
- 1 Following