Maintaining Data Models

David_Gordon
updated 4 yrs ago

Data modeling is now central to all business analytic processes as companies embrace the “Decision Intelligence” paradigm. Delivering advanced data modeling options in a self-service framework is challenging – and maintaining the workflows created in the process can complicate things further.

Pyramid provides enhanced no-code tools to allow users to better maintain existing data flows and ETL logic in its data modeling toolset – making the process of creating and maintaining such complex flows easy, accessible, and fully self-service centric.

The Problem

After having built data preparation pipelines to cleanse, transform and enrich data, model designers often need to adjust them at a subsequent time. For example, when migrating from a development to a production environment, the data sources or targets need to be adjusted; after a change in the data warehouse tables are removed, columns are renamed.

Some of the changes may simply involve the changing of input tables, while all other logic and manipulations remain intact. Alternatively, designers may want to adjust a node in the data pipeline and then walk through the downstream adjustments required.

Instead of creating a sequence of events that disrupts the pipeline such that it is rendered useless, Pyramid identifies the adjustments and provides a specialized mapping wizard to walk the designer through the process of rewiring the logic to the changes. Ultimately this maintains all subsequent logic without requiring a full do-over. The column mapping wizards also let designers correct any variances and issues WITHOUT CODE, sparing the designer the burden of having to recreate all the logic and data manipulations in the model.

Example

This model has been created in a development environment, with a masking process, custom Python code and a Distinct process used for manipulating and enriching the source data.

In moving the environment from testing to production, the server, and subsequent table names have changed in the production environment. The different names used results in an error displayed for the table names.

By simply clicking on the error and selecting the correct table name, the model “validates” correctly. All pre-existing logic downstream in the pipeline remains valid, so the model designer does not have to recreate any of the logic.

If an error exists on a column due to a mismatch or different naming of a column the joins are changed to dotted read links. By clicking on the link between the table and the subsequent process, a column mapping panel is opened where the model designer can rectify the error. Tools are available to either individually or heuristically map or unmap source to destination columns. After clicking on the apply button and saving the model, the new model mapped to the current datasource is fully operational, without having to recreate all logic from scratch.

Summary

Model designers often perform simply changes like the changing of input tables or adjusting a node in the data pipeline. Instead of disrupting the pipeline and rendering it useless, Pyramid provides a new code-free mapping wizard allowing the designer to rewire the logic to the new changes, sparing the designer the burden of having to recreate all the logic and data manipulations in the model. Saving time and money!