Massive Growth Led to Complex Development. SDF Came in to Simplify

The team at Linqto recently found themselves at a crossroads: pursue the conventional data stack to address the increasing demands on financial reporting analytics or find an alternative that met their data development goals.

The team at Linqto recently found themselves at a crossroads: pursue the conventional data stack to address the increasing demands on financial reporting analytics or find an alternative that addressed their growing issues. The data engineering team had already grown in size, and the number of analytics transformations was rapidly increasing. Their goals, like many in the data industry, were to increase developer efficiency while also controlling costs wherever possible.

I was working on building a new tool, similar to my previous open source creation, Astronomer Cosmos, but one that addressed multiple new areas of concern: quality, controls, and faster

I was working on building a new tool, similar to my previous open source creation, Astronomer Cosmos, but one that addressed multiple new areas of concern: quality, controls, and faster compile times. I then found SDF and knew it was perfect for our data management and transformation layer. Chris Hronek, Director of Data Engineering

Instead of pursuing alternatives requiring multiple tools to provide a data catalog, a complete data map and lineage, quality controls, and governance, the Linqto team set up SDF. Within the first few minutes of integrating SDF, the team was able to create quality controls that restrict common errors.

Enforcing QualityControls Before Run Time

The Linqto Data Warehouse was built with general rules and designs in mind. The team loads raw, untransformed data into schemas, copies them into models where testing, conversion, and transformations are applied, and the data is prepped for visualization. New developments and changes are tested in a development/staging environment before being pushed to production.

Before SDF, the team reviewed changes to the data model through a process on GitHub, and it was the responsibility of the reviewer to enforce the agreed-upon rules. It was an error-prone process with small issues slipping through.

One common error we kept missing in our deployments was simple references to development data warehouses instead of production data warehouses. It was so frustrating to see these simple problems pop up. Chris Hronek, Director of Data Engineering

With SDF, the team has implementedCode Contracts verifying the presence of specific properties and quality checks that do not require human intervention and review. All these rules run locally on the team’s workstations and are integrated directly into a ContinuousIntegration (CI) Pipeline that activates when a team member opens a PullRequest to change the data model. The CI will pass if the submission adheres to the rules or fail in case of violations.

Run SDFContracts and SDF within GitHub Actions CI/CD

If the test fails, the GitHub CI will fail and will be reported directly where the test has failed. In addition, failures being reported on GitHub, team members will have quality checks run directly on their workstation with each change as SDF compiles checks for quality and reports exactly where the failure has occurred.

SDF ensures that data models support company advancement while maintaining compliance and safeguarding sensitive information. By automating best practices and reinforcing data security, SDF stands as a cornerstone in scaling data infrastructures efficiently and responsibly. Chris Hronek, Director of Data Engineering

Gaining a complete view of the data warehouse

Another primary goal of the Linqto team was to establish a complete understanding of their data warehouse.Building anything off a data model, whether static visualizations, reverseETLs, or data processes, can lead to concerns that changes will break something downstream and paralyze business operations. This not only slows down the data development cycles but also can lead to high-value analytics models failing during critical times.

After compiling their entire data warehouse and parsing all models, in a matter of seconds, Chris was able to visualize the entire Linqto data model. With SDF providing column-level lineage to understand downstream column and table references, developers can increase confidence in decisions. The team is estimated to save over $130,000 in labor costs with this time saver and increase confidence in decisions. SDF brings data quality, data governance, and data management benefits back into the hands of analytics engineers.

An example screen capture of the SDF Cloud Workspace

SDF Brings DataQuality, Data Governance, and Data Management into the hands of AnalyticsEngineers

SDF brings immediate value with multiple data quality, data governance, and data management benefits back into the control of data engineers and analytics engineers. With the SDF CLI and SDFCloud, teams can proactively establish quality rules and identify issues before run time. As the Linqto data team continues to grow its data warehouse, they aim to further utilize SDF for their governance, quality, and transformation needs. The end goal is for all data models to be transformed, controlled, and governed by SDF.

Linqto - Empowering Individual Investors and changing private investing

Linqto's vision is to democratize private investing by making it accessible, affordable, and liquid for individual investors. Through an intuitive platform, Linqto empowers individual investors to engage in the private equity market, which was once only within reach for the privileged few.

With an inclusive, easy-to-use, and fast digital platform, Linqto provides qualified accredited investors access to investments in leading tech companies while they’re still private.

Book a Demo