GAP Helps Get Your Infrastructure Right for DaaP
About This Document
Take a look into the strategic insights of transforming data into a valuable commodity with this detailed exploration of “Data as a Product” (DaaP). This document distinguishes DaaP from standard data products by advocating for a product-like treatment of data, emphasizing benefits like enhanced decision-making and increased operational efficiency. It describes the essential steps of documentation, testing, and monitoring to ensure data meets rigorous standards before being marketed to third parties. Furthermore, the text outlines how GAP supports organizations in implementing DaaP, seamlessly integrating it with existing frameworks such as Scrum, Agile, or Kanban, and linking these to comprehensive data strategy development and project management.
Full Content Below
Read the Full Document
Explore the complete publication below

Your organization is producing reams of data every day — so isn’t it time you started making money out of it?
This is where Data as a Product (DaaP) can come in as a style of data management, where you apply a product mindset to your data sets. This not only helps create a new line of business, but also helps maintain and enhance it, building in attributes related to security and governance into your data.
Yet this is not just a case of putting your product in a marketplace and hoping for the best. Forward-thinking organizations need to look at how their infrastructure as well as their data facilitate this process, and this is where a trusted partner like GAP can help.
Before we jump in, let’s first define the difference between “Data as a Product” and “data products.” Any product or service built around data, such as a recommendation engine or data visualizations, is not DaaP; it is a data product. Data as a Product, in contrast, refers to treating data with the same quality rigorousness as any commodity that can be bought, sold or exchanged, either on a data marketplace — such as through Snowflake or a major cloud provider — or through other means.

Within Data as a Product, there are two different styles:
The first is developing a product or service that generates data, which means the data has to be subject to the same level of rigor as the application code, and have requisite documentation, testing and monitoring — as it will be offered as a product to third parties.
The second is to look at the output of any codebase as externally customer-facing. A data warehouse, for example, serves internal customers, from data scientists to product managers — so why can it not be put into a public environment (as long as it’s safe)? As Monte Carlo Data notes, “Anything that’s pushed to a ‘production data environment’ that the company can access is a product.”
Other benefits for organizations pursuing a DaaP strategy do not necessarily have to be customer-facing. DaaP projects can provide valuable insights and allow organizations to make more accurate data-driven decisions, as well as improve operational efficiency through infrastructure and process optimization.
Where GAP comes in is by being able to align to your working methodology. We help operationalize your infrastructure and guide you through all steps of the process, from developing a data strategy and monetizing your data, to technology implementation and project management. If your organization works with standard Scrum, Agile or Kanban methodologies, GAP can marry that to architectural implementation, model development and process optimization.
Let’s say your organization has a relatively small data team. You might have a nice proof of concept with regard to taking a product management approach to your data, or you might have sold one item on the Snowflake marketplace. What would a customer journey look like through working with GAP?
The first step would be to assess what data the company has, how it can be marketed, and how it can be put into data products, through a data dictionary, or in a metadata repository. The dictionary software identifies each data point structurally — be they strings of characters, numbers or true/false values — as well as semantically, evaluating whether each value makes sense.

An example of the latter is being able to help identify faults in the data, which need to be rectified before making it public-facing. An example could be in a column for “age” within a customer’s data set. If there was a spike in those figures, or if the figures were incomprehensible — such as minus numbers or numbers in the thousands — then the alarm would be raised with the customer.
The next step is to review the client’s existing infrastructure and make recommendations, depending on whether they want to publish the data to a public marketplace, or to an internal private set of partners. If the client was an AWS customer, it would be a simple process of setting the correct permissions to S3 buckets, as well as establishing policies and pipelines that feed those buckets regularly, depending on how regularly the client’s data is updated at their end.
Assuming the client runs on AWS and is looking for a public marketplace, a connection between their S3 account and Snowflake, for instance, could be provisioned and automated. If the data needs to be cleansed of PII (personally identifiable information), or if other aggregation needs to be performed on the data (to do things such as protect intellectual property), this can also be done on the client’s existing infrastructure, by establishing data transformation steps along the pipeline.

Working with GAP gives you expertise in infrastructure as code (IaC) and other techniques to be able to provision and optimize infrastructure in days, rather than months, with help from the GAPBuilt Accelerators focused on IaC. GAP also has many certified Snowflake experts, in addition to high-level access to Snowflake in terms of staff and support, as well as services not publicly available thanks to its Snowflake Partner status.
Combine this with project management expertise across myriad data disciplines, and you will be in safe hands if you want to make your data work better for your organization.