Snowflake's Head of Data Collaboration on productising public sector data
System-thinking is turning data hoards into data products. Standardised, “spaghetti” data is no more, as modern data platforms are enabling user-centric, privacy-compliant interoperability.
The potential is immense, says Jonathan Wolf, Head of Data Collaboration Strategy at Snowflake. If data products are the new reality, this is a game changer for quantifying the value of data and pinpointing quick-wins in government-wide efficiency.
Value-added data provision
What role does collaboration play in productising public sector data? Large-scale data providers like the US Census Bureau, have enormous, complex and often inaccessible data assets. Supporting value by design with organisations across the globe, Wolf takes an aerial view of their entire data ecosystem. Whilst this can first appear opaque, Wolf tells us that it all starts by looking at the diverse needs of data recipients to build an informed view of how to optimise the entire system.
“Most departments have a mandate to make data accessible and useful, but the model of providing opaque, uniform data to stakeholders with diverse needs is changing to a more bespoke approach that is adding value to the user experience. We can understand this as data productisation,” Wolf explained.
“No one wants your pile of data,” Wolf joked. An interactive dashboard with built-in explainability is probably more suited to citizens that want answers to specific questions.
Departments with data science teams like the Office for National Statistics (ONS) are more likely to prefer a data deluge where they can track the movement and consumption of data across departments in real-time.
Whether sold or not, viewing data as a 'product,' positions it within a system-wide value chain, with an obligation for it to meet higher accessibility standards based on the ‘customer’. This sets the stage for data to yield a return on investment.
User-centricity and privacy-compliance. Two sides of the same coin?
Constructing data in the form of a bespoke product for end-users is also helping organisations address some of the historical barriers to data sharing, with tools like data cleaning rooms rapidly advancing the possibility of a data sharing environment with built-in privacy compliance.
“For a long time, there was a distinct tension between the need for transparency and insight and the need to protect sensitive data,” Wolf reflected. Opting for arduous, manual data cleaning has meant waiting on answers to important analytical questions that required cross-government data sharing. In the medium term, this lag has inhibited the need for a proactive, evidence-driven government.
However, Wolf observes that there are a growing number of use cases where technologies are facilitating multi-party collaboration, allowing the bespoke data requirements of stakeholders to be met, whilst still being able to curtain sensitive data to your recipient.
“We collaborate with government bodies to understand what the core issues are that recipients aim to solve by accessing certain data. This approach enables us to provide valuable insights while preserving sensitive identifiers. Data productisation helps us understand the recipient's needs, which in turn, helps us determine which elements we can conceal, transforming raw data into a tailor-made, privacy-compliant solution.”
With the construct of productised data sharing in place, Wolf elaborated on how to use this as a foundation to monetise those data assets with the private sector whilst maximising departmental resource allocation.
Commercialising data assets through public-private collaboration
Gone are the days of departments relying on the ONS as the exclusive source of data expertise. With departments improving their data infrastructure, capabilities and literacy, the ONS’ role is increasingly less centred around being an intermediary data-cleaning service for other departments. Wolf observed that this decentralisation of expertise has meant that departments have greater bandwidth to optimise their data resources through new ways of working with the private sector.
“In the last few years, we’ve seen an opportunity open up where governments can begin commercialising their data. Previously, bespoke data products built off of aggregate data was a service monopolised by private actors but since the tension between insight and privacy has been addressed by technologies like data cleaning rooms, governments can now vertically integrate and start to profit from the data they are sourcing and storing.”
Enhancing the value of data assets can foster creative solutions for pressing national concerns. In sectors like energy, where private companies play a significant role in service provision, there's an opportunity for businesses to leverage public sector data to develop premium data products for the private sector. As Wolf underlined, these businesses could then pass on the resulting benefits to citizens directly through reduced utility bills, bringing the benefits of system-thinking and collaboration directly to citizens.
The virtuous cycle of data-driven budget allocation
But the opportunities of data productisation to make money go further aren’t confined to private-public partnerships. Attaching quantifiable value to the department's data products opens a host of opportunities for departments to communicate the value of their data provision, allowing for a more strategic view of budget allocation.
“Modern data platforms are going beyond simply showing the number of downloads, but are actually providing consumption metrics about what queries are being made against specific data points. We’re seeing agencies use these metrics to argue for more budget, which they can reinvest in bespoke data products to drive a virtuous cycle.”
By attaching quantifiable value to data products, departments can demonstrate the significance of their data provision and secure additional resources for reinvestment. This data-driven approach fosters a virtuous cycle of continuous improvement, shifting away from linear, unspecialised data sharing towards an enriched data ecosystem that can be reinforced by top-down direction.