Data Governance with Azure Purview

The modern workplace his a data-centric world. Organisations are always increasing the amount of data they must collect, store, and analyse to inform critical business decisions and measure their organisation’s performance. As data becomes increasingly important, so do the tools needed to discover, manage, and track information in the absence of a data governance program.

In the absence of a data governance program, the decisions made about key data systems are made by association staff who “own” the system. When there are different owners for systems (which is very common), the result is inconsistencies in data availability, collection, usability, integrity, and security. This instantly causes data quality problems. It also can negatively impact user experience if there is no integration or procedures to update the existing data within your environment. If you and your organisation find yourselves in such a predicament, Prometix highly recommends deploying an Azure Purview solution. 

What is Azure Purview? 

At the end of last year, Microsoft launched Azure Purview, a unified data governance solution that would solve the inherent challenges with managing disparate data assets. This new platform is designed to help your organisation to enable a better understanding and knowledge of its data and empower users with a data driven collaboration.  

At its core, Azure Purview is a unified data governance tool that automates data discovery, scanning, classification, labelling and business glossary. This gives users a holistic view and understanding of critical data assets. It pulls data from all the various Azure repositories and third-party data stores, including on-prem SQL servers, SaaS applications, and other forms of Cloud Storage. It is built on Apache Atlas, an open-sourced platform for data governance and provide seamless deployment on Azure Platform. Purview is designed with comprehensive security and data compliance built in. 

Why use Azure Purview?
Many organizations don’t have a centralized way to register all their data sources. Without insight into the location and movement of sensitive data, some users may be unaware of the available data sources or may not easily understand where to get the types of data they need. It may also cause users to lose confidence in the quality of data they are accessing since they are unsure how the data was sourced or who the data owner is—this is where the Azure Purview Elastic Data Map comes to the rescue!


Azure Purview automates the discovery and scanning of many types of data sources today (with many more in development), from on-prem servers to cloud-based services not only across Azure but other clouds also, like Amazon Web Services. Once the metadata from those scanned data sources is ingested, a vast amount of information is now available across organization-wide data sources, including a visual mapping of all the data source “collections.” The metadata can now be enriched with additional information your organization would provide, like the data owner, connection strings, descriptions, and associated Business Glossary terms. You can select the Business Glossary terms from a default store, or you can import your own. This process enables all these data sources to become a Data Catalog that your data consumers can easily browse and search.

In addition, the Data Lineage is then visible to your data consumers when they have found the data source they want to use from the Data Catalog. The Data Lineage allows users to visually trace the movement of the data assets from the data system of origin through movement, transformation, and enrichment, including various destination data storage and processing systems (in the cloud to consumption) using an analytics system like Power BI. This data audit trail is all brought to you by a serverless cloud service that is elastic, allowing the service to automatically increase or decrease capacity based on your needs, meaning you only pay for what you use!

What does Azure Purview do? 

Purview serves three primary purposes: mapping data from multiple and disparate data sources, categorising and labelling data on both pre-built or custom classification rules, and generating usable data insights.  
  • Data Mapping – Purview creates a unified data map by precisely scanning all data sources across your entire data estate. With ability to view data lineage, it’s enabled users to find where the data is sourced and how it transforms at the asset and granular object level on supported data sources. Azure Purview easily integrates with all third-party data systems using Apache Atlas APIs. 
  • Data Cataloguing – Presenting the discovered data so users understand relevant information is available and where each set is stored. Purview allows searching for and tagging data sets containing specific technical or business terms, with the system automatically determining the data’s sensitivity level by examining columns for certain keywords.  
  • Data Insights – The resulting data catalogue gives you insights into the location, movement, and transformation of data within a multi-faceted or hybrid data environment. 

Azure Purview as a Data Management and Governance
In essence, Purview helps you manage and govern your data from multiple sources, allowing you to map and visualise your data landscape in real-time. Purview bridges the gap between scattered data sources and centralised data management. By discovering, mapping, and categorising data, Purview gives you a bird’s-eye view of all your enterprise data environment. Additionally, it provides useful information about data distribution across different domains, data movement, and the locations of sensitive high-value information. 

Azure Purview brings into focus the need to have a centralized data governance system and delivers on virtually consolidating data sources onto a single pane of glass. It is not entirely possible to have just one data repository in the real world due to the complexities of dealing with variable data sets, formats, sources, and sensitivity levels. Purview is a much-needed and welcome data management solution for organizations using multi-cloud infrastructures, SaaS applications, on-premises datacenters, and Azure data storage services. 

Summary
Azure Purview is also valuable to security and compliance teams that need to keep track of the organization's while ensuring it is secure and appropriately used based on its sensitivity. Data governance is becoming increasingly difficult as data assets grow exponentially within many organizations, thus increasing the security and compliance risks. Azure Purview Insights provides customers a single pane of glass view into their catalog and aims to provide specific insights to the data sources such as administrators, business users, data stewards, data officers, and security administrators. This feature includes reports that provide details about your data assets, scans, glossary, classifications, sensitivity labeling, file extensions, and file types.

In addition, Sensitive Data Insights will provide a simplified compliance risk assessment across all your operational and transactional data sources using built-in data classifiers or custom data classifiers you can create as needed. This feature enables the evaluation of risk and the ability to derive audit trails of data qualified by sensitivity and business relevance.

Prometix as a Microsoft Gold certified O365/Azure consultants (Sydney, Canberra, Melbourne & Perth) have delivered numerous Azure Data warehouse based solutions. If you need any assistance, please feel free to contact us via enquiries@prometix.com.au.

Comments

Popular posts from this blog

Microsoft Teams is the home for the modern workplace of the future

Understanding Azure Synapse Analytics