Organizations are collecting more data than they ever used to. According to a whitepaper by International Data Corporation, the volume of data generated per year is expected to grow to 175 zettabytes(ZB) by 2025. For reference, 1 ZB is 1 Billion Terabytes which also indicates that the need to manage it will also grow at an alarming rate. Imagine being part of a large organization having 100s of data sources, the data required to make Business decisions will be calculated from multiple sources and would involve transformation at various locations in its lifecycle. At a particular point in time, governing this complex system of integrated data landscape will become a challenging & time consuming process.

Managing the lifecycle of data from creation to the end point is a long process and requires huge investments, that includes creation of processes, adoption of new technologies and training people. A poor or inefficient governance plan will lead to the following problems– Poor Data Quality due to lack of clarity and process
– Poor data visibility
– No clarification on Data Ownership and Accountability
– Disoriented definition and terms at different areas of the same organization

How do you tackle this without having any previous experience? Microsoft’s Azure purview is one such platform that can help you tackle your Data Governance nightmares. In addition to reducing the process time, Purview also simplifies the Data protection and complexity of Governance.

What is Data Governance?

Dataversity defines Data Governance as a collection of practices and processes which help to ensure the formal management of data assets within an organization. It includes areas & topics like Data Quality, Data Ownership, Data Management, etc. which help the organization gain better control over its data assets. It also includes methods to manage data efficiently and ensure security, privacy & compliance. Through a proper Data Governance process, companies can oversee the data flow within an organization and improve efficiencies as well as overcome regulatory scrutiny.

Manage, govern and get more value out of your data with Azure Purview

In traditional setup of Data Governance, the process of data discovery is a tedious one, which takes weeks to complete and might need a reiteration if things go wrong. In a traditional setup there is no central location that registers the meta data. Users are not aware of the data asset or source until they have an actual use of it. A data asset’s use is not visible until the user is aware about it or it specified to him/her in a usecase.
Moreover the documentation of the data asset lies in a different location. This make it difficult follow up on the different asset location and their definition (which differs across the organization).Additionally it is difficult to keep the data asset changes up to date in the documentation, you repeat the long same process again.

In terms of Data Owners or Data Stewards, if a user has questions related to a data asset, he/she must identify the responsible person for each of those assets and this might require lot of communication/documentation-reading efforts to identify the right person.

Microsoft introduced Azure Purview in 2020 as a unified solution for Data Governance. Azure Purview platform is a solution to simplify the discovery and compliance processes of Data Governance. It provides the following benefits.
– Automated data discovery, lineage identification and classification
– Unified map of your data assets across different sources and their relationships
– Sensitivity labels to filter sensitive data assets
– Hierarchy of Business Glossary
– Data Discovery opportunity through glossary with business and technical terms
– Insight into the sensitive data location and their movement across your organization

Key components of Purview

You can connect to different platforms like Microsoft Azure, Power Platform, Microsoft SQL Server as well as OnPrem servers and get the flow of data across your organization. This means that you can evaluate the transformation lifecycle of your data, better known as Data Lineage The platform is also access based, so only those users who have permissions can view the Data flow across the organization, ensuring data security. Azure purview is comprised of three key components: Data Map, Data Catalog & Data Insights.
Data Map:
Purview addresses the problems of a traditional governance process and provides a Unified solution to view all the data assets & data sources at a centralized location. It is automatically kept up to date with built in automated scanning and classification. . You can visually trace the lineage of a data asset through movement, transformation at various systems in the cloud as well as from analytics tool like Power BI.
Data Catalog:
The data assets includes glossary terms, business definition connected to them. Each asset includes the responsible owners and contact person who addresses that particular data related issues. The documentation is also synced with the data sources. With Data Catalog having inconsistent documentation is no longer a problem. You also have the option of filtering your data assets through sensitivity labels or filters.
Data Insights:
Data Insights provide users a single view into their catalog and provides specific insights to the sources, business users, data owners, Data stewards & security administrations. As of today, purview has six types of Insights reports that includes Asset Insights, Scan Insights, Glossary insights, Classification insights, Sensitivity labelling insights & File Extension insights. More in this will be explained in our future blogs.

Conclusion

Purview addresses huge problem that is faced in the Data Governance processes. It is a one place solution and can help maximize the value of your organizational data. Through automated scans it simplifies data discovery process & decrease asset identification time. Since it is central repository for your meta data and business glossaries, it provides consistent data experience across the organization.
The Data that you deal with will keep and increasing and so will increase the complexity to handle them. Azure purview helps ease this process by bringing some of the Data Governance practices to a single location. Today, data is the most valuable asset. Yet, its value is only utilized if you can manage to discover and identify data insights quickly.
To know more about the importance of Data Governance I’d like to refer to this link below, which will lead you to my 10 minute thought on data governance that i penned a couple of months ago.
For the next articles about Purview as well as all the other content centered around knowledge exchange please make sure to click the yellow button below and sign up for our Newsletter.
Peter Schmäling

Tushar Poojary

Tushar Poojary is a Junior Solution Architect at HUBSTER.S