Data Lineage - is it a tall claim?
Hello Fabricators, We all know about the "lineage view" which is provided by default in every workspace and this feature is advocated a lot by Fabric, claiming to provide the complete flow from ingestion in Bronze layer till loading to the Gold layer. Even this feature lured one of our customers, where we are now migrating the existing Power BI reports to Fabric eco-system. One of the pain points of the customer is that Power BI end users become clueless when coming across any discrepancies. They are unable to trace the journey of data and whom to contact etc. The customer has high hopes and are looking forward to this data lineage as a panacea to these troubles. However, when we look into lineage view, the extent of details it provides is very limited. Even for data engineers and coders like me is unable to fathom what exactly to make out from the way lineage is shown. Our expectations from lineage: End-to-end journey i.e. the list of all the components such as notebooks, dataflow gen2, data factory pipelines, lakehouses, warehouses which the data goes through to culminate into the final report. However, the lineage of the reports show nothing beyond "semantic data model", "SQL end point", "warehouse" (or lakehouse wherever applicable), but there are no signs of any dataflow or data pipeline. Considering this, I feel the data lineage feature is not fully matured. It could be under the development too. Does anyone echo their views about data lineage with me? Any workaround to come up with the "detailed flow" as expected in data lineage (may be by using some 3rd party libraries, etc)?