Hi everyone! I was wondering if any of you has experience with more advanced workspace architectures when coming to metadata driven architectures and data factory pipelines to handle them and also how to join it in CI/CD pipelines. In a Dev,Test,Prod environment with Bronze, Silver and Gold workspaces we end up with 9 workspaces with their lakehouses The reason of this architecture is security demands of incoming data to Bronze and processing in Silver, and presentation through Gold making the semantic model there for end users or data scientists more interested in datasets (that is interesting too, maybe data scientists needs more "secured data" ...) But how to arrange the data factory pipelines? I must drive data through all workspaces and update data in the lakehouses in their separate workspaces. I think it is possible to have all pipelines in one workspace either an additional Pipeline/notebooks workspace like a similar to the last image or place them for instance into the Bronze area for instance. I have tested that I can use pipelines to copy data to lakehouses between workspaces. But in a metadata driven way all workspaces lakehouses must be parameterized.
Then how to store all this in Git? I think that all layers Bronze, Silver and Gold can have their own folder in devops placing the workspace items there. I think the deployment pipelines will have a bronze, silver and gold pipeline driving them through dev, test and gold since I think it may be to complicated to put them all in one deployment pipeline, or? In azure devops I think this can be controlled better.
puh, I am sorry to put to much into this thread, I give you some of my sources for my thoughts perhaps you understand better.
Have a nice day or weekend