Activity
Mon
Wed
Fri
Sun
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

Learn Microsoft Fabric

Public • 5.8k • Free

19 contributions to Learn Microsoft Fabric
Exciting Governance features revealed (First Look, not released yet)
Just saw this post on LinkedIn from Jon Stjernegaard Vöge on some new Governance features which are currently being developed - I'm glad Microsoft are taking this direction - looks very promising. Jon's post: "There were a few hidden gems revealed at hashtag#FABCONEUROPE yesterday, which were not shown at the keynote: 𝐅𝐚𝐛𝐫𝐢𝐜 / 𝐎𝐧𝐞𝐋𝐚𝐤𝐞 𝐂𝐚𝐭𝐚𝐥𝐨𝐠 and 𝐃𝐚𝐭𝐚 𝐀𝐜𝐜𝐞𝐬𝐬 𝐇𝐮𝐛 Two extremely promising governance features which will help patch some of the current shortcomings.I snapped a few (poor) pictures as seen below, which might give you an idea. 1) 𝐅𝐚𝐛𝐫𝐢𝐜 / 𝐎𝐧𝐞𝐋𝐚𝐤𝐞 𝐂𝐚𝐭𝐚𝐥𝐨𝐠 At-a-glance overview of governance status by Domain, Workspace and Item with suggested actions, and easy to browse lists and lineage views of all items. All of this searchable and filterable. Also seems like Domains in general will play a larger role in your architecture 2) 𝐃𝐚𝐭𝐚 𝐀𝐜𝐜𝐞𝐬𝐬 𝐇𝐮𝐛 Are you also looking track of which people have access to what? Microsoft is planning a one-stop shop for data access, where you can browse, review and edit all your user and item permissions in your data estate. Sensitivity Labels and Endorsements appear very integrated, and will play a pivotal role in this as well 🤝🧯The exact timeline and functionality for these appears unknown at this time, but I’m personally very excited! What do you think?"
17
5
New comment Sep 28
Exciting Governance features revealed (First Look, not released yet)
1 like • Sep 26
Is this supposed to mimic Purview functionality within Fabric? We will be adopting Purview as part of our tech stack when we begin to implement an MDM solution as well for data catalog and governance
Need help with measure performance on direct lake model
Hi Everyone, I have created the following measure on a semantic model using direct lake, which looks something like this Measure = SWITCH( TRUE(), SELECTEDVALUE(Transactions[ClassID]) IN {1234, 5678}, SUM(Transactions[Quantity]), DISTINCTCOUNT(Transactions[TransactionID]) ) The measure works perfectly fine in a PowerBI card visual, but when I try to use the same measure in a table/matrix, any resulting query would basically time out I've tried the following but nothing seems to work - Use IF instead of SWITCH, made no difference and query still times out - Tried other aggregate functions instead of DISTINCTCOUNT, such as both conditions using SUM, and also made no difference and query still times out - I even tried both conditions doing SUM(Transactions[Quantity]) and it still doesn't work. Measure = SUM(Transactions[Quantity]) works perfectly fine, of course. Without showing the entire DAX query, when I traced the query in PowerBI desktop for a simple matrix, snippets of the query looks like the below, which eventually times out VAR __DS0Core = SUMMARIZECOLUMNS( ROLLUPADDISSUBTOTAL( ROLLUPGROUP( 'Transactions'[TransactionDisplayName], 'CampaignCategory'[CampaignCategoryName], 'Item'[ItemName], 'Class'[ClassName] ), "IsGrandTotalRowTotal" ), __DS0FilterTable, __DS0FilterTable2, __DS0FilterTable3, "SumQuantity", CALCULATE(SUM('silver_Transaction'[Quantity])), "Measure", 'Transactions'[Measure] ) I'm not good at DAX, so if anyone can shed some insight on how to rewrite this measure to perform in a table/matrix that would be highly appreciated. Thanks in advance.
0
3
New comment Sep 25
0 likes • Sep 24
@Yann Franck Transaction ID is not unique, a single Transaction ID can contain multiple line items separated by Transaction Line ID. However I tried doing count instead of distinct count on Transaction ID and the measure still times out
0 likes • Sep 25
I did some more tests today with additional observations, unfortunately these observations do not result in a solution to the DAX query performance issue - The same measure performs poorly in both Direct Lake mode as well as Import mode, so it doesn't appear direct lake is solely to blame for this problem - Performed additional testing with displaying the measure in a table. Basically, the DAX query still runs quickly when the query only uses the Transaction fact table, but once I introduce joins to dimension tables then performance suffers significantly. The DAX query still returns if I join to 1 or 2 smaller dimension tables, but once I start joining to larger dimension tables (e.g. the Transaction fact table has ~300k rows while the large dimension table has ~20k rows) then performance degrades quickly. Basically, when the DAX query joins the transaction fact table to 1 large dimension table and another (smaller) dimension table, then the query times out. For now I'll resort to calculating and storing the output of the measure within the transaction fact table in the lakehouse (since calculated columns are not supported in direct lake IIRC), but if anyone knows how to optimize the DAX measure it would be much appreciated.
AI Skills... does this change everything?
Yesterday, Microsoft released the AI Skill item to Public Preview. I hope I'm not going over the top here when I say: I really think there's a chance that this feature could change nearly everything we know about business analytics. If a business user can get reliable (key point) answers to their business questions, within seconds, without the need for an analyst or a dashboard... then doesn't that change quite a lot? Databricks seem to be going massively in this direction too with their recent AI/BI feature. All up for discussion, of course, and highly dependent on whether the feature actually works properly 😂 What are your thoughts?
11
13
New comment Sep 22
0 likes • Aug 15
@Will Needham Thanks Will, that's what I thought as well since MS turned off copilot for trial capacity, until I saw this youtube video demoing AI skills in Fabric and noticed that he is on Fabric trial as well. I guess he managed to get AI skills working through some other way, and I dropped a comment to ask how he got the AI skill working in trial capacity https://www.youtube.com/watch?v=5ivA4zdnDtE
0 likes • Aug 15
@Will Needham Ah that makes sense as a MS employee, he probably has access to multiple Fabric capacities, including trial capacity as well as non-trial capacities.
Ingest nested JSON to Lakehouse tables
Hello! I am trying to ingest the below JSON file into (two) tables in a Lakehouse. https://www.kaggle.com/datasets/aditeloo/the-world-dataset-of-covid19?resource=download&select=owid-covid-data.json I was trying to test Pipelines and PySpark notebooks on this task. This file is less than 50 MB, so it is fairly small. 1) Pipeline cannot handle (preview) this file as a source. I have attached two screenshots showing the error. 2) This file is fairly simple, however, its data is nested. It has countries (Dimension table), and for each country, it has daily covid cases (Fact table). This means I can attempt to load "Dim Country" and "Fact Covid" tables using PySpark. However, due to the structure of the json file, it appears that this file does not fit nicely in a Spark Data Frame. Each country code appears as a column in the Spark Data Frame instead of a row. I am looking for ways to get two Data Frames, one for "Dim Country" and another for "Fact Covid", to be saved as Delta tables in the Lakehouse. I have added two screenshots. I am keen to hear feedback from other users and if someone can try to load this file and guide me in the right direction, I am very grateful.
1
20
New comment Aug 17
Ingest nested JSON to Lakehouse tables
1 like • Aug 14
@Austin Libal thanks! this is great, i need this for other use cases where i am getting json response from API calls, and in most cases, i've been using json_normalize from pandas to extract the nested json and concat the pandas dataframes, but your solution works much better
Data Pipeline Failure Notification
Does anyone have a good idea how you can get the notification when the data pipeline is failed? As far as I know, there are notification activities - Team or Outlook but its for activity level and not for the pipeline or workspace level. You need to add the notification activity per each pipeline activities if you want to monitor the entire pipeline properly. Someone suggests that you can wrap the pipeline within another pipeline and run Outlook or Team activity if the wrapped pipeline has failure. Any good idea?
3
11
New comment Aug 7
1 like • Aug 7
@Jerry Lee I can live with not having key management built-in for now as long as it's easy enough to access Azure Key Vault within a notebook, but not having alerting capabilities for the Fabric monitor is a huge oversight. I did end up trying to link Fabric with Log Analytics, and as expected, only PowerBI related activities are going into log analytics and nothing from notebook/pipeline runs, which is quite disappointing. Will likely need to log it to MS as a feature request I think
2 likes • Aug 7
@Adeola Adeyemo Problem is that it's still a workaround that you need to do for individual notebooks/pipelines while the Fabric monitoring hub already has most of the metadata related to the notebook/pipeline runs available. It'd be great if this information can be easily queried against, similar to data in log analytics I found this and upvoted on the below idea, but i'm sure there are probably similar ideas already posted as well https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=6fdb7392-d79f-ee11-92bd-6045bd85676f
1-10 of 19
Brian Szeto
3
35points to level up
@brian-szeto-2173
I'm a data architect with extensive data warehousing experience, primarily focused on financial industry.

Active 4d ago
Joined Feb 26, 2024
powered by