Chinmay Phadke

Learn Microsoft Fabric

Activity

Mon

Wed

Fri

Sun

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

What is this?

Less

Memberships

Learn Microsoft Fabric

Public • 5.3k • Free

Learn Power Apps

Private • 2k • Free

13 contributions to Learn Microsoft Fabric

Will Needham

Sep 24 in

General

Incremental Refresh for Dataflow Gen2 (Public Preview)

Hey everyone, the conference updates are now starting 👀 Starting with this incremental refresh feature added to the Dataflow Gen2. You can read more here: https://blog.fabric.microsoft.com/en-us/blog/announcing-public-preview-incremental-refresh-in-dataflows-gen2/

New comment 23d ago

Incremental Refresh for Dataflow Gen2 (Public Preview)

Chinmay Phadke

1 like • Oct 3

Finally this feature is here. Better late than never. I wish Incremental Refresh feature was available in Dataflow Gen2 months back. Our pains in the ongoing implementation would have been reduced.

Maksim Khamets

Jul 8 in

Technical

Seeking Best Practices for CSV to JSON Transformation in Fabric

Hi everyone! My name is Max, and I work as an analyst in a small company. And i'm thrilled to build Fabric solutions for my tasks instead of huge piles of different tools that are hard to log and monitor. I'm new to Microsoft Fabric, and as an exercise to learn it, I decided to migrate one of my existing projects to the Fabric architecture. I'm facing some challenges with the architecture. Here is a brief description of the old process: 1. The client sends a list of agencies in a .csv file. 2. Validation and mapping are done within a Logic App. 3. The agency table is transformed into a set of JSON objects (one object per row) using Logic App and Azure Functions. The resulting JSON object has a more complex structure than the original table, so it cannot be converted 1:1 from the table to JSON. 4. All JSONs are saved in Blob Storage (in the new architecture, I was thinking of switching to Lakehouse to better orchestrate and log all these JSON files). While the first two steps are straightforward to implement in Fabric (thanks to Will's videos), I'm struggling with the third step. I can't figure out what is the best way to separate csv rows into set of json files. I have tried PySpark, but either I didn't understand something, or it's not the best fit for this task - saving JSON files to Lake Storage is behaving oddly. Also, I couldn't find a suitable solution in DataFlow and Pipeline (calling Azure Function seems like a possible option, but I can't figure out the best way to organize it). SQL doesn't look helpful, but It doesn't accept FOR JSON for some reason. Or should I just use something outside Fabric for this specific operation? I would appreciate any ideas on how to make this transformation as efficient and effective as possible, thanks! 💚

New comment Jul 10

Chinmay Phadke

0 likes • Jul 9

From the explanation you provided, if you need to leverage Microsoft Fabric, then there is no need of converting to JSON in this scenario, unless there is a specific requirement. Fabric is capable of Ingesting the CSV directly into Lakehouse tables, using 3 approaches: 1. Dataflow Gen2 2. Copy activity in Data factory pipelines 3. Pyspark notebooks

Bijal Patel

Jul 9 in

Technical

Cross-workspace access.

Hello everyone, Is it possible to access one dataflow from different workspaces.

New comment Jul 10

Chinmay Phadke

3 likes • Jul 9

"Accessing a dataflow from different workspace" has multiple interpretations. Please check if any of these answer your questions. All these points have an underlying assumption that all these multiple workspaces are accessible by the user. 1. If your question is that when we open up workspace A and land up on its homepage, does the dataflow from workspace B get listed here or can be searched here? Answer is No. 2. A dataflow defined in workspace A, can choose tables from lakehouse belonging to workspace B as sources and can choose destination tables as those from lakehouse belonging to workspace C. 3. A dataflow is defined in workspace A but you want to create a pipeline in workspace B, where one of the activities is expected to be thus dataflow from workspace A. Answer is Yes, this is possible.

Will Needham

May 20 in

General

What are your biggest pain points currently with Fabric?

Hey everyone, happy Monday! I'm currently planning out future content for the YouTube channel, and want to always produce the content that is most relevant/helpful to you! So, some questions: - What are your biggest pain points currently with Fabric? - Anything you're struggling to understand? - Things you think are important, but don't quite grasp yet? It's this kind of engagement that led to the Power BI -> Fabric series, and then the DP-600 series, so I thought I'd open it up to you again! Look forward to hearing from you - thank you! Here's some potential things: - Delta file format - integrating AI/ ML/ Azure AI / Open AI - copilot - Git integration / deployment pipelines / CI/CD - data modelling / implementing SCDs - medallion implementation - more advanced pyspark stuff - data pipelines - metadata driven workflows - dataflows (and optimising dataflows) - lakehouse achitectures - real-time - data science projects - semantic link - migrating semantic models - using python to manage semantic models - administration/ automation - fabric api - other...?

Complete action

New comment Aug 7

What are your biggest pain points currently with Fabric?

Chinmay Phadke

1 like • May 21

@Will Needham this is my list itself. I faced each of these during my ongoing implementation at a client project. It was a thrilling experience to overcome each of these challenges by exploring multiple alternate approaches. 🙂

Chinmay Phadke

2 likes • May 29

@Will Needham and @Julio Ochoa , I really appreciate the efforts you took in reading my long comment so patiently and answering those. Hats off to your efforts. My "knowledge repository" about Fabric features has definitely enriched on reading these answers. Thank you very much. Some of the limitations which I mentioned are due to certain constraints in the customer environment and for most of the questions, I got really good inputs from both of you. I will surely get back on those however, in mid-June. The go-live of this Fabric implementation is scheduled next week. There is an excitement as well as fear. With the blessings of you and the community members, I am sure the ride will be smoother. Can't wait to share my experiences and takeaways from this Fabric implementation once the go-live and post-go-live processes are accomplished.

Steve Malcolm

May 28 in

Technical

Explicit Path Question

Good morning, I have been following along with the Spark Tutorial in Microsoft Fabric and ran into an interesting issue that I have yet to understand. After leaving my office Friday I paused the lesson after running the line of code: df = spark.read.csv('Files/property-sales-extended.csv', header=True, inferSchema=True) df.show() The code ran fine on Friday and did again this morning. After loading another file the code failed: df = spark.read.csv('Files/property-sales-missing.csv', header=True, inferSchema=True) df.show() The error message was that the path was not valid. Same LakeHouse, same folder, same everything as far as I could tell. I solved the issue by changing the call to explicitly include the name of the LakeHouse folder: df = spark.read.csv('Files/csv/property-sales-missing.csv', header=True, inferSchema=True) df.show() The ABFS path also works. Can someone help me understand the reason for this? Thanks, Steve Malcolm

New comment May 29

Chinmay Phadke

0 likes • May 29

@Steve Malcolm , plz try with this code: df = spark.read.csv('/lakehouse/default/Files/property-sales-extended.csv', header=True, inferSchema=True)

1-10 of 13

Level 3

40points to level up

Chinmay Phadke

@chinmay-phadke-9568

I am a Data Engineer.

Active 3d ago

Joined Apr 2, 2024

Contributions

Followers

Following