Another noob question...
If Spark data frames are processed 'in memory' then isn't there a limit on the size of the frame relative to the cluster size and available RAM? Fine if you have a cluster of many machines but in most scenarios the cluster wont be that big, Is there some spilling to disk to handle this?
1
1 comment
Timothy Blackwell
2
Another noob question...
Learn Microsoft Fabric
skool.com/microsoft-fabric
Helping passionate analysts, data engineers, data scientists (& more) to advance their careers on the Microsoft Fabric platform.
Leaderboard (30-day)
powered by