Another noob question...
If Spark data frames are processed 'in memory' then isn't there a limit on the size of the frame relative to the cluster size and available RAM? Fine if you have a cluster of many machines but in most scenarios the cluster wont be that big, Is there some spilling to disk to handle this?
1
1 comment
Timothy Blackwell
2
Another noob question...
Learn Microsoft Fabric
skool.com/microsoft-fabric
A community for passionate analysts, data engineers, data scientists (& more!) looking to learn Microsoft Fabric - the end-to-end analytics platform.
Leaderboard (30-day)
powered by