Super impressive from an audio flow. Feels more like a natural back and forth, less scripted, little to no gaps, cross talk, human artifacts. Although without any control it's hard to get what you actually want. I tried to transcribe those speech nuances using whisper but it does not distinguish between voices.