I am working on a new app that will allow drawing on top of a video in real-time. There are a variety of uses cases, such as live sporting events, adding notes or funny doodles, as well as pointing stuff out or covering things up. The app will support the device camera, or loading a video from file.
The biggest challenge I have faced so far has been that of keeping the audio from a file in sync with the current player position. I want the user to be able to pause playback but continue recording, or to seek randomly in the play stream or to stop and start recording rapidly for stop motion effects. All of this requires pulling the exact audio sample from the previous frame to the current frame. A daunting challenge indeed.
I’ve delved deep into the realm of core audio in order to be able to do this. I know more about CMSampleBufferRef objects, CMBlockBufferRef objects and all manner of dark and hidden secrets that core audio has to offer. I’m by no means an expert, but I know enough now to be dangerous 🙂
One of the trickiest parts of core audio is memory management. CMBlockBufferRef is a great way to represent and use audio samples, but it’s also very tricky to get working properly without crashing. AVFoundation likes to hold on to core audio objects longer than it should and release them lazily, so you can’t assume that you can free up your memory even though you’ve just finished a call to [AVAssetWriter appendSampleInput:]. You also can’t simply use CMBlockBufferCreateWithBufferReference to create smaller chunks of a sample and write them out as separate CMSampleBufferRef objects. You will get weird audio artifacts if you do that.
I ended up having to pull the raw bytes out of samples with CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer and re-copy them to my own block buffers created with CMBlockBufferCreateWithMemoryBlock. After many hours into the night, I finally feel like I have a handle on how to work with core audio.