Acknowledgement part 5

Acknowledgement is taking a lot longer than I initially had hoped, but I continue to make progress.

One of the requests that came up during beta testing was speakerphone mode. While I was testing voice acknowledgement myself, I used earphones. This makes it easy to separate the microphone input from the clips that are playing. But the beta testers didn’t want to use earphones, or airpods, or anything like that. They just wanted to plop the phone down and have it play clips, like it has always done, and still reiterate clips back to acknowledge. In other words, to use it like a speakerphone.

This was problematic because Reiterate kept triggering itself when used in speakerphone mode. The clip would play, and the app would hear the clip playing, and count it as acknowledgement.

One way to solve this would be to disable recording while a clip was playing. But that creates a frustrating experience. Often times, if you’re paying attention, you can recognize a clip from the first word or two, and it’s very nice to be able to reiterate it back to acknowledge, while the clip is playing. Having to wait for the clip to finish feels awkward. I didn’t want to do that.

I know that it’s possible to use the iPhone as a speakerphone; there’s plenty of apps that can do that. It turns out I need to use a different API.

There are three different audio APIs you can use to play and record audio on iOS. The first, and simplest, is AVAudioPlayer and AVAudioRecorder. These classes handle everything for you. You just have to give the player a file and it plays it. To record, just give the recorder a URL to save its data. Simple and easy.

The next level API is AVAudioEngine and AVAudioNode. These classes let you set up a network of audio processing nodes. On first glance, it looks like it’s geared towards music or MIDI apps, with Echo and Reverb nodes that you can connect together to do spiffy effect processing.

Finally, at the lowest level, there’s Core Audio. If you Google for developer tips on how to use Core Audio you’ll find lots of warnings and admonitions that basically say “stay away from this”. It seems to be awkward to use and not documented very well.

When I first developed Reiterate, the simplest API seemed the most appropriate, so it plays its clips with AVAudioPlayer and record using AVAudioRecorder. And that’s worked fine, up to now.

The problem is, in order to enable the iPhone to record audio while playing back, I need to enable echo cancellation which is a part of the voice processing module. That’s available only in the AVAudioEngine API.

It also turns out that you can’t mix-and-match different audio APIs. If you try to do that, they step on each other, and you get all sorts of funny runtime exceptions with obscure error codes you can’t even Google.

So what I now need to do is rip out all the existing audio code in Reiterate, and replace it with code based on AVAudioEngine. I hope that gives me what I want. If I have to go all the way down to Core Audio it’s going to get very messy.

Comments and Webmentions

You can respond to this post using Webmentions. If you published a response to this elsewhere,