音频编码 Audio Converter

需求

iOS中将采集到的原始音频数据(PCM)进行编码以得到压缩数据类型(AAC...).

本例最终实现的是通过Audio Unit采集到PCM数据,将其压缩转为AAC数据,并以录制的形式保存在沙盒中.可调整编码后音频数据格式,采样率,编码器类型等参数.


实现原理

利用Audio Toolbox Framework中的Audio Converter可以实现音频数据编码,即将PCM数据转为其他压缩格式.


阅读前提:


GitHub地址(附代码) : 音频编码

简书地址 : 音频编码

掘金地址 : 音频编码

博客地址 : 音频编码


1.初始化

1.1. 初始化编码器

初始化编码器实例, 通过指定原始数据格式,最终编码后的格式,采样率,以及使用硬编还是软编,以下是具体步骤.

- (instancetype)initWithSourceFormat:(AudioStreamBasicDescription)sourceFormat destFormatID:(AudioFormatID)destFormatID sampleRate:(float)sampleRate isUseHardwareEncode:(BOOL)isUseHardwareEncode {
    if (self = [super init]) {
        mSourceFormat   = sourceFormat;
        mAudioConverter = [self configureEncoderBySourceFormat:sourceFormat
                                                    destFormat:&mDestinationFormat
                                                  destFormatID:destFormatID
                                                    sampleRate:sampleRate
                                           isUseHardwareEncode:isUseHardwareEncode];
    }
    return self;
}

1.2. 配置编码后ASBD音频流信息

 AudioStreamBasicDescription destinationFormat = {};
    destinationFormat.mSampleRate = sampleRate;
    if (destFormatID == kAudioFormatLinearPCM) {
        NSLog(@"Not get PCM format after encoding !");
        return NULL;
    } else {
        destinationFormat.mFormatID = destFormatID;
        
        // For iLBC, the number of channels must be 1.
        destinationFormat.mChannelsPerFrame = (destFormatID == kAudioFormatiLBC ? 1 : sourceFormat.mChannelsPerFrame);
        
        // Use AudioFormat API to fill out the rest of the description.
        size = sizeof(destinationFormat);
        if (![self checkError:AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &destinationFormat) withErrorString:@"AudioFormatGetProperty couldn't fill out the destination data format"]) {
            return NULL;
        }
    }
    memcpy(destFormat, &destinationFormat, sizeof(AudioStreamBasicDescription));

对音频做编码操作,实际就是将PCM格式转为如AAC等音频压缩格式(VBR格式),通过kAudioFormatProperty_FormatInfo属性可以自动获取指定音频格式的参数信息.

注意: 如果音频格式是iLBC, 声道数只能为1.

1.3. 选择编码器类型

AudioClassDescription结构体描述了系统使用音频编码器信息,其中最重要的就是指定使用硬编或软编。然后编码器的数量,即数组的个数,由当前的声道数决定。

  // encoder conut by channels.
    AudioClassDescription requestedCodecs[destinationFormat.mChannelsPerFrame];
    const OSType subtype = destFormatID;
    for (int i = 0; i < destinationFormat.mChannelsPerFrame; i++) {
        AudioClassDescription codec = {
            kAudioEncoderComponentType,
            subtype,
            isUseHardwareEncode ? kAppleHardwareAudioCodecManufacturer : kAppleSoftwareAudioCodecManufacturer,
        };
        requestedCodecs[i] = codec;
    }

注意:硬编即利用设备GPU硬件完成高效编码,降低CPU消耗. 软编就是传统的通过CPU计算。

1.4. 创建编码器

AudioConverterNewSpecific: 通过指定编码器来创建audio converter实例对象.第3,4个
分别是编码器的数量与编码器描述,同上,与声道数保持一致.

 // Create the AudioConverterRef.
    AudioConverterRef converter = NULL;
    if (![self checkError:AudioConverterNewSpecific(&sourceFormat, &destinationFormat, destinationFormat.mChannelsPerFrame, requestedCodecs, &converter) withErrorString:@"AudioConverterNew failed"]) {
        return NULL;
    }else {
        printf("Audio converter create successful \n");
    }

1.5. 设置码率

我们可以手动设置需要的码率,如果没有特殊要求一般可以根据采样率使用建议值,如下.

 /*
     If encoding to AAC set the bitrate kAudioConverterEncodeBitRate is a UInt32 value containing
     the number of bits per second to aim for when encoding data when you explicitly set the bit rate
     and the sample rate, this tells the encoder to stick with both bit rate and sample rate
     but there are combinations (also depending on the number of channels) which will not be allowed
     if you do not explicitly set a bit rate the encoder will pick the correct value for you depending
     on samplerate and number of channels bit rate also scales with the number of channels,
     therefore one bit rate per sample rate can be used for mono cases and if you have stereo or more,
     you can multiply that number by the number of channels.
     */
    
    if (destinationFormat.mFormatID == kAudioFormatMPEG4AAC) {
        UInt32 outputBitRate = 64000;
        
        UInt32 propSize = sizeof(outputBitRate);
        
        if (destinationFormat.mSampleRate >= 44100) {
            outputBitRate = 192000;
        } else if (destinationFormat.mSampleRate < 22000) {
            outputBitRate = 32000;
        }
        outputBitRate *= destinationFormat.mChannelsPerFrame;
        
        // Set the bit rate depending on the sample rate chosen.
        if (![self checkError:AudioConverterSetProperty(converter, kAudioConverterEncodeBitRate, propSize, &outputBitRate) withErrorString:@"AudioConverterSetProperty kAudioConverterEncodeBitRate failed!"]) {
            return NULL;
        }
        
        // Get it back and print it out.
        AudioConverterGetProperty(converter, kAudioConverterEncodeBitRate, &propSize, &outputBitRate);
        printf ("AAC Encode Bitrate: %u\n", (unsigned int)outputBitRate);
    }

1.6. 设置中断后是否可恢复

kAudioConverterPropertyCanResumeFromInterruption: 设置converter能否在中断后恢复.

如果没有显式实现该属性或get此属性返回错误,说明当前不是硬编,如果此查询返回1表明编码器可以在中断后恢复.否则不能恢复.


    /*
     Can the Audio Converter resume after an interruption?
     this property may be queried at any time after construction of the Audio Converter after setting its output format
     there's no clear reason to prefer construction time, interruption time, or potential resumption time but we prefer
     construction time since it means less code to execute during or after interruption time.
     */
    BOOL canResumeFromInterruption = YES;
    UInt32 canResume = 0;
    size = sizeof(canResume);
    OSStatus error = AudioConverterGetProperty(converter, kAudioConverterPropertyCanResumeFromInterruption, &size, &canResume);
    
    if (error == noErr) {
        /*
         we recieved a valid return value from the GetProperty call
         if the property's value is 1, then the codec CAN resume work following an interruption
         if the property's value is 0, then interruptions destroy the codec's state and we're done
         */
        
        if (canResume == 0) {
            canResumeFromInterruption = NO;
        }
        
        printf("Audio Converter %s continue after interruption!\n", (!canResumeFromInterruption ? "CANNOT" : "CAN"));
        
    } else {
        /*
         if the property is unimplemented (kAudioConverterErr_PropertyNotSupported, or paramErr returned in the case of PCM),
         then the codec being used is not a hardware codec so we're not concerned about codec state
         we are always going to be able to resume conversion after an interruption
         */
        
        if (error == kAudioConverterErr_PropertyNotSupported) {
            printf("kAudioConverterPropertyCanResumeFromInterruption property not supported - see comments in source for more info.\n");
            
        } else {
            printf("AudioConverterGetProperty kAudioConverterPropertyCanResumeFromInterruption result %d, paramErr is OK if PCM\n", (int)error);
        }
        
        error = noErr;
    }

2.编码

2.1. 估算音频大小

kAudioConverterPropertyMaximumOutputPacketSize: 可以查询编码后音频数据最大数值.此值常用来估算音频编码后最大值.可以通过此值为音频数据分配空间.

UInt32 outputSizePerPacket = destFormat.mBytesPerPacket;
    if (outputSizePerPacket == 0) {
        // if the destination format is VBR, we need to get max size per packet from the converter
        UInt32 size = sizeof(outputSizePerPacket);
        if (![self checkError:AudioConverterGetProperty(audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &outputSizePerPacket) withErrorString:@"AudioConverterGetProperty kAudioConverterPropertyMaximumOutputPacketSize failed!"]) {
            return;
        }
    }

2.2. 为编码后音频数据预分配内存

我们可以将2.1中算出的最大size为这个Buffer list分配内存,也可用原始音频数据的大小为其分配内存,因为我们无法直接得知编码后数据到底是多大,所以用估算出来的最大值或原始数据大小分配内存都可以生效,因为最终编码器会将有效大小的值赋值进去.

// Set up output buffer list.
    AudioBufferList fillBufferList = {};
    fillBufferList.mNumberBuffers = 1;
    fillBufferList.mBuffers[0].mNumberChannels  = destFormat.mChannelsPerFrame;
    fillBufferList.mBuffers[0].mDataByteSize    = theOutputBufferSize;
    fillBufferList.mBuffers[0].mData            = malloc(theOutputBufferSize * sizeof(char));

2.3. 编码音频数据

解析AudioConverterFillComplexBuffer:用来编码音频数据.同时需要指定回调函数(C语言函数),

第二个参数即指定回调函数,此回调函数中主要做的是为即将编码的数据进行赋值,即我们要把原始音频数据赋值给回调函数中的ioData参数,这是我们在编码前最后一次控制原始音频数据,此回调函数执行后即完成了编码的过程,新的数据会填充到第五个参数中,也就是我们上面预定义的fillBufferList.

  • userInfo: 自定义一个结构体,用来与编码回调函数间交互以传递数据.在这里是将原始音频数据信息传给编码回调函数中.
  • ioOutputDataPackets: 填入函数中时表示原始音频数据包的数量,而函数调用完成时表示转换后输出的音频数据包总数
  • outputPacketDescriptions: 转换完成后,如果此参数非空,表示转换器输出使用的音频数据包描述,它必须提前分配好内存,以让转换器赋值到其中.

最终,我们将转换后得到的AAC数据以回调函数的形式传给调用者.


OSStatus EncodeConverterComplexInputDataProc(AudioConverterRef              inAudioConverter,
                                             UInt32                         *ioNumberDataPackets,
                                             AudioBufferList                *ioData,
                                             AudioStreamPacketDescription   **outDataPacketDescription,
                                             void                           *inUserData) {
    XDXConverterInfoType *info = (XDXConverterInfoType *)inUserData;
    ioData->mNumberBuffers              = 1;
    ioData->mBuffers[0].mData           = info->sourceBuffer;
    ioData->mBuffers[0].mNumberChannels = info->sourceChannelsPerFrame;
    ioData->mBuffers[0].mDataByteSize   = info->sourceDataSize;
    
    return noErr;
}

- (void)encodeFormatByConverter:(AudioConverterRef)audioConverter sourceBuffer:(void *)sourceBuffer sourceBufferSize:(UInt32)sourceBufferSize sourceFormat:(AudioStreamBasicDescription)sourceFormat dest:(AudioStreamBasicDescription)destFormat completeHandler:(void(^)(AudioBufferList *destBufferList, UInt32 outputPackets, AudioStreamPacketDescription *outputPacketDescriptions))completeHandler {
    ...
    
    XDXConverterInfoType userInfo   = {0};
    userInfo.sourceBuffer           = sourceBuffer;
    userInfo.sourceDataSize         = sourceBufferSize;
    userInfo.sourceChannelsPerFrame = sourceFormat.mChannelsPerFrame;
    
    UInt32 numberOutputPackets = 1;
    UInt32 theOutputBufferSize = sourceBufferSize;
    UInt32 ioOutputDataPackets = numberOutputPackets;
    AudioStreamPacketDescription outputPacketDescriptions;
    // Convert data
    OSStatus status = AudioConverterFillComplexBuffer(audioConverter,
                                                      EncodeConverterComplexInputDataProc,
                                                      &userInfo,
                                                      &ioOutputDataPackets,
                                                      &fillBufferList,
                                                      &outputPacketDescriptions);
    
    
    
    // if interrupted in the process of the conversion call, we must handle the error appropriately
    if (status != noErr) {
        if (status == kAudioConverterErr_HardwareInUse) {
            printf("Audio Converter returned kAudioConverterErr_HardwareInUse!\n");
        } else {
            if (![self checkError:status withErrorString:@"AudioConverterFillComplexBuffer error!"]) {
                return;
            }
        }
    } else {
        if (ioOutputDataPackets == 0) {
            // This is the EOF condition.
            status = noErr;
        }
        
        completeHandler(&fillBufferList, ioOutputDataPackets, &outputPacketDescriptions);
    }
}

3. 模块对接

因为音频编码要依赖音频采集,所以我们这里以audio unit采集为例作示范,即使用audio unit采集pcm数据然后使用此模块编码得到aac数据.如需了解请参考如下链接

3.1. 初始化编码器

如下,在音频采集的类中声明一个编码器实例变量,然后初始化它. 仅仅需要设置原始数据格式,编码后的格式,采样率,使用硬编,软编即可.

@property (nonatomic, strong) XDXAduioEncoder *audioEncoder;

...

        self->_audioEncoder = [[XDXAduioEncoder alloc] initWithSourceFormat:m_audioDataFormat
                                                               destFormatID:kAudioFormatMPEG4AAC
                                                                 sampleRate:44100
                                                        isUseHardwareEncode:YES];

3.2. 编码音频数据

在Audio Unit采集PCM音频数据的回调中将PCM数据送入编码器,然后在回调函数中将得到的AAC数据其写入文件.

static OSStatus AudioCaptureCallback(void                       *inRefCon,
                                     AudioUnitRenderActionFlags *ioActionFlags,
                                     const AudioTimeStamp       *inTimeStamp,
                                     UInt32                     inBusNumber,
                                     UInt32                     inNumberFrames,
                                     AudioBufferList            *ioData) {
    AudioUnitRender(m_audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, m_buffList);
    
    XDXAudioCaptureManager *manager = (__bridge XDXAudioCaptureManager *)inRefCon;

    void    *bufferData = m_buffList->mBuffers[0].mData;
    UInt32   bufferSize = m_buffList->mBuffers[0].mDataByteSize;
    
    [manager.audioEncoder encodeAudioWithSourceBuffer:bufferData
                                       sourceBufferSize:bufferSize
                                        completeHandler:^(AudioBufferList * _Nonnull destBufferList, UInt32 outputPackets, AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
                                            if (manager.isRecordVoice) {
                                                [[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:destBufferList->mBuffers->mDataByteSize
                                                                                              ioNumPackets:outputPackets
                                                                                                  inBuffer:destBufferList->mBuffers->mData
                                                                                              inPacketDesc:outputPacketDescriptions];
                                            }
                                            
                                            free(destBufferList->mBuffers->mData);
                                        }];
    

    
    return noErr;
}

3.4. 释放内存

使用完编码后的音频数据,记得释放内存.

free(destBufferList->mBuffers->mData);

4. 文件录制

此部分可参考另一篇文章: 音频文件录制

5. 释放编码器资源

如需释放内存,请保证编码器工作彻底结束后再释放内存.

- (void)freeEncoder {
    if (mAudioConverter) {
        AudioConverterDispose(mAudioConverter);
        mAudioConverter = NULL;
    }
}
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 206,126评论 6 481
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 88,254评论 2 382
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 152,445评论 0 341
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 55,185评论 1 278
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 64,178评论 5 371
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,970评论 1 284
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,276评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,927评论 0 259
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,400评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,883评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,997评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,646评论 4 322
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,213评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,204评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,423评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,423评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,722评论 2 345