AQuA DLL library - develop your own voice and audio quality software

AQuA DLL Library

Purpose of the library.........................................................................................................................................2
Library initialization ..........................................................................................................................................2
Library information............................................................................................................................................2
Data structure containing audio quality measurements .....................................................................................3
Audio quality estimation functions....................................................................................................................3
Settings of audio quality algorithm....................................................................................................................4
Checking file formats.........................................................................................................................................6
Displaying audio quality testing results.............................................................................................................6
Analysis of possible reasons for audio quality loss ...........................................................................................7
Duration distortion .........................................................................................................................................7
Delay/Advancing of audio signal activity......................................................................................................7
Audio signal activity mistiming.....................................................................................................................8
Corrupted signal spectrum .............................................................................................................................8
AQuA software benefits ....................................................................................................................................9
Simple example of using AQuA DLL .............................................................................................................10

Purpose of the library
AQuA DLL provides intrusive audio signal quality measurement by comparing original and test
sound signals. Original signal is considered as etalon of audio quality and the more the test signal differs
from it the lower is the quality estimation. Software utilizes a set of objective and subjective characteristics
for perceptual audio quality measurement.
The software library provides developers with a wide range of possibilities to obtain information
about audio quality estimation algorithm settings, audio quality testing results, preparing received results to
display and the library itself. AQuA library contains the following functions:

Library initialization
Library initialization includes two functions:

SSA_SDK_API bool SSA_InitLib(void);
- performs library initialization, check library copyrights and software license period and validity. If
initialization is successful the function returns “true”, otherwise result is “false”.

SSA_SDK_API void SSA_ReleaseLib(void);
- this function is called when work with the library is finished. Invoking this function is important for further
work with the library.

Library information
This function returns information about the library:

SSA_SDK_API TSSA_AQuA_Info * SSA_GetPAQuAInfo(void);
- returns pointer to data structure containing information about AQuA library. Funciton
SSA_GetPAQuAInfo() may be called before the library is initialized by function SSA_InitLib. Data
structure returned has the following format:

struct TSSA_AQuA_Info
{
int dStructSize; // Structure size
wchar_t * dCopyrightString; // Copyright string
wchar_t * dVersionString; // Product name and version number string
int dSampleRateLimit; // Maximal sampling frequency
// supported by the library
int dChannelsLimit; // Maximal amount of channels
// supported by the library
bool isDifferentFFmtCheckingEnabled;
// If comparison of audio files in different formats
// is allowed
wchar_t * pSupportedBitsPerSampleList; // List of supported sample bits
wchar_t * pSupportedCodecsList; // List of supported audio compression algorithms
};

Data structure containing audio quality measurements
The structure contains received quality measurement estimations. If any estimation (Percentage,
MOS or PESQ) was requested then its correspondent field contains –1.

struct TSSA_AQuA_Results
{
double dPercent; // Percentage
double dMOSLike; // MOS-like
double dPESQLike; // PESQ-like
};

The structure is filled with the help of the following function:
SSA_SDK_API int SSA_FillQualityResultsStruct(void * anSSA_ID,
TSSA_AQuA_Results * aPQResults);
- which fills the structure with results of audio quality measurements.

Audio quality estimation functions
Audio quality estimation is represented by three functions of creating, deleting and start of quality
analyzer.

SSA_SDK_API void * SSA_CreateAudioQualityAnalyzer(void);
- creates audio analyzer. If successful returns analyzer’s handle, if unsuccessful the function returns NULL.

SSA_SDK_API void SSA_ReleaseAudioQualityAnalyzer(void * anSSA_ID);
- finishes working with analyzer and deletes all variables used during analyzer work.

SSA_SDK_API int SSA_OnTestAudioFiles(void * anSSA_ID);
- performs audio files quality estimation according to the files passed to the analyzer in the setting functions.

Settings of audio quality algorithm
Setting parameters is done by the same function that can set any parameter of the analyzer:

SSA_SDK_API bool SSA_SetAny(void * anSSA_ID,
wchar_t * aPParName, void * aPParValue);

There are three input parameters:
anSSA_ID – identifier of the analyzer;
aPParName – parameter name
aPParValue – pointer to value assigned to the parameter.

The table below represents a list of parameters and range of their values if that is applicable

Name Type Range
SrcAudioFileName wchar_t
TstAudioFileName wchar_t
FaultsReportFileName wchar_t
CoefficientsType int 0, 1, 2
EnergyNormalizationFlag bool
NumberOfLinkPoints int 1..10
EnvelopeSmoothingLevel int 1..10
OutputEstimations wchar_t %, m, p
QualityMode int 0, 1
SAPrecisionDegree int 8..16
DeltaCorrectionFlag bool
IntegrationMode Int 0, 1, 2
MusicalPriority bool
TstStartDelay long ms
BOFBandwidth double Hz
EOFBandwidth double Hz

L”SrcAudioFileName”
- name of the file containing original audio;
L”TstAudioFileName”
- name of the file under test (degraded);
L”FaultsReportFileName”
- file name to store reasons for audio quality loss;
L”CoefficientsType”
- type of weight coefficients for frequency groups. These coefficients manage input of different
frequency bands to overall audio signal quality. There are three types of weight coefficients:
o 0 (uniform) – equal, uniform input of frequency bands
o 1 (linear) – frequency bands input is in inverse ratio to the energy of frequency bands
o 2 (logarithmic) – frequency bands input is in inverse ratio to the loudness of frequency
bands
We recommend to use linear and logarithmic coefficients when signal quality is especially important in high
frequency bands of the signal.

L”EnergyNormalizationFlag”
- energy normalization flag. Normalizing energy maybe useful if one knows in advance about
uniform changes of the signal under test amplitude caused by signal processing. In other cases
attempt to normalize energy may cause unstable behavior of comparison procedure.

L”NumberOfLinkPoints”
- amount of link points. In case original signal and signal under test have reasonably long pauses
then one can virtually split the compared files. In such case quality estimation will be received
for each pair of the virtual files and then further processed to obtain overall quality score. There
is option that allows automatic detection of the amount of virtual files (mode auto). Setting
amount of link points manually must be done very carefully, because that may cause to
unsynchronization between the original signal and signal under test.

L”EnvelopeSmoothingLevel”
- envelope smoothing level. This option manages how smooth is work of audio activity detector
(or voice activity detector - VAD). The higher the value of this parameter the smoother will be
the change from one state of the detector to another.

L”OutputEstimations”
- list of quality estimation values for output. There are three possible values the software can return
as audio quality estimation:
o % - audio quality estimation in percentage
o M – MOS-like estimation
o P – PESQ-like estimation
Obtaining MOS-like and PESQ-like estimations does not require any additional signal processing.

L”QualityMode”
- mode of receiving audio quality measurement. The software allows obtaining two types of
quality measurements:
o 0 (quality) – typical audio quality estimation
o 1 (naturalness) – detecting how natural the audio sounds.
For most of the cases the first type of estimation (0, quality) is optimal, testing how natural the audio sounds
is an experimental estimation characterizing audio quality.

L”SAPrecisionDegree”
- precision degree of the spectrum analyzer. This parameter allows controlling speed and precision
of detecting audio quality. Depending on the sampling frequency there is an option to
automatically define precision degree of the spectrum analyzer (mode auto), which gives optimal
ration between the speed and accuracy.

L”DeltaCorrectionFlag”
- delta correction flag. Turning delta correction on allows considering additional factors that may
cause quality loss in audio signal. When enabled quality score will be lower if the factors is
present in audio.

L”IntegrationMode”
- sets working mode for software integrator. There are three modes of integration:
o 0 (linear)
o 1 (log)
o 2 (10log)
Integration mode manages work of the quality estimation algorithm. The most “sensitive” is the linear mode
of integration.

L”MusicalPriority”
- sets type of compared signals. When enabled software considers input signals as music and when
disable - speech.

L”TstStartDelay”
- time shift from beginning of the signal in milliseconds. This option allows user to exclude
starting fragment of the test file from analysis thus tuning the algorithm for user’s tasks.

L”BOFBandwidth”, L”EOFBandwidth”
- beginning and end of the bandwidth under test. This allows user to particularly specify frequency
bands for analysis and tune the quality estimation algorithm for user’s tasks.

Checking file formats
File formats checking is represented by the following three functions:

SSA_SDK_API bool SSA_IsFileFormatSupportable(void * anSSA_ID, wchar_t * aPFName);
- checks if file formats are supported by the library. If the format is supported the function returns “true”,
otherwise the return value is “false”.

SSA_SDK_API bool SSA_AreFilesComparable(void * anSSA_ID,
wchar_t * aPSrcFName, wchar_t * aPTstFName);
- checks if file comparison is possible. If files format is supported and file comparison is supported for these
file formats then the return value is “true”, otherwise the function returns “false”.

Displaying audio quality testing results
SSA_SDK_API int SSA_GetQualityStringSize(void * anSSA_ID);
- returns string length containing test results in text.

SSA_SDK_API int SSA_FillQualityString(void * anSSA_ID, wchar_t * aPString);
- fills string with the text of the test result. User should allocate memory for the string by himself. Amount of
the memory required can be found by function SSA_GetQualityStringSize.

SSA_SDK_API int SSA_GetSrcSignalSpecSize(void * anSSA_ID);
- returns size of array for integral energy spectrum of the original signal. Note that signal spectrum is
available only after quality estimation has been performed and only in the mode “QualityMode” = 0. If
signal spectrum was not calculated the function returns 0, in case of error the function returns -1.

SSA_SDK_API int SSA_GetTstSignalSpecSize(void * anSSA_ID);
- returns size of array for integral energy spectrum of the signal under test. Note that signal spectrum is
available only after quality estimation has been performed and only in the mode “QualityMode” = 0. If
signal spectrum was not calculated the function returns 0, in case of error the function returns -1.

SSA_SDK_API int SSA_FillSrcSignalSpecArray(void * anSSA_ID, float * aPSpecArray);
- fills array with integral energy spectrum of the original signal. Note that signal spectrum is available only
after quality estimation has been performed and only in the mode “QualityMode” = 0. If signal spectrum
was not calculated the function returns 0, in case of error the function returns -1.

SSA_SDK_API int SSA_FillTstSignalSpecArray(void * anSSA_ID, float * aPSpecArray);
- fills array with integral energy spectrum of the signal under test. Note that signal spectrum is available only
after quality estimation has been performed and only in the mode “QualityMode” = 0. If signal spectrum
was not calculated the function returns 0, in case of error the function returns -1.

SSA_SDK_API int SSA_GetFaultsAnalysisStringSize(void * anSSA_ID);

- returns size of the string containing reasons for quality loss. String size does not consider 0 symbol in the
end of the string.

SSA_SDK_API int SSA_FillFaultsAnalysisString(void * anSSA_ID, wchar_t * aPString);
- fills string with reasons for audio quality loss. String aPString contains only meaningful symbols and
does not contain 0 symbol in the end.

Analysis of possible reasons for audio quality loss
Besides audio quality score AQuA gives a possibility to analyze and determine possible reasons that
caused audio signal degradation. Software automatically prepares analysis results that can be returned as a
string or stored in a log file depending on the chosen option.

Additional audio quality metrics returned by the system may not look trivial to understand and this chapter
is devoted to the main principles of how these metrics are built and how one can interpret them.

AQuA returns additional metrics only in the case when they are out of range for their “typical values”. In
case the metrics are within the range the system returns “Cannot determine the major reason for audio
quality loss”.

Duration distortion
This metric represents continuity of compared audio files. Ideally amount of audio data in the
original signal and file under test should be the same. During audio processing or transfer over
communication channels audio fragments may be lost as well as inserted into the audio. If such audio
degradation took place then value of this metric is lower than 100. The bigger the difference the stronger the
degradation, however, this metric does not consider possible starting pauses.
When the value is less than 100% this means that audio data was lost and analysis result will be:
Audio shrinking corresponds to ХХ.ХХ percent.
where ХХ.ХХ corresponds to deviation from 100%.

When the actual value is more than 100% this means that data was inserted and analysis result will be:
Audio stretching corresponds to ХХ.ХХ percent.
where ХХ.ХХ corresponds to deviation from 100%.

Tolerance range for this value is set to 100% ± 1%.

Delay/Advancing of audio signal activity
This metric represents signal shift in test file compared to the original and determines how much
active level of the test signal delays/advances active level of the etalon (original) signal. When it is delayed
analysis returns the following:
Signal delayed by ХХ.ХХ ms.
where ХХ.ХХ is delay time in milliseconds. Correspondently, when the signal advances the original the
return string is:
Signal advances the original by -ХХ.ХХ ms.
where ХХ.ХХ is advancing time.

Tolerance range for this value is interval of ±50 ms.

Audio signal activity mistiming
This metric represents unsynchronization of active levels in etalon and under test signals. Original
(etalon) audio signal and test signal are merged to determine characteristics of audio activity, and when
current characteristics of audio activity do not match system increases unsynchronization counter. After
processing the final unsynchronization value is presented as percentage of cases when unsynchronization
was detected.
If the metric value is not zero analysis result represent it as:
Audio signal activity mistiming (unsynchronization) is ХХ.ХХ percent.
where ХХ.ХХ is percentage of unsynchronization. The value is not considered if it is less than 1%.

Corrupted signal spectrum
This represents a set of metrics reflecting differences in integral energy spectrums of the original
signal and audio under test. If overall spectrums difference is more than 15% than analysis returns the
following string:
Corrupted signal spectrum.
If difference in spectrums is multidirectional (goes both into positive and negative zones) analysis
returns the following string:
Vibration along the whole spectrum [-ХХ.XX, YY.YY] %
where ХХ.XX and YY.YY are deviations to negative and positive zones correspondently. Tolerance range
of the deviation is ±5%.
If spectrum distortions are unidirectional (only negative or only positive) analysis returns this string:
Amplification approaches YY.YY %
When distortions are positive, or
Attenuation approaches ХХ.XX %
when distortions are negative.
Other metrics returned by analysis correspond to distortions occurred in different frequency groups.
Analysis of different frequency bands performs in a similar manner to spectrum analysis. When talking
about frequency bands in question we consider:
Low frequencies – below 1000 Hz
Medium frequencies – from 1000 Hz to 3000 Hz
High frequencies are those that are greater than 3000 Hz
When analyzing frequency bands we use other value tolerance ranges. Distortion in low frequencies
is considered when they are greater than 5%, in medium frequencies – 10% and in high frequencies – 30%.
Multidirectional spectrum changes (vibration) is considered when they are greater than 2.5% in low
frequencies, 7% in medium frequencies and 15% in high frequencies.
Unidirectional distortions (no matter positive or negative) are considered when they are greater than
5% in low frequencies, 10% in medium frequencies and 25% in high frequencies.

AQuA software benefits
Among AQuA benefits one will definitely appreciate that:

- AQuA is suitable to develop server solutions and does not involve any “per channel” limitations
- AQuA license does not have any annual royalty fee
- AQuA is suitable both for 32 and 64 bit systems
- AQuA is easy to deploy and use for software products development
- AQuA provides perceptual estimation of audio quality and can be utilized in VoIP, PSTN, ISDN,
GSM, CDMA networks and combinations of those

Simple example of using AQuA DLL
#include "stdafx.h"
#include <stdio.h>

#include "AQuAdll.h"

int wmain(int argc, wchar_t* argv[])
{
if (argc < 3)
{
printf("usagenVQDLLTest <srcfilename> <codedfilename>");
return 0;
}

if (SSA_InitLib())
{
TSSA_AQuA_Info * iPAQuAInfo = SSA_GetPAQuAInfo();
wprintf(L"CopyrightString = n%sn", iPAQuAInfo->dCopyrightString);
wprintf(L"VersionString = n%sn", iPAQuAInfo->dVersionString);

void * iAnalyzerID = SSA_CreateAudioQualityAnalyzer();

if (iAnalyzerID)
{
wprintf(L"SSA_IsFileFormatSupportable() = %in", (int)SSA_IsFileFormatSupportable(iAnalyzerID,
argv[1]));
wprintf(L"SSA_AreFilesComparable() = %in", (int)SSA_AreFilesComparable(iAnalyzerID, argv[1],
argv[2]));

int iTmpI;
bool iTmpB;
iTmpI = 0;
SSA_SetAny(iAnalyzerID, L"SrcAudioFileName", argv[1]);
SSA_SetAny(iAnalyzerID, L"TstAudioFileName", argv[2]);
iTmpB = false;
SSA_SetAny(iAnalyzerID, L"EnergyNormalizationFlag", &iTmpB);
iTmpI = 5;
SSA_SetAny(iAnalyzerID, L"NumberOfLinkPoints", &iTmpI);
iTmpI = 5;
SSA_SetAny(iAnalyzerID, L"EnvelopeSmoothingLevel", &iTmpI);
SSA_SetAny(iAnalyzerID, L"OutputEstimations", L"%%mp");
SSA_SetAny(iAnalyzerID, L"FaultsReportFileName", L".report.txt");

iTmpI = 8;
SSA_SetAny(iAnalyzerID, L"SAPrecisionDegree", &iTmpI);

if (SSA_OnTestAudioFiles(iAnalyzerID) == 0)
{
printf("SSA_OnTestAudioFiles() --> Ok!n");

int iResLen = SSA_GetQualityStringSize(iAnalyzerID);

wchar_t * iResStr = new wchar_t[iResLen + 10];

SSA_FillQualityString(iAnalyzerID, iResStr);
wprintf(L"iResStr = n%sn", iResStr);
delete(iResStr);

iTmpI = SSA_GetSrcSignalSpecSize(iAnalyzerID);
float * iPSpecArr = new float[iTmpI];
SSA_FillSrcSignalSpecArray(iAnalyzerID, iPSpecArr);
wprintf(L"SrcSpecSize = %in", iTmpI);
for(int i=0; (i<16)&&(i<iTmpI); i++) wprintf(L"%ft", iPSpecArr[i]);
wprintf(L"n"); delete(iPSpecArr);

iTmpI = SSA_GetTstSignalSpecSize(iAnalyzerID);
iPSpecArr = new float[iTmpI];
SSA_FillTstSignalSpecArray(iAnalyzerID, iPSpecArr);
wprintf(L"TstSpecSize = %in", iTmpI);
for(i=0; (i<16)&&(i<iTmpI); i++) wprintf(L"%ft", iPSpecArr[i]);
wprintf(L"n"); delete(iPSpecArr);

TSSA_AQuA_Results iResults;
SSA_FillQualityResultsStruct(iAnalyzerID, &iResults);
wprintf(L"dPercent = %fn", iResults.dPercent);
wprintf(L"dMOSLike = %fn", iResults.dMOSLike);
wprintf(L"dPESQLike = %fn", iResults.dPESQLike);

iResLen = SSA_GetFaultsAnalysisStringSize(iAnalyzerID);
iResStr = new wchar_t[iResLen + 10];

SSA_FillFaultsAnalysisString(iAnalyzerID, iResStr);
wprintf(L"iResStr = n%sn", iResStr);
delete(iResStr);
}
else
{
printf("SSA_OnTestAudioFiles() --> failed!n");
}

SSA_ReleaseAudioQualityAnalyzer(iAnalyzerID);
}

SSA_ReleaseLib();
}

return(0);
}

AQuA DLL library - develop your own voice and audio quality software

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to AQuA DLL library - develop your own voice and audio quality software

Similar to AQuA DLL library - develop your own voice and audio quality software (20)

More from Sevana Oü

More from Sevana Oü (20)

Recently uploaded

Recently uploaded (20)

AQuA DLL library - develop your own voice and audio quality software