This repo is no longer in activate development, please use the audiostack SDK to continue using api.audio.
apiaudio is the official api.audio Python 3 SDK. This SDK provides easy access to the api.audio API for applications written in python.
This repository is actively maintained by Aflorithmic Labs. For examples, recipes and api reference see the api.audio docs. Feel free to get in touch with any questions or feedback!
You can view here our updated Changelog.
Get started with our quickstart recipes.
You don't need this source code unless you want to modify it. If you want to use the package, just run:
pip install apiaudio -U
#or
pip3 install apiaudio -U
Install from source with:
python setup.py install
#or
python3 setup.py install
Python 3.6+
Create a file hello.py
touch hello.py
This library needs to be configured with your account's api-key which is available in your api.audio Console. Import the apiaudio package and set apiaudio.api_key
with the api-key you got from the console:
import apiaudio
apiaudio.api_key = "your-key"
Let's create our first audio asset.
✍️ Create a new script, our scriptText
will be the text that is later synthesized.
script = apiaudio.Script.create(scriptText="Hello world")
print(script)
🎤 Render the scriptText that was created in the previous step. Lets use voice Aria.
response = apiaudio.Speech.create(scriptId=script["scriptId"], voice="Aria")
print(response)
🎧 Now let's join the speech we just created with a sound template.
response = apiaudio.Mastering.create(
scriptId=script.get("scriptId"),
soundTemplate="jakarta"
)
print(response)
Download the final audio asset to your current working directory:
filepath = apiaudio.Mastering.download(scriptId=script["scriptId"], destination=".")
print(filepath)
Easy right? 🔮 This is the final hello.py
file.
import apiaudio
apiaudio.api_key = "your-key"
# script creation
script = apiaudio.Script.create(scriptText="Hello world")
# speech creation
response = apiaudio.Speech.create(scriptId=script["scriptId"], voice="Aria")
print(response)
# mastering process
response = apiaudio.Mastering.create(
scriptId=script.get("scriptId"),
soundTemplate="jakarta"
)
print(response)
# download
filepath = apiaudio.Mastering.download(scriptId=script["scriptId"], destination=".")
print(filepath)
Now let's run the code:
python hello.py
#or
python3 hello.py
Once this has completed, find the downloaded audio asset and play it! 🔉 🔉 🔉
import apiaudio
The library needs to be configured with your account's secret key which is available in your Aflorithmic Dashboard. Set apiaudio.api_key
with the api-key you got from the dashboard:
apiaudio.api_key = "your-key"
You can also authenticate using apiaudio_key
environment variable and the apiaudio SDK will automatically use it. To setup, open the terminal and type:
export apiaudio_key=<your-key>
If you provide both an environment variable and apiaudio.api_key
authentication value, the apiaudio.api_key
value will be used instead.
In order to control a child organization of yours, please use the following method to assume that organization id.
Set your child organization id to None
to stop assuming an organization. Subsequent calls to the api will use your own organization id.
import apiaudio
apiaudio.set_assume_org_id('child_org_id')
# Stop using
apiaudio.set_assume_org_id(None)
See organization resource for more operations you can perform about your organization.
There are two approaches to use the resources.
The recommended approach is to import all resources directly from apiaudio:
import apiaudio
apiaudio.Script.create()
Alternatively, you can import the resource classes you want to use first, and then use the resource methods. For example, to use Script
, we could do:
from apiaudio import Script
Script.create()
Same logic applies for other resources (Speech
, Voice
, Sound
...)
The Organization resource/class allows you to perform some data retrieval about your organization and your child organizations.
Organization methods are:
get_org_data()
- Get organizations data, including orgId, orgName etc.- Parameters:
- None.
- Example:
org_data = apiaudio.Organization.get_org_data()
- Parameters:
list_child_orgs()
- List your child organizations.- Parameters:
- None.
- Example:
child_orgs = apiaudio.Organization.list_child_orgs()
- Parameters:
get_secrets()
- Get your api key, webhook url and webhook secret.- Parameters:
- None.
- Example:
secrets = apiaudio.Organization.get_secrets()
- Parameters:
The Script resource/class allows you to create, retrieve and list scripts. Learn more about scripts here.
Script methods are:
-
create()
- Create a new script.- Parameters:
scriptText
* [Required] (string) - Text for your script. A script can contain multiple sections and SSML tags. Learn more about scriptText details hereprojectName
(string) - The name of your project. Default value is "default" (max 60 characters)moduleName
(string) - The name of your module. Default value is "default" (max 60 characters)scriptName
(string) - The name of your script. Default value is "default" (max 60 characters)scriptId
(string) - Custom identifier for your script. If scriptId parameter is provided, then projectName, moduleName and scriptName are set to the same value as scriptIdversions
(dictionary) - A dictionary containing different versions of your script text, whereby the key is the version name, and its value is the associatedscriptText
. Version namev0
is reserved as the defaultscriptText
. Default value is "{}"metadata
(dictionary) - Metadata for your script. This is limited to 2kb in size.
- Example:
text = """ <<sectionName::hello>> Hello {{username|buddy}} <<sectionName::bye>> Good bye from {{location|barcelona}} """ script = apiaudio.Script.create( scriptText=text, projectName="myProject", moduleName="myModule", scriptName="myScript", scriptId="id-1234", metadata={"author" : "sam", "tags" : ["demo", "intro"]} ) # example 2 with versions script = apiaudio.Script.create( scriptText="Default text", versions={"es" : "Hola", "en" : "hello"} )
- Parameters:
-
retrieve()
- Retrieve a script by id.- Parameters:
scriptId
* [Required] (string) - The script ID you want to retrieve. By default retrieves the main version (v0).version
(string) - The script version you want to retrieve.
- Example:
script = apiaudio.Script.retrieve(scriptId="id-1234", version="abc")
- Parameters:
-
preview
- return a script with the dictionary highlighting applied. see Lexi for more examples of how to use the dictionary feature.-
Parameters:
scriptId
* [Required] (string) - The script ID you want to use.voice
* [Required] (string) - The voice that will be used to render speech. This is required as the output can be dependent on voice, language code, or provider.
-
Example:
text = """ The author of this repo has lived in two places in the UK, <!location>Bude<!> and <!location>Bristol<!>. """ r = apiaudio.Script.create(scriptText=text) scriptId = r["scriptId"] preview = apiaudio.Script.preview(scriptId=scriptId, language="en-gb")
-
-
list()
- List all scripts available in your organization. This method supports filtering.- Parameters:
projectName
(string) - Return any scripts with this projectName.moduleName
(string) - Return any scripts with this moduleName, noteprojectName
also needs to be supplied.scriptName
(string) - Return any scripts with this scriptName, not bothprojectName
andmoduleName
need to be supplied.scriptId
(string) - Return any scripts with this scriptId.verbose
(bool) - List scripts in verbose mode (True
by default). Set this toFalse
to return only theprojectName
,moduleName
,scriptName
andscriptId
fields.
- Example:
scripts = apiaudio.Script.list()
- Parameters:
-
delete()
- Deletes a script. By default this will delete all versions of the script.- Parameters:
scriptId
* [Required] (string) - The id of the script to be deletedversion
(string) - Delete a specific version.
- Example:
#deletes version 'en' from scriptId 'myworkout' scripts = apiaudio.Script.delete(scriptId="myworkout", version="en")
- Parameters:
-
delete_multiple()
- Delete multiple scripts.-
Parameters:
projectName
* [Required] (string) - Deletes all scripts within this projectName.moduleName
(string) - Deletes all scripts within this moduleName, noteprojectName
also needs to be supplied.scriptName
(string) - Deletes all scripts within this scriptName, note bothprojectName
andmoduleName
need to be supplied.
-
Example:
#deletes all scripts within the project 'awesome_demos' scripts = apiaudio.Script.delete_multiple(projectName="awesome_demos")
-
-
get_random_text()
- Retrieve random text from a list of categories.- Parameters:
category
(string) - The category from which the random text is retrieved. If no category is specified, the function defaults to"FunFact"
- Example:
text = apiaudio.Script.get_random_text(category="BibleVerse")
- Categories currently available:
"BibleVerse"
,"FunFact"
,"InspirationalQuote"
,"Joke"
,"MovieSynopsis"
,"Poem"
,"PhilosophicalQuestion"
,"Recipe"
,"TriviaQuestion"
.
- Categories currently available:
Lists the directory structure of your scripts, i.e. projectName/moduleName/scriptName
Script.Directory methods are:
-
list_projects()
- Lists all projects. -
Parameters:
- none
-
Example:
#Lists all project names projects = apiaudio.Script.Directory.list_projects() print(projects) # example output: ["projectX", "projectY"]
-
list_modules()
- Lists all modules within a projectPrefix. -
Parameters:
projectPrefix
* [Required] - will list modules starting with this projectPrefix.
-
Example:
#Lists all module names modules = apiaudio.Script.Directory.list_modules(projectPrefix="workout") print(modules) # example output: ["workout_1/over60s", "workout_2/morning_routine"]
-
list_script_names()
- Lists all modules within a projectPrefix. -
Parameters:
projectPrefix
* [Required] - will list scriptNames starting with this projectPrefix.modulePrefix
* [Required] - will list scriptNames starting with this modulePrefix.
-
Example:
#Lists all script names scriptNames = apiaudio.Script.Directory.list_script_names(projectPrefix="workout_1", modulePrefix="over60s") print(scriptNames) # example output: ["workout_1/over60s/routine_1", "workout_1/over60s/routine_2", "workout_1/over60s/routine_3"]
- Parameters:
Speech allows you to do Text-To-Speech (TTS) with our API using all the voices available. Use it to create a speech audio file from your script.
Speech methods are:
-
create()
Send a Text-To-Speech request to our Text-To-Speech service.-
Parameters:
scriptId
* [Required] (string) - The script IDversion
(string) - The version of the script to be produced. Default is "".voice
(string) - Voice name. See the list of available voices using Voice resource. Default voice is "Joanna".speed
(string) - Voice speed. Default speed is 100.effect
(string) - Put a funny effect in your voice. You can try the following ones:dark_father
,chewie
,88b
,2r2d
,silencePadding
(integer) - Add a silence padding to your speech tracks (in milliseconds). Default is 0 (no padding)audience
(dict) - Specify the values of parameters in your script. For instance, if in the script resource you havescriptText="Hello {{name}} {{lastname}}, welcome to {{location}}"
, the audience should be:{"name": "Elon", "lastname": "Musk", "location": "Istanbul"}
. If not provided, the fallback track will be created.sync
(boolean) - Allow sync or async speech creation. Default isTrue
. Ifsync=False
, speech create call will return a success message when the speech creation is triggered. To retrieve the files, checkSpeech.retrieve()
method.sections
(dict) - Specify parameters for specific sections in the script. The key is a section name, and the value is another dictionary with the section configuration ( valid parameters are: voice, speed, effect, silence_padding). If a section is not found here, the section will automatically inherit the voice, speed, effect and silence_padding values you defined above (or the default ones if you don't provide them). See an example below with 2 sections and different configuration parameters being used.useDictionary
(bool) - Applies pronunciation dictionary to the script text.useTextNormalizer
(bool) - Applies text normalization, which can help resolve grammatical errors with TTS pronunciations, for example 11:12 Uhr -> 11 Uhr 12. Note - this only works for german voices at present.sections={ "firstsection": { "voice": "Matthew", "speed": 110, "silence_padding": 100 }, "anothersection": { "voice": "en-GB-RyanNeural", "speed": 100 } }
-
Simple example:
response = apiaudio.Speech.create( scriptId="id-1234", voice="Joanna" )
-
Complete example:
response = apiaudio.Speech.create( scriptId="id-1234", version="abc", voice="Matthew", speed=100, effect="dark_father", silencePadding= 1000, sync=True, audience={"username": "Elon", "lastname": "Musk"}, sections={ "firstsection": { "voice": "Matthew", "speed": 110, "silence_padding": 100, }, "anothersection": { "voice": "Liam", } } )
-
-
retrieve()
Retrieve the speech file urls.-
Parameters:
scriptId
* [Required] (string) - The script ID you want to retrieve.version
(string) - The version of the script to be retrieved. Default is "".section
(string) - The script section name you want to retrieve. If not provided, all the script sections will be returned.parameters
(dict) - Dict containing the personalisation parameters of your script. If not provided, the fallback track will be retrieved. This field depends on the parameters you used in your script's resource section. In order to retrieve a specific set of parameters, you need to create the speech with the same set of parameters.
-
Example:
audio_files = apiaudio.Speech.retrieve(scriptId="id-1234")
-
-
download()
Download the speech files in your preferred folder.-
Parameters:
scriptId
* [Required] (string) - The script ID you want to downloadversion
(string) - The version of the script to be downloaded. Default is "".section
(string) - The script section name you want to retrieve. If not provided, all the script sections will be returned.parameters
(dict) - Dict containing the personalisation parameters of your script. If not provided, the fallback track will be retrieved. This field depends on the parameters you used in your script's resource section. In order to retrieve a specific set of parameters, you need to create the speech with the same set of parameters.destination
(string) - The folder destination path. Default is "." (current folder)
-
Example:
audio_files = apiaudio.Speech.download(scriptId="id-1234", destination=".")
-
Voice allows you to retrieve a list of the available voices from our API.
Voice methods are:
-
list()
List all the available voices in our API. The parameters are all optional, and can be used in combination to get the perfect voice for your usecase.- Parameters:
provider
(string) - Try one of: google, polly, azure, msnr (aflorithmic), ibm, yandex, retro (aflorithmic), vocalid, resemblelanguage
(string) - e.g. english, spanish, french, german, etc.accent
(string) - e.g. american, british, neutral, portuguese/brazilian, american soft, mexican, australiangender
(string) - Try with one of: male, femaleageBracket
(string) - Try with one of: adult, child, seniortags
(string) - Try with one or more (separated by commas) of: steady, confident, balanced, informative, serious, instructional, slow, storytelling, calm, clear, deep, formal, sad, thin, fast, upbeat, fun, energetic, tense, very fast, flat, low pitched, high pitched, low-pitched, sing-y, cooperative, kind, stable, monotonous, neutral, responsible, business man, straight to the point, knowledgeable, focused, newscastery, newsreader, interviewer, reliable, friendly, welcoming, good for handing out information, slightly friendlyindustryExamples
(string) - Try with one or more (separated by commas) of: fitness, business, commercial, fashion, travel, audiobook, real estate, faith, health industry, comercial, realestate, kids entertainment, games, customer service, education, storytelling, entertainment, kids, education audiobooktimePerformance
(string) - The time performance of the voice. There are three categories: slow, medium, fast.sectionCharacterLimit
(string) - The maximum amount of characters that the voice can process per Script section. All of the supported providers with the exception of VocalId have the limit of 4000.
- Example:
all_voices = apiaudio.Voice.list()
- Example:
french_voices = apiaudio.Voice.list(language="french",tags="steady, fun")
- Parameters:
-
list_parameters()
This method lets you see which attributes you can filter the voices by, along with the allowed values for each attribute. You can later use these parameters and values to filter the voices you wish to list.-
Parameters:
- No parameters required.
-
Example:
parameters = apiaudio.Voice.list_parameters()
-
Sound allows you to design your own sound template from a script and a background track. In order to get a sound template/project, make sure you requested speech for your script resource first.
Sound methods are:
-
list()
List all the available sound templates in our api. The parameters are all optional, and can be used in combination to get the perfect sound for your usecase.- Parameters:
industryExamples
(string) - Try with one or more (separated by commas) of: news, travel, business, relaxation, fitness, relax, children storiescontents
(string) - Try with one or more (separated by commas) of: intro, main, outro, effect1, effect2, main outro, droid_main, chewie_main, effect3, ambience, only effectsgenre
(string) - Try with one of: electronic, acoustic, atmospheric, abstract, rocktempo
(string) - Try with one of: mid, up, down, uptempotags
(string) - Try with one or more (separated by commas) of: intense, minimal, reflective, melodic, happy, nostalgic, focus, energetic, uplifting, active, relaxed, ambience, mysterious, positive, informative, workout, work, meditation, travel, full silence
- Example:
sound_templates = apiaudio.Sound.list()
- Parameters:
-
list_parameters()
This method lets you see which attributes you can filter the sound templates by, along with the allowed values for each attribute. You can later use these parameters and values to filter the sound templates you wish to list.-
Parameters:
- No parameters required.
-
Example:
parameters = apiaudio.Sound.list_parameters()
-
Mastering allows you to create and retrieve a mastered audio file of your script. A mastered version contains the speech of the script, a background track, personalised parameters for your audience and a mastering process to enhance the audio quality of the whole track. In order to get a mastered audio file, make sure you requested speech for your script resource first.
Mastering methods are:
-
create()
Create a mastered version of your script and choose the audio format.-
Parameters:
scriptId
* [Required] (string) - The script resource ID.version
(string) - The version of the script to be produced. Default is "".soundTemplate
(string) - The sound template name. For the list of available sound templates checkapiaudio.Sound.list_sound_templates()
call.public
(boolean) - Boolean flag that allows to store the mastered file in a public s3 folder. Default value isFalse
. Warning - This will cause your mastered files to be public to anyone in the internet. Use this at your own risk.vast
(boolean) - Boolean flag that allows to create a VAST file of your mastered file. Thevast
flag only works ifpublic
isTrue
. Default value isFalse
.endFormat
(list) - List of audio formats to be produced. Valid formats are:["wav", "mp3" (default), "flac", "ogg", "mp3_very_low", "mp3_low", "mp3_medium", "mp3_high", "mp3_very_high", "mp3_alexa"]
forceLength
(int) - force the audio length of the mastered track (in seconds).audience
(dict) - Dictionary containing the personalisation parameters. This parameter depends on the number of parameters you used in your script resource. In the script documentation example above, we used 2 parameters:username
andlocation
, and in the following example below we want to produce the script for usernamesalih
with locationBarcelona
. If audience is not provided, the fallback track will be created.mediaFiles
(list) - List of dicts containing the media files. This parameter depends on the media file tags used in the script resource and the media files you have in your account. For example, if the script contains<<media::myrecording>>
plus<<media::mysong>>
, and you want to attach myrecording to mediaId = "12345", and mysong to mediaId = "67890" thenmediaFiles = [{"myrecording":"12345", "mysong":"67890"}]
.mediaVolumeTrim
(float) - Floating point varible that allows you to trim the volume of uploaded media files (in dB). This attribute has a valid range of -12 to 12 dB and applies to all media files included in a single mastering call. Clipping protection is not provided so only make incremental adjustments.connectors
(list) - List of dicts specifying configuration for particular 3rd party connection. For guidelines in context of supported 3rd party application, see connectors documentation.masteringPreset
(string) - The mastering preset to use, this enables features such as sidechain compression 'i.e. ducking' Seeapiaudio.Mastering.list_presets()
for a list of presets and their descriptions.share
(boolean) - If you would like to have a sharable link created with your audio file, use this flag. If you putshare: True
the response will haveshareUrl
parameter returned. (Note: If you put this flag, your private files will be converted to public files.)
-
Example:
response = apiaudio.Mastering.create( scriptId="id-1234", soundTemplate="jakarta", audience={"username":"salih", "location":"barcelona"} )
-
-
create_media_timeline()
Creates a mastering request based purely on uploaded media files. Media files will need to be uploaded before calling this function. See media.-
Parameters:
-
timeline
* [Required] (list) - The timeline object. A timeline object is a list of dictionaries, whereby each represents a track of audio files. Each track must have the following two keysfiles
andcontentType
.-
files
* [Required] (list) Files is a list of dictionaries, whereby each entry must have the following 3 keys,mediaId
,startAt
andendAt
files
* [Required] (str) the mediaId of the file that has been uploaded.startAt
* [Required] (float) the time that this media file should start at in seconds.endAt
* [Required] (float) the time that this media file should end at in seconds.
-
contentType
* [Required] (string) - The type of content that this track contains, should be eithersound
orspeech
-
-
endFormat
(list) - List of audio formats to be produced. Valid formats are:["wav", "mp3" (default), "flac", "ogg", "mp3_very_low", "mp3_low", "mp3_medium", "mp3_high", "mp3_very_high", "mp3_alexa"]
-
masteringPreset
(string) - The mastering preset to use, this enables features such as sidechain compression 'i.e. ducking' Seeapiaudio.Mastering.list_presets()
for a list of presets and their descriptions.
-
-
Example:
backgroundId = apiaudio.Media.upload(file_path="background.wav")["mediaId"] speechId = apiaudio.Media.upload(file_path="speech1.wav")["mediaId"] timeline = [ { "files" : [ { "mediaId" : speechId, "startAt" : 2, "endAt" : 14, } ], "contentType" : "speech" }, { "files" : [ { "mediaId" : backgroundId, "startAt" : 0, "endAt" : 45, } ], "contentType" : "sound" } ] response = apiaudio.Mastering.create_media_timeline(timeline=timeline, masteringPreset="lightducking")
-
-
retrieve()
Retrieves the mastered file urls.-
Parameters:
scriptId
* [Required] (string) - The script resource ID.versions
(string) - The version of the script to be retrieved. Default is "".parameters
(dict) - Dictionary containing the audience item you want to retrieve. If parameters are not provided, the fallback track will be retrieved.public
(boolean) - Boolean flag that allows to retrieve the mastered file from the public bucket. Use this if you want to retrieve a mastered file created usingpublic=True
. Default value isFalse
.vast
(boolean) - Boolean flag that allows to retrieve the VAST file of your mastered file. Thevast
flag only works ifpublic
isTrue
. Default value isFalse
.endFormat
(list) - List of audio formats to be retrieved. Valid formats are:["wav", "mp3" (default), "flac", "ogg", "mp3_very_low", "mp3_low", "mp3_medium", "mp3_high", "mp3_very_high", "mp3_alexa"]
-
Example:
mastered_files = apiaudio.Mastering.retrieve( scriptId="id-1234", parameters={"username":"salih", "location":"barcelona"} )
-
-
download()
Download the mastered files in your preferred folder.-
Parameters:
scriptId
* [Required] (string) - The script resource ID.version
(string) - The version of the script to be downloaded. Default is "".parameters
(dict) - Dictionary containing the audience item you want to retrieve. If parameters are not provided, the fallback track will be downloaded.destination
(string) - The folder destination path. Default is "." (current folder)public
(boolean) - Boolean flag that allows to retrieve the mastered file from the public bucket. Use this if you want to retrieve a mastered file created usingpublic=True
. Default value isFalse
.vast
(boolean) - Boolean flag that allows to retrieve the VAST file of your mastered file. Thevast
flag only works ifpublic
isTrue
. Default value isFalse
.
-
Example:
mastered_files = apiaudio.Mastering.download( scriptId="id-1234", parameters={"username":"salih", "location":"barcelona"} destination="." )
-
list_presets()
List the available mastering presets.- Parameters:
- No parameters required.
- Parameters:
-
Example:
presets = apiaudio.Mastering.list_presets() print(presets)
-
Media allows you to retrieve all the files available in api.audio for your organization.
Media methods are:
-
upload()
- Upload files to our databases.-
Parameters:
file_path
* [Required] (string) - Relative path to the audio file.tags
(string) - Comma separated tags you want to add to your uploaded file. This will make retrieval easier.
-
Example:
apiaudio.Media.upload( file_path="./my_file.mp3", tags="tag1,tag2,tag3" )
-
-
list()
- List all files within an org.-
Parameters:
mediaId
(string) - If passed, will only return that file, or an empty object if it does not exist.tags
(string) - Comma separated tags you want to add to your uploaded file. If passed, will return all files that at least contain those tags.downloadUrl
(boolean): if True, a presigned url is added to each item on the array. This is slow for large amount of files (around 1s each).public
(boolean): If True, the media files listed will be the public media files provided by api.audio. Default is False.
-
Examples:
# lists all files files = apiaudio.Media.list() # lists files with tag="tag1" files = apiaudio.Media.list(tags="tag1") # lists file with specific id files = apiaudio.Media.list(mediaId="some_mediaId") # lists files with tag="tag1" and with a downloadurl files = apiaudio.Media.list(tags="tag1", downloadUrl=True)
-
-
list_tags()
This returns a list with all unique user defined tags.- Parameters:
- No parameters required.
- Parameters:
-
Example:
tags = apiaudio.Media.list_tags() print(tags)
-
get_download_url()
- This method returns a presigned url for downloading a specific audio file- params:
mediaId
* [required] (string): media id for the file to be downloaded
- params:
-
Example
url = apiaudio.Media.get_download_url(mediaId="some-mediaId") print(url)
-
download()
- This method downloads a specific audio file- params:
mediaId
* [required] (string): media id for the file to be downloadeddestination
(string): path to the directory where the file will be downloaded. Default is "."
- params:
-
Example
apiaudio.Media.download( mediaId="some_mediaId", destination="/my_destination_folder" )
SyncTTS allows you to do Synchronous Text-To-Speech (TTS) with our API using all the voices available. Use it to create a speech audio file from a text and a voice name. The response contains wave bytes ready to be played or written to a file.
SyncTTS methods are:
-
create()
Create a TTS speech file.-
Parameters:
voice
* [Required] (string) - The voice name. See the list of available voices using Voice resource.text
* [Required] (string) - The text you want to do TTS with. The limit is 800 characters for wave files.metadata
[Optional] ("full" or "none") - The level of metadata you want. Returns phoneme lists (only available for some msnr voices)
-
Example:
sync_tts = apiaudio.SyncTTS.create( voice="joanna", text="This is me creating synchronous text to speech", metadata="full" )
-
Birdcache is a caching service provided by API.audio that provides the caching layer for the customer by storing data in API.audio servers for future use. This allows you to retrieve your speech files on the fly.
Birdcache methods are:
-
create()
Create a TTS speech file.-
Parameters:
type
* [Required] (string) - Type of the event. Supported types aremastering
andspeech
.text
* [Required] (string) - The text you want to do speech/mastering with. See the example for personalisation parameters.voice
* [Required] (string) - The voice for speech creation.audience
* [Optional] (dict) - The key pair object for personalisation parameters. See the example below.soundTemplate
[Optional] (string) - The sound template for mastering creation. Only needed when the type is mastering.
-
Example:
birdcache = apiaudio.Birdcache.create( type="mastering", voice="linda", text="This is {{username|me}} creating synchronous text to speech", audience={"username": ["salih", "sam", "timo"]}, soundTemplate="electronic" )
-
Often when working with TTS, the models can fail to accurately pronounce specific words, for example brands, names and locations are commonly mis-pronounced. As a first attempt to fix this we have introduced our lexi flag, which works in a similar way to SSML. For example, adding <!peadar> instead of Peadar (who is one of our founders) to your script will cause the model to produce an alternative pronunciation of this name. This is particularly useful in cases where words can have multiple pronunciations, for example the cities ‘reading’ and ‘nice’. In this instance placing <!reading> and <!nice> will ensure that these are pronounced correctly, given the script:
" The city of <!nice> is a really nice place in the south of france."
If this solution does not work for you, you can instead make use of our custom (self-serve) lexi feature.
This can be used to achieve one of two things, correcting single words, or expanding acronyms. For example, you can replace all occurrences of the word Aflorithmic with “af low rhythmic” or occurrences of the word ‘BMW’ with “Bayerische Motoren Werke”. Replacement words can be supplied as plain text or an IPA phonemisation.
Prononciation dictionary methods are:
-
list()
Lists the publicly available dictionaries and their words-
Parameters:
none
-
Example:
# returns a list of public dictionaries dictionaries = apiaudio.Lexi.list()
-
-
list_custom_dicts()
Lists the custom dictionaries and their respective words-
Parameters:
none
-
Example:
# returns a list of custom dictionaries types = apiaudio.Lexi.list_custom_dicts()
-
-
register_custom_word
Adds a new word to a custom dictionary.lang
[required] (string) - Language family, e.g.en
ores
.dictionary - useglobal
to register a word globally.word
[required] (string) - The word that will be replacedreplacement
[required] (string) - The replacement token. Can be either a plain string or a IPA token.contentType
[optional] (string) - The content type of the supplied replacement, can be eitherbasic
(default) oripa
for phonetic replacements.specialization
[optional] (string) - by default the supplied replacement will apply regardless of the supplied voice, language code or provider. However edge cases can be supplied, these can be either a valid; provider name, language code (i.e. en-gb) or voice name.- Example:
# correct the word sapiens r = apiaudio.Lexi.register_custom_word(word="sapiens", replacement="saypeeoons", lang="en") print(r)
For each language, only a single word entry is permitted. However, each word can have multiple
specializations
. When a word is first registered adefault
specialization
is always created, which will match what is passed in. Subsequent calls with different specializations will only update the given specialization. The exact repacement that will be used is determined by the following order of preference:voice name > language dialect > provider name > default
For example, a replacement specified for voice name
sara
will be picked over a replacement specified for providerazure
. -
list_custom_words()
Lists all the words contained in a custom dictionary.-
Parameters:
-
lang
[required] (string) - Language family, e.g.en
ores
- useglobal
to list language agnostic words. -
Example:
# lists all words in the dictionary along with their replacements words = apiaudio.Lexi.list_custom_words(lang="en")
-
The effect of applying the Pronunciation Dictionary can be seen with the script.preview()
method. See Script documentation for more details.
-
Example:
text = """ The author of this repo has lived in two places in the UK, <!Bude> and <!Bristol> """ r = apiaudio.Script.create(scriptText=text) scriptId = r["scriptId"] # preview the script in en-gb preview = apiaudio.Script.preview(scriptId=scriptId, voice="Joanna") print(preview)
-
Response:
{"preview" : "The author of this repo has lived in two places in the UK, bude and <phoneme alphabet=\"ipa\" ph=\"###\"> bristol </phoneme>"}
In this example
Bristol
will be phonemised to ensure it is correctly pronouced, but asBude
is not in our a dictionaires it is left as is. The exact IPA tokens for words in our internal dictionaires are obsfucated.
Resource used for monitoring 3rd paty integrations. End results of Mastering resource can be distributed into external applications through connectors
field. See connectors documentation.
List of currently supported applications:
Available methods:
-
retrieve()
After registering a connector in the api.console, use this method to check whether a connection was succesful using provided credentials.-
Parameters:
name
* [Required] (string) - The name of the connector specified in console.
-
Example:
status = apiaudio.Connector.retrieve( name="julep" )
-
-
connection()
Check the status of the connection by providingconnectionId
returned in a Mastering response.-
Parameters:
connection_id
* [Required] (string) - The connectionId returned by Mastering resource.
-
Example:
status = apiaudio.Connector.connection( connection_id="af2fe14a-aa6b-4a97-b430-a072c38b11ff" )
-
The orchestrator is used to make working with a range of audio services as easy as sending a single API request. Each route here is carefully configured to produce high-quality and easy to access audio assets.
Orchestrator methods are:
-
create_audio()
Creates a simple TTS speech request and adds a sound template to it through mastering.-
Parameters:
scriptText
* [Required] (str) - Text to synthesize (TTS).soundTemplate
(str) - Sound template to use.voice
* [Required] (str) - Name of voice to use.
-
-
create_three_sections()
Creates a TTS speech request with 3 sections and adds a sound template to it through mastering.-
Parameters:
introText
* [Required] (str) - Text to synthesize in the intro section.mainText
* [Required] (str) - Text to synthesize in the main section.outroText
* [Required] (str) - Text to synthesize in the outro section.soundTemplate
(str) - Sound template to use.voice
* [Required] (str) - Name of voice to use.
-
-
media_with_sound()
Combines a pre-existing media file (i.e. pre-recorded voice) with a sound template-
Parameters:
mediaId
* [Required] (str) - MediaId of the media file to use as input.soundTemplate
* [Required] (str) - Sound template to use.
-
This SDK provides an easy way of verifying apiaudio webhook call security headers. It is highly recommended for you to verify the headers in order to protect your server from any malicious attack.
The method is:
apiaudio.Webhooks.verify(payload, sig_header, secret, tolerance)
It will return true if the header is valid, otherwise it will raise an error.
The parameters to pass are; payload
being the body object sent by apiaudio, sig_header
being X-Aflr-Secret
in the request headers sent by apiaudio, secret
being your webhook secret (you can get it in apiaudio console) and tolerance
being the tolerance in seconds for the header checks, which defaults to 300 seconds.
By default, warnings issued by the API are logged in the console output. Additionally, some behaviors are logged on the informational level (e.g. "In progress..." indicators during longer processing times).
The level of logging can be controlled by choosing from the standard levels in Python's logging
library.
- Decreasing logging level for more detailed logs:
apiaudio.set_logger_level("INFO") # apiaudio.set_logger_level("CRITICAL") - set the highest level to disable logs
There is a pre-commit hook that will run before you commit a file. This is to keep the code standards high. To enable it, you should run make
. Then it will set up the pre-commit hook for git. Thats all! Now every time before you commit, it will run to tell you about the standards.
If you use VSCode for committing files, you may bump into pre-commit command not found
error. That is ok, just run brew install pre-commit
or your fave package manager from the list here.
If you bump into your pip version is old
error, just ignore it and use the terminal.
If there is a problem and you are in a rush, you can add --no-verify
at the end of the commit command, it will skip the pre-commit hooks, e.g git commit -m 'your commit message' --no-verify
This project is licensed under the terms of the MIT license.