VNPT AI Ecosystem

mic
API requested:

    SmartVoice Engine Specifications

    Technical specifications of the SmartVoice Engine when provided as a license.

    Text to Speech

    Category Specification
    Integration Capability API & SDK
    Supported Protocols gRPC, RestAPI
    Minimum text length 1 character/conversion
    Maximum text length (synchronous) 1,800 characters/conversion
    Maximum text length (asynchronous) 10,000 characters/conversion
    Output Link to audio file
    Other Requirements The converted voice has emotional intonation and expressiveness with male/female voices from 3 regional accents

    Speech to Text

    Category Specification
    Integration Capability API & SDK
    Supported Protocols gRPC, RestAPI, websocket API gRPC
    Voice Quality Speaker's voice must be clearly audible, not too quiet (tested via mic at distance < 30cm)
    Noise Environment Environment without excessive background noise
    File Format Wav, Mp3
    Stream Format PCM 16bit, Mono Channel
    Synchronous audio file processing File size < 10MB
    Asynchronous audio file processing File size < 250MB, maximum duration 2 hours
    Format PCM_16 (Int16 = 2bytes)
    Sample rate 16kHz