SoundAI Voice Kit (SVK) is whole chain intelligent speech interactive development system that integrates acoustic distribution network, beamforming, sound source direction , directional pickup, noise suppression, reverberation, echo cancellation, speech wake-up, speech recognition, speech comprehension, speech synthesis, duplex communication, and so on. It maintains compatibility with the mainstream intelligent voice chip and hardware architecture and supports a lot of AI content platform like Baidu DuerOS, AliGenies, Xiao Ai, Tencent Dingdang, and Alexa. This system can be a great help for the development and volume production of intelligent speech products all over the world.
High AccuracyRecognition rate voiceprint within 5 meters > 95%
Rapid ResponseResponse speed of whole chain speech < 1.3 seconds
Quality Service50+volume production,SLA > 99.99%
Low CostsLow resource and low power
Compatible with mainstream hardware architectures, applicable to general application scenarios, and be able to support various formations of ring-shaped, linear, square, L-shaped arrays with different numbers of microphone from 1 to 6. There are many algorithms in the microphone array from beam forming, sound source localization, noise suppression, echo cancellation, speech enhancement, SSA(spatial situational awareness), SSP(spatial situational perception), VAN(vertical anti-noise recognition), to OpenAEC that help realize accurate intelligent voice interaction in local field, far field, distribution field and very-far field.
Awakening rate is over 95%. Functions like Duel-wake, Free-cut, Open AEC, and AKS are all supported. In daily environments, false wake-up is less than 0.5 times per day.
Highly customized for audio and video contents and be able to fulfill interactive demands of vertical scenes like office work, home control, travelling and so on. Free-ask, One-shot, and Van are all supported.
Precise speech understanding in vertical scenes. A lot of built-in content resources are available through Baidu DuerOS system and intelligent home control can be realized with Xi Ai platform. Similarly, more services and voice touch office can be provided through other voice platforms.
Provide various information analysis and mining services. Functions as voiceprint recognition, age identification, emotion detection, gender identification, humming detection, and abnormal sound detection are supported. With the high-quality speech synthesis technology, words can be turned into voices fluently, even celebrities and young children’s voices can be generated naturally.
Far Field Voice Interaction
Local Field Voice Interaction
Distribution Field Voice Interaction
Very-far Field Voice Interaction
full chain of voice technology
customization and flexible configuration
rich experience and verification for mass production
Advanced Acoustic Technology
support mainstream intelligent hardware architecture
SLA can up to 99.99%
MI AI Speaker
MI AI Mini Speaker
Xiaodu AI Speaker