Sonic Intelligence: The Future of AI-Driven Audio
(Room Ballroom A)
05 Nov 25
3:25 PM
-
3:50 PM
Tracks:
Embedded AI: Architectures & Applications
Embedded AI is accelerating the evolution of Audio AI into a key enabler of intelligent, real-time, and privacy-preserving interactions. Yet unlike computer vision - which benefits from robust datasets and standardized benchmarks - Audio AI remains fragmented and underdeveloped. As Audio AI expands beyond tasks like wakeword detection and noise suppression into areas such as environmental sound analysis, speech enhancement, and voice/music generation, the reliance on embedded AI at the edge increases. However, the field still lacks standardization and certification frameworks tailored to these emerging use cases.
Existing efforts, like AudioMark, focus on front-end performance but overlook the broader embedded AI audio stack. Common metrics - such as PESQ, STOI, and DNSMOS - often fall short in capturing perceptual quality across diverse environments, leading to inconsistent evaluations.
Through a speech enhancement case study, this talk will expose the limitations of current metrics and the need for fairer, more representative evaluation. We conclude with a call to action for a unified industry–academia initiative to establish shared benchmarks and standards, creating a true foundation for embedded Audio AI at the edge.