3D Audio Visualizer Code Using Python

Audio-Driven Emotion-Aware 3D Talking Face Generation from Single Image

Abstract: Audio-driven talking face generation from a single source image is a popular research topic. There still exist many challenges for its practical applications, e.g., diverse motion generation ...

IEEE

Sound Source Localization Using Multi-Dictionary Orthogonal Matching Pursuit in Reverberant Environments

Abstract: Sound source localization in reverberant environments remains a challenging problem, particularly when precise position estimation is required. Existing DOA estimation methods, while ...

GitHub

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

We introduce MMAR, a new benchmark designed to evaluate the deep reasoning capabilities of Audio-Language Models (ALMs) across massive multi-disciplinary tasks. MMAR comprises 1,000 meticulously ...

GitHub

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer (ICLR 2026)

We present STream3R, a novel approach to 3D reconstruction that reformulates pointmap prediction as a decoder-only Transformer problem. Existing state-of-the-art methods for multi-view reconstruction ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results