[시장보고서]중국의 자동차 멀티모달 인터랙션 개발(2025년)

중국의 자동차 멀티모달 인터랙션 개발(2025년)

China Automotive Multimodal Interaction Development Research Report, 2025

상품코드 : 1892141

리서치사 : ResearchInChina

발행일 : 2025년 12월

페이지 정보 : 영문 285 Pages

라이선스 & 가격 (부가세 별도)

한글목차

샘플 요청 목록에 추가

I. 멀티모달 인터랙션의 폐쇄 루프 진화 : L1 - L4 지능형 조종석의 점진적 진화

China Society of Automotive Engineers(China-SAE)가 공동 발표한 "White Paper on Automotive Intelligent Cockpit Levels and Comprehensive Evaluation"(자동차 지능형 콕핏 레벨과 종합평가에 관한 백서)에 따르면, 지능형 콕핏은 L0-L4의 5단계로 정의되고 있습니다.

콕핏 인텔리전스의 핵심인 멀티모달 인터랙션 기능은 AI 대규모 모델과 다중 하드웨어의 연계를 통해 멀티 소스 인터랙션 데이터의 융합 처리를 실현합니다. 이를 통해 운전자와 승객의 의도를 정확하게 이해하고 시나리오에 따른 피드백을 제공하며 궁극적으로 자연스럽고 안전하며 개인화된 휴먼 머신 인터랙션을 달성할 수 있습니다. 현재 자동차 인텔리전트 조종석 업계는 대략 L2 단계에 있으며, 일부 첨단 제조업체들은 L3 단계로의 전환을 모색 중입니다.

L2 지능형 조종석의 핵심 특징은 "강한 지각, 약한인지"입니다. L2 단계에서는 조종석의 멀티모달 인터랙션 기능이 신호 레벨 융합을 실현합니다. 멀티모달 대규모 모델 기술을 바탕으로 "사용자의 모호한 의도를 이해"하거나 "여러 명령을 동시에 처리"할 수 있어 사용자의 즉각적이고 명시적인 지시를 실행할 수 있습니다. 현재 대량 생산되는 지능형 조종석의 대부분은이 기능을 갖추고 있습니다.

L3 지능형 조종석의 핵심 기능은 "강한 지각, 강한인지"입니다. L3 단계에서는 조종석의 멀티모달 인터랙션 기능이인지 수준에서 융합을 실현합니다. 대규모 모델 능력을 기반으로 조종실 시스템은 현재 상황을 종합적으로 이해하고 사용자가 명시적인 명령을 내리지 않고 적절한 서비스와 제안을 능동적으로 시작합니다.

L4 지능형 조종석의 핵심 기능은 '전체 도메인 인지, 자율적 진화'이며 사용자를 위한 '전체 도메인 지능형 관리자'를 만듭니다. L4 단계에서 지능형 조종실의 용도는 단순한 툴의 속성을 훨씬 뛰어넘고, 사용자의 미표명의 요구를 예측하고, 공유 기억을 가지고, 모든 자원을 사용자를 위해 동원할 수 있는 「디지털 트윈 파트너」가 됩니다. 핵심 경험은 사용자가 명확하게 요구를 인식하거나 표현하기 전에 시스템이 예측과 계획을 완료하고 실행 상태에 있다는 것입니다.

II. 멀티모달 AI 에이전트 : 요청을 이해하고 사고를 예측

AI 에이전트는 지능형 조종실이 L2에서 L4로 진화하는 과정에서 기능을 구체적으로 구현하기 위한 핵심 실행 유닛 및 중요한 기술 아키텍처로 간주됩니다. 음성·시각·촉각·상황 정보를 통합함으로써 AI 에이전트는 단순히 명령을 '이해'할 뿐만 아니라 환경을 '시각화'하고 상태를 '감지'할 수 있어 기존의 분산된 조종석 기능을 일관성 있고 능동적이고 개인화된 서비스 프로세스로 통합합니다.

L2 레벨에서 에이전트의 용도는 "강화된 명령 실행"으로 간주되며 L2 조종석의 인터랙션 능력을 궁극적으로 확장합니다. 대규모 모델 기술을 기반으로 조종석 시스템은 사용자의 복잡한 명령을 여러 단계로 분해하고 다른 에이전트 도구를 순차적으로 호출하고 실행합니다.

현재 에이전트의 용도는 기본적으로 사용자의 명시적이고 복잡한 명령에 대한 응답과 실행입니다. 조종석 시스템은 "능동적으로"무언가를 수행하는 것이 아니라 단순히 "사용자로부터 할당 된 작업을보다 지능적으로 완료"하는 것입니다.

이 보고서는 중국 자동차 산업에 대한 조사 분석을 통해 자동차 조종석에서의 인터랙션 양식 탑재 수, 멀티모달 인터랙션 관련 특허, 자동차 제조업체/공급업체 조종석 인터랙션 솔루션 등의 정보를 제공합니다.

지능형 조종석 개발 단계
멀티모달 인터랙션 정의
멀티모달 인터랙션 개발 시스템
핵심 인터랙션 모달리티 기술 서론(1) : 촉각 인터랙션
코어 인터랙션 모달리티 기술 서론(2) : 청각 인터랙션
코어 인터랙션 모달리티 기술의 서론(3) : 시각 인터랙션
코어 인터랙션 모달리티 기술 서론(4) : 후각 인터랙션
지능형 조종석에서 대규모 모델의 응용 시나리오
멀티모달 AI 대규모 모델에 기초한 차량 휴먼 인터랙션 기능
멀티모달 인터랙션의 산업 체인
멀티모달 AI 대규모 모델의 산업 체인
멀티모달 인터랙션에 대한 정책 환경
조종석 인터랙션 모달리티의 탑재

제2장 자동차 멀티모달 인터랙션 관련 특허 요약

촉각 인터랙션 관련 특허 요약
청각 인터랙션 관련 특허 요약
시각 인터랙션 관련 특허 요약
후각 인터랙션 관련 특허 요약
기타 인터랙션 모달리티에 관련된 특허 요약

3장 OEM 멀티모달 인터랙션 콕핏 솔루션

BYD
SAIC IM Motors
FAW Hongqi
Geely
Great Wall Motor
Chery
Changan
Voyah
Li Auto
NIO
Leapmotor
Xpeng
Xiaomi
BMW

제4장 공급업체의 멀티모달 콕핏 솔루션

Desay SV
Joyson Electronics
SenseTime
iFLYTEK
Thundersoft
Huawei
Baidu
Banma Zhixing

제5장 대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례

대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례 요약(1)
대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례 요약(2)
대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례 요약(3)
대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례 요약(4)
대표적인 차량 모델에 있어서 멀티모달 인터랙션 솔루션의 응용 사례 요약(5)
All-New IM L6 : 멀티모달 인터랙션 기능의 전체적인 요약
All-New IM L6 : 주요 모달 인터랙션 기능 분석
Fangchengbao Bao 8 : 멀티모달 인터랙션 기능 전체 요약
Fangchengbao Bao 8 : 주요 모달 인터랙션 기능 분석
Hongqi Jinkuihua Guoya : 멀티모달 인터랙션 기능 요약
Hongqi Jinkuihua Guoya : 주요 모달 인터랙션 기능 분석(1)
Hongqi Jinkuihua Guoya : 주요 모달 인터랙션 기능 분석(2)
Hongqi Jinkuihua Guoya : 주요 모달 인터랙션 기능 분석(3)
Denza N9 : 멀티모달 인터랙션 기능 전체 요약
Denza N9 : 주요 모달 인터랙션 기능 분석(1)
Denza N9 : 주요 모달 인터랙션 기능 분석(2)
Zeekr 9X : 멀티모달 인터랙션 기능의 파노라마 요약
Zeekr 9X : 주요 모달 인터랙션 기능 분석
Geely Galaxy A7 : 멀티모달 인터랙션 기능 전체 개요
Leapmotor B10 : 멀티모달 인터랙션 기능 요약
Li i6 : 멀티모달 인터랙션 기능의 전체적인 요약
Li i6 : 주요 모달 인터랙션 기능 분석(1)
Li i6 : 주요 모달 인터랙션 기능 분석(2)
Xpeng G7 : 멀티모달 인터랙션 기능 전체 요약
Xpeng G7 : 주요 모달 인터랙션 기능 분석
Xiaomi YU7 : 멀티모달 인터랙션 기능 전체 요약
Xiaomi YU7 : 주요 모달 인터랙션 기능 분석
MAEXTRO S800 : 멀티모달 인터랙션 기능 전체 개요
MAEXTRO S800 : 주요 모달 인터랙션 기능 분석(1)
MAEXTRO S800 : 주요 모달 인터랙션 기능 분석(2)
MAEXTRO S800 : 주요 모달 인터랙션 기능 분석(3)
2025 AITO M9 : 멀티모달 인터랙션 기능 전체 요약
2025 AITO M9 : 주요 모달 인터랙션 기능 분석(1)
2025 AITO M9 : 주요 모달 인터랙션 기능 분석(2)
2025 AITO M9 : 주요 모달 인터랙션 기능 분석(3)
2025 AITO M9 : 주요 모달 인터랙션 기능 분석(4)
All-New BMW X3 M50 : 멀티모달 인터랙션 기능 요약
All-New BMW X3 M50 : 주요 모달 인터랙션 기능 분석
2026 Audi E5 Sportback : 멀티모달 인터랙션 기능 전체 요약
2026 Audi E5 Sportback : 주요 모달 인터랙션 기능 분석(1)
2026 Audi E5 Sportback : 주요 모달 인터랙션 기능 분석(2)
All-New Mercedes-Benz Electric CLA : 멀티모달 인터랙션 기능 요약
All-New Mercedes-Benz Electric CLA : 주요 모달 인터랙션 기능 분석

제6장 멀티모달 인터랙션 요약과 개발 동향

OEM의 대규모 모델 구성 파라미터의 요약
동향 1 : AI 대규모 모델에 근거한 멀티모달 인터랙션의 진화
동향 2
동향 3(음성 인터랙션)
동향 4(시각 인터랙션)

KTH

영문 목차

영문목차

Research on Automotive Multimodal Interaction: The Interaction Evolution of L1~L4 Cockpits

ResearchInChina has released the "China Automotive Multimodal Interaction Development Research Report, 2025". This report comprehensively sorts out the installation of Interaction Modalities in automotive cockpits, multimodal interaction patents, mainstream cockpit interaction modes, application of interaction modes in key vehicle models launched in 2025, cockpit interaction solutions of automakers/suppliers, and integration trends of multimodal interaction.

I. Closed-Loop Evolution of Multimodal Interaction: Progressive Evolution of L1~L4 Intelligent Cockpits

According to the "White Paper on Automotive Intelligent Cockpit Levels and Comprehensive Evaluation" jointly released by the China Society of Automotive Engineers (China-SAE), five levels of intelligent cockpits are defined: L0-L4.

As a key driver for cockpit intelligence, multimodal interaction capability relies on the collaboration of AI large models and multiple hardware to achieve the fusion processing of multi-source interaction data. On this basis, it accurately understands the intentions of drivers and passengers and provides scenario-based feedback, ultimately achieving natural, safe, and personalized human-machine interaction. Currently, the automotive intelligent cockpit industry is generally in the L2 stage, with some leading manufacturers exploring and moving towards the L3.

The core feature of L2 intelligent cockpits is "strong perception, weak cognition". In the L2 stage, the multimodal interaction function of cockpits achieves signal-level fusion. Based on multimodal large model technology, it can "understand users' ambiguous intentions" and "simultaneously process multiple commands" to execute users' immediate and explicit commands. At present, most mass-produced intelligent cockpits can enable this.

In the case of Li i6, it is equipped with MindGPT-4o, the latest multimodal model which boasts understanding and response capabilities with ultra-long memory and ultra-low latency, and features more natural language generation. It supports multimodal "see and speak" (voice + vision fusion search: allowing illiterate children to select the cartoons they want to watch by describing the content on the video cover); multimodal referential interaction (voice + gesture: 1. Voice reference to objects: while issuing commands, extend the index finger: pointing left can control the window and complete vehicle control. 2. Voice reference to personnel: passengers in the same row can achieve voice control over designated personnel through gesture and voice coordination, e.g., pointing right and saying "Turn on the seat heating for him").

The core feature of L3 intelligent cockpits is "strong perception, strong cognition". In the L3 stage, the multimodal interaction function of cockpits achieves cognitive-level fusion. Relying on large model capabilities, the cockpit system can comprehensively understand the complete current scenario and actively initiate reasonable services or suggestions without the user issuing explicit commands.

The core feature of L4 intelligent cockpits is "full-domain cognition and autonomous evolution", creating a "full-domain intelligent manager" for users. In the L4 stage, the application of intelligent cockpits will go far beyond the tool attribute and become a "digital twin partner" that can predict users' unspoken needs, have shared memories, and dispatch all resources for users. Its core experience is: before the user clearly perceives or expresses the need, the system has completed prediction and planning and entered the execution state.

II. Multimodal AI Agent: Understand What You Need and Predict What You Think

AI Agent can be regarded as the core execution unit and key technical architecture for the specific implementation of functions in the evolution of intelligent cockpits from L2 to L4. By integrating voice, vision, touch and situational information, AI Agent can not only "understand" commands, but also "see" the environment and "perceive" the state, thereby integrating the original discrete cockpit functions into a coherent, active and personalized service process.

Agent applications under L2 can be regarded as "enhanced command execution", which is the ultimate extension of L2 cockpit interaction capabilities. Based on large model technology, the cockpit system decomposes a user's complex command into multiple steps and then calls different Agent tools to execute them. For example, a passenger says: "I'm tired, help me buy a cup of coffee." The large model of the L2 cockpit system will understand this complex command and then call in sequence:

1.Voice Agent: Parse user needs in real time;

2.Food Ordering Agent: Recommend the best options according to user preferences, real-time location, and restaurant business status;

3.Payment Agent: Automatically complete unconscious payment;

4.Delivery Agent: Dynamically plan the food delivery time combined with vehicle navigation data (e.g., "food arrives when the car arrives", ensuring that the food is delivered synchronously when the user reaches the destination).

Currently, Agent applications are essentially responses and executions to a user's explicit and complex commands. The cockpit system does not do anything "actively", and it just "completes the tasks assigned by the user" more intelligently.

Case (1): IM Motors released the "IM AIOS Ecological Cockpit" jointly developed with Banma Zhixing. This cockpit is the first to implement Alibaba's ecosystem services in the form of AI Agent, creating a "No Touch & No App" human-vehicle interaction mode. The "AI Food Ordering Agent" and "AI Ticketing Agent" functions launched by the IM AIOS Ecological Cockpit allow users to complete food selection/ticketing and payment only through voice interaction without needing manual operation.

Case (2): On August 4, 2025, Denza officially launched the "Car Life Agent" intelligent service system at its brand press conference, which is first equipped on two flagship models, Denza Z9 and Z9GT. The "Car Life Agent" supports voice food ordering and enables payment by face with face recognition technology. After completing the order, the system will automatically plan the navigation route, forming a seamless experience of "demand-service-closed loop".

In the next level of intelligent cockpits, Agent applications will change from "you say, I do" to "I watch, I guess, I suggest, let's do it together". Users do not need to issue any explicit commands. They just sigh and rub their temples, and the system can comprehensively judge data from "camera" (tired micro-expressions), "biological sensors" (heart rate changes), "navigation data" (continuous driving for 2 hours), and "time" (3 pm (afternoon sleepiness period)) via the large model to know that "the user is in the tired period of long-distance driving and has the need to rest and refresh". Based on this, the system will take the initiative to initiate interaction: "You seem to need a rest. There is a service zone* kilometers ahead with your favorite ** coffee. Do you need me to turn on the navigation? At the same time, I can play refreshing music for you." After the user agrees, the system then calls navigation, entertainment and other Agent tools.

Foreword

Related Definitions

1 Overview of Multimodal Interaction in Automotive Cockpits

1.1 Development Stages of Intelligent Cockpits
1.2 Definition of Multimodal Interaction
1.3 Development System of Multimodal Interaction
1.4 Introduction to Core Interaction Modality Technologies (1): Haptic Interaction
Mainstream Contact Haptic Vibration Feedback Technology
Core Application Scenarios of Haptic Interaction
1.4 Introduction to Core Interaction Modality Technologies (2): Auditory Interaction
Core Application Scenarios of Voice Interaction
Voice Interaction - Voiceprint Recognition
1.4 Introduction to Core Interaction Modality Technologies (3): Visual Interaction
Visual Interaction: Face Recognition Technology Roadmap
Visual Interaction: DMS Technology Roadmap
Visual Interaction: Gesture Recognition Technology Roadmap
1.4 Introduction to Core Interaction Modality Technologies (4): Olfactory Interaction
1.5 Application Scenarios of Large Models in Intelligent Cockpits
1.6 Vehicle-Human Interaction Functions Based on Multimodal AI Large Models
1.7 Industry Chain of Multimodal Interaction
1.8 Industry Chain of Multimodal AI Large Models
1.9 Policy Environment for Multimodal Interaction
Summary of Regulations Concerning Network Data Security of Intelligent Connected Vehicles
Laws and Regulations on Multimodal Interaction (1): Data Security Law
Laws and Regulations on Multimodal Interaction (2): Several Provisions on the Administration of Automobile Data Security
Laws and Regulations on Multimodal Interaction (3): Measures for Security Assessment of Data Outbound Transfer
Latest Mandatory National Standards for Multimodal Interaction
1.10 Installation of Interaction Modalities in Cockpits
Installations & Installation Rate of In-vehicle Voice Recognition, 2025
Installations & Installation Rate of Vehicle External Voice Interaction, 2025
Installations & Installation Rate of In-vehicle Gesture Recognition, 2025
Installations & Installation Rate of Voice + Gesture Fusion Interaction, 2025 Interaction, 2025
Installations & Installation Rate of In-vehicle Biometric Recognition, 2025
Installations & Installation Rate of In-vehicle DMS, 2025
Installations & Installation Rate of In-vehicle OMS, 2025

2 Summary of Patents Related to Automotive Multimodal Interaction

2.1 Summary of Patents Related to Haptic Interaction
Cockpit Haptic Interaction Patents
2.2 Summary of Patents Related to Auditory Interaction
Summary of Automotive Voice Interaction Patents (1): Automakers
Summary of Automotive Voice Interaction Patents (2): Suppliers
Summary of Automotive Voice Interaction Patents (3): Universities/Research Institutions
2.3 Summary of Patents Related to Visual Interaction
Patents Related to Gesture Recognition
Patents Related to Emotion Recognition
Patents Related to In-Cabin Monitoring (1): IMS (In-Cabin monitoring System)
Patents Related to In-Cabin Monitoring (2): DMS (Driver Monitoring System)
Patents Related to In-Cabin Monitoring (3): OMS (Occupant Monitoring System)
In-Cabin Eye Tracking & Payment by Face
2.4 Summary of Patents Related to Olfactory Interaction
Summary of Patents Related to In-vehicle Fragrance System
2.5 Summary of Patents Related to Other Featured Interaction Modalities
Patents Related to Fingerprint Recognition
Patents Related to Heart Rate Recognition
Patents Related to Iris Recognition
Patents Related to Bioelectromyography Recognition

3 Multimodal Interaction Cockpit Solutions of OEMs

3.1 BYD
HMI Functions of BYD's Previous-Generation Intelligent Cockpit Systems
BYD's New-Generation DiLink Intelligent Cockpit
Featured Multimodal Interaction Applications of BYD DiLink Intelligent Cockpit
BYD Integrates DeepSeek R1 & Tongyi Series Large Models to Enhance Interaction Capabilities
BYD Launches Car Life Agent, Supporting Voice Ordering + Payment by Face
Summary of BYD's Interaction Modality OTA Content in Recent Years
Summary of Denza's Interaction Modality OTA Content in Recent Years
Summary of Fangchengbao's Interaction Modality OTA Content in Recent Years
Summary of Yangwang's Interaction Modality OTA Content in Recent Years
3.2 SAIC IM Motors
HMI Functions of Previous-Generation Intelligent Cockpit Systems
IM AIOS Cockpit Pioneers A Vehicle-Human Interaction Mode: "No Touch & No App"
Core Multimodal Interaction: Voice Interaction
Summary of Interaction Modality OTA Content in Recent Years
3.3 FAW Hongqi
HMI Functions of Previous-Generation Intelligent Cockpit Systems
The New "Lingshi Cockpit" Features Audio-Visual Interaction
Summary of HMI Functions of Lingshi Cockpit
3.4 Geely
HMI Functions of Geely's Previous-Generation Intelligent Cockpit Systems
HMI Functions of Lynk & Co's Previous-Generation Intelligent Cockpit Systems
HMI Functions of Zeekr's Previous-Generation Intelligent Cockpit Systems
Geely's AI Intelligent Cockpit Strategy: Fully Entering the AI Era to Achieve "One Geely, One Cockpit"
Geely's AI Intelligent Cockpit Technical Architecture
Geely's New-Generation AI Cockpit Operating System Flyme Auto 2 Leads Cockpit Interaction into a New Experience of "Services Finding Users"
Geely Launches Multimodal Agent Eva to Perceive User Emotions and Provide Proactive Care
The Intelligent Cockpit System of the Latest Zeekr AI OS 7 Launches AI Eva Agent
Summary of Geely's Interaction Modality OTA Content in Recent Years
Summary of Lynk & Co's Interaction Modality OTA Content in Recent Years
Summary of Zeekr's Interaction Modality OTA Content in Recent Years
3.5 Great Wall Motor
HMI Functions of WEY's Previous-Generation Intelligent Cockpit Systems
Coffee OS 3 Smart Space System
Coffee OS 3.1 Upgrades Voice Interaction Functions and Supports the Support Digital Health Applications in IVI Linkage
Coffee OS 3.3 Continuously Optimizes Voice Interaction Functions
Summary of WEY's Interaction Modality OTA Content in Recent Years
Summary of Tank's Interaction Modality OTA Content in Recent Years
3.6 Chery
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Lion Tech Intelligent Cockpit
Chery Cooperated with SenseTime to Build the Next-Generation AIOS, Enabling Proactive Services and Emotional Companionship in Intelligent Cockpits
Summary of Interaction Modality OTA Content in Recent Years
3.7 Changan
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Tianshu Intelligent Cockpit Enhances Vehicle-Human Interaction and Health Protection Function Experiences
Summary of Interaction Modality OTA Content in Recent Years
3.8 Voyah
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Xiaoyao Cockpit 2.0 Upgrades Five-Sense and Intelligent Experiences
Summary of Multimodal Interaction Capabilities of Xiaoyao Cockpit
Summary of Interaction Modality OTA Content in Recent Years
3.9 Li Auto
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Intelligent Cockpit 7.0: Fully Upgrades the Lixiang Tongxue Function Based on Mind GPT
Intelligent Cockpit 7.4: Upgrades the Lixiang Tongxue Life Assistant Agent to Realize Food Delivery Ordering Function
Intelligent Cockpit 8.0: Fully Upgrades Lixiang Tongxue to Lixiang Tongxue Agent
Summary of Interaction Modality OTA Content in Recent Years
3.10 NIO
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Featured Interaction: NOMI Voice Interaction System
Summary of Interaction Modality OTA Content in Recent Years
Summary of ONVO's Interaction Modality OTA Content in Recent Years
Summary of Firefly's Interaction Modality OTA Content in Recent Years
3.11 Leapmotor
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Leapmotor OS 4.0 PLUS Intelligent Cockpit System Equipped with Dual AI Large Voice Models
Leapmotor Cooperated with Unity China to Create a New HMI Experience for the Next-Generation Intelligent Cockpit
Summary of Interaction Modality OTA Content in Recent Years
3.12 Xpeng
HMI Functions of Previous-Generation Intelligent Cockpit Systems
VLM Large Model Defines the Next-Generation Intelligent Cockpit Interaction Experience
Featured Multimodal Interaction Functions
Summary of Interaction Modality OTA Content in Recent Years
3.13 Xiaomi
Hyper Intelligent Cockpit
"Super Xiaoai" Multimodal Fusion Application
Xiaomi Adds AI Spatial Interaction Sensors to Achieve Air Gesture Control
Summary of Interaction Modality OTA Content in Recent Years
3.14 BMW
HMI Functions of Previous-Generation Intelligent Cockpit Systems
Panoramic iDrive Equipped with Ultra-Sensitive Quality Control Steering Wheel and AI Large Language Model
Typical Models with In-Vehicle Infotainment Systems: All-New BMW iX3

4 Multimodal Cockpit Solutions of Suppliers

4.1 Desay SV
Profile
Development Strategy
Multimodal Interaction Solution: Smart Solution 3.0
Multimodal Interaction Solution: Smart Solution 3.0 Innovative Scenario Applications
Desay SV and ModelBest Jointly Released On-device Large Model Voice Interaction Solution
4.2 Joyson Electronics
Profile
Evolution and Definition of JOYNEXT Intelligent Cockpit
JoySpace+ Immersive Intelligent Cockpit Solution Integrates Multiple Innovative Multimodal Interaction Technologies
JoySpace+ Immersive Intelligent Cockpit Solution: Sensory Interaction
JoySpace+ Immersive Intelligent Cockpit Solution: Light and Shadow Space
JoySpace+ Immersive Intelligent Cockpit Solution: Smart Space
4.3 SenseTime
Profile of SenseTime
SenseAuto Intelligent Cockpit Product System
Models-as-a-Service (MaaS) of SenseAuto On-device Multimodal Large Model MAAS
Open Model Atomic Capabilities of SenseAuto On-device Multimodal Large Model (1): Full-Cabin Scenario Perception
Open Model Atomic Capabilities of SenseAuto On-device Multimodal Large Model (2): Multimodal Fusion Capabilities
Open Model Atomic Capabilities of SenseAuto On-device Multimodal Large Model (3): Multi-Image Perception Capabilities
SenseAuto Multimodal Interaction Application Cases
4.4 iFLYTEK
Profile
Full-Stack Intelligent Interaction Technology
Spark Smart Cockpit
Spark Smart Cockpit 2.0
Spark Smart Cockpit 2.0: Applications
Characteristics of Multimodal Perception System: Safety Protection, Personalized Interaction, Multimodal Interaction
Multimodal Interaction Becomes a Key Direction of iFLYTEK Super Brain 2030 Plan
4.5 Thundersoft
Profile
AIDV Roadmap
Device-Edge-Cloud AI Cockpit Solution Creates Full-Link Multimodal Services
AIBOX+AIOS Integrated Solution Deeply Integrates Multimodal Interaction
AquaDrive OS 1.0 Evo, Offering Innovative Cockpit Interaction Experience
4.6 Huawei
Profile
HarmonyOS Evolution History
HarmonySpace 5 Achieves Immersive Interaction Based on Five-Sense Synergy Technology
HarmonySpace Reshapes Multimodal Interaction Functions Based on Qianwu Large Model
Qianwu Interaction Characteristics (1): Enhances Xiaoyi Voice Capabilities + In-vehicle Sensing + Visual Perception Capabilities to Achieve Unconscious Interaction
Qianwu Interaction Characteristics (2): Supports Millimeter-Level Precise Perception and Full-Cabin Multimodal Human Body Perception
Featured Interaction Function: Multimodal Monitoring System to Create Driver Incapacitation Assistance Function
4.7 Baidu
Profile
Apollo Super Cockpit: Building Agents with Full-Sense Fusion, Global Planning and Full-Domain Execution
Intelligent Cockpit Deeply Integrates End-to-End Cross-Modal AI Voice
Intelligent Cockpit Deeply Integrates End-to-End Cross-Modal AI Voice: Launches Xiaodu Ideal Agent
Intelligent Cockpit Deeply Integrates End-to-End Cross-Modal AI Voice: In-vehicle Application Case
4.8 Banma Zhixing
Profile
Banma Zhixing Released Intelligent Cockpit AI Technology Brand - Yan AI
Yan AI Released "One Rocket, Ten Satellites" Interactive Agents
Banma Zhixing Released Yan AI Hybrid E2E Framework
Banma Zhixing First Launched Full-Modal On-device Large Model Vehicle-Mounted Solution AutoOmni
Banma Zhixing Initiated the "AI Vehicle-Mounted Platform Service Ecological Alliance" with Ecosystem Partners

5 Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models

5.1 Summary of Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models (1)
5.1 Summary of Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models (2)
5.1 Summary of Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models (3)
5.1 Summary of Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models (4)
5.1 Summary of Application Cases of Multimodal Interaction Solutions for Typical Vehicle Models (5)
5.2 All-New IM L6: Panoramic Summary of Multimodal Interaction Functions
5.2 All-New IM L6: Analysis of Featured Modal Interaction Capabilities
5.3 Fangchengbao Bao 8: Panoramic Summary of Multimodal Interaction Functions
5.3 Fangchengbao Bao 8: Analysis of Featured Modal Interaction Capabilities
5.4 Hongqi Jinkuihua Guoya: Panoramic Summary of Multimodal Interaction Functions
5.4 Hongqi Jinkuihua Guoya: Analysis of Featured Modal Interaction Capabilities (1)
5.4 Hongqi Jinkuihua Guoya: Analysis of Featured Modal Interaction Capabilities (2)
5.4 Hongqi Jinkuihua Guoya: Analysis of Featured Modal Interaction Capabilities (3)
5.5 Denza N9: Panoramic Summary of Multimodal Interaction Functions
5.5 Denza N9: Analysis of Featured Modal Interaction Capabilities (1)
5.5 Denza N9: Analysis of Featured Modal Interaction Capabilities (2)
5.6 Zeekr 9X: Panoramic Summary of Multimodal Interaction Functions
5.6 Zeekr 9X: Analysis of Featured Modal Interaction Capabilities
5.7 Geely Galaxy A7: Panoramic Summary of Multimodal Interaction Functions
5.8 Leapmotor B10: Panoramic Summary of Multimodal Interaction Functions
5.9 Li i6: Panoramic Summary of Multimodal Interaction Functions
5.9 Li i6: Analysis of Featured Modal Interaction Capabilities (1)
5.9 Li i6: Analysis of Featured Modal Interaction Capabilities (2)
5.10 Xpeng G7: Panoramic Summary of Multimodal Interaction Functions
5.10 Xpeng G7: Analysis of Featured Modal Interaction Capabilities
5.11 Xiaomi YU7: Panoramic Summary of Multimodal Interaction Functions
5.11 Xiaomi YU7: Analysis of Featured Modal Interaction Capabilities
5.12 MAEXTRO S800: Panoramic Summary of Multimodal Interaction Functions
5.12 MAEXTRO S800: Analysis of Featured Modal Interaction Capabilities (1)
5.12 MAEXTRO S800: Analysis of Featured Modal Interaction Capabilities (2)
5.12 MAEXTRO S800: Analysis of Featured Modal Interaction Capabilities (3)
5.13 2025 AITO M9: Panoramic Summary of Multimodal Interaction Functions
5.13 2025 AITO M9: Analysis of Featured Modal Interaction Capabilities (1)
5.13 2025 AITO M9: Analysis of Featured Modal Interaction Capabilities (2)
5.13 A2025 AITO M9: Analysis of Featured Modal Interaction Capabilities (3)
5.13 2025 AITO M9: Analysis of Featured Modal Interaction Capabilities (4)
5.14 All-New BMW X3 M50: Panoramic Summary of Multimodal Interaction Functions
5.14 All-New BMW X3 M50: Analysis of Featured Modal Interaction Capabilities
5.15 2026 Audi E5 Sportback: Panoramic Summary of Multimodal Interaction Functions
5.15 2026 Audi E5 Sportback: Analysis of Featured Modal Interaction Capabilities (1)
5.15 2026 Audi E5 Sportback: Analysis of Featured Modal Interaction Capabilities (2)
5.16 All-New Mercedes-Benz Electric CLA: Panoramic Summary of Multimodal Interaction Functions
5.16 All-New Mercedes-Benz Electric CLA: Analysis of Featured Modal Interaction Capabilities

6 Summary and Development Trends of Multimodal Interaction

6.1 Summary of Large Model Configuration Parameters of OEMs
6.2 Trend 1: Evolution of Multimodal Interaction Based on AI Large Models
Vehicle Scenario Applications Under Multimodal Integration
Cases
6.3 Trend 2
Cockpit Scenario Application Cases
Application Cases
6.4 Trend 3 (Voice Interaction)
6.5 Trend 4 (Visual Interaction)