콕핏 AI 에이전트(2024년)
Cockpit AI Agent Research Report, 2024
상품코드 : 1518881
리서치사 : ResearchInChina
발행일 : 2024년 07월
페이지 정보 : 영문 200 Pages
 라이선스 & 가격 (부가세 별도)
US $ 2,500 ₩ 3,637,000
Unprintable PDF (Single User License) help
PDF 보고서를 1명만 이용할 수 있는 라이선스입니다. 인쇄 불가능하며, 텍스트의 Copy&Paste도 불가능합니다.
US $ 4,200 ₩ 6,110,000
Printable & Editable PDF (Enterprise-wide License) help
PDF 보고서를 동일 기업의 모든 분이 이용할 수 있는 라이선스입니다. 인쇄 가능하며 인쇄물의 이용 범위는 PDF 이용 범위와 동일합니다.


한글목차

콕핏 에이전트의 핵심 역량

기반 모델 시대의 AI 에이전트는 LLM을 기반으로 하며, 강력한 추론을 통해 AI 에이전트의 적용 시나리오를 확장하고, 작동 중에 얻은 피드백을 통해 기반 모델의 사고 능력을 향상시킬 수 있습니다. 콕핏에서 에이전트의 능력 패러다임은 크게 '이해', '계획', '도구 사용', '성찰'로 나뉩니다.

에이전트가 처음 자동차를 타기 시작하면 인지 능력과 계획 능력이 더욱 중요해집니다. 작업 목표에 대한 이해와 실행 경로의 선택은 성능 결과의 정확성을 직접적으로 결정하고, 결과적으로 에이전트의 시나리오 활용률에 영향을 미칩니다.

예를 들어, 샤오미의 음성 대화 과정에서 의미 이해는 자동차 음성 처리 과정 전반의 어려움으로, 샤오미는 의미 분석 모델을 통해 의미 분석을 처리합니다.

에이전트 양산 이후에는 사용자가 시나리오 모드를 커스터마이징할 수 있는 개인화된 콕핏이 핵심이며, 이 단계에서 통찰력이 가장 중요한 핵심 역량이 되기 때문에 지속적으로 학습하고 최적화하는 에이전트 워크플로우를 구축해야 합니다.

예를 들어, Li Auto가 제공하는 Lixiang Tongxue는 한 문장 시나리오 작성을 지원합니다. 이는 Mind GPT의 내장 메모리 네트워크와 온라인 강화 학습 기능을 통해 지원되며, Mind GPT는 과거 대화를 기반으로 개인화된 선호도와 습관을 기억할 수 있습니다. 유사한 시나리오가 반복되는 경우, 과거 데이터를 통해 시나리오의 매개 변수를 자동으로 설정하여 사용자의 원래 의도에 맞게 조정할 수 있습니다.

AI OS의 아키텍처 설정 수준에서는 SAIC Z-One을 예로 들어 설명합니다.

Z-One은 커널 계층에서 LLM 커널(LLM OS)에 접근하여 AI OS SDK와 ASF의 인터페이스를 각각 오리지널 마이크로 커널과 제어하고, AI OS SDK는 LLM의 스케줄링을 받아 용도 계층의 Agent 서비스 프레임워크를 촉진합니다. Z-One AI OS 아키텍처는 AI와 CPU를 고도로 통합하고, SOA 아토믹 서비스를 통해 AI는 차량의 센서, 액추에이터, 컨트롤러에 연결됩니다. 단말-클라우드 기반 모델을 기반으로 하는 이 아키텍처는 단말 측 기반 모델의 컴퓨팅 파워를 강화하여 운영 대기 시간을 단축할 수 있습니다.

콕핏 AI 에이전트의 적용 난이도

에이전트는 사용자와 연결하여 명령을 실행합니다. 응용 과정에서는 자동차에 기본 모델을 탑재하는 기술적 어려움과 더불어 시나리오의 어려움에 직면하게 됩니다. 명령 수신-의미 분석-의도 추론-작업 수행 과정에서 성능 결과의 정확성과 인간과 컴퓨터의 상호 작용의 지연은 사용자의 탑승 경험에 직접적인 영향을 미칩니다.

인터랙션의 인간화

예를 들어, '감정적 컨설턴트' 시나리오에서 에이전트는 자동차 소유자와 감정적으로 공감하고 의인화해야 합니다. 일반적으로 AI 에이전트의 의인화에는 물리적 의인화, 성격 의인화, 정서적 의인화의 세 가지 형태가 있습니다.

이 보고서는 중국 자동차 산업에 대한 조사 분석을 통해 콕핏 AI 에이전트의 기술 및 응용에 대한 정보를 제공합니다.

목차

제1장 콕핏 AI 에이전트 서론

제2장 AI 에이전트 기술 구현 경로

제3장 공급업체 콕핏 AI 에이전트 용도 분석

제4장 OEM 콕핏 AI 에이전트 용도 분석

LSH
영문 목차

영문목차

Cockpit AI Agent: Autonomous scenario creation becomes the first step to personalize cockpits

In AI Foundation Models' Impacts on Vehicle Intelligent Design and Development Research Report, 2024, ResearchInChina mentioned that the core of an AI Agent uses a large language model (LLM) as its core computing engine (LLM OS). In the AI service framework, the LLM acts as AI core and the Agent acts as AI APP. With the help of reasoning and generation capabilities of AI foundation model, the Agent can create more cockpit scenarios, while further improving current multimodal interaction, voice processing and other technologies in the cockpit.

Compared with the traditional OS, AI OS is programmed in natural language, which is more suitable for human operation and convenient for customization and personalized human-computer interaction. The Agent responds to user prompts or instructions in natural language, and can perform a lot of tasks according to its interaction with physical or digital environment.

Foundation models spawn Agent customized scenarios

Since AI foundation models were installed in cars, they have been widely used in multimodal interaction scenarios. OEMs have started from voice assistants to further improve intelligent processing of user commands:

Before foundation models got on cars, the wake-up rate and effective instructions of voice assistants in cockpits are at a low level;

After foundation models were available in cars, not only has voice recognition become more intelligent, but the application of expression monitoring and gesture control have also been improved, with effective commands and daily active rates increasing significantly.

AI foundation models further enhance the user experience of smart cockpits through strong context processing and reasoning capabilities. Please refer to the following dialogue:

(Mobility scenario)

-**Tongxue! (the name of a voice assistant)

I am here!

There seems to be a prairie ahead. Please take a picture for me!

Wait a moment, and I will capture the beautiful scenariory of the prairie for you. I'll let you know when the photo is taken. Is there anything else I can help you with?

Send it to Moments immediately. I want my friends to have a look at this beautiful scenariory.

Ok, according to your authorized account, please choose the material and filter style.

Just this one, with a caption.

Ok, three captions have been generated for you. Which one do you like?

Just this one.

(You enter the office and receive a phone call)

There is a phone call from a customer, and I have transferred it to the IVI for you. Would you like to answer it?

Yes, and turn on the office mode.

Ok, I will generate an avatar of the customer for you and start the conversation. Do you have any questions or things to discuss? I will record the conversation and provide you with the minutes of the meeting after it is over.

(The avatar is generated)

Now you can start the conversation.

The above scenarios will not be materialized in the cockpit until 2024 when foundation models are installed on vehicles by some OEMs.

For example, IM L6 has built Carlog and City Drive scenarios to enable the AI foundation models to proactively recommend food and attractions and allow users to post them on social media:

Carlog: Actively perceive the scenario during driving through AI vision foundation model, mobilize four cameras to take photos, automatically save and edit them, and support one-click share in Moments.

City Drive: Cooperate with Volcengine to model nearby food, scenic spots and landmarks in real time in the digital screen, and push them according to users' habits and preferences.

The applicability of foundation models in various scenarios has stimulated users' demand for intelligent agents that can uniformly manage cockpit functions. In 2024, OEMs such as NIO, Li Auto, and Hozon successively launched Agent frameworks, using voice assistants as the starting point to manage functions and applications in cockpits.

Agent service frameworks can not only manage cockpit functions in a unified way, but also provide more abundant scenario modes according to customers' needs and preferences, especially supporting customized scenarios for users, which accelerates the advent of the cockpit personalization era.

For example, NIO's NOMI GPT allows users to set an AI scenario with just one sentence:

Core competence of cockpit Agents

AI Agents in the era of foundation models are based on LLMs, whose powerful reasoning expands the applicable scenarios of AI Agents that can improve the thinking capability of foundation models through feedback obtained during operation. In the cockpit, the Agent capability paradigm can be roughly divided into "Understanding" + "Planning" + "Tool Use" + "Reflection".

When Agents first get on cars, cognitive and planning abilities are more important. The understanding of task goals and the choice of implementation paths directly determine the accuracy of performance results, which in turn affect the scenario utilization rate of Agents.

For example, in Xiaomi's voice interaction process, semantic understanding is the difficulty of the entire automotive voice processing process. XiaoAi handles semantic parsing through a semantic parsing model.

After the mass production of Agents, the personalized cockpits that support users to customize scenario modes become the highlight, and Reflection becomes the most important core competence at this stage, so it is necessary to build an Agentic Workflow that is constantly learning and optimizing.

For example, Lixiang Tongxue offered by Li Auto supports the creation of one-sentence scenarios. It is backed by Mind GPT's built-in memory network and online reinforcement learning capabilities. Mind GPT can remember personalized preferences and habits based on historical conversations. When similar scenarios recur, it can automatically set scenario parameters through historical data to fit the user's original intentions.

At the AI OS architecture setting level, we take SAIC Z-One as an example:

Z-One accesses the LLM kernel (LLM OS) at the kernel layer, which controls the interfaces of AI OS SDK and ASF with the original microkernel respectively, in which AI OS SDK receives the scheduling of the LLM to promote the Agent service framework of the application layer. The Z-One AI OS architecture highly integrates AI with CPU. Through SOA atomic services, AI is then connected to the vehicle's sensors, actuators and controllers. This architecture, based on a terminal-cloud foundation model, can enhance the computing power of the terminal-side foundation model and reduce operational latency.

Application Difficulty of Cockpit AI Agents

Agents connect to users and execute commands. In the application process, in addition to the technical difficulties of putting foundation models on cars, they also face scenario difficulties. In the process of command reception-semantic analysis-intention reasoning-task execution, the accuracy of the performance results and the delay in human-computer interaction directly affect the user's riding experience.

Humanization of interaction

For example, in the "emotional consultant" scenario, Agents should resonate emotionally with car owners and perform anthropomorphism. Generally, there are three forms of anthropomorphism of AI Agents: physical anthropomorphism, personality anthropomorphism, and emotional anthropomorphism.

Foundation model performance

In the "encyclopedia question and answer" scenario, Agents may be unable to answer the user's questions, especially open questions, accurately because of LLM illusion after semantic analysis, database search, answer generation and the like.

Current solutions include advanced prompting, RAG+knowledge graph, ReAct, CoT/ToT, etc., which cannot completely eliminate "LLM illusion". In the cockpit, external databases, RAG, self-consistency and other methods are more often used to reduce the frequency of "LLM illusion".

Some foundation model manufacturers have improved the above solutions. For example, Meta has proposed to reduce "LLM illusion" through Chain-of-Verification (CoVe). This method breaks down fact-checking into more detailed sub-questions to improve response accuracy and is consistent with the human-driven fact-checking process. It can effectively improve the FACTSCORE indicator in long-form generation tasks.

Table of Contents

1 Introduction to Cockpit AI Agent

2 AI Agent Technology Implementation Path

3 Application Analysis of Cockpit AI Agent of Suppliers

4 Application Analysis of Cockpit AI Agent of OEMs

(주)글로벌인포메이션 02-2025-2992 kr-info@giikorea.co.kr
ⓒ Copyright Global Information, Inc. All rights reserved.
PC버전 보기