자동차 메모리 칩 산업과 그 기반 모델에 대한 영향(2025년)
Research Report on Automotive Memory Chip Industry and Its Impact on Foundation Models, 2025
상품코드 : 1721396
리서치사 : ResearchInChina
발행일 : 2025년 04월
페이지 정보 : 영문 580 Pages
 라이선스 & 가격 (부가세 별도)
US $ 4,500 ₩ 6,444,000
Unprintable PDF (Single User License) help
PDF 보고서를 1명만 이용할 수 있는 라이선스입니다. 인쇄 불가능하며, 텍스트의 Copy&Paste도 불가능합니다.
US $ 6,800 ₩ 9,738,000
Printable & Editable PDF (Enterprise-wide License) help
PDF 보고서를 동일 기업의 모든 분이 이용할 수 있는 라이선스입니다. 인쇄 가능하며 인쇄물의 이용 범위는 PDF 이용 범위와 동일합니다.


한글목차

2D+CNN 소규모 모델에서 BEV+Transformer 기반 모델로, 모델 파라미터 수가 급증하고 메모리가 성능의 병목현상이 발생합니다.

세계 자동차 메모리 칩 시장 규모는 2023년 약 43억 달러에서 2030년까지 170억 달러를 넘어설 것으로 예상되며, 이 기간 동안 최대 22%의 CAGR로 성장할 것으로 예상됩니다. 자동차 메모리 칩은 2023년 자동차 반도체의 8.2%의 점유율을 차지했으며, 2030년까지 17.4%로 상승할 것으로 예상되며, 이는 메모리 칩의 비용이 크게 상승하고 있음을 보여줍니다.

자동차 메모리 칩 산업 발전의 주요 촉진요인은 자동차 LLM의 급속한 부상입니다. 이전 2D+CNN 소규모 모델에서 BEV+Transformer 기반 모델로 모델 매개변수 수가 크게 증가하여 컴퓨터 수요가 급증하고, CNN 모델의 매개변수 수는 일반적으로 1,000만 개 이하이지만, 기반 모델(LLM)의 매개변수 수는 일반적으로 70억-2,000억입니다. 디스트리뷰션 후에도 자동차 모델은 수십억 개의 파라미터를 가질 수 있습니다.

컴퓨팅 관점에서 볼 때, BEV+Transformer의 기본 모델(일반적으로 LLaMA 디코더 아키텍처의 모델)에서는 Softmax 연산자가 중심적인 역할을 수행합니다. 기존의 컨볼루션 연산에 비해 병렬화 능력이 약하기 때문에 메모리가 병목현상이 발생합니다. 특히 GPT와 같이 메모리를 많이 사용하는 모델은 메모리 대역폭에 대한 요구가 높기 때문에 시중에 나와 있는 일반적인 자율주행 SoC는 종종 '메모리의 벽' 문제에 직면하게 됩니다.

엔드 투 엔드는 기본적으로 작은 LLM을 심습니다. 공급되는 데이터의 양이 증가함에 따라 기반 모델의 매개변수는 계속 증가합니다. 초기 모델 크기는 약 100억 개의 매개변수이며, 지속적인 반복을 통해 최종적으로 1,000억 개가 넘습니다.

2025년 4월 15일, XPeng은 AI 공유 행사에서 720억 매개변수 초대형 자율주행 모델 'XPeng World Foundation Model'을 개발 중이라고 처음으로 밝혔습니다. 매개변수를 가진 모델에서는 스케일링의 법칙 효과가 뚜렷하게 나타나며, 매개변수 규모가 커질수록 모델의 능력이 향상됩니다. 같은 크기의 모델이라도 학습 데이터가 많을수록 모델의 성능은 향상됩니다.

멀티모달 모델 학습의 주요 병목 현상은 GPU뿐만 아니라 데이터 접근의 효율성에 있으며, XPeng은 기초 데이터 인프라(Data Infra)를 독자적으로 개발하여 학습 데이터 업로드 용량을 22배, 데이터 대역폭을 15배 증가시켰으며, GPU/CPU와 네트워크 I/O를 모두 최적화하여 모델 학습 속도를 5배 향상시켰습니다. GPU/CPU와 네트워크 I/O를 모두 최적화하여 모델 훈련 속도를 5배 향상시켰습니다. 현재 XPeng은 최대 2,000만 개의 비디오 클립을 사용하여 기본 모델을 훈련하고 있으며, 이 수치는 올해 안에 2억 개까지 증가할 예정입니다.

향후 XPeng은 클라우드 상에서 소규모 모델을 디스트리뷰션하여 "XPeng World Foundation Model"을 자동차에 배포할 계획입니다. 자동차 기반 모델의 매개변수 규모는 앞으로도 계속 확대될 것이며, 이는 컴퓨팅 칩과 메모리에 큰 도전이 될 것입니다. 이에 대응하기 위해 XPeng은 일반 자동차용 고성능 칩보다 20% 더 높은 이용률을 자랑하며, 최대 300억 개의 파라미터를 가진 기초 모델을 처리할 수 있는 튜링 AI 칩을 자체 개발했습니다. 이에 비해 Li Auto의 현재 VLM(Vision-Language Model)의 파라미터는 약 22억 개입니다.

모델의 매개변수가 증가함에 따라 추론 대기 시간이 길어지는 경우가 많습니다. 튜링 AI 칩은 다중 채널 설계와 첨단 패키징 기술을 통해 메모리 대역폭을 크게 개선하여 300억 개의 매개변수 기반 모델의 로컬 작동을 지원할 수 있을 것으로 기대됩니다.

본 보고서는 중국의 자동차 메모리 칩 산업에 대해 조사 분석했으며, 자동차 메모리 칩의 개발 동향, 응용 동향, 기술 동향, 웨이퍼 제조업체 및 제품 제조업체 등의 정보를 전해드립니다.

목차

제1장 자동차 메모리 칩 산업 개요

제2장 다양한 응용 시나리오의 자동차 메모리 칩 개발 동향

제3장 자동차 메모리 칩 생산, 테스트, 인증, 로컬라이즈 진행

제4장 자동차 메모리 칩 기술 동향 : 제품 부문별

제5장 자동차 메모리 칩 웨이퍼 제조업체

제6장 자동차 메모리 칩 제품 제조업체

ksm
영문 목차

영문목차

Research on automotive memory chips: driven by foundation models, performance requirements and costs of automotive memory chips are greatly improved.

From 2D+CNN small models to BEV+Transformer foundation models, the number of model parameters has soared, making memory a performance bottleneck.

The global automotive memory chip market is expected to be worth over USD17 billion in 2030, compared with about USD4.3 billion in 2023, with a CAGR up to 22% during the period. Automotive memory chips took an 8.2% share in automotive semiconductor value in 2023, a figured projected to rise to 17.4% in 2030, indicating a substantial increase in memory chip costs.

The main driver for the development of the automotive memory chip industry lies in the rapid rise of automotive LLMs. From the previous 2D+CNN small models to BEV+Transformer foundation models, the number of model parameters has significantly increased, leading to a surge in computing demands. CNN models typically have fewer than 10 million parameters, while foundation models (LLMs) generally range from 7 billion to 200 billion parameters. Even after distillation, automotive models can still have billions of parameters.

From a computing perspective, in BEV+Transformer foundation models, typically those with LLaMA decoder architecture, the Softmax operator plays a core role. Its weaker parallelization capability than that of traditional convolution operators makes memory the bottleneck. Especially memory-intensive models like GPT pose high requirements for memory bandwidth, and common autonomous driving SoCs on market often face the problem of "memory wall".

End-to-end essentially embeds a small LLM. With the increasing amount of data fed, the parameters of the foundation model will continue to grow. The initial model size is around 10 billion parameters, and through continuous iteration, it will eventually exceed 100 billion.

On April 15, 2025, at its AI sharing event, XPeng disclosed for the first time that it is developing XPeng World Foundation Model, a 72-billion-parameter ultra-large autonomous driving model. XPeng's experimental results show that the scaling law effect is evident in models with 1 billion, 3 billion, 7 billion, and 72 billion parameters: the larger the parameter scale, the greater the model's capabilities. For models of the same size, the more training data, the greater the model's performance.

The main bottleneck in multimodal model training is not only GPUs but also the efficiency of data access. XPeng has independently developed underlying data infrastructure (Data Infra), increasing data upload capacity by 22 times, and data bandwidth by 15 times in training. By optimizing both GPU/CPU and network I/O, the model training speed has been improved by 5 times. Currently, XPeng uses up to 20 million video clips to train its foundation model, a figure that will increase to 200 million this year.

In the future, XPeng will deploy the "XPeng World Foundation Model" to vehicles by distilling small models over the cloud. The parameter scale of automotive foundation models will only continue to grow, posing significant challenges to computing chips and memory. To address this, XPeng has self-developed Turing AI chip, which boasts a utilization 20% higher than general automotive high-performance chips and can handle foundation models with up to 30B (30 billion) parameters. In contrast, Li Auto's current VLM (Vision-Language Model) has about 2.2 billion parameters.

More model parameters often come with higher inference latency. How to solve the latency problem is crucial. It is expected that the Turing AI chip may offer big improvements in memory bandwidth through multi-channel design or advanced packaging technology, so as to support the local operation of 30B-parameter foundation models.

Memory bandwidth determines the upper limit of inference computing speed. LPDDR5X is widely adopted but still falls short. GDDR7 and HBM may be put on the agenda.

Memory bandwidth determines the upper limit of inference computing speed. Assuming a foundation model has 7 billion parameters, at INT8 precision for automotive use, it occupies 7GB of storage. Tesla's first-generation FSD chip has memory bandwidth of 63.5GB/s, meaning it generates one token every 110 milliseconds, with a frame rate of lower than 10Hz, compared with the typical image frame rate of 30Hz in the autonomous driving field. Nvidia Orin with memory bandwidth of 204.5GB/s generates one token every 34 milliseconds (7GB ÷ 204.5GB/s = 0.0343s, about 34ms), barely reaching 30Hz (frame rate = 1 ÷ 0.0343s = 29Hz). Noticeably this only accounts for the time required for data transfer, completely ignoring the time for actual computation, so the real speed will be much lower than the data.

DRAM Selection Path (1): LPDDR5X will be widely adopted, and the LPDDR6 standard is still being formulated.

Apart from Tesla, all current automotive chips only support up to LPDDR5. The next step for the industry is to promote LPDDR5X. For example, Micron has launched a LPDDR5X + DLEP DRAM automotive solution, which has passed ISO26262 ASIL-D certification and meets critical automotive FuSa requirements.

Nvidia Thor-X already supports automotive LPDDR5X, with memory bandwidth increased to 273GB/s. It supports the LPDDR5X standard and PCIe 5.0 interface. Thor-X-Super has an astonishing memory bandwidth of 546GB/s, and utilizes 512-bit wide LPDDR5X memory to ensure extremely high data throughput. In reality, the Super version, like Apple's chip series, simply integrates two X chips into one package, but it is not expected to enter mass production in the short term.

Thor has multiple versions, with five currently known: 1. Thor-Super, with 2000T computing power; 2. Thor-X, with 1000T computing power; 3. Thor-S, with 700T computing power; 4. Thor-U, with 500T computing power; 5. Thor-Z, with 300T computing power. Lenovo's first Thor central computing unit in the world plans to adopt dual Thor-X chips.

Micron 9600MTPS LPDDR5X already has samples, targeting mobile devices, with no automotive-grade products available yet. Samsung's new LPDDR5X product, K3KL9L90DM-MHCU, empowers high performance from PCs, servers, vehicles, to emerging on-device AI applications. It delivers speeds 1.25 times faster and 25% better power efficiency compared to the previous generation, and has a maximum operating temperature of 105°C. Mass production started in early 2025. A single K3KL9L90DM-MHCU features 8GB and x32 bus, eight chips totaling 64GB.

As LPDDR5X gradually enters the era of 9600Mbps or even 10Gbps, JEDEC has started developing the next-generation LPDDR6 standard, targeting 6G communications, L4 autonomous driving, and immersive AR/VR scenarios. LPDDR6, as the next-generation memory technology, is expected to have a rate of over 10.7Gbps, even possibly up to 14.4Gbps, with improvements in both bandwidth and energy efficiency - 50% better than the current LPDDR5X. However, mass production of LPDDR6 memory may not occur until 2026. Qualcomm's next-generation flagship chip, Snapdragon 8 Elite Gen 2 (codenamed SM8850), will support LPDDR6. Automotive LPDDR6 may take even longer to arrive.

DRAM Selection Path (2): GDDR6 is already installed in vehicles but faces cost and power consumption issues. A GDDR7+LPDDR5X hybrid memory architecture may be viable.

Aside from LPDDR5X, another path is GDDR6 or GDDR7. Tesla's second-gen FSD chip already supports first-gen GDDR6. HW4.0 uses 32GB GDDR6 (model: MT61M512M32KPA-14) running at 1750MHz (the minimum LPDDR5 frequency is also above 3200MHz). Since it is the first-gen GDDR6, its speed is relatively low. Even with GDDR6, running 10 billion-parameter foundation models smoothly remains unfeasible, though it's currently the best available.

Tesla's third-gen FSD chip is likely under development and may be completed in late 2025, with support for at least GDDR6X.

The next-generation GDDR7 standard was officially released in March 2024, but Samsung had already unveiled the world's first GDDR7 in July 2023. Currently, both SK Hynix and Micron have introduced GDDR7 products. GDDR requires a special physical layer and controllers, and chips must have a built-in GDDR physical layer and controllers to use GDDR. Companies like Rambus and Synopsys sell relevant IPs.

Future autonomous driving chips may adopt hybrid memory architecture, for example, use GDDR7 for processing high-load AI tasks and LPDDR5X for low-power general computing, balancing performance and cost.

DRAM Selection Path (3): HBM2E is already deployed in L4 Robotaxis but remains far from production passenger cars. Memory chip vendors are working on migration of HBM technology from data centers to edge devices.

High bandwidth memory (HBM) is primarily used in servers. Stacking SDRAM using TSV technology increases not only the cost of the memory itself, but also the cost of TSMC's CoWoS process. Currently CoWoS capacity is tight and expensive. HBM has a much higher price than LPDDR5X, LPDDR5, and LPDDR4X commonly used in production passenger cars, and is not economical.

SK Hynix's HBM2E is being exclusively used in Waymo's L4 Robotaxis, offering 8GB capacity, transmission rate of 3.2Gbps, and impressive bandwidth of 410GB/s, setting a new industry benchmark.

SK Hynix is currently the only vendor capable of supplying HBMs that meet stringent AEC-Q automotive standards. SK Hynix is actively collaborating with autonomous driving solution giants like NVIDIA and Tesla to expand HBM applications from AI data centers to intelligent vehicles.

Both SK Hynix and Samsung are working to migrate HBM from data centers to edge devices like smartphones and cars. Adoption of HBMs in mobile devices will focus on improving edge AI performance and low-power design, driven by technological innovation and industry chain synergy. Cost and yield remain the primary short-term challenges, mainly involving HBM production process improvement.

Key Differences: Traditional data center HBM is a "high bandwidth, high power consumption" solution designed for high-performance computing, while on-device HBM is a "moderate bandwidth, low power consumption" solution tailored for mobile devices.

Technology Path: Traditional data center HBM relies on TSV and interposers, whereas on-device HBM achieves performance breakthroughs through packaging innovations (e.g., vertical wire bonding) and low-power DRAM technology.

For example, Samsung's LPW DRAM (Low-Power Wide I/O DRAM) uses similar technology, offering low latency and up to 128GB/s bandwidth while consuming only 1.2pJ/b. It is expected to enter mass production during 2025-2026.

LPW DRAM significantly increases I/O interfaces by stacking LPDDR DRAM to achieve the dual goals of improving performance and reducing power consumption. Its bandwidth can exceed 200GB/s, 166% higher than LPDDR5X. Its power consumption is reduced to 1.9pJ/bit, 54% lower than LPDDR5X.

UFS 3.1 has already been widely adopted in vehicles and will gradually iterate to UFS 4.0 and UFS 5.0, while PCIe SSD will become the preferred choice for L3/L4 high-level autonomous vehicles.

At present, high-level autonomous vehicles generally adopt UFS 3.1 storage. As vehicle sensors and computing power advance, higher-specification data transmission solutions are imperative, and UFS 4.0 products will become one of the mainstream options in the future. UFS 3.1 offers a maximum speed of 2.9GB/s, which is dozens of times lower than SSD. The next-generation version UFS 4.0 will reach 4.2GB/s, providing higher speed while reducing power consumption by 30% compared to UFS 3.1. By 2027, UFS 5.0 is expected to arrive with speeds of around 10GB/s, still much lower than SSD, but with the advantages of controllable costs and a stable supply chain.

Given the strong demand for foundation models from both cockpit and autonomous driving, and to ensure sufficient performance headroom, SSD should be adopted instead of the current mainstream UFS (which is not fast enough) or eMMC (which is even slower). Automotive SSD uses the PCIe standard, which offers tremendous flexibility and potential. JESD312 defines the PCIe 4.0 standard, which actually includes multiple rates. 4 lanes is the lowest PCIe 4.0 standard, and 16-lane duplex can reach 64GB/s. PCIe 5.0 was released in 2019, doubling the signaling rate to 32GT/s, with x16 full-duplex bandwidth approaching 128GB/s.

Currently, both Micron and Samsung offer automotive-grade SSD. Samsung AM9C1 Series ranges from 128GB to 1TB, while Micron 4150AT Series comes in 220GB, 440GB, 900GB, and 1800GB capacities. The 220GB version is suitable for standalone cockpit or intelligent driving, while cockpit-driving integration requires at least 440GB.

Multi-port BGA SSD can serve as a centralized storage and computing unit in vehicles, connecting via multiple ports to SoCs for cockpit, ADAS, gateways, and more. It efficiently processes and stores different types of data in designated areas. Its benefit of independence ensures that non-core SoCs cannot access critical data without authorization, preventing interference, misidentification, or corruption of core SoC data. This maximizes data transmission isolation and independence and also reduces hardware cost of each SoC for vehicle storage.

For future L3/L4 high-level autonomous vehicles, PCIe 5.0 x4 + NVMe 2.0 will be the preferred choice for high-performance storage:

Ultra-high-speed transmission: Read speeds up to 14.5GB/s and write speeds up to 13.6GB/s, three times faster than UFS 4.0.

Low latency & high concurrency: Support higher queue depths (QD32+) for parallel processing of multiple data streams.

AI computing optimization: Combined with vehicle SoCs, accelerate AI inference computing to meet requirements of fully autonomous driving.

In autonomous driving applications, PCIe NVMe SSD can cache AI computing data, reducing memory access pressure and improving real-time processing capabilities. For example, Tesla's FSD system uses a high-speed NVMe solution to store autonomous driving training data to enhance perception and decision-making efficiency.

Synopsys has already launched the world's first automotive-grade PCIe 5.0 IP solution, which includes PCIe controller, security module, physical layer device (PHY), and verification IP, and complies with ISO 26262 and ISO/SAE 21434 standards. This means PCIe 5.0 will soon be available for automotive applications.

Table of Contents

1 Overview of Automotive Memory Chip Industry

2 Development Trends of Automotive Memory Chips in Various Application Scenarios

3 Production, Testing, Certification, and Localization Progress of Automotive Memory Chips

4 Technology Trends of Automotive Memory Chips by Product Segment

5 Automotive Memory Chip Wafer Manufacturers

6 Automotive Memory Chip Product Manufacturers

(주)글로벌인포메이션 02-2025-2992 kr-info@giikorea.co.kr
ⓒ Copyright Global Information, Inc. All rights reserved.
PC버전 보기