< span> At the Chengdu Auto Show that opened today, Li Auto shared the latest progress in the end-to-end model, VLM visual language model and world model autonomous driving technology architecture. At the press conference, Li Auto officially announced that the Li Auto end-to-end + VLM experience group of 10,000 people will officially start recruiting today. In addition, the official stated that Ideal OTA6.2 will be officially launched in full today.

The source of the ideal autonomous driving theory is the theory of "Thinking, Fast and Slow". Nobel Prize winner in economics Daniel Kahneman elaborated on the concepts of System 1 and System 2 in cognitive psychology in "Thinking, Fast and Slow", providing an important framework for understanding human cognitive models.

System 1 is actually the intuition formed by people based on their past experiences and habits, and can make quick decisions. System 2 is actually a thinking and reasoning ability. People need to think or reason to solve such complex problems and deal with unknown scenarios. In short, System 1 and System 2 cooperate with each other to form the basis for human beings to recognize and understand the world and make decisions.

How are System 1 and System 2 applied to autonomous driving?

System 1 is implemented by an end-to-end model (E2E) and is directly used to quickly respond to routine driving problems. System 2 is implemented by a visual language model (VLM), which includes the ability to think. We use the world model in the cloud to verify the capabilities of System 1 and System 2. The above three systems constitute the next-generation autonomous driving technology architecture of Li Auto.

The evolution process of ideal car system 1:

First generation: NPN. Adopting a modular design, including sensing, positioning, planning, navigation, NPN, etc., this generation of architecture supports us in launching urban NOA functions in 100 cities across the country.

Second generation: no graph, segmented end-to-end. It consists of only two models, namely perception and planning. The biggest change is the removal of NPN, which does not rely on a priori information, allowing us to truly drive it across the country and with navigation.

The third generation: end-to-end model, which is a One Model structure. There is only one model. The input is the sensor and the output is the driving trajectory.

Open the Bitauto App and search for "2024 Chengdu Auto Show" to view the latest auto show content

Editor in charge: Wu Haotian