Government Data Agency: Accelerating the Construction of Key Fields Such as Intelligent, Low-altitude Economy, and Biomanufacturing
【Great River Financial Network Message】On August 14th, the National Development and Reform Commission held a press conference to introduce the development achievements of China's digital infrastructure during the "14th Five-Year Plan" period.
Wang Sheng, a member of the Party Committee and Director of the National Data Bureau, said that experts have said that computing power is the skeleton, algorithms are the nerves, and data is the blood. As one of the three core elements of artificial intelligence development, data plays a key role in promoting "AI+" processes, especially the construction of high-quality data sets. For example, in the field of medical health, through annotated medical image datasets, the diagnostic accuracy rate of disease diagnosis models can be improved by more than 15%. In the era of artificial intelligence, tokens, which are commonly referred to as word units, serve as the smallest data unit for processing text, like internet traffic. As of June this year, China's daily token consumption has exceeded 300 billion, a growth of over 300 times in just half a year, reflecting the rapid growth of AI applications in China.
China's rapid development of artificial intelligence is inseparable from its high-level attention to data work. China is the first country to treat data as a production factor and has taken multiple measures to promote the development and utilization of data resources. The National Data Bureau emphasizes that "AI+" actions will only be successful with the construction and promotion of high-quality data sets. It has issued relevant files for building high-quality data sets, and various departments have jointly promoted related work.
As of June this year, China has built more than 35,000 high-quality data sets, with a total volume exceeding 400PB (1PB can store approximately 5 billion 2MB-sized high-definition photos). The training of AI models has also driven the growth of data trading demand. As of June this year, the cumulative transaction value of high-quality data sets nationwide is nearly 4 billion yuan, and the total scale of high-quality data sets traded by institutions has reached 246PB. For example, Beijing's data exchange center has seen a proportion of high-quality data sets in transactions increase from 10% last year to nearly 80% this year. Shanghai, Tianjin, Anhui, and other regions are piloting new modes such as "data corpus valuation" and guiding enterprises to convert high-quality data sets into equity investments in relevant companies.
Chinese data plays a crucial role in the training performance enhancement of domestic large models. Everyone is very concerned about the proportion of Chinese data in training datasets, after several years of effort, most models trained in China now use more than 60% Chinese data, with some models reaching 80%. The development and supply capacity of high-quality Chinese data continues to grow, driving the rapid improvement of AI model performance in China.
Wang Sheng said that next, the National Data Bureau will continue to promote the construction of high-quality data sets through systemic arrangements and accelerate the construction of key fields such as intelligent, low-altitude economy, and biomanufacturing. It will also push for a nationwide recognition of the value of data elements and accelerate the development of data element value co-creation, cultivating a market consensus on "buying premium data."
Editor: Liu Anqi | Reviewer: Li Zhen | Supervisor: Guqin