Recent #GPU news in the semiconductor industry
➀ Nvidia's Ian Buck discusses the shift from focusing on single chips to integrated systems;
➁ The transformation of data centers into AI factories due to the rise of AI;
➂ The increasing power and cooling requirements of GPUs and Nvidia's involvement in the Coolerchips project.
➀ The evolution of rendering technology with GPU hardware upgrades from GeForce 256 to RTX with ray tracing;
➁ The perspective of software developers in understanding rendering and the need for a deeper understanding from the GPU architecture angle;
➂ Overview of GPU architecture including global memory, stream multiprocessors (SM), and their components.
➀ AMD has detailed the Instinct MI300X at Hot Chips 2024, with MI325X expected to be released soon.
➁ MI300X is a significant revenue source for AMD with over $4 billion in sales in the AI industry.
➂ AMD has acquired ZT Systems, the manufacturer of the Microsoft Azure MI300X platform.
➃ MI300X features a 192MB HBM3, a multi-chiplet chip for computing applications.
➄ AMD's CDNA 3 architecture has evolved with 8-stack HBM3 memory arrays, reaching 192GB in capacity.
➅ MI300X can operate as a single partition or across different memory and compute partitions.
➆ AMD's current major platform is the 8-way MI300X OAM platform.
➇ AMD discusses ROCm, which is improving.
➈ AMD's MI300X can compete with NVIDIA H100 in some cases.
➉ AMD is expected to release MI325X this year and Instinct MI350 288GB GPU in 2025.
➀ Computing power is an important indicator of a computer's information processing capability, with AI computing power focusing on AI applications, commonly measured in TOPS and TFLOPS, and provided by dedicated chips such as GPU, ASIC, and FPGA for algorithm model training and inference.
➁ AI chip accuracy is a way to measure computing power level, with FP16 and FP32 used in model training, and FP16 and INT8 used in model inference.
➂ AI chips typically use GPU and ASIC architectures. GPUs are the key components in AI computing due to their advantages in computation and parallel task processing.
➃ Tensor Core, an enhanced AI computing core compared to the parallel computation performance of Cuda Core, is more focused on the deep learning field and accelerates AI deep learning training and inference tasks through optimized matrix operations.
➄ TPUs, a type of ASIC designed for machine learning, stand out in high energy efficiency in machine learning tasks compared to CPUs and GPUs.