Heterogene 3D System-on-Chip
Heterogeneous 3D integration allows to combine different manufacturing technologies in a single SoC. This is beneficial for application that integrate components with different requirements towards technology. Examples are; 3D Vision SoCs that integrate analog sensing, mixed-signal AD/conversion and digital image processing; or high-performance processors closely interconnecting logic dies optimized for compute and dies optimized for memory. At HTI, we optimize computer architectures in a co-deisgn with technology parameters for heterogeneous 3D SoCs. Please check out our works on NoCs that can be found in the section on interconnection networks.
As one example for an architectural optimization for 3D SoCs, we worked on hardware architectures for neural networks using 3D integration. With the ever increasing relevance of machine learning in all fields of science and industry, more efficient compute resources for these are required. As CNN workloads are massively parallel, intrinsically, so must be accelerators. An exemplary architecture is shown in the figure on the right At HTI, we research novel accelerator architecture for CNN accelerators enabled by 3D integration. This research is done in close collaboration with Prof. Krishna’s group at Georgia Institute of Technology, Atlanta, GA.
HTI evaluated the potential performance benefits from going 3D. The benefits for 3D integration are quite large. A speedup of up to a magnitude of 3D vs. 2D was possible with our architectures for state-of-the-art cloud workloads. 3D designs are often limited by thermal constraints, as the inner layers cannot dissipate heat. Area requirements for 3D are also very relevant, e.g., from keep-out-zones of vertical links. These technology-issues impact the feasibly of architectures so that a technology-architecture co-design is essential. HTI found less than 6% increased power for 3D CNN accelerators. The proposed architecture ran below 335K core temperature during full utilization. HTI identified that a sufficient compute resources are required to surpass the area costs with the improved performance vs 2D.