HC20 (2008)

Tutorials, Sunday, August 24, 2008

Chair: Chuck Moore, AMD
Author(s): Chuck Moore, AMD; Craig Hampel, Rambus; Jerry Bautista,Intel; Fritz Kruger, AMDPresentation: High Bandwidth Memory Technology & Systems Implications.

Abstract: As multicore CPUs and manycore GPUs continue to rapidly double their processor counts, the challenge is to develop mainstream application software that inherently scales its parallelism to leverage ever more processor cores. The CUDA scalable parallel programming model provides readily understood abstractions – a hierarchy of thread groups, shared memories, and barrier synchronization – that provide a clear parallel structure to conventional C code for one thread of the hierarchy. Learn how developers have made a wide range of CUDA applications scale transparently to hundreds of processor cores and thousands of concurrent threads. CUDA is a minimal extension of C/C++ applicable to both GPUs and CPUs.

  • Introduction & Motivating Issues (Chuck Moore, AMD) PDF
  • Terabyte Bandwidth Initiative – Architectural Considerations for Next-Generation Memory Systems (Craig Hampel, Rambus) PDF
  • Tera-scale Computing and Interconnect Challenges – 3D Stacking Considerations (Jerry Bautista, Intel) PDF
  • System Architecture Implications and Perspective (Fritz Kruger, AMD)PDF
Afternoon Tutorial Chair: John NickollsAuthor(s):Ian Buck, Michael Garland, Patrick Legresley, Massimiliano Fatica,NVIDIA; Wen-mei Hwu, Univ. of IllinoisPresentation: Scalable Parallel Programming with CUDA.

Abstract: As Moore’s Law enables us to pack more CPUs and other computing devices onto future chips, addressing the “Memory Wall” takes on a whole new level of importance. The combination of larger working sets, multiple working sets, and bandwidth hungry offload computing devices take a difficult situation and make it worse. This tutorial will introduce these challenges, and present several potential technology solutions, as well as the associated system-level implications.

  • CUDA Schedule PDF
  • Presenter’s Biographies PDF
  • Introduction and Scaling Demonstrations PDF
  • Scalable Parallel Programming PDF
  • Toolkit and Libraries PDF
  • Performance Optimizations PDF
  • Application Development Experience PDF
  • Directions PDF
  • Panel Discussions PDF
  • Demos PDF

Conference Day One

Session Monday, August 25, 2008
Opening Remarks Opening remarks

  • Message from the Chair PDF
  • Message from the Program Co-Chairs PDF
  • Computer History Museum PDF
  • Sponsors PDF
Session 1 Session One: Multi-Core TechnologiesSession Chair: Will Eaterton, CiscoPresentations:

  • MicroNetwork-Based Coherency: Extending Coherency over Standard Networks PDF
    Author(s): Bob Quinn, 3Leaf Sys.
  • The Roofline Model: A tool for Auto-tuning Kernels on Multicore Architectures PDF
    Author(s): Samuel Williams, David Patterson, Leonid Oliker, John Shalf, Katherine Yelick, UC Berkeley
  • Power-performance Comparative Evaluation of Alternate Microarchitectures PDF
    Author(s): Pradip Bose, Alan Weger, Victor Zyuban, Hendrik Hamann, Hans Jacobson, Richard Eickemeyer, John Griswell, IBM
Keynote 1 Keynote IKeynote Chair: Christos KozyrakisPresentation: Cars that drive themselves PDF

Sebastian Thurn, Stanford University

Session 2 Session Two: Video & MediaSession Chair: Pradeep DubeyPresentations:

  • SpursEngine – A High-Performance Stream Processor Derived from Cell/B.E. for Media Processing Acceleration PDF
    Author(s): Hiroo Hayashi, Toshiba
  • A 167-processor Array for Efficient DSP & Embedded Application Processing PDF
    Author(s): Dean Truong, W. Cheng, T. Mohsenin, Z. Yu, T. Jacobson, G Landge, M. Meeuwsen, C. Watnik, P. Mejia, A. Tran, J. Webb, E. Work, Z. Xiao, B. Baas, UC Davis
  • System Architecture and Applications of the PNX5100: A High-Performance Full HD 120Hz Progressive Post Processing Multicore Video Processor PDF
    Author(s): Johan Janssen, NXP Semi.
  • AMD mediaDSP: A Platform for Building Programmable Multicore Video Processor PDF
    Author(s): Richard Selvaggi, Larry Pearlstein, AMD
Session 3 Session Three: Mobile Media ProcessingSession Chair: Forest Baskett Presentations:

  • A 300-mW Single-Chip NTSC/PAL Television for Mobile Applications PDF
    Author(s): Samuel Sheng, D. Yee, S. Stoiber, P. Chi, H. Huang, A. Abo, L. Lynn, R. S. Narayanaswami, R. Contreras, R. Gupta, E. Macdonald,Telegent
  • Voice Processor Based on Human Hearing System PDF
    Author(s): Lloyd Watts, Dana Massie, Allen Sansano, James Huey,Audience
  • NVIDIA Tegra: Enabling Stunning Handheld Graphics & HD Video PDF
    Author(s): Michael Toksvig, John Mathieson, Brian Cabral, Brian Smith, NVIDIA
Session 4 Session Four: SupercomputingSession Chair: Ralph WittigPresentations:

  • PowerXCell 8i: A Cell Broadband Engine Implementation Enhanced for Supercomputing PDF
    Author(s): Brian Flachs, D. Brokenshire, K. Imming, T. Ozguner, S. Mueller, H. J. Oh, M. Boersma, E. Doan, K. Hirairi, R. Krentler, C. Durham, A. Huynh, R. Berry, IBM
  • A Specialized ASIC for Molecular Dynamics PDF
    Author(s): Martin. M. Deneroff, D. E. Shaw, R. O. Dror, J. Gagliardo, J. S. Kuskin, R. H. Larson, E. C. Priest, J. K. Salmon, C. Young, D.E. Shaw
Panel Discussion Panel Discussion: Ready, Fire, Aim – 20 years of hits & misses at Hot ChipsSession Chair: Nick Tredennick PDFPanelists: 
Nathan Brookwood PDF
John R. Mashey PDF
David Patterson PDF
Dave Ditzel PDF
Howard Sachs PDF
Michael Slater PDF

Abstract: Is computer engineering a logic-driven profession on a path of inexorable monotonic progress? It isn’t; computer engineering is as subject to fads and foibles as are the fashion and toy industries. To prove it, we’ll take a humorous, sarcastic, controversial, and embarrassing look at twenty years of hits and misses from Hot Chips conferences, primarily in microprocessor design. We’ve collected panelists from outspoken one-time designers, professors, or pundits who aren’t shy about making fun of other peoples’ (or their own) life’s work.

Conference Day Two

Session Tuesday, August 26, 2008
Session 5 Session Five: FPGAsSession Chair: Krste AsanovicPresentations:

  • Virtex-5 FXT, a New Field-Programmable Gate Array Platform PDF
    Authors(s): Peter Alfke, Xilinx
  • New 40nm High Performance FPGA and ASIC Common Platform PDF
    Authors(s): Dan Mansur, Altera
  • MAXware: Acceleration in HPC PDF
    Authors(s):Michael Flynn, Rob Dimond, Oskar Mencer, Oliver Pell,Maxeler
Session 6 Session Six: PC ChipsSession Chair: John SellPresentation:

  • AMD 780G, an x86 Chipset with Advanced Integrated GPU PDF
    Authors(s): Niles Burbank, T. Asaro, D. Cherepacha, D. Sinclair, J. Chappel, M. Tressidar, P. Ng, C. Klement, J. Bruno, L. Sinclair, C. Kuan, AMD
  • Micro-architecture of Godson-3 Multi-Core Processor PDF
    Authors(s): Weiwu Hu, Xiang Gao, Yunji Chen, Institute of Computing Technology, Chinese Academy of Sciences
  • Inside Intel® Core™ Microarchitecture (Nehalem) PDF
    Authors(s): Ronak Singhal, Intel
Keynote 2 Keynote II: SunPower’s History and Technology PDFKeynote Chair: Forest BaskettPresenter: Richard Swanson, SunPower
Session 7 Session Seven: NetworkingSession Chair: Jose RenauPresentations:

  • Low Cost Chipset for Broadband Powerline Communications at 200 Mbps PDF
    Authors(s): Chano Gomez, DS2
  • The QFP Packet Processing Chip Set PDF
    Authors(s): Donald Steiss, Will Eatherton, James Markevitch, Cisco
Session 8 Session Eight: Visual ComputingSession Chair: Marc TremblayPresentations:

  • NVIDIA GTX200: TeraFLOPS Visual Computing PDF
    Authors(s): John Tynefield, Luke Chang, Stuart Oberman, NVIDIA
  • Larrabee: A Many-Core x86 Architecture for Visual Computing PDF
    Authors(s): Doug Carmean, Intel
Session 9 Session Nine: Server ChipsSession Chair: Alan Jay SmithPresentations:

  • Tukwila: A Quad-Core Intel(R) Itanium(R) Processor PDF
    Authors(s):Eric DeLano, Intel
  • SPARC64VII: Fujitsu’s Next Generation Quad-Core Processor PDF
    Authors(s): Takumi Maruyama, Fujitsu
  • Rock: A third Generation 65nm, 16-Core, 32 Thread + 32 Scout-Threads CMT SPARC Processor PDF
    Authors(s): Shailender Chaudhry, Sun