Tutorials, Sunday, August 24, 2008
Chair: Chuck Moore, AMD
Author(s): Chuck Moore, AMD; Craig Hampel, Rambus; Jerry Bautista,Intel; Fritz Kruger, AMDPresentation: High Bandwidth Memory Technology & Systems Implications.
Abstract: As multicore CPUs and manycore GPUs continue to rapidly double their processor counts, the challenge is to develop mainstream application software that inherently scales its parallelism to leverage ever more processor cores. The CUDA scalable parallel programming model provides readily understood abstractions – a hierarchy of thread groups, shared memories, and barrier synchronization – that provide a clear parallel structure to conventional C code for one thread of the hierarchy. Learn how developers have made a wide range of CUDA applications scale transparently to hundreds of processor cores and thousands of concurrent threads. CUDA is a minimal extension of C/C++ applicable to both GPUs and CPUs.
- Introduction & Motivating Issues (Chuck Moore, AMD)

- Terabyte Bandwidth Initiative – Architectural Considerations for Next-Generation Memory Systems (Craig Hampel, Rambus)

- Tera-scale Computing and Interconnect Challenges – 3D Stacking Considerations (Jerry Bautista, Intel)

- System Architecture Implications and Perspective (Fritz Kruger, AMD)

|
| Afternoon Tutorial |
Chair: John NickollsAuthor(s):Ian Buck, Michael Garland, Patrick Legresley, Massimiliano Fatica,NVIDIA; Wen-mei Hwu, Univ. of IllinoisPresentation: Scalable Parallel Programming with CUDA.
Abstract: As Moore’s Law enables us to pack more CPUs and other computing devices onto future chips, addressing the “Memory Wall” takes on a whole new level of importance. The combination of larger working sets, multiple working sets, and bandwidth hungry offload computing devices take a difficult situation and make it worse. This tutorial will introduce these challenges, and present several potential technology solutions, as well as the associated system-level implications.
- CUDA Schedule

- Presenter’s Biographies

- Introduction and Scaling Demonstrations

- Scalable Parallel Programming

- Toolkit and Libraries

- Performance Optimizations

- Application Development Experience

- Directions

- Panel Discussions

- Demos

|
Conference Day One
| Session |
Monday, August 25, 2008 |
| Opening Remarks |
Opening remarks
- Message from the Chair

- Message from the Program Co-Chairs

- Computer History Museum

- Sponsors

|
| Session 1 |
Session One: Multi-Core TechnologiesSession Chair: Will Eaterton, CiscoPresentations:
- MicroNetwork-Based Coherency: Extending Coherency over Standard Networks

Author(s): Bob Quinn, 3Leaf Sys.
- The Roofline Model: A tool for Auto-tuning Kernels on Multicore Architectures

Author(s): Samuel Williams, David Patterson, Leonid Oliker, John Shalf, Katherine Yelick, UC Berkeley
- Power-performance Comparative Evaluation of Alternate Microarchitectures

Author(s): Pradip Bose, Alan Weger, Victor Zyuban, Hendrik Hamann, Hans Jacobson, Richard Eickemeyer, John Griswell, IBM
|
| Keynote 1 |
Keynote IKeynote Chair: Christos KozyrakisPresentation: Cars that drive themselves
Sebastian Thurn, Stanford University |
| Session 2 |
Session Two: Video & MediaSession Chair: Pradeep DubeyPresentations:
- SpursEngine – A High-Performance Stream Processor Derived from Cell/B.E. for Media Processing Acceleration

Author(s): Hiroo Hayashi, Toshiba
- A 167-processor Array for Efficient DSP & Embedded Application Processing

Author(s): Dean Truong, W. Cheng, T. Mohsenin, Z. Yu, T. Jacobson, G Landge, M. Meeuwsen, C. Watnik, P. Mejia, A. Tran, J. Webb, E. Work, Z. Xiao, B. Baas, UC Davis
- System Architecture and Applications of the PNX5100: A High-Performance Full HD 120Hz Progressive Post Processing Multicore Video Processor

Author(s): Johan Janssen, NXP Semi.
- AMD mediaDSP: A Platform for Building Programmable Multicore Video Processor

Author(s): Richard Selvaggi, Larry Pearlstein, AMD
|
| Session 3 |
Session Three: Mobile Media ProcessingSession Chair: Forest Baskett Presentations:
- A 300-mW Single-Chip NTSC/PAL Television for Mobile Applications

Author(s): Samuel Sheng, D. Yee, S. Stoiber, P. Chi, H. Huang, A. Abo, L. Lynn, R. S. Narayanaswami, R. Contreras, R. Gupta, E. Macdonald,Telegent
- Voice Processor Based on Human Hearing System

Author(s): Lloyd Watts, Dana Massie, Allen Sansano, James Huey,Audience
- NVIDIA Tegra: Enabling Stunning Handheld Graphics & HD Video

Author(s): Michael Toksvig, John Mathieson, Brian Cabral, Brian Smith, NVIDIA
|
| Session 4 |
Session Four: SupercomputingSession Chair: Ralph WittigPresentations:
- PowerXCell 8i: A Cell Broadband Engine Implementation Enhanced for Supercomputing

Author(s): Brian Flachs, D. Brokenshire, K. Imming, T. Ozguner, S. Mueller, H. J. Oh, M. Boersma, E. Doan, K. Hirairi, R. Krentler, C. Durham, A. Huynh, R. Berry, IBM
- A Specialized ASIC for Molecular Dynamics

Author(s): Martin. M. Deneroff, D. E. Shaw, R. O. Dror, J. Gagliardo, J. S. Kuskin, R. H. Larson, E. C. Priest, J. K. Salmon, C. Young, D.E. Shaw
|
| Panel Discussion |
Panel Discussion: Ready, Fire, Aim – 20 years of hits & misses at Hot ChipsSession Chair: Nick Tredennick Panelists:
Nathan Brookwood 
John R. Mashey 
David Patterson 
Dave Ditzel 
Howard Sachs 
Michael Slater
Abstract: Is computer engineering a logic-driven profession on a path of inexorable monotonic progress? It isn’t; computer engineering is as subject to fads and foibles as are the fashion and toy industries. To prove it, we’ll take a humorous, sarcastic, controversial, and embarrassing look at twenty years of hits and misses from Hot Chips conferences, primarily in microprocessor design. We’ve collected panelists from outspoken one-time designers, professors, or pundits who aren’t shy about making fun of other peoples’ (or their own) life’s work. |
Conference Day Two
| Session |
Tuesday, August 26, 2008 |
| Session 5 |
Session Five: FPGAsSession Chair: Krste AsanovicPresentations:
- Virtex-5 FXT, a New Field-Programmable Gate Array Platform

Authors(s): Peter Alfke, Xilinx
- New 40nm High Performance FPGA and ASIC Common Platform

Authors(s): Dan Mansur, Altera
- MAXware: Acceleration in HPC

Authors(s):Michael Flynn, Rob Dimond, Oskar Mencer, Oliver Pell,Maxeler
|
| Session 6 |
Session Six: PC ChipsSession Chair: John SellPresentation:
- AMD 780G, an x86 Chipset with Advanced Integrated GPU

Authors(s): Niles Burbank, T. Asaro, D. Cherepacha, D. Sinclair, J. Chappel, M. Tressidar, P. Ng, C. Klement, J. Bruno, L. Sinclair, C. Kuan, AMD
- Micro-architecture of Godson-3 Multi-Core Processor

Authors(s): Weiwu Hu, Xiang Gao, Yunji Chen, Institute of Computing Technology, Chinese Academy of Sciences
- Inside Intel® Core™ Microarchitecture (Nehalem)

Authors(s): Ronak Singhal, Intel
|
| Keynote 2 |
Keynote II: SunPower’s History and Technology Keynote Chair: Forest BaskettPresenter: Richard Swanson, SunPower |
| Session 7 |
Session Seven: NetworkingSession Chair: Jose RenauPresentations:
- Low Cost Chipset for Broadband Powerline Communications at 200 Mbps

Authors(s): Chano Gomez, DS2
- The QFP Packet Processing Chip Set

Authors(s): Donald Steiss, Will Eatherton, James Markevitch, Cisco
|
| Session 8 |
Session Eight: Visual ComputingSession Chair: Marc TremblayPresentations:
- NVIDIA GTX200: TeraFLOPS Visual Computing

Authors(s): John Tynefield, Luke Chang, Stuart Oberman, NVIDIA
- Larrabee: A Many-Core x86 Architecture for Visual Computing

Authors(s): Doug Carmean, Intel
|
| Session 9 |
Session Nine: Server ChipsSession Chair: Alan Jay SmithPresentations:
- Tukwila: A Quad-Core Intel(R) Itanium(R) Processor

Authors(s):Eric DeLano, Intel
- SPARC64VII: Fujitsu’s Next Generation Quad-Core Processor

Authors(s): Takumi Maruyama, Fujitsu
- Rock: A third Generation 65nm, 16-Core, 32 Thread + 32 Scout-Threads CMT SPARC Processor

Authors(s): Shailender Chaudhry, Sun
|
Pingback: Nvidia’s telegraphs Tegra’s woes at CES | SemiAccurate