Program

Please join us at the Flint Center for the Performing Arts, Cupertino, California, Sunday-Tuesday, August 20-22, 2017.
register now or see registration for more details.

At A GlanceTutorialsConf. Day1Conf. Day2Posters
  • Sunday 8/20: Tutorials
    • 8:00 AM – 9:00 AM: Breakfast
    • 9:00 AM – 12:20 PM: Tutorial 1: P4 for Software Defined Networks: Language and Hardware Implementation
    • 12:20 PM – 1:35 PM: Lunch
    • 1:35 PM – 4:30 PM: Tutorial 2: End-to-End Autonomous Vehicle Platform
    • 4:30 PM – 6:00 PM: Reception
  • Monday 8/21: Conference Day 1
    • 7:30 AM – 9:15 AM: Breakfast
    • 9:15 AM – 9:30 AM: Introduction
    • 9:30 AM – 10:00 AM: GPU & Gaming
    • 10:00 AM – 10:30 AM: Break (Eclipse Viewing)
    • 10:30 AM – 11:30 AM: GPU & Gaming (cont)
    • 11:30 AM – 12:30 PM: IOT/Embedded
    • 12:30 PM – 1:45 PM: Lunch
    • 1:45 PM – 2:45 PM: Keynote 1: Direct Human/Machine Interface…
    • 2:45 PM – 3:45 PM: Automotive
    • 3:45 PM – 4:15 PM: Break
    • 4:15 PM – 6:15 PM: Processors
    • 6:15 PM – 7:15 PM: Reception (Wine & Snacks)
  • Tuesday 8/22: Conference Day 2
    • 7:15 AM – 8:15 AM: Breakfast
    • 8:15 AM – 10:15 AM: FPGA
    • 10:15 AM – 10:45 AM: Break
    • 10:45 AM – 11:45 AM: Neural Net
    • 11:45 AM – 12:45 PM: Keynote 2: Advances in AI…
    • 12:45 PM – 2:00 PM: Lunch
    • 2:00 PM – 3:30 PM: Neural Net (cont)
    • 3:30 PM – 4:30 PM: Architecture
    • 4:30 PM – 5:00 PM: Break
    • 5:00 PM – 7:00 PM: Server
    • 7:00 PM – 7:15 PM: Closing Remarks

Tutorials

Sun 8/20 Tutorial Title Presenter Affiliation
8:00 AM Breakfast
Tutorial 1: P4 for Software Defined Networks: Language and Hardware Implementation

The flexibility offered by Software Defined Networks (SDNs) is appealing to network operators, since it allows the customization of network behavior for application-specific needs. In SDNs, flexibility comes at the cost of running the network on general-purpose hardware, reducing the performance that one can obtain from specialization. A recent trend in the industry is to use specialized hardware (ASICs and FPGAs) to combine the best of SDN programmability and flexibility with hardware execution efficiency. Programmable networking hardware allows both enhancing legacy protocols (e.g. adding monitoring to a traditional L2/L3 switching) and developing new protocols at a much faster pace.

To support programmability for network devices, P4 (www.p4.org) has been developed as a new programming language for describing how network packets should be processed on a variety of targets ranging from general-purpose CPUs to NPUs, FPGAs, and custom ASICs. P4 was designed with three goals in mind: (i) protocol independence: devices should not “bake in” specific protocols; (ii) field reconfigurability: programmers should be able to modify the behavior of devices after they have been deployed; and (iii) portability: programs should not be tied to specific hardware targets. P4 is the first widely-adopted domain-specific language for packet processing. Several vendors have developed FPGA-based implementations and, with the arrival of Tofino, there is already at least one domain-specific processor optimized as a compiler target for P4 programs – a processor with a small instruction set that can process one header per cycle. The P4 community has created – and continues to maintain and develop – the language specification, a set of open-source tools (compilers, debuggers, code analyzers, libraries, software P4 switches, etc.), and sample P4 programs, all with the goal of making it easy for P4 users to quickly and correctly author new data-plane behaviors. Specialized backend compilers and optimizers for vendor targets are built upon the open source framework. Using P4 and the open source tools, it is easy to prototype new ideas for networking hardware and applications in P4, by simply augmenting the compiler with support for the new hardware.

The goal of the tutorial is to introduce attendees to the domain of specialized hardware and programming tools for SDN. We discuss the basic operations in networking and how they have influenced the design of the P4 language, and of specialized, programmable networking hardware. We will show how the design goals of P4 language are met through examples of programs that can run on a variety of architectures. We provide several examples of applications (e.g. monitoring) that are enabled only by the combination of programmable hardware. We expect that, at the end of the tutorial, attendees will be familiar with the application domain and with a set of several implementations from different vendors that demonstrate various trade-offs. We aim to encourage researchers to consider the programmable networking devices area as one of the areas where domain-specific specialization is already moving into commercial devices and to contribute to the development of the P4 based ecosystem.

9:00 AM T1 Background on Software Defined Networking Gordon Brebner (Xilinx & P4.org), Calin Cascaval (Barefoot Networks & P4.org)
9:20 AM T1 P4 Language and Applications Gordon Brebner (Xilinx & P4.org), Calin Cascaval (Barefoot Networks & P4.org)
10:20 AM Break
10:40 AM T1 Overview of the P4 tools Gordon Brebner (Xilinx & P4.org), Calin Cascaval (Barefoot Networks & P4.org)
11:10 AM T1 P4 hardware implementations: Barefoot, Netronome, Xilinx Gordon Brebner (Xilinx & P4.org), Calin Cascaval (Barefoot Networks & P4.org)
12:10 PM T1 Future directions: research problems, getting involved, and resources Gordon Brebner (Xilinx & P4.org), Calin Cascaval (Barefoot Networks & P4.org)
12:20 PM Lunch
Tutorial 2: Title: End-to-End Autonomous Vehicle Platform

It is not hard to imagine a day in the near future where explaining the concept of driving would be analogous to explaining the concept of using a cassette player to play music today. The path to this scenario is paved by the rise of autonomous vehicles towards making our roads safer, and in doing so, redefining transportation as we know it.

The goal of this tutorial is to provide an overview of the autonomous vehicle landscape through NVIDIA’s platform. We will be covering the advancements in hardware which facilitate optimized computation for self-driving tasks, including the detection of objects/obstacles around the vehicle and precisely localizing the vehicle in the world. Until recently, the perception of the world around the vehicle was primarily derived from hand-crafted computer vision algorithms. While these handcrafted algorithms were adequate for basic ADAS, it is simply impossible to manually write code for every possible scenario an autonomous vehicle (AV) might encounter. AV requires a new computing model which is deep learning. Deep learning algorithms help address the issues of robustness, accuracy, and scalability, and they have become more relevant through the advancements in the availability of big data, compute power, and accelerated frameworks.

The knowledge of where objects/obstacles are located helps keep autonomous vehicles safe and is a crucial element for navigation. To begin navigation, an autonomous vehicle needs to recognize where it is in reference to its surroundings. By using objects detected by deep neural networks as points of reference, a high definition representation of the world can be created by the vehicle in the form of an HD map. This HD map can be utilized for route planning, and maintained through a continuous loop of perception, localization and mapping. In this tutorial we will highlight how deep neural networks are changing the autonomous vehicle landscape, and why mapping is important for autonomous vehicles.

1:35 PM T2 An Overview of NVIDIA’s Autonomous Vehicles Platform Pradeep Gupta NVIDIA
2:35 PM Break
3:00 PM T2 Deep Neural Networks – Changing the Autonomous Vehicles Landscape Dennis Lui NVIDIA
4:00 PM T2 Mapping for Autonomous Vehicles Richard Albayaty NVIDIA
4:30 PM Reception
6:00 PM End of Reception

Conference Day1

Mon 8/21 Session Title Presenter Affiliation
7:30 AM Breakfast
9:15 AM Introduction
9:30 AM GPU and Gaming The Xbox One X Scorpio Engine John Sell Microsoft
10:00 AM Eclipse Viewing Break
10:30 AM GPU and Gaming (cont) AMD’s Radeon Next Generation GPU Michael Mantor & Ben Sander AMD
NVIDIA’s Volta GPU: Programmability and Performance for GPU Computing Jack Choquette NVIDIA
11:30 AM IOT/Embedded SiFive Freedom SoCs: Industry’s First Open-Source RISC-V Chips Yunsup Lee SiFive
Self-timed ARM M3 Microcontroller for Energy Harvested Applications David Baker ETA Compute
12:30 PM Lunch
1:45 PM Keynote 1 The Direct Human/Machine Interface and hints of a General Artificial Intelligence Dr. Phillip Alvelda Wiseteachers.com, Former DARPA PM
Abstract: Dr. Alvelda will speak about the latest and future developments in Brain-Machine Interface, and how new discoveries and interdisciplinary work in neuroscience are driving new extensions to information theory and computing architectures.
2:45 PM Automotive R-Car Gen3: Computing Platform for Autonomous Driving Era Mitsuhiko Igarashi & Kazuki Fukuoka Renesas Electronics Corporation
Localization for Next Generation Autonomous Vehicles Fergus Noble Swift Navigation
3:45 PM Break
4:15 PM Processors XPU: A programmable FPGA Accelerator for diverse workloads Jian Ouyang Baidu
Knights Mill: Intel Xeon Phi Processor for Machine Learning Jesus Corbal (Lead Presenter), Ken Janik, Sundaram Chinthamani, Adhiraj Hassan Intel
Celerity: An Open Source RISC-V Tiered Accelerator Fabric Scott Davidson (UC San Diego), Khalid Al-Hawaj (Cornell) and Austin Rovinski ( U. Michigan)
Graph Streaming Processor (GSP) A Next-Generation Computing Architecture Val Cook ThinCI
6:15 PM Reception
7:15 PM End of Reception

Conference Day2

Tue 8/22 Session Title Presenter Affiliation
7:15 AM Breakfast
8:15 AM FPGA Xilinx RFSoC: Monolithic Integration of RF Data Converters with All Programmable SoC in 16nm FinFET for Digital-RF Communications Brendan Farley Xilinx
Stratix 10: Intel’s 14nm Heterogeneous FPGA System-in-Package (SiP) Platform Sergey Shumarayev Altera/Intel
Xilinx 16nm Datacenter Device Family with In-Package HBM and CCIX Interconnect Gaurav Singh & Sagheer Ahmad Xilinx
FPGA Accelerated Computing Using AWS F1 Instances David Pellerin Amazon
10:15 AM Break
10:45 AM Neural Net 1 A Dataflow Processing Chip for Training Deep Neural Networks Chris Nicol Wave Computing
Accelerating Persistent Neural Networks at Datacenter Scale Eric Chung & Jeremy Fowers Microsoft
11:45 AM Keynote 2 Recent Advances in Artificial Intelligence via Machine Learning and the Implications for
Computer System Design
Jeff Dean Google
12:45 PM Lunch
2:00 PM Neural Net 2 DNN ENGINE: A 16nm Sub-uJ Deep Neural Network Inference Accelerator for the Embedded Masses Paul Whatmough Harvard University/ARM Research
DNPU: An Energy-Efficient Deep Neural Network Processor with On-Chip Stereo Matching Dongjoo Shin & Hoi-Jun Yoo KAIST
Evaluation of the Tensor Processing Unit: A Deep Neural Network Accelerator for the Datacenter Cliff Young Google
3:30 PM Architecture A 400Gbps Multi-Core Network Processor James Markevitch & Srinivasa Malladi Cisco
ARM DynamIQ: Intelligent Solutions using Cluster Based Multi-Processing Peter Greenhalgh ARM
4:30 PM Break
5:00 PM Server The Next Generation IBM Z Systems Processor Christian Jacobi & Anthony Saporito IBM
The Next Generation AMD Enterprise Server Product Architecture Kevin Lepak AMD
The New Intel® Xeon® Processor Scalable Family (Formerly Skylake-SP) Akhilesh Kumar Intel
Qualcomm Centriq 2400 Processor Thomas Speier & Barry Wolford Qualcomm
7:00 PM Closing Remarks
7:15 PM End of Conference

Posters

Title Presenter