社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  机器学习算法

斯坦福大学Fall 2018课程-机器学习硬件加速器

机器学习研究会订阅号 • 5 年前 • 705 次点击  

【导读】斯坦福大学2018秋季学期推出《机器学习硬件加速器》课程,介绍机器学习系统中的硬件加速器训练和推理的架构技术,系统而又前沿,是该领域不可多得的课程值得一看。



课程简介

本课程将深入介绍在机器学习系统中用于设计训练和推理加速器的架构技术。本课程将涵盖经典的ML算法,如线性回归和支持向量机,以及DNN模型,如卷积神经网络和递归神经网络。我们将考虑对这些模型的训练和推断,并讨论批量大小、精度、稀疏性和压缩等参数对这些模型精度的影响。我们将介绍ML模型推理和训练的加速器设计。学生将熟悉使用并行性、局部性和低精度来实现ML中使用的核心计算内核的硬件实现技术。为了设计高效节能的加速器,学生们将建立直觉,在ML模型参数和硬件实现技术之间进行权衡。学生将阅读最近的研究论文并完成一个设计项目。

课程地址:

https://cs217.github.io/



教师介绍

Kunle Olukotun 教授:

http://arsenalfc.stanford.edu/kunle

ARDAVAN PEDRAM

https://web.stanford.edu/~perdavan/


课程内容安排






LectureTopicReadingSpatial Assignment
1

Introduction, role of hardware accelerators in post Dennard  and Moore era

(硬件加速器在后登纳-摩尔时代作用介绍)

Is Dark silicon useful?
Hennessy Patterson Chapter 7.1-7.2

2Classical ML algorithms: Regression, SVMs (What is the

  building block?)

(经典ML算法:回归,SVMs)

TABLA
3Linear algebra fundamentals and accelerating linear algebra
BLAS operations

20th century techniques: Systolic arrays and MIMDs, CGRAs

(线性代数基础和BLAS加速运算)

Why Systolic Architectures?
Anatomy of high performance GEMM
Linear Algebra
Accelerators
4Evaluating Performance, Energy efficiency, Parallelism, Locality,

Memory hierarchy, Roofline model

(评价性能、能效、并行度、局部性、内存层次结构,Roofline 模型)

Dark Memory
5

Real-World Architectures: Putting it into practice Accelerating GEMM: Custom, GPU, TPU1 architectures and their GEMM performance

(现实世界的架构:将其付诸实践加速GEMM:自定义、GPU、TPU1架构和它们的GEMM性能。)

Google TPU
Codesign Tradeoffs
NVIDIA Tesla V100

6

Neural networks:  MLPs and CNNs Inference

(神经网络:MLP和CNN推断)

Viviense IEEE proceeding
Brooks’s book (Selected Chapters)
CNN Inference
Accelerators
7Accelerating Inference for CNNs:

Blocking and Parallelism in practice DianNao, Eyeriss, TPU1

(加速对CNNs的推理:在实践中阻塞和并行。 DianNao、Eyeriss TPU1)

Systematic Approach to Blocking
Eyeriss
Google TPU (see lecture 5)

8Modeling neural networks with Spatial, Analyzing 

performance and energy with Spatial

(以空间为基础的神经网络建模,分析性能和空间能量)

Spatial
One related work

9

Training: SGD, back propagation, statistical efficiency, batch size

(训练:SGD, )反向传播,

NIPS workshop last year
Graphcore
Training
Accelerators
10

Resilience of DNNs: Sparsity and Low Precision Networks

(DNNs的弹性能力:稀疏性和低精度网络)


Some theory paper
EIE
Flexpoint of Nervana
Boris Ginsburg: paper, presentation
LSTM Block Compression by Baidu?

11

Low precision training

(低精度训练)


HALP
Ternary or binary networks
See Boris Ginsburg's work (lecture 10)

12Training in Distributed and Parallel systems: 

Hogwild!, asynchrony and hardware efficiency

(分布式并行系统训练)

Deep Gradient compression
Hogwild!
Large Scale Distributed Deep Networks
Obstinate cache?

13

FPGAs and CGRAs: Catapult, Brainwave, Plasticine

(FPGA)

Catapult
Brainwave
Plasticine

14

ML benchmarks: DAWNbench, MLPerf

(机器学习基准)

DawnBench
Some other benchmark paper

15

Project presentations




客座讲师

课程相关内容Slides

  • Lecture01: Deep Learning Challenge. Is There Theory? (Donoho/Monajemi/Papyan)

    https://cs217.github.io/assets/lectures/StanfordStats385-20170927-Lecture01-Donoho.pdf

  • Lecture02: Overview of Deep Learning From a Practical Point of View (Donoho/Monajemi/Papyan)

    https://cs217.github.io/assets/lectures/Lecture-02-AsCorrected.pdf

  • Lecture03: Harmonic Analysis of Deep Convolutional Neural Networks (Helmut Bolcskei)

    https://cs217.github.io/assets/lectures/bolcskei-stats385-slides.pdf

  • Lecture04: Convnets from First Principles: Generative Models, Dynamic Programming & EM (Ankit Patel)

    https://cs217.github.io/assets/lectures/2017%20Stanford%20Guest%20Lecture%20-%20Stats%20385%20-%20Oct%202017.pdf

  • Lecture05: When Can Deep Networks Avoid the Curse of Dimensionality and Other Theoretical Puzzles (Tomaso Poggio)

    https://cs217.github.io/assets/lectures/StanfordStats385-20171025-Lecture05-Poggio.pdf

  • Lecture06: Views of Deep Networksfrom Reproducing Kernel Hilbert Spaces (Zaid Harchaoui)

    https://cs217.github.io/assets/lectures/lecture6_stats385_stanford_nov17.pdf

  • Lecture07: Understanding and Improving Deep Learning With Random Matrix Theory (Jeffrey Pennington)

    https://cs217.github.io/assets/lectures/Understanding_and_improving_deep_learing_with_random_matrix_theory.pdf

  • Lecture08: Topology and Geometry of Half-Rectified Network Optimization (Joan Bruna)

    https://cs217.github.io/assets/lectures/stanford_nov15.pdf

  • Lecture09: What’s Missing from Deep Learning? (Bruno Olshausen)

    https://cs217.github.io/assets/lectures/lecture-09--20171129.pdf

  • Lecture10: Convolutional Neural Networks in View of Sparse Coding (Vardan Papyan)

    https://cs217.github.io/assets/lectures/lecture-10--20171206.pdf


附:第一节 深度学习挑战:存在理论么


想要了解更多资讯,请扫描下方二维码,关注机器学习研究会

                                          


转自:专知



今天看啥 - 高品质阅读平台
本文地址:http://www.jintiankansha.me/t/eFlPFrSSwm
Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/21033
 
705 次点击