18847F
18847F was a special topics course in computer systems: Foundations of Cloud and Machine learning infrastructure taught by professor Gauri Joshi.
The objective of this course was to introduce students to modern cloud and machine learning infrastructure, and its theoretical foundations. The course was divided into two section: first being the distributed computing and storage systems, frameworks such as MapReduce and Spark, and discussions on scheduling and load balancing policies used in them.
In the context of distributed storage systems, the course discusses coding-theoretic techniques used to improve availability and repair failed nodes. This section covered papers such as ‘Sparrow’, ‘Attack of clones’, ‘Rateless coding’, ‘Gradient coding’ etc.
The second section of the course focused on machine learning infrastructure in stochastic gradient descent and its implementation in large-scale systems coupled with adaptive communication strategies. Papers covered in this section involves: ‘DistBielef’, ‘HolgWild!’, ‘Slow and stale gradients’, PipeDream’, ‘Cooperative SGD’, Adacomm’, ‘Federated learning’ etc.
There was an extension to the above two sections where a bit exotic and wide range of sysML topics ranging from model compression to hyperparamter tuning to multi-armed bandits, Gaussian processes and bayesianoptimization were introduced. This section covered papers such as ‘TernGrad’, ‘ATOMO’, PowerSGD’, ‘HyperBand’, ‘Neural Architecture Search’, ‘Parallel Bayesian optimisation’ etc and guest lectures on Multi-armed bandits and Gaussian processes.