Large-Scale Data and Computation: Challenges and Opportunities

1. Replication

replication对于large-scale system的意义

Data loss 数据备份
- replicate the data on multiple disks/machines (GFS/Colossus)
Slow machines 慢速机器 – replicate the computation (MapReduce)
Too much load 负载过高
- replicate for better throughput (nearly all of our services)
Bad latency 高延迟
- utilize replicas to improve latency
- improved worldwide placement of data and services

这些问题都可以通过合理的replication来解决。

large-scale system都采用共享环境，有利也有弊

容忍错误和可变性的相似之处，都需要使用额外资源来解决。解决思路都是在unpredictable parts上面构建出predictable part. 两者的差别是时间范围，faults通常在10s/day. 而variability通常在1000s/sec.

Tolerating faults:
- rely on extra resources
  - RAIDed disks, ECC memory, dist. system components, etc. – make a reliable whole out of unreliable parts
Tolerating variability:
- use these same extra resources
- make a predictable whole out of unpredictable parts
Times scales are very different:
- variability: 1000s of disruptions/sec, scale of milliseconds
- faults: 10s of failures per day, scale of tens of seconds

延迟容忍技术主要有下面两种,单位是request.

Cross request adaptation 一种方式是跨request的，检查最近request的行为，时间范围在10s到分钟级别
- examine recent behavior
- take action to improve latency of future requests
- typically relate to balancing load across set of servers
- time scale: 10s of seconds to minutes
Within request adaptation 一种是在request内部的
- cope with slow subsystems in context of higher level request
- time scale: right now, while user is waiting
Many such techniques
- [The Tail at Scale, Dean & Barroso, to appear in CACM Feb. 2013]
Tied Requests 这种方式非常简单，就是同时发送request到多个replica上面，如果某个replica开始执行的话，那么这个replica直接取消其他replica上的请求。 note:实现上是否会复杂？由replica直接取消其他replica上的request感觉会复杂化设计

提供cross-cluster的服务，比如spanner系统

Our earliest systems made things easier within a cluster:

Solve many problems, but leave many cross-cluster issues to human-level operators

Spanner:Worldwide Storage

有了这些"just works"的抽象组件(GFS, MapReduce, BigTable, Spanner, tied requests, etc.)后，我们就能开搞更加牛X的东西比如深度学习。

Parameter Server用于实现异步分布式随机梯度下降.

实现这种深度学习系统里面也有许多有意思的tradeoff