学习：Distributed Programming - 分布系统

latency.png

分布式系统，就好像一个公司、群体、团体的抽象模型。

distributed programming
大部分内容来自：http://blog.zhuanxu.org/2017-01-19-Distributed-systems-for-fun-and-profit.html

information travels at the speed of light
independent things fail independently

让读者认识对 distance, time and consistency models 怎么交互有个更好的理解。

Distributed systems at a high level

Distributed programming is the art of solving the same problem that you can solve on a single computer useing multiple computers.

分布式系统遇到的问题和单机系统是一样的，只是解决同一个问题时，其系统的 constraints 不同了。

计算机系统主要解决两类问题： 1. 计算； 2. 存储

各种框架可以划分为两个领域的战争，一个是偏向底层存储的战争，一个是偏向计算的战争。

STORAGE VS PROCESSING WARS

存储：关系型数据库和非关系数据库(Relational vs NoSQL)

关系型数据库最大的特点是事务的一致性，读写操作都是事务的，具有 ACID 的特点，它在银行这样对已执行有要求的系统中应用广泛。
非关系型数据库一般对一致性要求不高，但支持高性能并发读写，海量数据访问，在微博、Facebook 这类 SNS 应用中广泛使用。

非关系型数据库内部也有战争，比如：注重一致性(Consistency)和可用性(Availability) 的 HBase，与提供可用性(Availability)和分区容错性(Partition tolerance) 的 Cassandra。

Redis 和 Memcached，都是内存内的 Key/Value 存储，但 Redis 还支持哈希表，有序集和链表等多种数据结构。

MongoDB, CouchDB 和 Couchbase 都是文档型数据库，MongoDB 更适用于需要动态查询的场景，CouchDB 偏向于预定义查询，Couchbase 比 CouchDB 有更强的一致性，而且还可以作为 Key/Value 存储。
(https://zhuanlan.zhihu.com/p/21587328?refer=bittiger)

一、为什么需要分布式？

理论上，如果单机系统有无限的内存，极致的计算速度，那是没必要将系统设计为分布式的，因为分布式引入了一系列问题。但是，我们没有 money 啊，只能妥协，用平价机器来实现高性能。

二、通过分布式，希望得到什么？

size-scalability

everything starts with the need to deal with size-scalability
everything starts with size-scalability

什么是 scalability?

Scalability is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.

scalability: 伸缩性，系统伸缩性好意味着，当系统处理 growing amount of work 时，即使处理难度不是线性增长的，也应该是可预测的，而不是指数级增长的。

什么是 growing amount of work? For example,

Size scalability: adding more nodes should make the system linearly faster; growing the dataset should not increase latency
Geographic scalability: it should be possible to use multiple data centers to reduce the time it takes to respond to use queries, while dealing with cross-data center latency in some sensible manner.
Administrative scalability: adding more nodes should not increase the administrative costs of the system (e.g. the administrators-to-machines ratio).

可以看到每个例子都试着从存储和计算两方面进行解释，然后每个处理都是在time，distance和consistency的抉择。

随着系统size的增加，我们的设计需要不断满足实际的需求，而这些需求主要来自于

performance
availability
Performance (and latency)

而这两方面在实际系统中的测量方式也有很多，先来看performance

Performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.

性能主要是通过计算机系统有效工作占计算时间和资源的比例来衡量的，更具体来说：

Short response time/low latency for a given piece of work
High throughput (rate of processing work)
Low utilization of computing resource(s)
低延迟，高吞吐，低计算资源使用率

先看延迟

Latency：The state of being latent; delay, a period between the initiation of something and the occurrence.

Latent：From Latin latens, latentis, present participle of lateo (“lie hidden”). Existing or present but concealed or inactive.

延迟：一件事情发生到真正产生影响

举个例子：假设一个数据中心，有一个功能是通过计算所有数据，然后返回一个结果

1
result = query(all data in the system)
对于上面这个请求，影响延迟的因素是什么呢？

the amount of old data
the speed at which new data “takes eﬀect” in the system
是1还是2呢？

我们可以知道如果系统处理query的速度和新数据在系统中生效的速度一样，那么我们会发现query将永远不会返回，因此真正决定延迟的是2.

另外一点我们需要注意的是：延迟是因为有新数据的产生，那如果系统没有产生数据，那就没有延迟一说。

最后一点我们无法忽略的延迟是：

the speed of light limits how fast information can travel, and
hardware components have a minimum latency cost incurred per operation (think RAM and hard drives but also CPUs).
信息传输产生的延迟以及硬件本身产生的延迟。

因此决定延迟的因素有：

操作本身的耗时
信息传输延迟
看完性能后，我们看下一个分布式系统的主需求：可用性

Availability (and fault tolerance)

Performance is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.

单机系统相比较于分布式系统永远无法达到的一点是：高可用或者说容错

分布式系统可以通过一群不可靠的组件，最后达到一个高可用的系统，这就是分布式的魅力。

那我们来看下什么是容错（fault tolerance）

Fault tolerance：ability of a system to behave in a well-deﬁned manner once faults occur

从定义我们可以看到，首先我们要规定问题边界：要处理的错误，基于定义好的错误我们去开发算法去解决它。如果系统发生了我们从未考虑的错误，那何来谈容错一说。

What prevents us from achieving good things?

分布式系统受限于两个物理约束：

the number of nodes (which increases with the required storage and computation capacity)
the distance between nodes (information travels, at best, at the speed of light)
简单来说就是数量和距离，上面的物理限制在增加节点的时候会带来：

an increase in the number of independent nodes increases the probability of failure in a system (reducing availability and increasing administrative costs)【降低可用性和管理成本】
an increase in the number of independent nodes may increase the need for communication between nodes (reducing performance as scale increases)【通信成本增加降低系统性能】
an increase in geographic distance increases the minimum latency for communication between distant nodes (reducing performance for certain operations)【多数据中心增加通信成本】
上面说了这么多限制，那怎么定义系统好不好呢？总的来说是从 performance 和 availability 来看，具体衡量则是我们经常说的SLA(service level agreement)。

除了 performance 和 availability 外，还有一个很重要的点是： intelligibility，即系统好不好理解，如果系统设计的非常好，但是只有设计的人懂，那这种系统价值也是不大的。

Abstractions and models

抽象帮我们做减法，只剩下本质的东西，而模型则将抽象具现化，让我们能更精确的定义抽象的东西。

我们来举一些具体的模型的例子，这些也是之后章节会讲的东西：

System model (asynchronous / synchronous)
Failure model (crash-fail, partitions, Byzantine)
Consistency model (strong, eventual)
我们设计分布式系统的目标是：work like a single system，但是现实中我们有很多的节点需要处理，我们很难让使用者在无感知的情况下使用分布式系统，模型越是简单，意味着提供给用户的功能越是抽象，如果用户想要获得更好的性能，更高级的功能，只能穿过给用户提供的抽象，去看系统的实现细节。

Design techniques: partition and replicate

为了存储更多的数据，我们将数据保存到不同的节点上，为了让运算更快，我们将同一份数据copy多份，让多个实例进行计算，一个很好的总结是：

Divide and conquer - I mean, partition and replicate.

Partitioning

Partitioning improves performance by limiting the amount of data to be examined and by locating related data in the same partition【数据分片后减少数据大小从而提高性能】
Partitioning improves availability by allowing partitions to fail independently, increasing the number of nodes that need to fail before availability is sacriﬁced【每个数据片失败都是独立的，提供了可用性】
Replication

To replication! The cause of, and solution to all of life’s problems.

复制是应对延迟的有效手段

Replication improves performance by making additional computing power and bandwidth applicable to a new copy of the data
Replication improves availability by creating additional copies of the data, increasing the number of nodes that need to fail before availability is sacriﬁced
复制来到好处的同时，也带来了很多问题，最大的就是数据一致性问题，只有当模型是 strong consistency 的时候，我们才会得到一个简单的编程模型（和单机系统一致），其他模型我们都好去理解系统内部是怎么做的，这样子才能很好的满足我们的需求。

学习：Distributed Programming - 分布系统

推荐阅读更多精彩内容