基于容器的分布式系统的设计模式（谷歌论文译文）

Design patterns for container-based distributed systems

基于容器的分布式系统的设计模式

Brendan Burns David Oppenheimer

Google

翻译学习：zackerycao

1 Introduction

1 介绍

In the late 1980s and early 1990s, object-oriented programming revolutionized software development, popularizing the approach of building of applications as collections of modular components. Today we are seeing a similar revolution in distributed system development, with the increasing popularity of microservice architectures built from containerized software components. Containers [15] [22] [1] [2] are particularly well-suited as the fundamental “object” in distributed systems by virtue of the walls they erect at the container boundary. As this architectural style matures, we are seeing the emergence of design patterns, much as we did for object-oriented programs, and for the same reasonthinking in terms of objects (or containers) abstracts away the lowlevel details of code, eventually revealing higher-level patterns that are common to a variety of applications and algorithms.

在上世纪80年代末和90年代初，面向对象编程革新了软件开发，使得构建软件的方法成了模块化组建的集合。随着从容器化组建构建的微服务架构越来越受到欢迎，今天我们正看到一场相同的在分布式系统开发领域的革命。因为它在容器边界的建立的隔离墙的优点，容器特别适合作为一个在分布式系统的基础的“对象”。随着这种架构风格的成熟，我们正看到设计模式的出现，就像我们为面向对象做的那样，并且出于同样的原因——从对象（或者容器）角度思考，从底层的代码细节中抽象出来。最终，揭示各种应用和算法所共有的高级模式。

This paper describes three types of design patterns that we have observed emerging in container-based distributed systems: single-container patterns for container management, single-node patterns of closely cooperating containers, and multi-node patterns for distributed algorithms. Like object-oriented patterns before them, these patterns for distributed computation encode best practices, simplify development, and make the systems where they are used more reliable.

这篇论文描述三种类型我们观察到出现在容器发布系统中的设计模式：用于容器管理的单容器模式，多容器紧密协作的单节点模式，以及用于分布式算法的多节点模式，就像面向对象出现在他们之前，这些用于分布式计算编码的最佳实践的模式，开发简单，并且使得系统被用起来更可靠。

2 Distributed system design patterns

2 分布式系统设计模式

After object-oriented programming had been used for some years, design patterns emerged and were documented [3]. These patterns codified and regularized general approaches to solving particular common program-ming problems. This codification further improved the general state of the art in programming because it made it easier for less experienced programmers to produce well-engineered code, and led to the development of reusable.

在面向对象使用多年后，设计模式出现并且被记录。这些模式被编撰出来，同时规范了解决特别普遍的编程问题的一般方法。这些总结的特效改善了编程技术的通用性，因为它使得缺乏经验的程序员去生产设计精良的代码更为轻松，而且引导了程序复用性开发。

libraries that made code more reliable and faster to develop.The state-of-the-art in distributed system engineering today looks significantly more like the world of early 1980s programming than it does the world of object-oriented development. Yet it’s clear from the success of the MapReduce pattern [4] in bringing the power of “Big Data” programming to a broad set of fields and de-velopers, that putting in place the right set of patterns can dramatically improve the quality, speed, and accessibility of distributed system programming. But even the success of MapReduce is largely limited to a single programming language, insofar as the Apache Hadoop [5] ecosystem is primarily written in and for Java. Developing a truly comprehensive suite of patterns for distributed system design requires a very generic, language-neutral vehicle to represent the atoms of the system.

库使得代码更可靠并且开发起来更快。先进的分布式系统工程对于今天的意义更像是20世纪80年代早期的编程世界，而非面向对象编程世界。然而， MapReduce 模式的成功显然将“大数据”编程的能力，赋予了众多的领域和开发者，应用得当的模式可以显著地优化分布式系统程序的质量、速度以及可访问性。但是，即使 MapReduce 的成功也非常局限在单一程序语言，因为 Apache Hadoop 生态是主要是用 Java 写的。就开发一个真正全面的合适分布式系统设计的模式而言，则要求非常普遍、语言无关性的工具，来呈现系统的原子。

Thus it is fortunate that the last two years have seen a dramatic rise in adoption of Linux container technology. The container and the container image are exactly the abstractions needed for the development of distributed systems patterns. To date, containers and container images have achieved the large measure of their popularity simply by being a better, more reliable method for delivering software from development all the way through production. By being hermetically sealed, carrying their dependencies with them, and providing an atomic deployment signal (“succeeded”/“failed”), they dramatically improve on the previous state of the art in deploying software in the datacenter or cloud. But containers have the potential to be much more than just a better deployment vehicle – we believe they are destined to become analogous to objects in object-oriented software systems, and as such will enable the development of distributed system design patterns. In the following sections we explain why we believe this to be the case, and describe some patterns that we see emerging to regularize and guide the engineering of distributed systems over the coming years.

最近两年 linux 容器技术被显著地兴起和采用，这是非常幸运的，容器和容器镜像精确地抽象了开发分布式系统模式中的需求。迄今为止，容器和容器镜像已经被广泛检验，因为他们广受欢迎简单好用、更可靠的从开发环境发布到生产环境的方法。依靠紧密的封装，它能够携带它的依赖，并且提供一个原子性的发布信号（“成功”/“失败”），他们显著地改进了，从前在云或者数据专用线发布软件方面的技术水平。但是容器具有比作为一个发布工具更大的潜力——我们相信它们注定会成为类似对象在面向对象软件系统中的对象，并且会推动分布式系统设计模式的发展。在后面小节中，我们会揭示为何我们相信这些猜想会成为现实，并且描述某些我们看到的一些模式，这些模式，会去规范化和指引未来几年里分布式系统工程。

3 Single-container management patterns

3 单容器管理模式

The container provides a natural boundary for defining an interface, much like the object boundary. Containers can expose not only application-specific functionality, but also hooks for management systems, via this interface.

容器提供一道天然的边界来定义的一个接口，更像是对象的边界。容器可以暴露的不仅仅是应用特定的功能，还可以通过这个接口为管理系统提供的钩子。

The traditional container management interface is extremely limited. A container effectively exports three verbs: run(), pause(), and stop(). Though this interface is useful, a richer interface can provide even more utility to system developers and operators. And given the ubiquitous support for HTTP web servers in nearly every modern programming language and widespread support for data formats like JSON, it is easy to define an HTTPbased management API that can be “implemented” by having the container host a web server at specific endpoints, in addition to its main functionality.

传统的容器接口是非常受限的。一个容器的有效地导出三个动词：run(),pause(),和 stop()。然而这个接口是非常管用的，一个丰富的接口可以提供更多功能，给系统开发工程师和运维工程师。并且在几乎每种现代编程语言普遍支持 HTTP web 服务，而且广泛地支持如 JSON 这样的数据格式化，去定义一个基于 HTTP 的管理 API是很容易，除了容器主要功能之外，可以在特定 endpoints 托管一个 web 服务来“实现”。

In the “upward” direction the container can expose a rich set of application information, including application-specific monitoring metrics (QPS, application health, etc.), profiling information of interest to developers (threads, stack, lock contention, network message statistics, etc.), component configuration information, and component logs. As a concrete example of this, Kubernetes [6], Aurora [7], Marathon [8], and other container management systems allow users to define health checks via specified HTTP endpoints (e.g. “/health”). Standardized support for other elements of the “upward” API we have described is more rare.

容器可以向上暴露一个丰富的应用信息集合，包括应用特定的健康度量（QPS，应用健康，等等），开发者感兴趣的分析信息（线程、栈、锁竞争，网络统计信息，等等），组件配置信息，和组件日志。作为一个关于这点具体的例子，kubernetes , Aurora ，Marathon，和其他的容器管理系统允许用户通过指定的 HTTP 端点（比如 “健康”）定义健康检查。标准化对其他我们描述的属于向上的 API 元素对支持更少

In the “downward” direction, the container interface provides a natural place to define a lifecycle that makes it easier to write software components that are controlled by a management system. For example, a cluster management system will typically assign “priorities” to tasks, with high-priority tasks guaranteed to run even when the cluster is oversubscribed. This guarantee is enforced by evicting already-running lower-priority tasks, that will then have to wait until resources become available. Eviction can be implemented by simply killing the lowerpriority task, but this puts an undue burden on the developer to respond to arbitrary death anywhere in their code. If instead, a formal lifecycle is defined between application and management system, then the application components become more manageable, since they conform to a defined contract, and the development of the system becomes easier, since the developer can rely on the contract. For example, Kubernetes uses a “graceful deletion” feature of Docker that warns a container, via the SIGTERM signal, that it is going to be terminated, an application-defined amount of time before it is sent the SIGKILL signal. This allows the application to terminate cleanly by finishing in-flight operations, flushing state to disk, etc. One can imagine extending such a mechanism to provide support for state serialization and recovery that makes state management significantly easier for stateful distributed systems.

在“向下”方向，容器接口提供一个天然的空间，去定义一个生命周期，这可以使得去编写被管理系统控制的软件组件更为简单。例如，一个集群管理系统会通常会为任务分配“优先事项”，高优先级任务也可以被保障运行，即使集群是超卖的。这个保障是强制驱逐已经运行低优先级的任务，它会等待直到资源变得可用为止。驱逐可以由简单杀死低优先级的任务来实现，但是这个是对开发者不当的负担，去响应在他们代码里任何时间任何位置的死亡。如果换一种方式，在应用和管理系统之间的定义一个正式的生命周期，就可以让应用组件更具有可管理性，由于它们根据一个协议定义，同时系统开发也变得更容易，因为开发者可以依赖这个协议。比如，Kubernetes 用 Docker “优雅删除”的特性来通过 SIGTERM 信号来警告容器，该容器将在 SIGKILL 信号发送之前，终止应用自定义的时间。这允许应用可以干净地终止，在完成已经在运行的操作、刷新磁盘状态等等之后。我们可以想象的扩展这样一个支持状态序列化和恢复的机制，来使得对有状态的分布式系统的状态管理更为容易。

As a concrete example of a more complex lifecycle, consider the Android Activity model [9], which features a series of callbacks (e.g. onCreate(), onStart(), onStop(), ...) and a formally defined state machine for how the system triggers these callbacks. Without this formal lifecycle, robust, reliable Android applications would be significantly harder to develop. In the context of container-based systems, this generalizes to application-defined hooks that are invoked when a container is created, when it is started, just before termination, etc. Another example of a “downward” API that a container might support is “replicate yourself” (to scale up the service).

作为一个更完整的生命周期具体的例子，请考虑 Android Activity 模型，具有一系列回调（比如 onCreate(),onStart(),onStop(),...）并且有一个供系统触发回调的正式定义的状态机。若没有这个正式的生命周期，想开发强大、可靠的安卓应用会变得明显困难。在这个基于容器的系统里的上下文中，这通常定义为一个在容器创建、启动、终止等时候被调用的钩子。另一个关于“向下” API 的例子中，容器可以支持“自我复制”（扩大服务规模）。

4 Single-node, multi-container application patterns

4 单节点，多容器应用模式

Beyond the interface of a single container, we also see design patterns emerging that span containers. We have previously identified several such patterns [10]. These single-node patterns consist of symbiotic containers that are co-scheduled onto a single host machine. Container management system support for co-scheduling multiple containers as an atomic unit, an abstraction Kubernetes calls “Pods” and Nomad [11] calls “task groups,” is thus a required feature for enabling the patterns we describe in this section.

除了单个容器的接口以为，我们也可以看见跨容器设计模式的出现。我们有之前认定了这一系列的模式。这些单节点模式由被共同调度到单个机器上的共生容器组成。容器管理系统支持共同调度多个容器作为一个原子化的单元，这个抽象概念，Kubernetes 管这叫做 “Pods” ，而 Nomad 叫做 “task groups” ，这是用于我们在本节要描述的设计模式的必要特性。

4.1 Sidecar pattern

The first and most common pattern for multi-container deployments is the sidecar pattern. Sidecars extend and enhance the main container. For example, the main container might be a web server, and it might be paired with a “logsaver” sidecar container that collects the web server’s logs from local disk and streams them to a cluster storage system. Figure 1 shows an example of the sidecar pattern. Another common example is a web server that serves from local disk content that is populated by a sidecar container that periodically synchronizes the content from a git repository, content management system, or other data source. Both of these examples are common at Google. Sidecars are possible because containers on the same machine can share a local disk volume.

4.1 sidecar (边车)模式

sidecar 模式是第一个并且最普遍的为多容器发布的模式。sidecars 扩展并且提升了主容器的能力。举个例子，这个主容器可能作为一个 web 服务，也许与一个 “logsaver” 的 sidecar 容器配对——可以从本地磁盘和流收集web 服务日志到集群存储系统的。图 1 展示了一个 Sidecar 模式的例子。另一个普遍的例子是一个本地磁盘的内容提供服务的 web 服务，内容由一个 sidecar 容器从 git 仓库、内容管理系统、或数据源做周期性的同步。这些例子在谷歌都很普遍。sidecars 的可行性，基于容器在一个相同的机器上可以共享一个本地磁盘 volume 。

image-20200405215643019.png

Figure 1: An example of a sidecar container augmenting an application with log saving.

图 1 ：一个 sidecar 容器增加一个应用容器的日志存储能力的例子

While it is always possible to build the functionality of a sidecar container into the main container, there are several benefits to using separate containers. First, the container is the unit of resource accounting and allocation, so for example a web server container’s cgroup[15] can be configured so that it provides consistent lowlatency responses to queries, while the logsaver container is configured to scavenge spare CPU cycles when the web server is not busy. Second, the container is the unit of packaging, so separating serving and log saving into different containers makes it easy to divide responsibility for their development between two separate programming teams, and allows them to be tested independently as well as together. Third, the container is the unit of reuse, so sidecar containers can be paired with numerous different “main” containers (e.g. a log saver container could be used with any component that produces logs). Fourth, the container provides a failure containment boundary, making it possible for the overall system to degrade gracefully (for example, the web server can continue serving even if the log saver has failed). Lastly, the container is the unit of deployment, which allows each piece of functionality to be upgraded and, when necessary, rolled back, independently. (Though it should be noted that this last benefit also comes with a downside – the test matrix for the overall system must consider all of the container version combinations that might be seen in production, which can be large since sets of containers generally can’t be upgraded atomically. Of course while a monolithic application doesn’t have this issue, componentized systems are easier to test in some regards, since they are built from smaller units that can be independently tested.) Note that these five benefits apply to all of the container patterns we describe in the remainder of this paper.

尽管可以将一个 sidecar 容器的功能构建在主容器中，但是使用独立的容器也有诸多好处。首先容器是一个资源账户和分配的单元，因此，比如说，一个 web 服务容器的 cgroup 可以被配置，使得他可以提供了一个持续的低延时的查询响应，而日志存储容器是被配置当 web 服务不繁忙的时候去搜寻闲置的 CPU 周期。第二，容器是一个打包单元，所以独立的服务和日志存储置于不同的容器，使得分割它们处于不同的独立编程团队中的发布职责更为简单，并且允许他们被独立地测试，也可以放在一起测试。第三，容器是一个重复使用的单元，所以 sidecar 容器可以与许多的不同的“主”容器配对（比如，一个日志存储容器可以被任意一个生产日志的组件使用）。第四，容器提供一个失败控制边界，使得整个系统可以优雅地降级（比如，web 服务可以持续服务尽管日志服务已经failed）。最后，这容器是一个发布单元，可以允许每一单项功能被升级，并且当必要的时候、回滚。（尽管它应该被注意，最后一点的好处同样也伴随一个负面--对整个系统的测试矩阵必须考虑可能会在生产中看到所有容器版本的组合，这个可以变得很大，因为容器集合一般不可以被自动升级。当然尽管一个单体的应用不会有这个问题，组件化的系统在某些方面更容易去测试，因为他们是从可以被独立测试的小的单元构建的。）注意上述这五点好处，会应用于所有我们将在本论文的后续部分中继续描述的容器模式。

4.2 Ambassador pattern

The next pattern that we have observed is the ambassador pattern. Ambassador containers proxy communication to and from a main container. For example, a developer might pair an application that is speaking the memcache protocol with a twemproxy ambassador. The application believes that it is simply talking to a single memcache on localhost, but in reality twemproxy is sharding the requests across a distributed installation of multiple memcache nodes elsewhere in the cluster.

4.2 大使模式

大使模式是下一个我们观察到的模式。大使容器代理与主容器的交流。举个例子，一个开发者也许将一个遵循memcache 协议的应用和一个 twemproxy 配对。这个应用会认为仅仅是与一个本机单节点的 memcache 程序通信，但是在真实的 twemproxy 里是分片请求，透传给分布式安装在集群其他地方的，多个 memcache 集群节点。

image-20200406152210605.png

Figure 2: An example of the ambassador pattern applied to proxying to different memcache shards.

图 2：一个大使模式应用于代理不同的 memcache 分片的例子

This container pattern simplifies the programmer’s life in three ways: they only have to think and program in terms of their application connecting to a single server on localhost, they can test their application standalone by running a real memcache instance on their local machine instead of the ambassador, and they can reuse the twemproxy ambassador with other applications that might even be coded in different languages. Ambassadors are possible because containers on the same machine share the same localhost network interface. An example of this pattern is shown in Figure 2.

这个容器模式用三种方法简化了程序的生命：他们只要考虑和开发他们应用去连接一个本地的服务，他们可以用一个一个真实运行在本机的 memcache 实例而非大使，来独立地测试他们的应用，他们可以在其他也用不同语言编写的应用中重用 twemproxy 大使。大使是可行的，因为容器在一个相同的机器上分享本地网络接口。一个简单该模式的的例子在图 2 中展示。

4.3 Adapter pattern

4.3 适配器模式

The final single-node pattern we have observed is the adapter pattern. In contrast to the ambassador pattern, which presents an application with a simplified view of the outside world, adapters present the outside world with a simplified, homogenized view of an applica tion. They do this by standardizing output and interfaces across multiple containers. A concrete example of the adapter pattern is adapters that ensure all containers in a system have the same monitoring interface. Applications today use a wide variety of methods to export their metrics (e.g. JMX, statsd, etc). But it is easier for a single monitoring tool to collect, aggregate, and present metrics from a heterogenous set of applications if all the applications present a consistent monitoring interface. Within Google, we have achieved this via code convention, but this is only possible if you build your software from scratch. The adapter pattern enables the heterogenous world of legacy and open-source applications to present a uniform interface without requiring modification of the original application. The main container can communicate with the adapter through localhost or a shared local volume. This is shown in Figure 3. Note that while some existing monitoring solutions are able to communicate with multiple types of back-ends, they use applicationspecific code in the monitoring system itself, which provides a less clean separation of concerns.

最后一个我们观察到的单节点模式是适配器模式。与大使模式不同，大使模式以一个简化的外部世界视图呈现给应用程序，适配器则以一个简化的、同质化的应用程序视图呈现给外部世界。他们用标准化的输出和接口跨越多个容器来实现这些。一个关于适配器模式具体的例子是确保所有在一个系统中的容器拥有相同的监控接口的适配器。如今的应用程序使用各种各样的方法去导出他们的度量（例如，JMX、statsd 等）。但是，如果所有应用呈现一个统一的监控接口，这对于单个监控工具更容易去连接、聚合和呈现从一组异构的应用程序的度量。在谷歌，我们通过代码约定来实现这一点，但是这只有从头构建软件才可能实现。适配器模式，使得异构的世界的资产和开源应用程序，无需修改原始的应用程序，就可以去呈现一个统一的接口。主容器可以通过本地或者共享的本地 valume 和适配器通信。这展示在图 3 中。注意，尽管一些现存的监控解决方案，是可以与各种不同类型的后端通信，他们在监控系统本身中使用特定于应用程序的代码去监控它们，如此一来就提供的关注点分隔的效果就不那么彻底了。

image-20200406170647446.png

Figure 3: An example of the adapter pattern applied to normalizing the monitoring interface.

图 3 ：一个关于适配器模式应用于通用化监控接口的例子

5 Multi-node application patterns

5 多节点应用模式

Moving beyond cooperating containers on a single machine, modular containers make it easier to build coordinated multi-node distributed applications. We describe three of these distributed system patterns next. Like the patterns in the previous section, these also require system support for the Pod abstraction.

超越单机上协作容器，模块化的容器是的构建协同多节点的分布式应用更简单。我们接下来描述三种分布式系统。就如同在前面小节中提到的模式，这也有一些必要的对 Pod 抽象的系统支持。

5.1 Leader election pattern

5.1 领导选举模式

One of the most common problems in distributed systems is leader election (e.g. [20]). While replication is commonly used to share load among multiple identical instances of a component, another, more complex use of replication is in applications that need to distinguish one replica from a set as the “leader.” The other replicas are available to quickly take the place of the leader if it fails. A system may even run multiple leader elections in parallel, for example to determine the leader of each of multiple shards.There are numerous libraries for performing leader election. They are generally complicated to understand and use correctly, and additionally, they are limited by being implemented in a particular programming language. An alternative to linking a leader election library into the application is to use a leader election container. A set of leader-election containers, each one co-scheduled with an instance of the application that requires leader election, can perform election amongst themselves, and they can present a simplified HTTP API over localhost to each application container that requires leader election (e.g. becomeLeader, renewLeadership, etc.). These leader election containers can be built once, by experts in this complicated area, and then the subsequent simplified interface can be re-used by application developers regardless of their choice of implementation language. This represents the best of abstraction and encapsulation in software engineering.

领导选举是在分布式系统中一个最普遍的问题（比如[20]）。尽管副本是通常地被用于在多个相同实例之间共享负载，但更复杂的副本使用另一用法，是在需要从从一组应用副本中，区分一个作为“leader”的副本的分布式应用程序。如果领导者 fail 了，另一个副本可以用于快速地抢占领导地位。一个系统甚至可以运行多个领导选举，例如去确定多个分片的领导。这有许多用于做领导选举的库。他们通常理解和用起来比较难，与此同时，他们也受限作为在一个特定的程序语言里的工具。一个可以替代的连接选举库类到应用程序到方案，是去使用一个领导选举容器。一组领导选举容器，每一个都与一个需要进行领导选举的应用程序实例，来共同调度，就可以在他们之间主持领导选举，并且他们可以通过本地主机为每个需要领导选举的应用程序，提供一个简化的 HTTP API （比如作为领导、更换领导等）。这些领导选举容器可以一次性构建，来作为这个复杂领域的专家，而且这个序列化的简化接口，不论选择哪种工具语言的应用开发者都可以重用。这是相当于是在软件工程里，最好的抽象和封装。

5.2 Work queue pattern

5.2 工作队列模式

Although work queues, like leader election, are a wellstudied subject with many frameworks implementing them, they too are an example of a distributed system

尽管工作队列，和领导选举一样，伴随许多框架对他们的实现，都是已经被充分研究的主题，他们也是一个受益于面向容器架构的，分布式系统模式的例子。

image-20200407193230032.png

Figure 4: An illustration of the generic work queue. Reusable framework containers are shown in dark gray, while developer containers are shown in light gray.

图 4 :一个常见的工作队列的图表。可重用的框架容器用暗灰色表示，开发者容器用亮灰色表示。

pattern that can benefit from container-oriented architectures. In previous systems, the framework limited programs to a single language environment (e.g. Celery for Python [13]), or the distribution of work and binary were exercises left to the implementer (e.g. Condor [21]). The availability of containers that implement the run() and mount() interfaces makes it fairly straightforward to implement a generic work queue framework that can take arbitrary processing code packaged as a container, and arbitrary data, and build a complete work queue system. The developer only has to build a container that can take an input data file on the filesystem, and transform it to an output file; this container would become one stage of the work queue. All of the other work involved in developing a complete work queue can be handled by the generic work queue framework that can be reused whenever such a system is needed. The manner in which a user’s code integrates into this shared work queue framework is illustrated in Figure 4.

在以前的系统中，框架受限于单一语言环境（比如 Celery for Python[13]，或者 work 和二进制的发布执行被分配给实现者（比如，Condor [21]）。容器实现 run() 和 mount() 接口的能力，使得去实现一个，可以随意将代码包制作成为容器的通用的 work 队列框架，变得相当简单。开发者只要构建能够从文件系统获取输入数据文件对容器，并且转换到一个输出文件；这个容器将会成为一个 work 队列对平台。所有其他的，需要开发一个完整的 work 队列来处理的 work，可以被通用的 work 队列框架处理，只要需要这样的系统，这个框架就可以被重用。用户代码集成到共享 work 队列框架的方式，被在图4中说明。

5.3 Scatter/gather pattern

5.3 分散/聚合模式

The last distributed systems pattern we highlight is scatter/gather. In such a system, an external client sends an initial request to a “root” or “parent” node. This root fans the request out to a large number of servers to perform computations in parallel. Each shard returns partial data, and the root gathers this data into a single response to the original request. This pattern is common in search engines. Developing such a distributed system involves a great deal of boilerplate code: fanning out the requests, gathering the responses, interacting with the client, etc. Much of this code is quite generic, and again, as in object-oriented programming, can be refactored in such a way that a single implementation can be provided that can be used with arbitrary containers so long as they client interactions and request fanout to developer-supplied leaf containers and to a developer-supplied container responsible for merging the results (all in light gray).

我们挑选的最后一种分布式系统模式，是分散/聚合模式。在这样一个系统里，一个外部的客户端发送一个初始请求到一个 “root” 或者叫 “parent” 节点。root 将请求分发到一个大量服务器去执行并行运算。每个分片返回一个部分的数据，root 聚合这些数据到单个对原始请求的响应中去。这个模式在搜索引擎中很普遍，聚合响应，与客户端交互等等。许多代码是非常通用的，同样的，就如同在面向对象编程一样，可以用这样一种方式重构，能提供单个实现，该实现可以和任意容器一起使用，只要他们客户端交互和请求分发给开发者提供的叶子容器，并且负责为开发者提供的的容器合并响应结果（全为亮灰色）。

image-20200408002913776.png

Figure 5: An illustration of the scatter/gather pattern. A reusable root container (dark gray) implements

图 5 ：一个分散/聚集模式的说明。一个重用根容器（暗灰色）的实现

implement a specific interface. In particular, to implement a scatter/gather system, a user is required to supply two containers. First, the container that implements the leaf node computation; this container performs the partial computation and returns the corresponding result. The second container is the merge container; this container takes the aggregated output of all of the leaf containers, and groups them into a single response. It is easy to see how a user can implement a scatter/gather system of arbitrary depth (including parents, in addition to roots, if necessary) simply by providing containers that implement these relatively simple interfaces. Such a system is illustrated in Figure 5.

实现一个特定的接口。在特特定情况下，去实现一个分散/聚合系统，一个用户必须提供两个容器。第一个容器，实现叶子节点的计算；这个容器做部分运算并且返回相应的结果。第二个容器是合并容器；这个容器聚合所有叶子容器的输出，并且聚集到单个响应中去。这很容易理解一个用户怎么能，依靠提供实现简单接口的容器，就可以实现一个任意的深度（如果必要的话，除根节点外，还包括双亲节点和）的分散/聚合系统。这样一个系统如图 5 中示意。

6 Related work

6 相关工作

Service-oriented architectures (SOA) [16] pre-date, and share a number of characteristics with, container-based distributed systems. For example, both emphasize reusable components with well-defined interfaces that communicate over a network. On the other hand, components in SOA systems tend to be larger-grain and more loosely-coupled than the multi-container patterns we have described. Additionally, components in SOA often implement business activities, while the components we have focused on here are more akin to generic libraries that make it easier to build distributed systems. The term “microservice” has recently emerged to describe the types of components we have discussed in this paper.

作为基于容器的分布式系统的前身，面向服务架构与之分享了很多特性。比如，两者都强调具有通过网络通信的，接口良好定义的可重用性组件。另一方面，SOA 系统的组件，与我们描述的多容器模式相比，更趋向于大颗粒，并且更加松耦合。另外，SOA 中的组件经常来实现业务活动，然而这些我们在此关注的组件更类似于通用库，使得构建分布式应用更容易。微服务一词最近出现，用来描述我们在本论文讨论的这些类型的组件。

The concept of standardized management interfaces to networked components dates back at least to SNMP [19]. SNMP focuses primarily on managing hardware components, and no standard has yet emerged for managing microservice/container-based systems. This has not prevented the development of numerous container management systems, including Aurora [7], ECS [17], Docker Swarm [18], Kubernetes [6], Marathon [8], and Nomad [11].

网络化组件的标准化的管理接口的概念起码可以追溯到SNMP。SNMP 专注于基础的的硬件组件管理，同时尚未有管理微服务/基于容器的系统的标准出现。这并未阻碍大量容器管理系统的发展，包括 Aurora [7], ECS [17], Docker Swarm [18], Kubernetes [6], Marathon [8], 和 Nomad [11]。

All of the distributed algorithms we mentioned in Section 5 have a long history. One can find a number of leader election implementations in Github, though they appear to be structured as libraries rather than standalone components. There are a number of popular work queue implementations, including Celery [13] and Amazon SQS [14]. Scatter-gather has been identified as an Enterprise Integration Pattern [12].

所有我们在第五节提到的分布式算法都拥有漫长的历史。在 Github 可以找到大量领导人选举算法的实现，尽管它们是作为结构化的库出现，而非独立的组件。有许多留下的工作队列的实现，包括 Celery 和 Amazon SQS。分散/聚合模式已经被认定为一种企业级集成模式。

7 Conclusion

7 结论

Much as object-oriented programming led to the emergence and codification of object-oriented “design patterns,” we see container architectures leading to design patterns for container-based distributed systems. In this paper we identified three types of patterns we have seen emerging: single-container patterns for system management, single-node patterns of closely-cooperating containers, and multi-node patterns for distributed algorithms. In all cases, containers provide many of the same benefits as objects in object-oriented systems, such as making it easy to divide implementation among multiple teams and to reuse components in new contexts. In addition, they provide some benefits unique to distributed systems, such as enabling components to be upgraded independently, to be written in a mixture of languages, and for the system a whole to degrade gracefully. We believe that the set of container patterns will only grow, and that in the coming years they will revolutionize distributed systems programming much as object-oriented programming did in earlier decades, in this case by enabling a standardization and regularization of distributed system development.

许多面向对象编程引导来面向对象“设计模式”的出现和编纂，我们看到容器架构正在引领基于容器的分布式系统的设计模式。在这篇论文里，我们认定了我们观察到已经出现的三种类型的设计模式：用于容器管理的单容器模式，多容器紧密协作的单节点模式，以及用于分布式算法的多节点模式。在所有案例里，和作为一个面向对象系统中对象一样，容器具有许多的优点。比如，使得在多个团队之间分割实现更简单，并且可以在新的环境中重用组件。另外，他们针对分布式系统也具有一些特有的优点，比如，使得组件更容易独立更新，可以用混合语言编写，并且整个系统可以优雅降级。我们相信这些容器模式集合会继续成长，并且在未来几年里它们会在分布式系统领域里，和面向对象编程在前几十年那样，通过使得分布式系统开发标准化和规范化，掀起的革命性的变化。

8 Acknowledgements

8 鸣谢

Ideas don’t simply appear in our heads from a vacuum. The work in this paper has been influenced heavily by conversations with Brian Grant, Tim Hockin, Joe Beda and Craig McLuckie.

这些思路并非凭空简单出现在我们头脑里。同Brian Grant，Tim Hockin，Joe Beda 和 Craig McLuckie.的谈话，深深地影响在这篇论文中的这些工作。

References

参考文献

[1] Docker Engine http://www.docker.com

[2] rkt: a security-minded standards-based container engine https://coreos.com/rkt/

[3] Erich Gamma, John Vlissides, Ralph Johnson, Richard Helm, Design Patterns: Elements of

Reusable Object-Oriented Software, AddisonWesley, Massachusetts, 1994.

[4] Jeffrey Dean, Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Sixth Symposium on Operating System Design and Implementation, San Francisco, CA 2004.

[5] Apache Hadoop, http://hadoop.apache.org

[6] Kubernetes, http://kubernetes.io

[7] Apache Aurora, https://aurora.apache.org.

[8] Marathon:Acluster-wideinitandcontrolsystemfor services, https://mesosphere.github.io/marathon/

[9] Managing the Activity Lifecycle, http://developer.android.com/training/basics/activitylifecycle/index.html

[10] Brendan Burns, The Distributed System ToolKit: Patterns for Composite Containers, http://blog.kubernetes.io/2015/06/the-distributedsystem-toolkit-patterns.html

[11] Nomad by Hashicorp, https://www.nomadproject.io/

[12] Gregor Hohpe, Enterprise Integration Patterns, Addison-Wesley, Massachusetts, 2004.

[13] Celery: Distributed Task Queue, http://www.celeryproject.org/

[14] Amazon Simple Queue Service, https://aws.amazon.com/sqs/

[15] https://www.kernel.org/doc/Documentation/cgroupv1/cgroups.txt

[16] Service Oriented Architecture, https://en.wikipedia.org/wiki/Serviceoriented architecture

[17] Amazon EC2 Container Service,https://aws.amazon.com/ecs/

[18] Docker Swarm https://docker.com/swarm

[19] J. Case, M. Fedor, M. Schoffstall, J. Davin, A Simple Network Management Protocol (SNMP), https://www.ietf.org/rfc/rfc1157.txt, 1990.

[20] R. G. Gallager, P. A. Humblet, P. M. Spira, A distributed algorithm for minimum-weight spanning trees, ACM Transactions on Programming Languages and Systems, January, 1983.

[21] M.J. Litzkow, M. Livny, M. W. Mutka, Condor: a hunter of idle workstations, IEEE Distributed Computing Systems, 1988.

[22] https://linuxcontainers.org/

最后编辑于：2020.04.12 17:03:23

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 194,390评论 5赞 459
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 81,821评论 2赞 371
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 141,632评论 0赞 319
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 52,170评论 1赞 263
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 61,033评论 4赞 355
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 46,098评论 1赞 272
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 36,511评论 3赞 381
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 35,204评论 0赞 253
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 39,479评论 1赞 290
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 34,572评论 2赞 309
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 36,341评论 1赞 326
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 32,213评论 3赞 312
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 37,576评论 3赞 298
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 28,893评论 0赞 17
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 30,171评论 1赞 250
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 41,486评论 2赞 341
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 40,676评论 2赞 335