基础模式 托管生命周期

艾因技术K8s翻译大约 14 分钟

基础模式 托管生命周期

2.4 Managed Lifecycle  托管生命周期

Containerized applications managed by cloud-native platforms have no control over their lifecycle, and to be good cloud-native citizens, they have to listen to the events emitted by the managing platform and adapt their lifecycles accordingly. The Managed Lifecycle pattern describes how applications can and should react to these lifecycle events.

由云原生平台(K8s)管理的容器化应用程序无法控制其生命周期,要成为优秀的云原生公民,他们必须倾听管理平台发出的事件,并相应地调整其生命周期。托管生命周期模式描述了应用程序如何实现倾听并响应这些生命周期事件。

2.4.1 Problem  问题

In Chapter 4, Health Probe we explained why containers have to provide APIs for the different health checks. Health-check APIs are read-only endpoints the platform is continually probing to get application insight. It is a mechanism for the platform to extract information from the application.

在上一节,健康探针中,我们解释了为什么容器必须为不同的健康检查提供API。运行状况检查API是只读端点,平台正在不断探测以获取应用程序洞察力。它是平台从应用程序中提取信息的机制。

In addition to monitoring the state of a container, the platform sometimes may issue commands and expect the application to react on these. Driven by policies and external factors, a cloud-native platform may decide to start or stop the applications it is managing at any moment. It is up to the containerized application to determine which events are important to react to and how to react. But in effect, this is an API that the platform is using to communicate and send commands to the application. Also, applications are free to either benefit from lifecycle management or ignore it if they don’t need this service.

除了监控容器的状态外,平台有时可能会发出命令,并期望应用程序对这些命令做出反应。在政策和外部因素的驱动下,云原生平台可能随时决定启动或停止其正在管理的应用程序。由容器化应用程序决定哪些事件是重要的,以及如何作出反应。但实际上,这是平台用来与应用程序通信和向应用程序发送命令的API。此外,应用程序可以免费从生命周期管理中获益,也可以在不需要此服务的情况下忽略生命周期管理。

2.4.2 Solution  解决方案

We saw that checking only the process status is not a good enough indication of the health of an application. That is why there are different APIs for monitoring the health of a container.Similarly, using only the process model to run and stop a process is not good enough. Real-world applications require more fine-grained interactions and lifecycle management capabilities. Some applications need help to warm up, and some applications need a gentle and clean shutdown procedure. For this and other use cases, some events, as shown in Figure 5-1, are emitted by the platform that the container can listen to and react to if desired.

我们发现,仅检查进程状态并不能很好地指示应用程序的运行状况。这就是为什么有不同的API来监控容器的运行状况。类似地,仅使用进程模型来运行和停止进程是不够的。现实世界的应用程序需要更细粒度的交互和生命周期管理功能。有些应用程序需要帮助预热,有些应用程序需要温和清洁的关机程序。对于此用例和其他用例,一些事件(如图5-1所示)由平台发出,容器可以根据需要监听和响应这些事件。

image.png
image.png

Figure 5-1. Managed container lifecycle

The deployment unit of an application is a Pod. As you already know, a Pod is composed of one or more containers. At the Pod level, there are other constructs such as init containers, which we cover in Chapter 14, Init  Container (and defer-containers, which is still at the proposal stage as of this writing) that can help manage the container lifecycle. The events and hooks we describe in this chapter are all applied at an individual container level rather than Pod level.

应用程序的部署单元是Pod。正如你已经知道的,一个Pod由一个或多个容器组成。在Pod级别,还有其他的构造,比如我们在后面章节中介绍的初始化容器(以及延迟容器,在撰写本文时仍处于建议阶段)可以帮助管理容器生命周期。我们在本章中描述的事件和钩子都应用于单个容器级别,而不是Pod级别。

2.4.2.1 SIGTERM Signal   SIGTERM 信号

Whenever Kubernetes decides to shut down a container, whether that is because the Pod it belongs to is shutting down or simply a failed liveness probe causes the container to be restarted, the container receives a SIGTERM signal. SIGTERM is a gentle poke for the container to shut down cleanly before Kubernetes sends a more abrupt SIGKILL signal. Once a SIGTERM signal has been received, the application should shut down as quickly as possible. For some applications, this might be a quick termination, and some other applications may have to complete their in-flight requests, release open connections, and clean up temp files, which can take a slightly longer time. In all cases, reacting to SIGTERM is the right moment to shut down a container in a clean way.

每当Kubernetes决定关闭一个容器时,无论是因为它所属的Pod正在关闭,还是仅仅是由于liveness探测器故障导致容器重新启动,容器都会收到一个SIGTERM信号。SIGTERM是一个温和的信号,在Kubernetes发出更突然的SIGKILL信号之前,容器可以干净地关闭。一旦收到SIGTERM信号,应用程序应尽快关闭。对于某些应用程序,这可能是一个快速终止,而其他一些应用程序可能需要完成其进行中的请求、释放打开的连接并清理临时文件,这可能需要稍长的时间。在所有情况下,对SIGTERM作出反应是以干净(优雅)的方式关闭容器的正确时机。

2.4.2.2 SIGKILL Signal   SIGKILL  信号

If a container process has not shut down after a SIGTERM signal, it is shut down forcefully by the following SIGKILL signal. Kubernetes does not send the SIGKILL signal immediately but waits for a grace period of 30 seconds by default after it has issued a SIGTERM signal. This grace period can be defined per Pod using the .spec.terminationGracePeriodSeconds field, but cannot be guaranteed as it can be overridden while issuing commands to Kubernetes. The aim here should be to design and implement containerized applications to be ephemeral with quick startup and shutdown processes.

如果容器进程在收到SIGTERM信号后未关闭,就行了则会收到SIGKILL信号强制关闭。Kubernetes不会立即发送SIGKILL信号,但在发出SIGTERM信号后,默认情况下会等待30秒的宽限期。可以使用.spec.terminationGracePeriodSeconds字段为每个Pod定义此宽限期,但不能保证在Kubernetes发出命令时仍然可以覆盖此宽限期。这里的目标应该是设计和实现具有快速启动和关闭过程的容器化应用程序。

2.4.2.3 Poststart Hook  Poststart 钩子

Using only process signals for managing lifecycles is somewhat limited. That is why there are additional lifecycle hooks such as postStart and preStop provided by Kubernetes. A Pod manifest containing a postStart hook looks like the one in Example 5-1.

仅使用进程信号来管理生命周期在某种程度上能力是有限的。这就是为什么Kubernetes提供了额外的生命周期钩子,比如postStartpreStop。包含postStart钩子的Pod清单与示例5-1中的清单类似。

Example 5-1. A container with poststart hook

apiVersion: v1
kind: Pod
metadata:
  name: post-start-hook
spec:
  containers:
  - image: k8spatterns/random-generator:1.0
    name: random-generator
    lifecycle:
      postStart:
        exec:
          command:
          - sh
          - -c
          - sleep 30 && echo "Wake up!" > /tmp/postStart_done
#The postStart command waits here 30 seconds. sleep is just a simulation for any lengthy startup code that might run here. Also, it uses a trigger file here to sync with the main application, which starts in parallel.

The postStart command is executed after a container is created, asynchronously with the primary container’s process. Even if many of the application initialization and warm-up logic can be implemented as part of the container startup steps, postStart still covers some use cases. The postStart action is a blocking call, and the container status remains Waiting until the postStart handler completes, which in turn keeps the Pod status in the Pending state. This nature of postStart can be used to delay the startup state of the container while giving time to the main container process to initialize.

postStart命令在创建容器后与主容器的进程异步执行。即使许多应用程序初始化和预热逻辑可以作为容器启动步骤的一部分实现,postStart仍然涵盖一些用例。postStart操作是一个阻塞调用,容器状态保持等待状态,直到postStart处理程序完成,从而使Pod状态保持在挂起状态。postStart的这种性质可以用来延迟容器的启动状态,同时给主容器进程初始化时间。

Another use of postStart is to prevent a container from starting when the Pod does not fulfill certain preconditions. For example, when the postStart hook indicates an error by returning a nonzero exit code, the main container process gets killed by Kubernetes.

postStart的另一个用途是防止容器在Pod未满足某些先决条件时启动。例如,当postStart钩子通过返回非零的退出代码来指示错误时,主容器进程将被Kubernetes终止。

postStart and preStop hook invocation mechanisms are similar to the Health Probes described in Chapter 4 and support these handler types:

postStart和preStop钩子调用机制类似于上一章中描述的运行状况探测,并支持以下处理程序类型:

exec Runs a command directly in the container   在容器中直接运行一条命令

httpGet Executes an HTTP GET request against a port opened by one Pod container  对一个Pod容器打开的端口执行HTTP GET请求

You have to be very careful what critical logic you execute in the postStart hook as there are no guarantees for its execution. Since the hook is running in parallel with the container process, it is possible that the hook may be executed before the container has started. Also, the hook is intended to have at-least once semantics, so the implementation has to take care of duplicate executions. Another aspect to keep in mind is that the platform does not perform any retry attempts on failed HTTP requests that didn’t reach the handler.

你必须非常小心在postStart钩子中执行的是什么关键逻辑,因为它的执行没有任何保证。由于钩子与容器进程并行运行,所以钩子可能在容器启动之前执行。此外,钩子至少要有一次语义,因此实现必须处理重复执行。要记住的另一个方面是,平台不会对未到达处理程序的失败HTTP请求执行任何重试尝试。

2.4.2.4 Prestop Hook  Prestop 钩子

The preStop hook is a blocking call sent to a container before it is terminated. It has the same semantics as the SIGTERM signal and should be used to initiate a graceful shutdown of the container when reacting to SIGTERM is not possible. The preStop action in Example 5-2 must complete before the call to delete the container is sent to the container runtime, which triggers the SIGTERM notification.

preStop钩子是在容器终止之前发送给容器的阻塞调用。它与SIGTERM信号具有相同的语义,当无法对SIGTERM作出反应时,应使用它来启动容器的正常关闭。在将删除容器的调用发送到容器运行时(触发SIGTERM通知)之前,需要完成示例5-2中的预停止操作。

Example 5-2. A container with a preStop hook

apiVersion: v1
kind: Pod
metadata:
  name: pre-stop-hook
spec:
  containers:
  - image: k8spatterns/random-generator:1.0
    name: random-generator
    lifecycle:
      preStop:
        httpGet:
          port: 8080
          path: /shutdown
#Call out to a /shutdown endpoint running within the application

Even though preStop is blocking, holding on it or returning a nonsuccessful result does not prevent the container from being deleted and the process killed. preStop is only a convenient alternative to a SIGTERM signal for graceful application shutdown and nothing more. It also offers the same handler types and guarantees as the postStart hook we covered previously.

即使preStop正在阻塞,保持它或返回一个不成功的结果也不能阻止容器被删除和进程被终止。preStop只是SIGTERM信号的一种方便的替代方案,用于优雅的应用程序关闭,仅此而已。它还提供了与我们前面介绍的postStart钩子相同的处理程序类型和保证。

2.4.2.5 Other Lifecycle Controls  其他生命周期控制方法

In this chapter, so far we have focused on the hooks that allow executing commands when a container lifecycle event occurs. But another mechanism that is not at the container level but at a Pod level allows executing initialization instructions.

在本章中,到目前为止,我们主要关注在容器生命周期事件发生时允许执行命令的钩子。但另一种不是容器级而是Pod级的机制允许执行初始化指令。

We describe in Chapter 14, Init Container, in depth, but here we describe it briefly to compare it with lifecycle hooks. Unlike regular application containers, init containers run sequentially, run until completion, and run before any of the application containers in a Pod start up. These guarantees allow using init containers for Pod-level initialization tasks. Both lifecycle hooks and init containers operate at a different granularity (at container level and Pod-level, respectively) and could be used interchangeably in some instances, or complement each other in other cases. Table 5-1 summarizes the main differences between the two.

我们在后面的Init Container章节中进行了Pod级的机制深入的描述,但这里我们简要地描述了它,以将其与生命周期挂钩进行比较。与常规应用程序容器不同,init容器按顺序运行,运行到完成,并在Pod中的任何应用程序容器启动之前运行。这些保证允许对Pod级初始化任务使用init容器。lifecycle钩子和init容器都以不同的粒度(分别在容器级别和Pod级别)运行,在某些情况下可以互换使用,或者在其他情况下可以相互补充。表5-1总结了两者之间的主要差异。

Table 5-1. Lifecycle Hooks and Init Containers

AspectLifecycle hooksInit Containers
Activates onContainer lifecycle phasesPod lifecycle phases
Startup phase actionA postStart
commandA list of initContainers
to execute
Shutdown phase actionA preStop
commandNo equivalent feature exists yet
Timing guaranteesA postStart
command is executed at the same time as the container’s ENTRYPOINTAll init containers must be completed successfully before any application container can start
Use casesPerform noncritical startup/shutdown cleanups specific to a containerPerform workflow-like sequential operations using containers; reuse containers for task executions

There are no strict rules about which mechanism to use except when you require a specific timing guarantee. We could skip lifecycle hooks and init containers entirely and use a bash script to perform specific actions as part of a container’s startup or shutdown commands. That is possible, but it would tightly couple the container with the script and turn it into a maintenance nightmare.

对于使用哪种机制没有严格的规定,除非需要特定的时间保证。我们可以完全跳过生命周期挂钩和init容器,并使用bash脚本作为容器启动或关闭命令的一部分来执行特定操作。这是可能的,但它会将容器与脚本紧密耦合,并将其变成维护噩梦。

We could also use Kubernetes lifecycle hooks to perform some actions as described in this chapter. Alternatively, we could go even further and run containers that perform individual actions using init containers. In this sequence, the options require more effort increasingly, but at the same time offer stronger guarantees and enable reuse.

我们还可以使用Kubernetes生命周期挂钩来执行本章中描述的一些操作。或者,我们可以更进一步,使用init容器运行执行单个操作的容器。在这个序列中,选项需要越来越多的努力,但同时提供了更有力的保证并支持重用。

Understanding the stages and available hooks of containers and Pod lifecycles is crucial for creating applications that benefit from being managed by Kubernetes.

了解容器和Pod生命周期的阶段和可用钩子,对于创建受益于Kubernetes管理的应用程序至关重要。

2.4.3 Discussion  讨论

One of the main benefits the cloud-native platform provides is the ability to run and scale applications reliably and predictably on top of potentially unreliable cloud infrastructure. These platforms provide a set of constraints and contracts for an application running on them. It is in the interest of the application to honor these contracts to benefit from all of the capabilities offered by the cloud-native platform.

云原生平台的主要好处之一是提供了能够在不可靠的云基础设施之上构建可靠、可预测地运行和扩展应用程序的能力。云原生平台为这些运行在平台上的应用程序提供了一组约束和契约。为了获取云原生平台提供的这些优势和能力,应用程序需要遵守这些契约和约束。

Handling and reacting to these events ensures your application can gracefully start up and shut down with minimal impact on the consuming services. At the moment, in its basic form, that means the containers should behave as any well-designed POSIX process. In the future, there might be even more events giving hints to the application when it is about to be scaled up, or asked to release resources to prevent being shut down. It is essential to get into the mindset where the application lifecycle is no longer in the control of a person but fully automated by the platform.

处理和响应这些事件(通知信号和钩子)可确保应用程序能够正常启动和关闭,而对使用服务的影响最小。目前,在其基本形式中,这意味着容器应该像任何设计良好的POSIX进程一样工作。将来,可能会有更多的事件在应用程序即将扩展时向应用程序提供提示,或者请求释放资源以防止被关闭。必须进入这样的思维模式:应用程序生命周期不再由个人控制,而是由平台完全自动化。

Besides managing the application lifecycle, the other big duty of orchestration platforms like Kubernetes is to distribute containers over a fleet of nodes. The next pattern, Automated Placement, explains the options to influence the scheduling decisions from the outside.

除了管理应用程序生命周期外,Kubernetes等编排平台的另一项重要职责是在一组节点上分发容器。下一节中的模式是自动布局,它阐述了从外部影响调度决策的选项。