Skip to content

Instantly share code, notes, and snippets.

@liuyu
Last active August 29, 2015 13:59
Show Gist options
  • Save liuyu/10802203 to your computer and use it in GitHub Desktop.
Save liuyu/10802203 to your computer and use it in GitHub Desktop.
如何利用CoreOS和Nuxeo构建PaaS

CoreOS and Nuxeo: How We Built nuxeo.io

如何利用CoreOS和Nuxeo构建PaaS

译者注:

Nuxeo: nuxeo是使架构师和开发者能够构建、部署和运行内容为中心的开源内容管理系统平台。属于开源ECM厂家,是一个创新的,基于标准的,开放源码平台,企业内容管理的应用。它涉及的应用领域如:文件管理,协作,记录管理,文档为中心的业务流程管理, Web内容管理。wiki 官方网站

nuxio.io: 基于nuxeo平台构建的企业级应用的PaaS平台

CoreOS: 采用独特的设计用做集群系统的大规模部署,试图解决目前大多数衍生系统安装、更新等繁琐问题。官方网站

Docker: 一种增加了高级API的LinuX Container(LXC)技术,提供了能够独立运行Unix进程的轻量级虚拟化解决方案。它提供了一种在安全、可重复的环境中自动部署软件的方式。 在CoreOS用来做容器。官方网站

etcd: 是一个高可用的 Key/Value 存储系统,主要用于分享配置和服务发现。在CoreOS中用来负责服务发现和配置共享。官方网站

systemd: 是Linux下的一种init软件。其开发目标是提供更优秀的框架以表示系统服务间的依赖关系,并依此实现系统初始化时服务的并行启动,同时达到降低Shell的系统开销的效果,最终代替现在常用的System V与BSD风格init程序。在CoreOS中用来做默认系统和服务管理器。Wiki

fleet: 将systemd和etcd组成一个分布式的init。是systemd集群的扩展。官方网站

Gogeta: Nuxeo研发,基于etcd的动态反向代理。提供动态实时加载路由配置,而不需要重启服务。 官方网站


前言

If you follow the Nuxeo blogs, you may have seen that we have been working with Docker for a few months now. Not because it’s very trendy, but because we will be using it in the infrastructure of nuxeo.io. In this post I will explain how we see our global infrastructure.

如果你关注了Nuxeo的博客,就可能已经看到我们使用Docker工作几个月了。不是因为它新潮,而是因为我们使用它做为nuxeo.io的基础设施。在本文中,我将说明我们如何看待我们的全局基础设施。

A Platform Designed for Failure

一个平台的故障设计

One of the things we learned when reading about building a cloud platform is that the system must be resilient to failure. This means that a host may be down, a process may halt, etc., but your platform must live with that. If a node of your cluster goes down, the service shouldn’t be altered, and the system must react automatically to re-balance the missing services on other nodes – very fast.

有一件事情我们需要知道:当我们读到关于构建一个云平台的时候,该系统必须要有很强的故障处理能力。这就意昧着平台需要解决一台服务器的宕机、一个进程的停止运行等等问题。如果集群中的一台服务器出现故障,服务不能受影响,系统必须做出反应,通过负载均衡将服务路由到其它可用节点 -- 这一过程还必须非常快。

This is where Docker comes in. Since the kernel is shared by all the containers, the startup time of a service is very quick – no need to provision a virtual machine for instance. To start a new Nuxeo container, it only takes about 30 seconds.

这就是 Docker 的用武之地。所有容器都是基于内核共享,一个服务的启动时间也非常快 -- 它不需要依赖于一个虚拟机的实例。因此启动一个新的 Nuxeo 容器,仅需30秒。

Cluster Management

集群管理

When you need to start a lot of containers, you have to find a way to manage them. This means knowing if they are started, where they are located, etc… In order to do that, we found etcd, a distributed registry across a cluster. When we set a key in that registry from one host, it is then accessible from all other hosts in the cluster. Moreover, we can watch for key modifications and setup a TTL for a key. This last feature allows us to very quickly setup a heartbeat mechanism.

当你开始需要大量的容器时,你必须找到一种管理它们的方法。这意味着,当容器启动时你需要知道,它启动的位置等等。为了做到这一点,我们发现了etcd,一个分布式集群存储系统。当我们给一个主机注册一个key,集群中其它主机都可以访问。而且,我们可以等待key的修改和给一个key设置一个TTL值。最一个功能还允许我们快速设置一个心跳机制。

As etcd is part of CoreOS, we look at that distribution which is a tiny distribution that embeds Docker, etcd and systemd. Systemd allows us to start Docker containers, but we don’t use it directly. We use fleet which can be seen as a “systemd over a cluster” – it is also bundled with CoreOS.

由于etcd是CoreOS的一部分,我们来看看CoreOS的分布,Docker、etcd和systemd微小的分布是如何整合至系统中的。Systemd允许我们启动Docker容器,但我们并不直接使用Systemd。我们使用fleet ,捆绑在CoreOS中,被看作为“systemd集群的扩展”。

With all that stuff, we have a running cluster that is synchronized, and on which we can run services. Each time we start a service, we register it in etcd with its attributes (ip, port). For instance we have a docker registry service that holds our private container images.

将以上软件整合在一起使用时,我们需要保证正在运行的集群之间信息同步,并在上面运行服务。每当我们启动一个服务时,我们就将它们的属性信息(ip,port)注册到etcd中。举例来说,我们就拥有一个docker注册服务,用来保存私有容器镜像。

Data Free Runtime Containers

运行容器中的数据存储

Remember now that we want to be able to deal with failures. If a Nuxeo server goes down, or is frozen or whatever, we destroy it and restart it with fleet. That means we loose all data may be held in our container. We could restart the instance, analyze things and get back the data, but it’s too long and requires an administrator operation.

请记住,现在我们希望能够处理故障。如果一台Nuxeo服务器出现宕机,或被冻结或其它故障,我们可以使用fleet删除并重启该主机。但这意味着,我们将丢失容器中所存储的所有数据。当然,我们也可以重新启动该实例,通过做一些分析并找回数据,但是将耗费大量时间与精力,并且需要管理员手工操作。

In our cluster configuration, every Docker image in the stop status is destroyed. We treat our container as an execution part of the platform – it can run everywhere in the cluster. This means that the database and the binary manager must be exernalized, and the logs as well.

在我们的集群配置中,任何在停止状态中的Docker镜像都可以被删除。我们把容器做为平台执行的一部分 -- 你可以在任意集群中运行这些Docker镜像。这就意味着,这些数据库和二进制文件管理者必须有exernalized?,日志文件也一样。?

Dynamic Virtual Hosting and Load Balancing

动态调整虚拟主机和负载均衡

The other part of the cluster is the public facing part. In order to achieve that we need to route and balance the requests for a given host to the proper container serving the request.

集群的另一部分是面向使用者的一部分。为了实现这一目标,我们需要路由和均衡请求到一台特定主机的合适容器来接收服务请求。

A dynamic proxy doesn’t route requests based on a configuration file, but on a route database that can be altered without having to restart the process.

一个动态代理,并不通过一个基于配置文件的内容来进行请求的路由转发,而是通过数据库中的内容进行。这样做的好处就是:当有配置变更时,不需要重启进程。

In our cluster we have such a database – it is etcd. Each domain we want to serve will have its own key, and each key will reference an environment id (i.e. a running Nuxeo instance).

在我们的集群中就有这么一个数据库 -- etcd。每个域名都拥有自己的key,并且每个key关联一个环境ID(使用一个Nuxeo实例运行)。

There are already some dynamic proxy implementations, but no one fit our needs.

现在已经有一些动态代理软件,但都不符合我们的需求:

  • Hipache stores its data in Redis,
  • Strowger uses some Flynn primitives that we don’t want to depend on,
  • Active-proxy: This was just a POC and hasn’t evolved in a long time (meaning several months,which is very long in the Docker ecosystem ;-) ), and
  • Boxcars: Based on a configuration file and has some use cases we don’t need.
  • Hipache,将数据存储在Redis
  • Strowger,使用一些原始的Flynn,我们并不想依赖它
  • Active-proxy: 只有一个POC并在很长一段时间不够成熟(指近几个月,在Docker生态系统还需要很长的路要走)
  • Boxcars: 基本一个配置文件,并且只有一些使用案例,我们并不需要

As none of them met our needs, we decided to develop our own reverse proxy called Gogeta, reusing some basic ideas from several tools we looked at:

由于这些都没有满足我们的需求,因此我们开发自己的反向代理:Gogeta,该工具所使用一些基本思路如下:

  • Written in Go: generates a native executable and has some basic primitives to run a proxy server,
  • Gets and watches its configuration from etcd, and
  • Keep it small and simple (KISS)
  • 基于Go开发:生一个可执行文件和一些系统的基础信息来运行代理服务
  • 从etcd获取和查看配置
  • 保持小和简单(KISS)

Big infrastructure picture

大型图片的基础设施

In the following picture we can follow how a request is issued to a container:

从下面架构图中我们可以看到一个容器是如何处理请求的:

  • The user enters http://mydomain.nuxeo.io/.
  • It ends on the front load balancer that randomly sends the request to one of the coreOS hosts on the Gogeta endpoint.
  • Gogeta reads etcd in the /domains keys to know that it must proxy on the NXIO-0001 container.
  • Gogeta gets the properties of NXIO-0001 in etcd, checks that the status is okay and proxies the request.
  • 负载均衡负责将请求随机到CoreOS主机的Gogeta上
  • Gogeta从etcd中读取 /domains key信息,知道请求必须代理到NXIO-0001主机
  • Gogeta从etcd获取NXIO-0001的属性,检查该主机的状态是否正常,并将请求代理至该主机

On the bottom, fleet helps us to start new containers in the cluster.

在架构图最底图,fleet帮助我们在集群中启动一个新的容器。

nuxeo_cluster_img

Conclusion

结论

Docker is a good piece of software to build a fault tolerant infrastructure. CoreOS provides us with a cool integration of it, providing us the missing tools to manage the cluster.

nuxeo.io will be soon open-sourced (some parts of it are already done), so you can play with it and give us some feedback. When we think it is ready for testing, then we will announce it. Stay tuned !

Docker是一个很好的构建一个具有容错功能的基础设施软件。CoreOS为我们提供了一个很酷的整合,但没有给我们提供一个集群管理工具。 nuxeo.io将很快会开源(某些部分已经开源),因此你可以使用它并给我们一些反馈。当我们完成测试并认为可以开源时,就会放出来,敬请期待!

About Damien Metzler

关于 Damien Metzler

Avatar

After several years as a Nuxeo customer (since 5.0!) and contributor, Damien joined Nuxeo as a software developer in 2013

Nuxeo 5.0以前的客户及贡献者,于2013年加入Nuxeo,担任软件开发。

原文:http://www.nuxeo.com/blog/development/2014/04/coreos-nuxeo-build-nuxeoio/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment