Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kubopanda/072f3dbed23211c49c07852b41cb49d4 to your computer and use it in GitHub Desktop.
Save kubopanda/072f3dbed23211c49c07852b41cb49d4 to your computer and use it in GitHub Desktop.
Q & A from Traefik Online Meetup: Container Orchestration with Traefik on Docker Swarm by Jakub Hajek, Cometari Dedicated Solutions

Q & A Online Meetup: Container Orchestration with Traefik on Docker Swarm by Jakub Hajek, Cometari Dedicated Solutions

Check out the YouTube video: https://www.youtube.com/watch?v=ga3cv0RHxQg

Question: Will the demo code be published to GitHub?

Answer: Yes, it is. Here is a link to Github repo: https://github.com/jakubhajek/traefik-consul-swarm

Question: Can you say something on the placement of the Consul? Are you using just the server cluster or agents too? Are the agents running in the Docker vms, or in the containers directly?

Answer: The entire stack including Consul is deployed in a container. One of the services which act as Consul leader has been run with the —bootstrap CLI command. https://github.com/jakubhajek/traefik-consul-swarm/blob/master/stack-consul.yml#L7

This service starts as first component, and then remaining Consul replicas should join to the leader. https://github.com/jakubhajek/traefik-consul-swarm/blob/master/stack-consul.yml#L26

The following stack should create the Consul cluster consisting of 4 nodes. Here is an example:

/ # consul members Node Address Status Type Build Protocol DC Segment 3fc5fa01fde4 10.0.38.10:8301 alive server 1.5.3 2 dc1 483b7a6c600d 10.0.38.3:8301 alive server 1.5.3 2 dc1 8043b85f0081 10.0.38.4:8301 alive server 1.5.3 2 dc1 f8cc0f5388bb 10.0.38.5:8301 alive server 1.5.3 2 dc1

Question: As you using route53? Would it be better to inject SSL certificates via route53 for high availability, and avoid a single point of failure on Traefik with their Acme storage?

Answer: Route53 is only used to manage a domain and to monitor via health check availability of nodes which consist a Docker Swarm cluster. The suggested idea of terminating an SSL cert on AWS seems interesting. However, the main purpose of this demo was to show how to generate SSL certificates with Traefik. The demo presented the environment with multiple Traefik instances. Consul which keeps SSL certificates is also replicated, but you need to deal with data persistence - this is a separate topic.

Question: Can you talk about what could be improved, in terms of achieving a high availability Swarm?

Answer: Consider having dedicated roles with MANAGER role. Make sure that you will have a quorum in case of failure one of the node consisting the cluster. Make sure that you are familiar with making backup of RAFT database and you tested the procedure to restore database. You can also implement placement preferences features as it is described in this document: https://docs.docker.com/engine/swarm/services/#placement-preferences

Question: Please confirm the name of the command line service tester used.

Answer: I used Slapper in order to generate network traffic. Here is a link to GitHub Repo: https://github.com/ikruglov/slapper Docker Image: https://cloud.docker.com/u/jakubhajek/repository/docker/jakubhajek/slapper Following command has been used to run Slapper docker run -it -v${PWD}:/app jakubhajek/slapper:1 slapper -targets /app/node-app.target -minY 30ms -maxY 200ms -timeout 30s -rate 50

Question: Have you ever used swarm-cronjob to handle cron based services on Swarm, or Consule can handle this behavior?

Answer: I haven’t tried to use this project yet. Seems that it could be very helpful if you need to run any tasks in according to specific routine.

Question: I manage many domains via Cloudflare. I use their SSL feature. Will there be a conflict if I use SSL certificates from CF, and SSL certificates from Traefik?

Answer: SSL for your application domain has to be installed on the top of your application tier. In this case you can use SSL certs installed on Cloudflare and than setup DNS entry pointing to Traefik. However, this configuration requires further testing.

Question: Have you run any benchmarks on Traefik 1.7?

Answer: Such a benchmark does not make a lot of sense: Most of the reverse-proxies performances are related to the underlying host machine (is it bare metal? Is it virtualized? Is it containerized? A mix of these 3?). The underlying(s) kernel(s) configuration(s), particularly at the TCP level are really impactful (are TPC socket reused aggressively? what are the timeouts? How is configured the OOM, what is the underlying I/O system for the sockets? etc.). So we considered removing such a benchmark given the cost of maintaining such a thing.

Also, as Traefik is aimed to be a "cloud native" tool, we consider that the target is not to have 1 single machine only dedicated only to Traefik where you try to put as much request as possible: the era of containers and clouds allows you to easily add more resources when you need it, so benchmark make less sense here. If the goal is to mention that Traefik is able to handles "at least" a certain threshold of requests: the answer is "yes it can."

Question: When is Traefik insufficient? i.e. when would you opt for an API gateway, or do you believe API gateway to be an anti-pattern?

Answer: Traefik is built to route request from outside your infrastructure, to an application inside your infrastructure (aka. an "edge reverse-proxy", or "ingress"). If you try to route requests from internal application to another internal application, it works, but it would be limited. This s where we draw the line as for today.

Question: How do you do rate limiting with traffic? is it possible to do dynamic rate limiting based on some condition?

Answer: The documentation is quite good on this one: https://docs.traefik.io/configuration/commons/#rate-limiting . Ratelimiting is defined per frontend, and use what we call an "extractor function" to discriminate a user from another. The "extractor" can be client'sip, or "request"s host", or an header of the request. RateLimit allows bursting for some period, and can is limited on the average. It can be "dynamically" adapted using a file provider: as soon as the "condition" is met, then the TOML file for the dynmaic configuration is changed from your homemade process and configuration is dynamically adapted, as Traefik can be configured for watching the config files. Traefik API can also be used, but I don't know a lot about this so I'm not in my area of expertise

Important thing: if you scale Traefik horizontally by adding nodes, the RateLimiting is not cluster-wide: it's only "per node". Example: 10 req/q on the client IP, then if you have 3 Traefik replicas, then you might be rate-limited between 10 and 30 req/s globally. This is one of the "distributed" features we decided to embed in Traefik Enterprise Edition, where we solved this "cluster-wide coordination" problem (there is a webinar about this the 31 of July - https://zoom.us/webinar/register/WN_YIuRLKZPQky2JGVuRUIzhw)

Question: How do you incorporate third party authorization and authentication via Traefik? i.e external oath and opened provider..

Answer: In order to stay in the philosophy of "do 1 thing but do it well", Traefik can delegates auth. + authorization to a 3rd party system that will do this correctly. This is called the "forward authentication" in the doc (https://docs.traefik.io/configuration/entrypoints/#forward-authentication). Please note that Traefik supports htpasswd and htdigest based authentication for simple use cases. We want to mention https://github.com/thomseddon/traefik-forward-auth which a contributor's project that is totally awesome for this kind of use case: it can be used for integrating with any oauth compliant system as Google Auth, or Dex (https://github.com/dexidp/dex) for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment