Skip to content

Instantly share code, notes, and snippets.

@phedoreanu
Last active August 31, 2017 09:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phedoreanu/6db543c9f422ab8fb280e04dd7b5b1c9 to your computer and use it in GitHub Desktop.
Save phedoreanu/6db543c9f422ab8fb280e04dd7b5b1c9 to your computer and use it in GitHub Desktop.

Login

cf login -a https://api.run.pivotal.io -u adrian.fedoreanu@gmail.com

Push a single app (with or without a manifest):

cf push APP_NAME [-b BUILDPACK_NAME] [-c COMMAND] [-d DOMAIN]

[-f MANIFEST_PATH] [--docker-image DOCKER_IMAGE]

[-i NUM_INSTANCES] [-k DISK] [-m MEMORY] [--hostname HOST] [-p PATH]

[-s STACK] [-t TIMEOUT] [-u (process | port | http)] [--route-path ROUTE_PATH]

[--no-hostname] [--no-manifest] [--no-route] [--no-start] [--random-route]

Push multiple apps with a manifest:

cf push [-f MANIFEST_PATH]
  1. The CLI gathers your application’s files and uploads them to Cloud Foundry.
  2. Within Cloud Foundry:
  • The Cloud Controller finds the needed Buildpack.
  • The Buildpack builds an application image (Droplet).
  • Diego takes that Droplet and runs it in a Container (Diego Cell).
  1. Reports back success or any issues.

Example manifest.yml:

---
applications:
- name: web-app             # mandatory, unless set from the command line
  memory: 32M               # defaults to 1GB
  disk_quota: 256M          # defaults to 1GB
  random-route: true        # --random-route command line flag
  buildpack: ruby_buildpack # optional
  services:
  - existing-service

Services

The service broker takes the request and creates what is called a service instance. After that instance comes online, it binds the service instance to the application. Binding reports the information to the GoRouter, so it knows how to send information from the application to the database.

  • Datastores
    • Relational Databases, like PostgreSQL and MySQL
    • Key-Value stores, like Redis, Memcached, and MongoDB
  • Email (SMTP)
  • Monitoring, like NewRelic and Cedexis Radar
  • RabbitMQ
  • ElasticSearch.

The basic process is:

  • Use cf marketplace and cf marketplace -s SERVICE to find the service and plan you need.
  • Use cf create-service to create an instance of the service.
  • Use cf bind-service to tell Cloud Foundry which app to connect the service to.

Web apps vs Workers

  • Web apps wait for a request, process it, then go back to waiting.
  • Worker applications just run constantly and you can program them to watch a datastore, the time, or just churn through data constantly.

The standard pattern for Web applications is to process the web request as fast as possible, and any slow requests get sent to a queue for a worker to do.

Worker apps are also useful for your 12 factor “one-off processes”. If you need to make large changes to the database, like update leaderboards, you can push a worker that will update the database for you.

Delete apps

$ cf delete APP_NAME

$ cf delete APP_NAME --hostname web-app-random-name

$ cf delete web-app -r

$ cf delete-service

Buildpacks

When Cloud Foundry detects that a Java application is being deployed, it figures out what it needs, like the JDK, and downloads the dependencies. Cloud Foundry does this with what is called a buildpack. The buildpack helps applications gather everything they need to build and run on Cloud Foundry.

Runtime = JVM

The release stage is when the droplet is copied to your application container. The droplet is unpacked and the command to start your application is run.

$ cf update-buildpack some-buildpack-name -i <new position number>

Special permission needed to 'add', 'move' and 'delete' a buildpack. use & push default.

The 12 Factor App (Adam Wiggins - Heroku co-founder)

  1. Codebase - not implemented by CF
  2. Dependencies - implement in CF via buildpacks
  3. Configuration Data - implement in CF via ENV vars
  4. Backing Services - anything the application consumes over the network for normal operation
  5. Build, Release, Run - implement in CF via buildpacks and the Diego Architecture
  6. Processes - stateless/a CF push will create your processes and attach your services automatically if you bound them already
  7. Port binding - tcp/http
  8. Concurrency - cf scale
  9. Disposability - cf push & zero downtime deployment
  10. Dev/Prod Parity - implement in CF by the release management tool of BOSH
  11. Logs - implement in CF by event streams
  12. Admin Processes - implement in CF by either through BOSH errands or through one-off CF web applications

CF components

  • Load-balancer (HAProxy ) -> Router
  • UAA -> Cloud Controller Database (CCDB)
  • DesiredLRP or desired long running process
  • ActualLRP or actual long running process
  • blobstore = a row stores a pointer to the BLOB on the filesystem
  • Diego Cell = VM

The Cloud Controller manages blobstores for the following:

  • Resources: Files that are uploaded to the Cloud Controller with a unique SHA, such that they can be * reused without re-uploading the file
  • App Packages: Unstaged files that represent an application
  • Droplets: Result of taking an app package, staging it by processing a buildpack, and preparing it to run
  • Buildpacks: The buildpacks available to stage apps with
  • Buildpack Cache: Cached artifacts resulting from the staging process.

Variables order: Cloud Foundry, Provider, Currently used, Manifest, Command line option!

Messaging - bulletin board system (BBS)

The BBS server handles messages coming from inside and outside the Diego system. This helps keep track of what work is being orchestrated across Diego at any given moment.

Log streaming = aggregation

CF does this by capturing logs and metrics from everything using a tool named the loggregator.

The logs and metrics start at their source, yet move from one collector to another, until they finally reach the top of the loggregator collection, called the firehose. Then, they can be consumed and filtered by a system called nozzles.

Importance of Metrics and Logging

Using a system like the loggregator, storing the data it receives, and then using other analysis systems, like Splunk or Hadoop/Hive, allows for great power and flexibility for introspecting an app’s behavior over time, including:

  • Finding specific events in the past
  • Large-scale graphing of trends (such as requests per minute)
  • Active alerting according to user-defined heuristics (such as an alert when the quantity of errors per minute exceeds a certain threshold).

Resilience and Availability

Default Values:

  • Cloud Foundry default
  • Provider default
  • Currently used
  • Manifest
  • Command line option

CF default's for Memory and Disk usage are both 1GB. When you push an app, the default is one instance.

Application Health

CF tries to ensure your application is healthy. It has a health monitor that constantly checks your app and if the health check fails, Cloud Foundry will replace the application instance with a fresh one.

Healthchecks

  • Port - LISTEN (fast and easy)
  • Process - pid (fastest)
  • HTTP - 200 < 1s; increase health-check-timeout for Java to 180s.
$ cf push APP-NAME -u HEALTH-CHECK-TYPE -t HEALTH-CHECK-TIMEOUT

$ cf set-health-check APP-NAME (process | port | http [--endpoint PATH])

Scaling the Resources

$ cf push [-m MEMORY] [-k DISK] [-i NUM_INSTANCES]

$ cf scale APP-NAME [-m MEMORY] [-k DISK] [-i NUM_INSTANCES]

Orgs, Spaces, Roles, and Permissions

  • developer-centric
  • application-centric
  • development process-centric

Spaces

Spaces are the objects that contain your applications and services.

  • quotas
  • roles
  • permissions

Organizations

Orgs group all spaces together, and have the same traits as spaces, except for one: orgs cannot run applications or services. Orgs are for performing administrative tasks. Orgs affect spaces when they are being created (allow you to see user accounts).

Quotas

Quotas are the size constraints put on orgs and spaces.

Roles and Permissions

Permissions are the steps that can be taken by user accounts. These permissions directly map to the command operations in the cf CLI tool. Related cf CLI actions are grouped into a single permission. Roles are the job tasks. They are designed using the principle of least privilege.

Five fundamental roles exist in CF:

  • Administrator

There are only a few user accounts that are assigned this role and is only given out to the operations team. This role has all rights and privileges.

  • Manager

A manager role is a person that needs to administer a group of user accounts, but not deploy applications.

  • Auditor

The auditor role is the task that needs to review what is going on, but never modify anything.

  • Billing

The billing role is similar to the auditing role, but is more restrictive.

  • Developer

The developer role manages the applications and services it relates to.

Managers Have the Power

Quotas can be assigned to Orgs or Spaces. Quotas at the Org level affect all new Spaces and can be overridden by setting space quotas.

$ cf set-quota ORG ORG_QUOTA

$ cf set-space-quota SPACE_NAME SPACE_QUOTA_NAME

Orgs support manager, billing and auditing roles. You associate the role and the user account with the org:

$ cf set-org-role USERNAME ORG ROLE

Spaces support manager, developer, and auditing roles. You associate the role and a user account with the space:

$ cf set-space-role USERNAME ORG SPACE ROLE

Routes and Domains

Routers are like the local postal offices, with the routing tables as the sorting machines. Domain Name System (DNS) serves as your address book. A domain is like a zip code. A domain is a router.

A domain comes in two types

  • TCP domain does not look at the data. It forwards the data as is. Secure data to be passed to your application or service.
  • HTTP domain looks at the data before forwarding the message to your application or service. This decrypting also means any secure data is decrypted at the router.

Shared vs Private Domains

The default domain name type for domains is a shared domain name. A shared domain name is used across multiple Organizations. The domain name prefix will be the same for applications and services. Your Cloud Foundry environment can support multiple shared domain names.

The private property only means you have your special unique external domain name. You can still share the Private Domain Name across Organizations. The Private Domain Names can only be used with HTTP Domains.

TCP and HTTP Routes

A route is the mapping between the application and its domain. All routes use the domain and the port number to find its Domain Router. All routes use the domain and the port number to find its Domain Router.

The TCP domain only looks at the TCP port number. The router then looks up that port number and then forwards that message to the appropriate host to continue the processing. The data is never looked at and is forwarded as it was received. Once that port number is used, no other application or service can use that port number within Cloud Foundry.

The HTTP domain only uses ports 80 and 443. All messages are received through those two port numbers. Something else needs to be used to figure out the routing, and that is the URL string. The URL is analyzed and compared with the registered routes. Once a match is found, the message can be forwarded to the proper application or service.

Many applications can be mapped to a single route, as well as a single application can be mapped to many routes. Routes are globally unique within CF, so they can associate applications with their Spaces.

Zero Downtime Deployments

A zero downtime deployment is designed to deploy new software release or restore the previous release without user interruption.

In an agile workflow, there is the concept of continuous deployment. The idea is to do many deployments with small changes. This concept reduces the risk of problems by making all the changes smaller and simpler to understand and review. Zero downtime deployments further reduces the risk by allowing the new version some traffic for verification before you remove the older version.

The concept of zero downtime deployment was designed to be used in staging and production environments. It is generally not necessary for your development and testing environments. Unit, integration, and system testing are items you never want to do in a production environment. In testing environments, inconsistent responsiveness during deploys is usually fine.

The 12-factor approach simplifies zero downtime deployments. The three key items to achieve this are stateless processes, process scaling, and attachable background services. If your processes are stateless, then the processes can be stopped once there are no open connections to them. Scaling allows processes to be added when adding the new deployment processes. The attachable background services allow for services to be connected to both versions during the deployment.

When a new release is created, new routes to and from processes and background services are created. For application reliability and scalability issues, the same load balancer or network router should be used for both deployments. Both versions of the code are available and connected to the backing service, so all the work is focused solely on the changes to routing.

For Cloud Foundry zero downtime deployments, the new application will have a distinct temporary unique route that facilitates the testing without being exposed to the user. It is only after the current route switches to the production route that the user’s traffic is processed.

Cost

There is a cost in doing zero downtime deployments. You must have enough resources in each space to handle two deployments for a short time period. Management needs to understand that uninterruptible production environments are worth the expense of having more than a single instance of your running application. Having end users not affected by system outages far outweighs the additional cost.

Manual Process

In CF, the GoRouter is the load balancer/network router. The GoRouter maintains the association between the application and its network address path. This tuple is called a mapping. Many applications can be mapped to the same network path. This multiple mapping is what makes zero downtime deployments possible since the old and new applications can use the same network path.

The only restriction on the network address path is the path restricted to a single space. So, you cannot deploy your new application to a different space from the old application without building some additional customised routing code.

$ cf repo-plugins|grep -i zero
blue-green-deploy            1.2.0     Zero downtime deploys with smoke test support
autopilot                    0.0.1     zero downtime deploy plugin for cf applications
bg-restage                   1.0.0     Perform a zero-downtime restage of an application
                                       over the top of an old one (highly inspired by autopilot)

Cloud Native Design Patterns

Design patterns are programming templates for solving common programming problems.

Cloud native, as mentioned previously, just means the design or code was tailored to the cloud infrastructures. So, a cloud-native design pattern is a programming paradigm applied to solving cloud-based programming issues.

Service Discovery Design Pattern

  • Datastore

The datastore will store the multiple network routes for microservice instances using a generic lookup key. Microservices and clients need to share this generic key.

  • Service Registry

This component handles both the microservice registration and the microservice discovery for clients. The registry determines which clients can access which microservice through an authentication process.

To find the registry service, its network route must be predetermined.

  • Auto-registration

A microservice needs to register itself with the Service Registry. Most Service Registries expect the service to refresh its registration periodically, to confirm it is still alive and up to date.

read-only access to the Configuration Service.

  • Client The client needs to authenticate with the Service Registry before getting service information back. If an error occurs, the client side needs to report it and deal with the client side interface as well.
Implementation

The CF GoRouters are an implementation of this design pattern. It does the mapping from the network route to the applications. There are various other implementations of this design pattern, such as Etcd, Consul, and Zookeeper.

Configuration Server Design Pattern

This design pattern is a service to store and retrieve your application configuration parameters. It is a pattern to meet the Twelve Factor principle for Configuration Data. This design pattern is more powerful since configuration data can be changed without needing to restart the application and the setting of the parameters could be done independently. It does not need to be tied to a cf push command.

Persistent datastore for storing and retrieving your application parameters.

Implementation

There is a Java implementation of the configuration service. It is called Spring Cloud Config. It is written in Java and integrates quite cleanly with Java applications. It provides many features, such as multiple datastore support and data encryption.

Microservices in other languages can also interface with the Spring Cloud Config service, since the interface is a REST API.

Circuit Breaker Design Pattern

The circuit breaker design pattern is for dealing with remote communication failures and how to recover gracefully. It is named after its namesake, the circuit breaker used in your homes to protect your electrical appliances. The idea is if a failure occurs then it should mark that code path, so future calls can report back a failure more quickly.

Client Call

A request is received. If the request requires other services to complete, it makes those calls. Now, what if an unrecoverable error occurs? The client call needs to fail, but does not want to take a very long time. Maybe the first few failures might take a long time, but the remaining calls should complete quickly.

Service Call

If the service call detects a failure, it needs to mark itself as disabled. In this way, future calls can fail right away. It then sets up a background process to check for when the service comes back up again.

Asynchronous Retry and Recovery Thread

The thread gets activated when a microservice call detects an unrecoverable error, like a connection failure. In this case, the failed service is checked periodically to see if it is back up. Once the failed service recovers, the failed service call interface is marked as enabled to allow all future calls to go through.

This background task logs when it initiates a service check and when the recovery occurs.

Implementation

The Hystrix library is an implementation of the circuit breaker design pattern. The Spring Cloud Services has a dashboard that displays the events coming from the Hystrix library. Cloud Foundry also provides a dashboard if the circuit breaker feature is enabled.

Logging

Every log line contains four fields:

  • Timestamp
  • Log type (origin code)
  • Channel: either STDOUT or STDERR
  • Message

Loggregator assigns the timestamp when it receives log data.

Debugging

$ cf app APP_NAME

$ cf ssh APP_NAME -i 0

$ YOUR-APP-GUID=cf app APP_NAME --guid

$ curl app.example.com -H "X-CF-APP-INSTANCE":"YOUR-APP-GUID:YOUR-INSTANCE-INDEX"

Distributed Tracing

Distributed tracing assists in figuring out either application failures or if your application is not responding as quickly as you want.

The general concept behind distributed tracing is that you need a way to correlate the different logs being collected. This is commonly done by using a correlation identifier where the request is initiated at. The rest of your application needs to pass the correlation identifier through all the various APIs. All messages with the same correlation identifier can be filtered together.

Cloud Foundry assists in this process by supporting the zipkin distributed tracing facility, discussed in the next unit. If enabled, Cloud Foundry will automatically log messages with the zipkin’s correlation identifier.

Zipkin Trace Logging

If Zipkin trace logging is enabled in Cloud Foundry, then GoRouter access log messages will contain Zipkin HTTP headers and the GoRouter also adds or forwards Zipkin trace IDs and span IDs to HTTP headers.

The following is an example access log message containing Zipkin headers:

2016-11-23T16:04:01.49-0800 [RTR/0] OUT www.example.com - [24/11/2016:00:04:01.227 +0000] "GET /
HTTP/1.1" 200 0 109 "-" "curl/7.43.0" 10.0.2.150:4070 10.0.48.66:60815
x_forwarded_for:"198.51.100.120" x_forwarded_proto:"http"
vcap_request_id:87f9d899-c7a4-46cd-7b76-4ec35ce9921b response_time:0.263000966
app_id:8e5d6451-b369-4423-bce8-3a7a9e479dbb app_index:0 x_b3_traceid:"2d5610bf5e0f7241"
x_b3_spanid:"2d5610bf5e0f7241" x_b3_parentspanid:"-"

After adding Zipkin HTTP headers to application logs, developers can use cf logs APP-NAME to correlate the trace and span IDs logged by the GoRouter with the traceid logged by their app. To correlate traceids for a request through multiple applications, each application must forward appropriate values for the headers with requests to other applications.

Annotation

An Annotation is used to record an occurance in time. There’s a set of core annotations used to define the beginning and end of an RPC request:

cs - Client Send. The client has made the request. This sets the beginning of the span. sr - Server Receive: The server has received the request and will start processing it. The difference between this and cs will be combination of network latency and clock jitter. ss - Server Send: The server has completed processing and has sent the request back to the client. The difference between this and sr will be the amount of time it took the server to process the request. cr - Client Receive: The client has received the response from the server. This sets the end of the span. The RPC is considered complete when this annotation is recorded.

When using message brokers instead of RPCs, the following annotations help clarify the direction of the flow:

ms - Message Send: The producer sends a message to a broker. mr - Message Receive: A consumer received a message from a broker. Unlike RPC, messaging spans never share a span ID. For example, each consumer of a message is a different child span of the producing span.

Other annotations can be recorded during the request’s lifetime in order to provide further insight. For instance adding an annotation when a server begins and ends an expensive computation may provide insight into how much time is being spent pre and post processing the request versus how much time is spent running the calculation.

BinaryAnnotation

Binary annotations do not have a time component. They are meant to provide extra information about the RPC. For instance when calling an HTTP service, providing the URI of the call will help with later analysis of requests coming into the service. Binary annotations can also be used for exact match search in the Zipkin Api or UI.

Endpoint Annotations and binary annotations have an endpoint associated with them. With two exceptions, this endpoint is associated with the traced process. For example, the service name drop-down in the Zipkin UI corresponds with Annotation.endpoint.serviceName or BinaryAnnotation.endpoint.serviceName. For the sake of usability, the cardinality of Endpoint.serviceName should be bounded. For example, it shouldn’t include variables or random numbers.

Span

A set of Annotations and BinaryAnnotations that correspond to a particular RPC. Spans contain identifying information such as traceId, spandId, parentId, and RPC name.

Spans are usually small. For example, the serialized form is often measured in KiB or less. When spans grow beyond orders of KiB, other problems occur, such as hitting limits like Kafka message size (1MiB). Even if you can raise message limits, large spans will increase the cost and decrease the usability of the tracing system. For this reason, be conscious to store data that helps explain system behavior, and don’t store data that doesn’t.

Trace

A set of spans that share a single root span. Traces are built by collecting all Spans that share a traceId. The spans are then arranged in a tree based on spanId and parentId thus providing an overview of the path a request takes through the system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment