Skip to content

Instantly share code, notes, and snippets.

@stanleyxu2005
Last active November 2, 2015 09:54
Show Gist options
  • Save stanleyxu2005/ba71de4947bae11aa77c to your computer and use it in GitHub Desktop.
Save stanleyxu2005/ba71de4947bae11aa77c to your computer and use it in GitHub Desktop.
Gearpump Continuous Integration Environment

Gearpump Continuous Integration Proposal

This proposal is for Gearpump end-to-end integration test. For more information, please track issue 1243.

Background

Gearpump has some integration tests. But tests are frequently failed on Travis-CI by unrelated reasons. So currently, the integration tests are performed manually and spontaneously. The test effort is very high and not plausible. With the increase of project complexity, any sightly code change might break the build, if we do not test the build entirely. The major challenge of creating automated integration tests is to setup a "Gearpump on Hadoop cluster" (AUT, application under test) in an easy way.

Approach

TL;DR: Create a scalable Gearpump cluster using Docker.

The long version: We will create a Docker image for Gearpump, so that we can start a set of Docker containers to build up a Gearpump at any scale level instantly. We can do destructive operations to the test cluster (e.g. kill a worker, disconnect the network) without breaking our real machine. And good reason is that Travis-CI supports Docker service.

Here are the major items required:

  1. Build a Docker image for Gearpump like this. The key thing is the init_script. As Gearpump has master and worker roles. The init_script will specify, whether the container should start Gearpump as master or a worker.
  2. Create a test driver for integration tests. The test driver is a black box. Developer will treat the test driver as a real Gearpump cluster. The test driver will manage Docker containers.

Technical Details

Build Docker Image

Prerequisites:

  • CentOS 7 64-bit (with Linux Kernel 3.10.x or higher)
  • Install Docker (TBD: add doc of docker basics, proxy settings, dockerui)

We will create Docker image with Gearpump like this.

Not in this scope, but we will do in next step is that, we will create more Docker images to simulate other test environments by considering these aspects:

  • Non-HA; HA
  • Basic Authz; Kerberos Authz

Test Driver

The test driver will actually execute Docker commands to manage a real Gearpump cluster. The test driver should expose a set of operations to test cases.

  • Valid operations:
    • Start/Stop a cluster
    • Add/Remove a worker
    • Query components runtime information
    • Submit/Kill application
  • Destructive operations:
    • Kill prcoess
    • Block network communication
class GearpumpTestCluster {
    def start(masterNum: Integer, workerNum: Integer);
    def stop();
    def getGearpumpClient: GearpumpClient
    def getMasters: Array[String]
    def killMaster(masterAddress: String)
    def killWoker(workerAddress: String)
    ...
} 

Commands

Command to start a single node Gearpump cluster. Dashboard and restapi will be exposed at http://127.0.0.1:8090.

docker run -d -p 8090:8090 --name master0 -i gearpump/gearpump

Command to stop a running Gearpump cluster:

docker stop master0

Command to start 2 worker instance (not implemented yet). Worker instance will communicate with master automatically, there is no need to specify the hostname or port of master.

docker run -d --name worker0 -i gearpump/gearpump
docker run -d --name worker1 -i gearpump/gearpump

Command to start worker 2 (not implemented yet)

docker stop worker1

Command to retrieve the number of workers

curl http://127.0.0.1:8090/api/v1/workers

Command to retrieve the master status

curl http://127.0.0.1:8090/api/v1/master

The test driver will have to stop all docker containers, when tearDown is called.

Test Plan (Moved out of the scope)

Update #1: The proposal is to define a mini Gearpump cluster for testing. How the test cases are organized, will not be a part of the design.

We need to define two test suites.

Check-list Test

Checks, whether designed features behave expectedly. Test cases are put into different test categories. Some test cases might only be enabled for particular test environment. For instance, YARN related tests will only be performed on "Gearpump on Hadoop-YARN cluster".

Here is a draft of test category. Test cases are fake.

  • Core Spec
    • Test case #1: All ports serve expected
    • Test case #2: Query service component status, etc.
    • Test case #N: ...
  • Stability Spec
    • Test case #1: Kill an executor and wait for recover
    • Test case #2: Kill an application and wait for recover
    • Test case #N: ...
  • Example Spec
    • Test case #1: Word count related
    • Test case #2: Storm related
  • Scalability Spec
    • Test case #1: ...
  • HA Spec
    • Test case #1: ...
  • Dashboard Spec
    • Test case #1: ...

###Regression Test Ensures, no regression happens. Every test case has an issue id. Every test case will be put into one or more test categories.

Conclusion

Please feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment