Skip to content

Instantly share code, notes, and snippets.

@jusleg
Last active March 9, 2020 05:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jusleg/12c4374ca14494022cae4c0888c438ed to your computer and use it in GitHub Desktop.
Save jusleg/12c4374ca14494022cae4c0888c438ed to your computer and use it in GitHub Desktop.
345 notes

Soen 345

Some definitions

Legacy code: code without tests. Without test, we don't know if the code is getting better of worse

Unit test: tests that run fast and help localize problems. They test a specific part of the code

Integration test: Test that spans across multiple modules. Not fast. Insure that a complete feature is working

Continuous integration: Automatically build, test and analyze software to every change to the source repo. New commit = building, testing and analyzing again. (Ex: Travis CI)

Continuous delivery: Ensures software change can be delivered by testing production-like environments (Ex: testing with Ruby 2.4, Ruby 2.5,... )

Software is constrained by people who wrote the software before you. You want to code things as simple as possible and skip things that you are not going to need..

TDD (Test driven development)

  1. Write a failing test case
  2. Get it to compile
  3. Make it pass
  4. Removed duplication*
  5. Repeat

*At step 3, you might copy old code to use in your new code. After the test pass, it is a good idea to refactor to remove this duplication. There isn't always duplication.

TDD cons

  • lot of discipline needed to write test before the code
  • lot of small useless tests
  • Need to maintain those tests

The solution: you compromise!

  • Every commit, you need to make sure that it has test covering the feature or code you added/modified
  • If you don't do that, you will never do it. (Just be like Shia Labeouf and do it)

TDD pros

It lets us focus on a specific part at a time: You are either writing new code (code/tests) or refactoring. Not both at the same time.

Characterization tests

WHY: To protect the existing behavior of legacy code against unintended changes. WHAT: It characterizes the actual behavior of a piece of code HOW:

  1. Put the code in a test harness (automated testing)
  2. Write an assertion that you know will fail
  3. Let the failure tell you what the behavior is
  4. Update the test so that it tests that behavior
  5. Repeat

It's important to characterize important parts or code you think will change in a future update. This will provide you a heads up in the event that the behavior changed

Example: If you use a stack in java, you will notice that if you use an iterator, the results will be returned in the wrong order and the elements are not popped out of the stack. This is due to the fact that the stack uses the queue iterator. It is a great idea to add characterization tests to your stack so if the stack gets updated to have a proper iterator behavior, it will notify you of the change.

Breaking dependencies (the fun part)

Dependencies among classes make it difficult to get a cluster of objects under test. Sometimes you'll end up with the whole system in the test harness (you don't want that)

Two reasons why we might want to break dependencies:

  1. for sensing: when we can't access the values our code computes
  2. for separation: when we can't even get a piece of code in the testing harness. We do that so we can test the code.

Fake collaborator or fake object

Object that impersonates a collaborator of the class being tests

before screen shot 2018-04-17 at 5 26 36 pm

After screen shot 2018-04-17 at 5 28 54 pm

We made an interface Display. The sale object now has a display object instead of ArtR56Display. We can then instantiate it with the fakeDisplay to test the behavior of scan() and verify that showLines() would be called with the proper value.

Mock

A mock is a dummy implementation for an interface or a class which we can:

  • Define the output of certain method calls
  • Configure to perform a certain behavior
  • Validate the interaction with the system

Mockito: Library enabling mock creations, verifications and stubbing. To do so, it uses reflection and the proxy pattern.

Creating a mock ObjectYouMAde mockObject = mock(ObjectYouMade.class); the mock will remember all the interactions, but will not run the real methods when you call it.

Verifying calls to a mock

CustomObject obj = mock(CustomObject.class);
verify(obj).methodYouMade("Some param");


// other examples
obj.methodYouMade(3);
verify(obj).methodYouMade(3); //true
verify(obj).methodYouMade(4); //false

// N.B. verify does not verify by default the order of the method calls

Verify the order of calls to a mock To do so, you need to initialize an InOrder.

InOrder inOrder = inOrder(mock1, mock2, ...);
inOrder.verify(mock1).methodYouMade(ParamUsed);
inOrder.verify(mock2).methodYouMade(ParamUsed);
// in order to pass, the method was called on mock1 before mock2

Stubbing Return whatever value is passed in

when(mockedObject.get(3)).thenReturn("ok");
mockedObject.get(3); // ok
when(mockedObject.get(3).thenReturn("It's order dependent");
mockedObject.get(3); // It's order dependent
// you can also stub for any param value
when(mockedObject.get(any(int.class)).thenReturn("Cool");

Stubbing consecutive calls

when(mockedStack.pop()).thenReturn(3,2,1);
mockedStack.pop(); // 3
mockedStack.pop(); // 2
mockedStack.pop(); // 1

ArgumentCaptor Captures the value that was passed in a method. You can then use that value to verify that other object or other methods were called with this value

ArgumentCaptor<String> argCaptor = ArgumentCaptor.forClass(String.class);

You can set the type you want to capture. Then you can use the following 2 methods:

  1. argCaptor.capture(); to save the arg value
  2. argCaptor.getValue(); to return the arg value

N.B. You can capture multiple times

Here's a better example:

// Object.java
private void setXY(int xy) {
	setZ(XY);
	this.xy = xy;
}

private void setZ(int z) {...}

// Main.java
ArgumentCaptor<Integer> argCaptor = ArgumentCaptor.forClass(int.class);
verify(spyObject).setXY(argCaptor.capture());
verify(spyObject).setZ(argCaptor.getValue());

Using Spy Spy allows you to use a real object and used verify + stubs on it if need be

  1. Wrap the object in a spy
List list = new LinkedList();
List spy = spy(list);
  1. enjoy!

If you call the methods, they will still work as intended. If you stub a method, the functionality will be overwritten by the stub.

Bisection

It is used to find when something broke and it wasn't tested.

  1. write a test
  2. bisection search to find the last good commit. The commit to the right will be the commit that broke it.

Flaky tests (oh he tweakin')

A flaky test is a non-deterministic test (can exhibit different output in the same run)

2 types of non-determinism:

  1. Inherent non-determinism: noisy or complex tests. Race condition
  2. Accidental non-determinism: Old / out of date test introduces flakiness

How to handle flaky tests Google's way: Only report failure if it fails 3 times in a row. Microsoft: test 1000 times and if below flaky ratio, quarantine the test Ericsson: binomial fail ratio: Worst case, you run 384 times a test. It gives you a number of time to run based on the flaky ratio. If the actual ratio is above the baseline, it is considered a failure.

Testing smells

1. Hidden dependencies (dependencies inside constructor)

Solution: Parameterize the constructor. Make a constructor that accept the dependency as parameter and rewrite the original constructor to call the new constructor.

Before

public Sale(Display ..., Storage ...){
	this.display = ...
	this.storage = ...
	this.interac = new Interac(42); // we can't mock that right now
}

After

public Sale(Display ..., Storage ..., Interac ...){ // new parameterized constructor
	this.display = display;
	this.storage = storage;
	this.interac = interac;
}

public Sale(Display ..., Storage ...){
	this(display, storage, new Interac(42));
}

We parameterized the constructor so we can insert a mock for testing purposes.

2. Blob (object with so much shit inside)

If you have an object with a large number of instance variables, parameterizing the constructor might now be the best way. A quick fix it to supersede the instance variable of interest (create setter for the instance variable so that we can dynamically insert mocks).

N.B. These methods should only be used for testing

3. Globals (singleton pattern)

Global and singletons are evil. Since the test depends on something that is globally modifiable, it can cause a lot of problems.

First, a good way to implement a global object is to use the singleton pattern.

Example

public class SingletonDemo {
	private static  SingletonDemo singleton;
	private int x;
	private int y;

	private SingletonDemo() {
		x = 8;
		y = 10;
	}

	public synchronized static SingletonDemo getInstance() {
		if (singleton == null) singleton = new SingletonDemo();
		return singleton;
	}

	public int getX() {
		return x;
	}

	public int getY() {
		return y;
	}

	public void setX(int x) {
		this.x = x;
	}

	public void setY(int y) {
		this.y = y;
	}
}

With a singleton/global, the tests become order dependent as they are coupled to the value of the global. A solution is to reset the singleton/global to its default value after every test. Doing so fixed the order problem but still doesn't allow to run the tests in parallel.

Dependency injection

Let's say you have a Store object:

public class Store {
	private Interac terminal;
	private Register register;

	public Store(){
		terminal = new Interac(12);
		register = new CashRegister();
	}
}

there is currently no way to mock the hard dependencies of interac and register. With dependency injection, you won't need to explicitly set the dependencies with new in the constructor. Those will be set from the outside and will be modifiable. A small refactoring will be necessary.

1. Parameterize the constructor to include every parameter

public Store(Interac terminal, Register register) { ... } done.

2. Add the @Inject annotation above every instance variable and constructor of the class you want to inject stuff in

@Inject
private Interac terminal;

@Inject
privat Register register;

@Inject
public Store(Interac terminal, Register register) { ... }

done.

3. Create new module extending AbstractModule to declare configurations

// StoreModule.java
public class StoreModule extends AbstractModule
	@Override
	protected void configure(){
		bind(Interac.class).toInstance(new Interac(12));
		bind(Register.class).to(CashRegister.class);
	}
}

This will set the same configurations as the original code. .toInstance(...) is used when you want to specify an instance that uses a non default constructor .to(Something.class) is used when the LHS is different than the RHS and you use a default constructor N.B. if the LHS and RHS are the same and are using the default constructor, you do not need to bind it. Magic will happen automatically.

4. Add the injector in the class that creates the object

Injector injector = Guice.createInjector(new StoreModule());
Sale sale = injector.getInstance(Store.class);

Note: Dependency injection uses inversion of control to allow framework to specify the dependencies

Testing with dependency injection

1. change the runner

@RunWith(MockitoJUnitRunner.class)

2. Set the instance variables in you test class with the @Mock annotation

@Mock
Interac terminal;

@Mock
Register register

Store store; // we'll need it for step 3

3. Add a before step using the @Before annotation

@Before
public void anyMethodName() {
	store = new Store(terminal, register);
}

this will be run before every test. It would be a great idea to reset singletons in a before action. After step 3, you are done. You can test like you used to do it.

Dependency inversion vs dependency injection

Dependency inversion principle: The code should depend on abstract classes and interfaces only, not concrete implementations

Bad example

class SomeClass{
	private CandyStore store; // CandyStore is a class
}

Good example

class SomeClass {
	private Store store; // Store is an abstract class
}

Dependency injection is a dependency inversion enabler (it helps us acheive dependency inversion)

Consistency checking

  • There are things that you cannot test (or it would be way too expensive to test).
  • Ex: race condition, hardware problem
  • Never assume your db has the right data

You make a consistency checker to verify that the data is still valid (not outdated or corrupted). You can quickly verify the integrity of a file by comparing the expected checksum with the actual checksum.

How to check you whole data?

  1. Run a script on every partition
  2. Every partition has a set of units (more on that later)
  3. Run sanity checks on every units (same hash, same size, ...)

What is a good unit

unit size varies. You can have really small one and not so small ones

  • Small: value in a db
  • Single larger unit: file (use hash)
  • Small composite unit: size of a file, total number of records
  • large composite units: directory structure (use a hash)

Schedule the checker

  1. Assign a partition per check runner
  2. Set a cursor and record its position

if a violation if found, store it in a db (we'll use it later)

Fix an inconsistency

when an inconsistency was found and store in the db, a script will run over all the variations and fix them. Run the checker again afterward

Recap

  1. Start a checking job
  2. Compare the actual value against the recorded values
  3. Check for violations. If one is found, report it and persist it.
  4. if valid changes are made to the data, you need to update the checker.

Data migration

1. Start with a forklift

Take a snapshot of the data and start feeding it in the new datastore. Do this in offpeak hours

2. Incremental Replication

Since we don't want to run the forklit over and over again, use incremental replication instead to move new/updated data to the new datastore. Store a flag of modified row. This is run continuously so that the new datastore mirrors the old one.

3. Consistency checker

Since the incremental replication might now always work, use a consistency checker to find violations between the two datastores. Keep track of any violations and update the outdated records in the new datastore.

4. Shadow writes

We shadow writes we still read the old datastore as source of the truth, but we write to both datastores. When a write happens on the old datastore, an asynchronous write is also done on the new one. We need to track the status of those writes. Any failed writes will eventually be fixed by the consistency checker. You can also evaluate write performance on the new datastore.

5. Shadow reads

With shadow reads, you will read from both datastores. When a request is made, both datastores are used. The data from the old datastore will be served to the user. The data from the new datastore is used to verify the consistency. Keep track of the result over a period of time. Gradually roll it out to evaluate the performance.

6. Perform full migration

When shadow read shows that there is minimal data mismatch, the datastores are ready to be switched. Flip a flag and now the new datastore is the source of truth and the old one is not used anymore.

Feature toggles

If you work with release branches or feature branches and you ship broken code that break functionalities, you'll have to do a revert or and emergency patch. This will take time and will require recompile + redeploy of the application globally.

Solution: Feature toggles

Instead of shipping code that is instantly used in master, hide the feature behind a feature toggle. A feature toggle is just a fancy way of saying a flag. You ship new code hidden behind the flag.

Example of a toggle

public class StoreToggles {
	public static boolean newSalesModule = true;
}

Example of the toggle being used

...
if (Storetoggles.newSalesModule) {
	// run the super cool new code
else {
	// run the old boring code
}

If something does not behave properly with the new code, simply turn the toggle off. After the feature has been running for a while without any issue, you can delete the toggle and the old code.

Advantages of feature toggles

  1. Support for A/B testing (turning the flag on for a percent of user)
  2. Canary release (gradual rollout of a feature)
  3. Feature management (quickly turn on/off a feature)
  4. no need to redeploy

Disadvantages of feature toggles

  1. toggle debt (at some point you have so many toggles that you don't know what they do.
  2. Combination hell (some feature might only with a specific combination of toggles. Removing one toggle might break everything)
  3. Dormant code (You are keeping code that could potentially still be called. You need to be careful)

Loggers

Intro

// get root logger
Logger logger = LogManager.getLogger();

// get any other logger
Logger analytics = LogManager.getLogger("analytics");

// Example of logger usage
logger.debug(“Debug log message”);
logger.info(“Info log message”);
logger.error(“Error log message”);

Log level

  1. All
  2. Fatal
  3. Error
  4. Warn
  5. Info
  6. Debug
  7. Trace
  8. OFF

N.B. All and fatal will show everywhere but off. Error will be visible to every log level below itself. This is the same for every log level. For instance, info will be visible at debug and trace

Log hierarchy

The hierarchy of loggers uses a dot structure

root
root.parent // parent is root
root.parent.child // parent is "parent"

Small examples with log levels

In this example only root logger is defined

Logger Name Assigned LoggerConfig LoggerConfigLevel
root root DEBUG
x root DEBUG
x.y root DEBUG

In this example they all end up the same as root as they are not defined; therefore, they inherit the parent config. Since x is not defined it is not the parent of x.y ; therefore, they both inherit from root.

Another example with X and X.Y.Z defined

Logger Name Assigned LoggerConfig LoggerConfigLevel
root root DEBUG
x x ERROR
x.y x ERROR
x.y.z x.y.z WARN

x.y has the same values as x as it is not defined and x is its parent. It is not the same for x.y.z as it was explicitly defined

Another example with X and X.Y defined

Logger Name Assigned LoggerConfig LoggerConfigLevel
root root DEBUG
x x ERROR
x.y x.y INFO
x.yz x ERROR

x.yz is not a children of x.y as it is missing a period. Since it is not defined, it takes its parent value x.

Appenders and additivity

Appenders allow to print to multiple destinations (sysout, file, db,...). They are inherited from the hierarchy. By default a logger will get the root appenders and its child will get his its appenders + root appenders and so on. To stop the forwarding of appenders, simply use additivity="false" this will not get the appenders from the parents.

N.B. Log request are forwarded down the hierarchy

Logger Name Specific Appenders Additivity flag Active Appenders
root A1 n/a A1
x A-x1, A-x2 true A1, A-x1, A-x2
x.y none true A1, A-x1, A-x2
x.yz A-xyz1 true A1, A-x1, A-x2, A-xyz1
security S1 false S1
security.access S1-access true S1, S1-access

We can see the additivity with x, x.y and x.y.z. Also we can see that turning off additivity does not inherit A1 from root. Only the appenders specified for security are kept.

Pattern for logging

<Pattern>%d %p %c{1.} [%t] %m%n</Pattern> %d = data %p = level (eg. ERROR) %c = name of logger %t = name of thread %m = message %n = new line

Modern Logging

  • Log as much as possible. Don't care about performance issue for logging
  • Dump all of that in a DB
  • Log structured data (ex: JSON)
  • Log timestamp, method name, request, payloads, data, message, latency,...
  • Aggregate those logs in a search engine (Ex: splunk)
  • Use a db like LogStash to search and also structure the log data
  • Normally cheaper to pay for a log search provider than to build one
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment