Skip to content

Instantly share code, notes, and snippets.

@vincent-pradeilles
Created September 25, 2017 17:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vincent-pradeilles/084a39622fa070ca3bfcc88bcd04efe0 to your computer and use it in GitHub Desktop.
Save vincent-pradeilles/084a39622fa070ca3bfcc88bcd04efe0 to your computer and use it in GitHub Desktop.

A Pattern to easily enforce external data validation

The pattern described in this article is available as an iOS framework on my GitHub

Whenever a program get its data from an untrusted source, such as a user or an external webservice, it's particularly important that this untrusted data get thoroughly validated before the program starts working with it. Otherwise it will run at the risk of performing errors, corrupting data, or, worse, be vulnerable to a whole array of injection attacks.

Fortunately, those risks are now well known and development teams usually try to make sure that external inputs are systematically validated before they are used. And while such efforts are a good start, they can easily get harder to enforce as the code base grows. For instance, consider the following code:

https://gist.github.com/d6eb6ce6ff969c3dc8916de873e7ab42

The necessary validation is indeed performed before the external data is used, but this validation happens somewhere in a possibly large chunk of code, and could consequently easily be modified or deleted at some point in the future, at which point nothing guarantees that the mistake will be easily spotted during code review.

Based on this example, it becomes clear that we need to lay down some kind of pattern to validate external data in a way that lets code review be easy and reliable.

Model driven security

A good point to start, is to find a way to clearly differentiate untrusted and trusted data. Model-driven security is a sound approach to reach this goal. The idea behind it is to define business object that will wrap trusted data. This way, untrusted data will be stored using primitive types and trusted data using those business objects. The line between them is clear and easy to enforce:

https://gist.github.com/192cbdafce9680ac37483d8e88120bcf

The bad part is that the initializer has access to the external data regardless of whether it passes the validation. While this concern might sound a bit extreme, remember that initializers tend to be written hastily, sometimes with a heavy rely on copying and pasting. So there definitely is enough room for mistakes to happen.

A container for untrusted data

To address the issue, we are going to define a generic struct called Untrusted:

https://gist.github.com/486fbccca6f36085bd7f251292073b29

Its role is to act as a wrapper around external data. Whenever external data is handled, it should be systematically and immediately be stored in an Untrusted container.

Notice that the value is stored with the visibility fileprivate, which means that it cannot be retrieved outside of the file where the field is defined. It might sound weird and impractical at first, but it actually is the angular stone of the pattern.

In the same file, we now declare a protocol Validator:

https://gist.github.com/3150d39a8acb93a0a2779703343e84d2

This protocol expects conforming types to provide a validation(value:) function that will tell whether the data is correct or not. To actually get the data, the developer will have to call the function validate(untrusted:) which will return the value if it passes the validation, and nil otherwise.

Putting things together

We now have a pattern that looks pretty tight, let's have a look at it in action:

https://gist.github.com/7bcac9e2a6813ddf70a0328c119229f1

This time, there is no room for mistakes: when all the external data is wrapped in Untrusted containers, the initializer cannot extract the actual value unless it passes the validation test.

Conclusioon

Now let's see if our initial goal of making external data systematic and easy to enforce has been reached:

  • Whenever external data is retrieved, it must be immediately stored in an Untrusted container => easy to enforce in code review ✅
  • Initializer of business objects can take as arguments either other business objects or Untrusted containers, never primitive types => easy to enforce in code review ✅
  • Business logic can only operate with business objects, never with primitive types => easy to enforce in code review ✅

The goal is indeed achieved: by using the Untrusted+Validator pattern along with the 3 rules above, we can guarantee that, as long as the validation functions are correct, the business logic will only deal with safe and validated data.

For more information around the topic of secure coding, I recommend this video from the 2016 GOTO conference: https://www.youtube.com/watch?v=oqd9bxy5Hvc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment