Skip to content

Instantly share code, notes, and snippets.

@bkase
Last active July 22, 2020 07:21
Show Gist options
  • Save bkase/9a401450f57c48de960952aef4e13366 to your computer and use it in GitHub Desktop.
Save bkase/9a401450f57c48de960952aef4e13366 to your computer and use it in GitHub Desktop.
GPT3 was fed the first paragraph only (taken from https://bkase.dev/posts/data-abstraction )

One of the greatest tools to tame complexity in a growing codebase, growing either via lines of code or via people, is through strategic information hiding — you may know this as encapsulation, modularization, or data abstraction. In this post, we'll explore data abstraction's definition and capabilities from a formal perspective and show to what extent we can achieve the ideal through Swift, TypeScript, OCaml, ReasonML, Rust, and Haskell. Knowing how to use data abstraction in your language of choice is important in the short term, but understanding and naming the general concepts equips you to understand these tools as our languages evolve and change.

Data abstraction is about abstracting away the details of data structures and manipulating them at will, so that you don't have to worry about their internals anymore. It's not just about avoiding memory leaks, it's also about avoiding having to think about the underlying representation of those values in the first place. This allows for more flexibility when working with your data, which in turn makes it easier to reason about your code. The next time you're updating your favourite game engine and you need to move from 32-bit floats to 64-bit doubles, or when you're migrating a database from Postgres to Oracle, or if you just need to store some additional data that the original representation can't provide for, you can focus on the code and what it's doing rather than having to care about the low-level implementation details.

What this typically means is that we need to be able to treat values of one type as if they are of another. This is achieved through so-called abstract data types (ADTs). These are essentially data types that support operations that can work on any instance of that type, regardless of whether or not it's a concrete instance. The key feature of these operations is that they don't "look behind" the values at all and simply operate on the given parameters. We'll look at a few examples of such operations later.

When using abstract data types, it's important to understand that they don't "do" anything on their own. They're just building blocks we can use to build more complex entities, which is why it's common to see the term "abstract" used in this context. We'll explore this concept further with a couple of code examples in a moment.

A Simple Binary Tree

One of the simplest representations of an abstract data type is a binary tree. Such a tree consists of nodes that have some number of branches (often just called "children" regardless of whether they're left or right). Each of these nodes has some value and pointers to the two child nodes.

We're not interested in the internal details of a node here, just in what operations we can do with it. So, we have the following operations available:

// Given a node with value 'value' and children 'left' and 'right', return 'true' if // 'value' is less than 'left.value' and 'right.value'. bool less(value, left, right); // Returns a new binary tree with 'left' as the left child and 'right' as the right child. TreeNode createTree(value, left, right);

The first thing to note about these operations is that they don't rely on the actual type of the values ( int , string , whatever) but rather operate on the abstract values that they represent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment