Skip to content

Instantly share code, notes, and snippets.

@DavidGinzberg
Created November 21, 2016 07:32
Show Gist options
  • Save DavidGinzberg/c5ed1cc548080f3811cda81a1696231f to your computer and use it in GitHub Desktop.
Save DavidGinzberg/c5ed1cc548080f3811cda81a1696231f to your computer and use it in GitHub Desktop.
A persistence challenge focusing on DB implementation concerns

Persistence Challenge - Database Implementation

Unlike most assignments, this is not structured as a lab or project. This is a challenge, with components that range in difficulty from the very simple to the extremely difficult. You can pick and choose which elements to implement, and should consider returning to this challenge from time to time throughout your career to continue expanding your knowlege and skills.

This GitHub repository is provided as a starting point. As long as you are familiar with Spring Boot and Maven projects it should be easy to start here.

Tasks

Basic

Familiarize yourself with the starter code by completing the implementation of a simple CRUD application that stores the names of people. The Twople and Threeple classes are provided to maintain simplicity, and you can use them to expand the application to store first and last names, which will be necessary for some of the later challenges.

  • Implement a 1-table RDBMS with CRUD functions
  • Optimize for access on 1 column via indexing
  • Generalized indexing (eg: setIndexed(columnName, on/off))

Easy

  • Index a single column with map of String -> sets of indexes (indicating all rows that have that value). Ensure that the CRUD operations continue to behave properly.

Medium

  • Add configurable indexing on columns (multiple columns can be set to index themselves). Any number of indexes (up to the number of columns) should be vaible, with the index being automatically consulted, if it exists, for any searches.
  • Add multiple table functionality to the database. This will require some refactoring of existing methods to accomodate the need to specify which table to search.
  • Store the entire database on disk upon shutdown and load it (if present) on startup.

Hard

  • Replace your existing indexes with B-Trees; You should implement the B-Trees yourself.
  • Intermittently write database to disk, reducing the amount of data loss in case of sudden shutdown.

Very hard

  • Implement partial-table loading from file. Only the portion of the database needed for a search should be loaded, and frequently accessed records should be cached to prevent excessive disk IO.

Extreme

  • Add multithreading and background consistency checking. At this point you should have a multi-stage file persistence scheme so that a sudden crash/loss of power cannot cause corrupted data (instead inconsistencies in main storage can be corrected based on temporary storage and vice versa).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment