DavidGinzberg/PersistenceChallenge.md

## PersistenceChallenge.md

      
    Raw
  

              PersistenceChallenge.md
            
          
    Persistence Challenge - Database Implementation

Unlike most assignments, this is not structured as a lab or project. This is a challenge, with components that range in difficulty from the very simple to the extremely difficult. You can pick and choose which elements to implement, and should consider returning to this challenge from time to time throughout your career to continue expanding your knowlege and skills.
This GitHub repository is provided as a starting point. As long as you are familiar with Spring Boot and Maven projects it should be easy to start here.
Tasks

Basic

Familiarize yourself with the starter code by completing the implementation of a simple CRUD application that stores the names of people. The Twople and Threeple classes are provided to maintain simplicity, and you can use them to expand the application to store first and last names, which will be necessary for some of the later challenges.

Implement a 1-table RDBMS with CRUD functions
Optimize for access on 1 column via indexing
Generalized indexing (eg: setIndexed(columnName, on/off))

Easy


Index a single column with map of String -> sets of indexes (indicating all rows that have that value). Ensure that the CRUD operations continue to behave properly.

Medium


Add configurable indexing on columns (multiple columns can be set to index themselves). Any number of indexes (up to the number of columns) should be vaible, with the index being automatically consulted, if it exists, for any searches.
Add multiple table functionality to the database. This will require some refactoring of existing methods to accomodate the need to specify which table to search.
Store the entire database on disk upon shutdown and load it (if present) on startup.

Hard


Replace your existing indexes with B-Trees; You should implement the B-Trees yourself.
Intermittently write database to disk, reducing the amount of data loss in case of sudden shutdown.

Very hard


Implement partial-table loading from file. Only the portion of the database needed for a search should be loaded, and frequently accessed records should be cached to prevent excessive disk IO.

Extreme


Add multithreading and background consistency checking. At this point you should have a multi-stage file persistence scheme so that a sudden crash/loss of power cannot cause corrupted data (instead inconsistencies in main storage can be corrected based on temporary storage and vice versa).