Skip to content

Instantly share code, notes, and snippets.

@candale
Last active May 15, 2020 08:01
Show Gist options
  • Save candale/c5e22e331c0ba0965c9f7b8cf99bcc90 to your computer and use it in GitHub Desktop.
Save candale/c5e22e331c0ba0965c9f7b8cf99bcc90 to your computer and use it in GitHub Desktop.
FileIntegerListProblem

Docker

To start playing with docker, we are going to have to build a toy problem. The toy problem is going to be a very simple ToDo app that we're going to take to a final conclusion, having a bunch of containers doing the work with volumes, networking an the lot.

ToDo in a container

Develop a minimal CLI that offers the possibility to save, list and mark ToDos as done. The information for each todo will be stored in a simple CSV file, in the following way:

Do the laundry,no_done
Make coffee,done

The CLI should have the following commands:

$ todo add "Walk the dog"
Added todo "Walk the dog"

$ todo add "Do homework"
Added todo "Do homework"

$ todo list
(1) Walk the dog, Not Done
(2) Do homework, Not Done

$ todo done (1)
Marked "Walk the dog" as done

$ todo list
(1) Walk the Dog, DONE
(2) Do homework, Not Done

The code should be as simple as possible, with no bells or whistles. You can choose to develop the CLI in any way you like, but preferably as simple as possible, so as to require little time.

Create a Docker image that has everything setup to allow one to run the cmd todo.

You need to create a Dockerfile that builds an image where one could simply say:

$ docker run -it --rm todo:latest todo add "Create Dockerfile"
Added todo "Create Dockerfile"

One could alias the command to run the container so things would be easier:

$ alias todo="docker run -it --rm todo:latest todo"
$ todo add "Create alias"

Please modify the command such that the file where the ToDos are stored is not lost after each container run.

Webserver to share the todo file

We want to be able to share the file that is used within our docker container through an HTTP server. Python has a nice module that one can use to create a server that servers the files in a directory, like so:

python3 -m http.server --bind 0.0.0.0 8021

One can now go to http://localhost:8021 and get the list of all files from the directory where the server was ran.

We need to be able to run our todo command at any time using the alias we defined above but we should also be able to go in a web browser at a specific address, at any time, and find our file that holds the information. Every component, the todo as well as the webserver, must be containerized.

If one has no clue where to start and help is needed, one can find a hint here.

Introducing Redis

You're using a file as storage? What is this, the beginning of the 21st century?

Let's change the storage backend to be a Redis instance. Redis is a multi-purpose key-value store (well, much more than that actually so read it up) that offers sub-millisecond access to its data.

We need to rewrite our ToDo application so that it will keep the todos in Redis. We still have to run our ToDo application in a container of its own and we also have to run Redis in a container of its own.

One more thing to add is that Redis dumps its data to disk periodically (the time it chooses to do this can be configured so you might want to take a look at that as well). We want this data to be persistent, even though the engine runs in Docker, so please make sure of that.

While we're talking about persistent data, we should also keep the webserver we implemented in the step above, but now it should serve the data that is dumped on disk by Redis.

There are a couple rules to abide by:

  • Everything that we run in this exercise in run within a container; that includes the Python code, the Redis engine and the webserver.
  • The only port that we are allowed to expose to the host (e.g. -p 8080:8080) is that of the webserver. Redis should not be accessible from the host.

Research:

  • redis data types

Docker-compose

Wrap Webserver and Redis in a docker-compose while runnig the todo app with docker run

Reasearch:

  • docker-compose
  • networking
  • environment variables

Docker Swarm

Research:

  • secrets

Descriptors

Design a JSON validation system that works by defining a models, classes that describe how an object will look like. A model will inherit from a BaseModel and have fields defined as class attributes.

The models will only work in an input - output fashion, i.e. it can only receive a dict at initialization and can spit out the validated data. Setting values on the model is not in the scope of this problem.

from typing import Dict

def less_than_30(value):
    return value < 30

class CarOption(BaseModel):

    name = fields.CharField(max_length=30)
    version = fields.IntegerField(required=False, validators=[less_than_30])


class CarDescription(BaseModel):

    make = fields.CharField(max_length=30)
    model = fields.CharField(max_length=30)
    number_cylinders = fields.IntegerField(required=False)
    options = fields.ListField(model_class=CarOption)

    def validate_make(self, value: str) -> str:
        """Validate options here

        Raise validation error if something is not right

        Return cleaned value. A validator can alter the value it returns
        to the called. If that's the case, the returned value will be shown
        in the end to the user.
        """

        return f'MAKE: {value}'

    def validate(self, validated_data):
        """Final validation over the whole data."""


my_data = {
    'make': 'Volkswagen',
    'model': 'Golf VI',
    'number_cylinders': 10,
    'options': [
        {'name': 'Radio'},
        {'name': 'Bluetooth', 'version': 10}
    ]
}

# Calling .is_valid will raise an exception if any of the fields are not valid
description = CarDescription(my_data)
# raise ValidationError if data is not valid
description.is_valid()


# One can get the value of a field by accessing it directly
>>> print(description.name)
Volkswagen
>>> print(description.options)
[{'name': 'Radio'}, {'name': 'Bluetooth', 'version': 10}]


# Or the entire object is returned, as dict, if .data property is accessed.
>>> print(description.data)
{'make': 'MAKE: Volkswagen', ...}


# A method validate_<field_name> can be defined on the model. When a field is
# accessed, before the value is returned, this method is called and it's
# return value will be returned as the final value. This method can throw
# a validation error if the value is incorrect.
>>>print(description.make)
MAKE: Volkswagen


# If a field is invalid, when it is accessed, a validation error is thrown
my_data['number_cylinders'] = 'asdf'
description = CarDescription(my_data)
print(description.number_cylinders)
# ^ ^ ^ -> raises validation error
description.is_valid()
# ^ ^ ^ -> raises validation error

BaseModel must have the following interface:

class BaseModel:

    def __init__(self, data: Dict[Text, Any]):
        """Instantiate a new model with the passed data"""

    def is_valid(self) -> None:
        """Throws a validation error if any of the fields has invalid data"""

    def validate(self, validated_data: Dict[Text, Any]) -> None:
        """Override this if custom validation is necessary over all fields"""

    @property
    def data(self) -> Dict[Text, Any]:
        """This will return a dict with all the validated data
        If .is_valid was not called, it will be called and then the data will
        be returned.

        If a caching mechanism for this property is installed, .is_valid
        must be also handled accordingly.
        """

A model is made out of fields that represent a data type. Define, as descriptor, the following fields: CharField, ListField, IntegerField, URLField.

URLField will return whatever that resource points to, as a stream of bytes.

Decorators for classes

Let's say you have an introspection tool that builds some sort of documentation based on the docstrings of the methods of a class.

As it usually happens in Django, with the views, we have a base class, lets say RetrieveUpdateDestroyAPIView, which has the generic methods get, put, patch and delete. We may create 3 views that inherit from this generic view but for none of them we will override the http methods ... methods but the docstrings should be different for each of those specific views.

class UserView(RetrieveUpdateDestroyAPIView):

    serializer_class = UserSerializer
    queryset = User.objects.all()


class BallView(RetrieveUpdateDestroyAPIView):

    serializer_class = BallSerializer
    queryset = Ball.objects.all()


class CarView(RetrieveUpdateDestroyAPIView):

    serializer_class = CarSerializer
    queryset = Car.objects.all()

For me to override the docs for the methods get and the others, I would have to just implement the methods in the views, put the docstrings, and call super, because there is nothing else I need to do there.

Another solution would be the following:

Create a decorator that, when applied to a class, changes the docstrings of the methods with the names given as kwargs, as follows:

@replace_docstrings(
    get='My other get docstring',
    post='My other post docstring'
)
class CarView(RetrieveUpdateDestroyAPIView):

    serializer_class = CarSerializer
    queryset = Car.objects.all()


print(CarView.get.__doc__)
My other get docstring

print(CarView.post.__doc__)
My other post docstring

It is important the class CarView keeps the same name and module name as if the decorator would not have been applied.

Testing and Documenting Pipelining

Test the two pieces of code from below, Pipelining and Q object for Pipelining using pytest considering the key features exposed by each of the assignments.

Write docstrings for relevant parts of the code considering best practices and readability.

Pipelining Q object

Write a class, Q, that acts as a filtering unit for an instance of Pipe (below). This class should provide an interface for creating complex boolean filtering logic that can be supplied to the method Pipe.filter

Examples for file with numbers [1, 2, 3, 4, 5, 6, 7, 8]:

"""
Can use binary or operator (`|`) to chain multiple Q instances and the filter
is valid if any of the filters are valid
"""
pipe = Pipe('/path/to/file')

# allow numbers which are three or numbers which are even
my_filter = Q(lambda x: x == 3) | Q(lambda x: x % 2 == 0)
new_pipe = pipe.filter(my_filter)

list(new_pipe)
[2, 3, 4, 6, 8]
"""
Can use binary and operator (`&`) to chain multiple Q instances and the filter
is valid if all the filters evaluate to true
"""
pipe = Pipe('/path/to/file')

# allow numbers which can be evenly divided by 4 and 2
my_filter = Q(lambda x: x % 4 == 0) & Q(lambda x: x % 2 == 0)
new_pipe = pipe.filter(my_filter)

list(new_pipe)
[4, 8]
"""
Can nest filters and they will be evaludated in the order that is dictated
by the nesting
"""
pipe = Pipe('/path/to/file')

# the order of evaluation for the filters will be x%2, x%4 and then x!=8
my_filter = (Q(lambda x: x % 2 == 0) | Q(lambda x: x % 4 == 0)) & Q(lambda x: x != 8)

new_pipe = pipe.filter(my_filter)

list(new_pipe)
[4]
"""
The operation have built-in the short-circuiting behavior we see with normal
`and` and `or` operations.

For this example we will consider that we only have the following numbers
in the list: [4, 5]
"""

def is_even(x):
    print(x, 'in is_even')
    return x % 2 ==0


def is_divided_by_four(x):
    print(x, 'in divided by four')
    return x % 4 ==0


pipe = Pipe('/path/to/file')

my_filter = Q(is_even) | Q(is_divided_by_four)
new_pipe = pipe.filter(my_filter)

list(new_pipe)

"""
will print
4 in is_even
5 in is_even
5 in divided by fours

FOR 4: we call the function `is_even` and we get
true so it doesn't make sense to call `is_divided_by_four` because we know we
have a true expression.

FOR 5: we call the function `is_even` and we get false so we go on and
evaluate for `is_divided_by_four` as well.
"""

# Example for and operation

pipe = Pipe('/path/to/file')

my_filter = Q(is_even) & Q(is_divided_by_four)
new_pipe = pipe.filter(my_filter)

list(new_pipe)

"""
will print
4 in is_even
4 in is divided by four
5 in is even

FOR 4: we call the function `is_even` and we get true but we have an and
so we go further to evaluate the second condition, `is_divided_by_four`.

FOR 5: we call the function `is_even` and we get false so it makes
no sense to go on evaluate the other condition
"""

Pipelining

Write a class that receives as a single argument at init a path to a file that has an integer on each line. The class has the following interface:

class Pipe:

    def __init__(self, path):
        ...

    def filter(self, func: Callable) -> Pipe:
        """Apply filter `func` to the integers in the file

        :param func: a function that takes as single argument an integer
            and returns True or False
        """
        pass

    def modifier(self, func: Callable) -> Pipe:
        """Apply modifier `func` to the integers in the file

        :param func: a function that takes as single argument an integer
            and returns an integer
        """
        pass

    def register_signal(self, func: Callable) -> Pipe:
        """Register `func` as signal

        When the file is iterated, for each integer that passes the filters
        and after the integer is modified, `func` gets called with the resulting
        integer.

        This method can be called for an infinite number of signals.

        :param func: a function that takes as a single argument an integer
        """

Requirements:

  • Implement in Python 3.x
  • An instance of Pipe must be iterable
    p = Pipe('/path/to/file')
    for item in p:
        print(item)
  • Every iteration in the inner-working of the class must work with generators. When I execute the following code, each item in the file is read an process one at a time:
    pipe = Pipe('/path/to/file')
    iterator = iter(pipe)
    # one item is fetched from the file, filters and modifiers are applied,
    # signals called and then the item is handed out
    next(pipe)
    # the second line is read from the file, filters and modifiers are applied,
    # signals called and then the item is handed out
    next(pipe)
  • The filters and modifiers are applied in the order they were set on the instance
    pipe = Pipe('/path/to/file')
    pipe.modifier(lambda x: x**2)
    pipe.filter(lambda x: x % 2 == 0)
    pipe.register_signal(lambda x: ...)
    pipe.modifier(lambda x: x * 10)
    
    # the first modifier is applied, the first filter is applied on the result
    # the signal is called, if the filter passed, and then the last modifier is applied
    next(pipe)
  • No file data is held as state on an instance
  • An instance of Pipe must be imutable. For example, for numbers [1, 2, 3, 4]:
    pipe = Pipe('/path/to/file')
    list(pipe)
    [1, 2, 3, 4]
    new_pipe = pipe.filter(lambda x: x % 2 == 0)
    list(new_pipe)
    [2, 4]
    list(pipe)
    [1, 2, 3, 4]
  • Do this in as few lines as possible. The code must still look Pythonic and reasonable.

Examples for file with numbers [1, 2, 3, 4]:

def write_to_file_and_print(number):
    with open('/path/to/other/file', 'a+') as f:
        f.write(number)
        print(number)

pipe = Pipe('/path/to/file')
pipe = pipe.filter(lambda x: x % 2 == 0)
pipe = pipe.modifier(lamdba x: x ** 2)
pipe = pipe.register_signal(write_to_file_and_print)

for i in pipe:
    print(i)
2
2
4
4
with open('/path/to/other/file') as f:
    for line in f:
        print(line)
2
4

FileIntegerList Advanced Filtering

The instance that results from adding two or more other instances, will apply filters individually

# has numbers 3, 6, 9, 20
>>> file_list_1 = FileIntergerList('/path/to/file1')
# has numbers 10, 12, 15, 20
>>> file_list_2 = FileIntergerList('/path/to/file2')
>>> file_list_1 = file_list_1.filter(lambda x: x % 2 == 0)
>>> file_list_2 = file_list_2.filter(lambda x: x % 3 == 0)
>>> file_list = file_list_1 + file_list_2
>>> file_list = file_list.filter(lambda x: x % 5 == 0)

# This will print the numbers from file_list_1 filtered only by the filters
# applied to file_list_1 and the filters applied to file_list and the numbers
# from file_list_2 filtered only by the filters applied to file_list_2 and
# the filters applied to file_list
>>> print(list(file_list))
[20, 15]

FileIntegerList

Write a class FileIntegerList that gets at init a single parameter, a disk path to a file with an integer on each line, e.g.

4
3
1532
13
41

Considering the file above, an instance of FileIntegerList should behave in the following way:

When we print an instance of the file, only the first three numbers are shown

>>> file_list = FileIntegerList('/path/to/file')
>>> print(file_list)
# shows only the first three numbers, if any, when the instance is printed
FileIntergerList: 4, 3, 1532, ...

The file is opened and read only when the instance is evaluated

# the file is not read when the instance is instantiated
>>> file_list = FileIntergerList('/path/to/file')
# only now the file opened and read
>>> numbers = list(file_list)
>>> print(numbers)
[4, 1532]

>>> file_list = FileIntergerList('/path/to/file')
>>> for integer in file_list:
        print(integer)
4
3
1532
13
41

After the file has be read once, when the instance was evaluated, at a second evaluation the file is not read anymore

>>> file_list = FileIntergerList('/path/to/file')
# the file is opened and read
>>> for number in file_list:
        print(number)
# the file is not opened again
>>> for number in file_list:
        print(number)

The class has a .filter method takes as a single argument a callable that takes as a single argument an integer from the file and returns True if it should be kept or False if it should be omitted. The method .filter returns an instance of FileIntegerList so the filter operations can be chained like in the example below. The method .filter does not open the file for reading. The file will be opened only when the instance is evaluated. Filtering can be done before and after the file has been loaded into memory

>>> file_list = FileIntergerList('/path/to/file')
>>> file_list = file_list.filter(lambda x: x % 2 == 0).filter(lambda x: x % 3 == 0)
>>> print(file_list)

An instance can be evaluated to True or False in an if statement

# when the file is empty OR when the file does not exist
>>> file_list = FileIntergerList('/path/to/empty/file')
>>> if not file_list:
        print('The list is empty')
The list is empty

# when the file has items
>>> file_list = FileIntergerList('/path/to/nonempty/file')
>>> if file_list:
        print('The list is not empty')
The list is not empty

Two instances can be added together to form a new list

>>> file_list_1 = FileIntergerList('/path/to/file1')
>>> file_list_2 = FileIntergerList('/path/to/file2')
# The files /path/to/file1 and /path/to/file2 are not opened when
# this operation is performed
>>> file_list = file_list_1 + file_list_2
# Only now the files /path/to/file1 and /path/to/file2 are opened and read
>>> numbers = list(file_list)

On ADD, when the instances already have filters, all filters will be applied to the final dataset

# has numbers 3, 6, 9
>>> file_list_1 = FileIntergerList('/path/to/file1')
# has numbers 10, 12, 15
>>> file_list_2 = FileIntergerList('/path/to/file2')
>>> file_list_1 = file_list_1.filter(lambda x: x % 2 == 0)
>>> file_list_2 = file_list_2.filter(lambda x: x % 3 == 0)
>>> file_list = file_list_1 + file_list_2

# This will print the numbers that are divisible by 2 and 3, FROM BOTH FILES
# The filters from both file_list_1 and file_list_2 will apply to the final
# dataset, the list of numbers from both file_list_1 and file_list_2
>>> print(list(file_list))
[6, 12]

I can use sum function on a FileIntegerList instance

>>> file_list = FileIntegerList('/path/to/file1')
>>> print(sum(file_list))
15321

I can use the keyword in to test the membership of an element

>>> file_list = FileIntergerList('/path/to/file')
>>> print(10 in file_list)
True

Context Manager

Write a context manager that behaves exactly like with open(...) but when a the file is not found, instead of raising FileNotFoundError, it should raise a custom exception you have defined, MyFileNotFound.

The decorator must be written in two ways:

  • as a function
  • as a class

The contextmanager should catch and not let bubble up any ValueError that is thrown in the context of the manger

The usage must in the following way:

with my_open('path', 'r') as f:
....

Write a decorator that does the same thing as @contextmanager

Decorators

Scrie un docorator care sa prinda orice value error aruncat de functia pe care o decoreaza:

@catch
def do():
    raise ValueError

>>> do() # <- does not raise

Scrie un decorator care sa prinda o exceptie data ca si prin prametru:

@catch(ValueError)
def do():
    raise ValueError


>>> do() # <- does not raise

Scrie o clasa Catch care sa poata fi folosita in felul urmator:

class Catch:
   ...

@Catch(ValueError)
def do():
    raise ValueError

>>> do() # <- does not raise
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment