Skip to content

Instantly share code, notes, and snippets.

@ynonp
Last active January 19, 2024 09:41
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ynonp/06914f626cd4127899af53a96733157f to your computer and use it in GitHub Desktop.
Save ynonp/06914f626cd4127899af53a96733157f to your computer and use it in GitHub Desktop.
Python Exercises

Learn Python With Me

My (hebrew speaking) Python course is available online at: https://www.tocode.co.il/bundles/python

If you speak the language be sure to drop by and say hello.

Syntax Review

  1. Write a program that asks the user for a number (integer only) and prints the sum of its digits

  2. Write a program that takes a file name as command line argument, count how many times each word appears in the file and prints the word that appears the most (and its relevant count)

Text Processing

Part 1

  1. From the following text, print only the lines that start with a word that has only uppercase characters:
hello - don't print me
HELLO - but I'm ok
Im a line that shouldn't be printed
BUT I'm a line that should
  1. Given input file of the form:
# one of these shells.

/bin/bash
/bin/csh

Translate each file path to windows style path and print:

# one of these shells.

C:\bin\bash
C:\bin\csh

Part 2

From the following log file:

167.86.115.113 - - [26/Dec/2019:00:25:59 +0200] "GET / HTTP/1.1" 200 1993 "-" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"
167.86.115.113 - - [26/Dec/2019:00:26:01 +0200] "GET /HNAP1/ HTTP/1.1" 404 1772 "https://178.79.150.27/" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"
167.86.115.113 - - [26/Dec/2019:00:26:01 +0200] "GET /hudson/script HTTP/1.1" 404 1772 "https://178.79.150.27/" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"
167.86.115.113 - - [26/Dec/2019:00:26:02 +0200] "GET /script HTTP/1.1" 404 1772 "https://178.79.150.27/" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"
84.110.208.186 - - [26/Dec/2019:00:37:57 +0200] "-" 408 5165 "-" "-"
77.124.5.143 - - [26/Dec/2019:01:32:58 +0200] "-" 408 543 "-" "-"
23.95.84.74 - - [26/Dec/2019:01:41:20 +0200] "-" 408 1373 "-" "-"
169.197.108.6 - - [26/Dec/2019:02:23:56 +0200] "GET / HTTP/1.1" 200 1814 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
52.41.211.72 - - [26/Dec/2019:04:23:07 +0200] "HEAD / HTTP/1.1" 200 1899 "-" "Go-http-client/1.1"
66.249.66.10 - - [26/Dec/2019:04:57:14 +0200] "GET / HTTP/1.1" 200 2406 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
157.55.39.2 - - [26/Dec/2019:05:22:21 +0200] "GET /robots.txt HTTP/1.1" 404 2274 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.39.2 - - [26/Dec/2019:05:22:22 +0200] "GET /robots.txt HTTP/1.1" 404 2274 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.39.70 - - [26/Dec/2019:05:22:31 +0200] "GET / HTTP/1.1" 200 2495 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
35.175.224.215 - - [26/Dec/2019:07:00:51 +0200] "GET / HTTP/1.0" 400 0 "-" "-"
77.124.127.86 - - [26/Dec/2019:07:59:28 +0200] "-" 408 543 "-" "-"
77.124.127.86 - - [26/Dec/2019:07:59:28 +0200] "-" 408 543 "-" "-"
66.249.66.40 - - [26/Dec/2019:08:04:46 +0200] "GET / HTTP/1.1" 200 2406 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
84.110.208.186 - - [26/Dec/2019:08:06:07 +0200] "-" 408 543 "-" "-"
169.197.108.38 - - [26/Dec/2019:08:44:04 +0200] "GET / HTTP/1.1" 200 1814 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
45.79.152.7 - - [26/Dec/2019:09:57:12 +0200] "GET / HTTP/1.0" 200 2977 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:00:20 +0200] "-" 408 156 "-" "-"
169.197.108.42 - - [26/Dec/2019:10:02:20 +0200] "GET / HTTP/1.1" 200 1814 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
84.110.208.186 - - [26/Dec/2019:10:03:56 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:05:09 +0200] "-" 408 543 "-" "-"
45.79.152.7 - - [26/Dec/2019:10:11:50 +0200] "HEAD / HTTP/1.1" 200 2065 "-" "Mozilla"
45.79.152.7 - - [26/Dec/2019:10:12:07 +0200] "HEAD / HTTP/1.1" 200 2065 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a3pre) Gecko/20070330"
185.3.145.39 - - [26/Dec/2019:10:15:01 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:24:10 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:24:10 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:25:40 +0200] "-" 408 156 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:25:40 +0200] "-" 408 156 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:25:40 +0200] "-" 408 156 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:25:40 +0200] "-" 408 156 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:32:12 +0200] "-" 408 156 "-" "-"
185.3.145.39 - - [26/Dec/2019:10:32:19 +0200] "-" 408 543 "-" "-"
82.81.7.123 - - [26/Dec/2019:10:32:24 +0200] "-" 408 4720 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:34:22 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:34:22 +0200] "-" 408 543 "-" "-"
185.3.145.39 - - [26/Dec/2019:10:44:27 +0200] "-" 408 543 "-" "-"
84.110.208.186 - - [26/Dec/2019:10:46:26 +0200] "-" 408 543 "-" "-"
  1. Print a list of all IP addresses that sent partial HTTP requests (server response was 408). For each address print how many times it tried to access the server.

  2. Print a list of all URLs for which the server returned an 404 error. Sort that list by url access frequency.

Part 3

Given the following input:

lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
	options=1203<RXCSUM,TXCSUM,TXSTATUS,SW_TIMESTAMP>
	inet 127.0.0.1 netmask 0xff000000 
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
	nd6 options=201<PERFORMNUD,DAD>
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	ether f4:0f:24:29:df:4d 
	inet6 fe80::1cb5:1689:1826:cc7b%en0 prefixlen 64 secured scopeid 0x4 
	inet 10.176.85.19 netmask 0xffffff00 broadcast 10.176.85.255
	nd6 options=201<PERFORMNUD,DAD>
	media: autoselect
	status: active
en1: flags=963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX> mtu 1500
	options=60<TSO4,TSO6>
	ether 06:00:58:62:a3:00 
	media: autoselect <full-duplex>
	status: inactive
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
	ether 06:0f:24:29:df:4d 
	media: autoselect
	status: inactive

Create a CSV file from the data as follows:

interface,inet,status
lo0,127.0.0.1,
gif0,,
en0,10.176.85.19,active
en1,,inactive
p2p0,,inactive   

Part 4

Extract the columns: Sample Temp (Kelvin) and Samp HC (J/mole-K) from the following input. Ignore all lines before [Data] starts:

Input:

[Header]
TITLE,xxxxxxxx.dat
BYAPP,HeatCapacity,3.8.14,1.7
INFO,2.29,MASS:Sample Mass (mg)
INFO,,MASSERR:Sample Mass Error (mg)
STARTUPAXIS, X, 8, LINEAR, AUTO
STARTUPAXIS, Y1, 10, LINEAR, AUTO
[Data]
Time Stamp (Seconds),Comment (),Puck Temp (Kelvin),System Temp (Kelvin),Sample Temp (Kelvin),Temp Rise (Kelvin),Samp HC (J/mole-K),Samp HC Err (J/mole-K),Addenda HC (uJ/K),Addenda HC Err (uJ/K)
41022684.5,,300.013,299.45,302.94831,6.228222,124.32831,0.85,8872.3999,4.515
41022942.84,,300.066,299.351,302.70334,5.6470045,123.05087,0.8188,8867.6331,4.515
41093651.75,,6.944,7.7512,6.9778851,0.13327843,0.037876781,0.0009362,3.4701983,0.005059
41094250.64,,6.69529,7.3013,6.7421403,0.1326991,0.03380048,0.0008134,3.1694344,0.004567
41094277.06,,6.68978,7.2966,6.7338561,0.12852694,0.033751509,0.0007938,3.1593382,0.004551

Expected Output:

302.94831,124.32831
302.70334,123.05087
302.69056,122.93617
6.9761565,0.037976428
6.9772032,0.037621912
6.9778851,0.037876781
6.7421403,0.03380048
6.7338561,0.033751509

Part 5

Write a python script for mass renaming music files according to labels. The script takes an existing format of files in current directory and an expected output format and prints a list of old -> new file name tuples.

Format can be any string that contains any number of the labels: <artist>, <album>, <track>, <year>

Assume file names match input format.

Sample list of input files:

Bob Dylan - 01 You're No Good (1962).mp3
Bob Dylan - 02 Talkin' New York (1962).mp3
Bob Dylan - 03 In My Time of Dyin' (1962).mp3
Bob Dylan - 04 Man of Constant Sorrow (1962).mp3
Bob Dylan - 05 Fixin' to Die (1962).mp3
Bob Dylan - 06 Pretty Peggy-O (1962).mp3

Sample input format:

<album> - <track> <title> (<year>).mp3

Sample output format:

Bob Dylan/<year> <album>/<track> <title>.mp3

Expected output:

Bob Dylan - 01 You're No Good (1962).mp3 -> Bob Dylan/1962 Bob Dylan/01 You're No Good.mp3
Bob Dylan - 02 Talkin' New York (1962).mp3 -> Bob Dylan/1962 Bob Dylan/02 Talkin' New York.mp3
Bob Dylan - 03 In My Time of Dyin' (1962).mp3 -> Bob Dylan/1962 Bob Dylan/03 In My Time of Dyin'.mp3
Bob Dylan - 04 Man of Constant Sorrow (1962).mp3 -> Bob Dylan/1962 Bob Dylan/04 Man of Constant Sorrow.mp3
Bob Dylan - 05 Fixin' to Die (1962).mp3 -> Bob Dylan/1962 Bob Dylan/05 Fixin' to Die.mp3
Bob Dylan - 06 Pretty Peggy-O (1962).mp3 -> Bob Dylan/1962 Bob Dylan/06 Pretty Peggy-O.mp3

Bonus: also rename the files and create required directories along the way

OS Integration

Part 1

Write a python program that takes a list of file extensions and prints all the files from the current directory matching the extension given. The following extensions and meaning should be supported:

  1. c should find and print all .c and .h file names
  2. py should find and print all .py and .pyc file names
  3. pl should find and print all .pl and .pm file names

Bonus: Read extension and meaning from a configuration file.

Part 2

Write a program that takes a list of ini file names and a list of keys and prints the values of the given keys. Running example looks like this:

$ python fetch.py --key name --key email users.ini teachers.ini cities.ini

And the files:

// file: users.ini
name=ynon
web = www.tocode.co.il
likes = python and stuff

// file: teachers.ini
name = joe
email = joe@gmail.com
color = red

// file cities.ini
size=10
name=foo bar

Expected Result:

name: ynon
email: joe@gmail.com

Part 3

Write a python program that calls ifconfig and splits its output to files according to the network interfaces it finds.

For example given the following ifconfig output:

en3: flags=8963 mtu 1500
        options=60
        ether 32:00:18:24:c0:00
        media: autoselect 
        status: inactive
p2p0: flags=8843 mtu 2304
        ether 06:38:35:47:96:24
        media: autoselect
        status: inactive

Program should create 2 files named: en3 and p2p0, saving the first block to file en3 and the second one to p2p0.

Functions

Part 1

  1. Write a function that takes a number and returns the sum of its digits. Raise exception if argument of the wrong type was passed
  2. Write a function that returns the multiplication of all input arguments. The function should ignore non-numeric arguments. Example Usage:
# returns 200:
mymul('foo', 'bar', 10, 20)

# returns 1:
mymul()

# returns 7:
mymul(7)
  1. Write a function that takes a list of strings AND a minimum length (number) and returns only the strings that are longer than the provided number. Example Usage:
# returns the list: ['baby', 'more', 'time']
longer_than(3, 'hit', 'me', 'baby', 'one', 'more', 'time')
  1. Write a function groupby that takes a key-function and a list. The function should call key-function on all items in the list and return a dictionary whose keys are the results of key-function and values are all values from the list that productd that key. Example Usage:
# returns: { h: ['hello', 'hi', 'help', 'here'], b: ['bye'] }
groupby(lambda s: s[0], 'hello', 'hi', 'help', 'bye', 'here')

Part 2

  1. Write a Decorator named after5 that will ignore the decorated function in the first 5 times it is called. Example Usage:
@after5
def doit(): print("Yo!")

# ignore the first 5 calls
doit()
doit()
doit()
doit()
doit()

# so only print yo once
doit()
  1. Calculation in the following fib function may take a long time. Implement a Decorator that remembers old calculations so the function won't calculate a fib value more than once. Program Code:
@memoize
def fib(n):
    print("fib(%d)" % n)
    if n <= 2:
        return 1
    else:
        return fib(n-1) + fib(n-2)

Expected Output:

fib(10)
fib(9)
fib(8)
fib(7)
fib(6)
fib(5)
fib(4)
fib(3)
fib(2)
fib(1)
55
  1. Write a non-recursive implementation for the presented fib function using generators.

  2. Write a decorator called accepts that checks if a function was called with correct argument types. Usage example:

# make sure function can only be called with a float and an int
@accepts(float, int)
def pow(base, exp):
  pass

# raise AssertionError
pow('x', 10)
  1. Write a decorator called returns that checks that a function returns expected argument type. Usage example:
@returns(str)
def same(word)
  return word

# works:
same('hello')

# raise AssertionError:
same(10)
  1. Use the above decorators to create a function that accepts two integers and returns a list.

  2. Write a decorator with_cwd that adds a first parameter named cwd to the function it receives. That parameter holds the current working directory. Example usage:

@with_cwd
def create_file(cwd, filename):
    (Path(cwd) / Path(filename)).touch()


create_file('hello.txt')

Part 3

  • Create a generator named uniq that takes in input sequence and yields only the unique values from it
  • Create a generator named dup that takes a sequence and yields only the non unique values from it
  • Create a generator that takes a path and an extension and yields a list of all files with that extension
  • Create a generator that takes a number and yields the digits of that number
  • Create a web spider generator. It takes a URL and first yields the HTML at that initial URL. Then it will fetch all the links it finds in the HTML and yields their pages one by one and continue recursively until no more new links are found. Use that generator to save a copy of your favorite web page. (use requests module)

Classes

Part 1

  1. The following code assumes a class named Summer exists. Create that class so the code will work:
s = Summer()
t = Summer()

s.add(10, 20)
t.add(50)
s.add(30)

# should print 60
print(s.total)

# should print 50
print(t.total)
  1. Update your code so calculation is only performed when getting total value.

  2. The following code assumes a class named Widget which represent a thing that needs to be built. Building a widget should automatically trigger a build on all its dependencies. Implement Widget so the following code works:

luke    = Widget("Luke")
hansolo = Widget("Han Solo")
leia    = Widget("Leia")
yoda    = Widget("Yoda")
padme   = Widget("Padme Amidala")
anakin  = Widget("Anakin Skywalker")
obi     = Widget("Obi-Wan")
darth   = Widget("Darth Vader")
_all    = Widget("")


luke.add_dependency(hansolo, leia, yoda)
leia.add_dependency(padme, anakin)
obi.add_dependency(yoda)
darth.add_dependency(anakin)

_all.add_dependency(luke, hansolo, leia, yoda, padme, anakin, obi, darth)
_all.build()
# code should print: Han Solo, Padme Amidala, Anakin Skywalker, Leia, Yoda, Luke, Obi-Wan, Darth Vader
# (can print with newlines in between modules)
  1. Provided a text file with information about artists and songs:
Joy Division - Love Will Tear Us Apart
Joy Division - New Dawn Fades
Pixies - Where Is My Mind
Pixies - Hey
Genesis - Mama

Write the required classes so the following code works:

music = MusicFile('/Users/ynonperek/music.txt')
print(music.artist('Joy Division').songs)

Part 2

  1. The program /sbin/ifconfig prints out all network interfaces on the host. Write a class that calls ifconfig, reads the interface list and allow scanning them using a for loop. i.e. the following code should print interface data:
for nif in (nif for nif in MachineNetworkInterfaces() if nif.ip != '127.0.0.1'):
    print(nif)
  1. Write a class named AddressBook that saves names and email addresses in a file. The following code should work (and create the file if not already exists):
with AddressBook('contacts.txt') as ab:
    ab.add('Eve', 'eve@gmail.com')
    ab.add('Alice', 'alice@walla.co.il')

with AddressBook('contacts.txt') as ab:
    print(ab.email('Eve'))
  1. Modify the class so the following will also work (Hint: read about __getitem__):
with AddressBook('contacts.txt') as ab:
    print(ab['Eve'])

Part 3

Read the following exercise: https://adventofcode.com/2018/day/4

  1. Create a Guard class and a MostSleepyGuardStrategy class to solve part 1. The strategy should take a list of guards and return the "best" guard according to its logic.

  2. Create a new class called MostFrequentlyAsleepInTheSameMinuteStrategy to solve part 2. Modify the Guard class if needed.

Part 4

Provided a log file showing user visits in a website:

88.198.66.230 - - [20/Apr/2017:07:53:10 +0300] "GET /?feed=rss2&p=18 HTTP/1.1" 200 836 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/)"
88.198.66.230 - - [20/Apr/2017:07:53:12 +0300] "GET /?feed=rss2&p=199 HTTP/1.1" 200 834 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/)"
176.58.103.66 - - [20/Apr/2017:07:53:07 +0300] "POST /wp-cron.php?doing_wp_cron=1492663987.4470920562744140625000 HTTP/1.0" 200 166 "-" "WordPress/4.1.16; http://mysongbook
.co.il"
88.198.66.230 - - [20/Apr/2017:07:53:14 +0300] "GET /?feed=rss2&page_id=2 HTTP/1.1" 200 828 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/)"
88.198.66.230 - - [20/Apr/2017:07:53:16 +0300] "GET /?m=201605 HTTP/1.1" 200 3227 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/)"
88.198.66.230 - - [20/Apr/2017:07:53:19 +0300] "GET /?p=162 HTTP/1.1" 200 3507 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.7; http://mj12bot.com/)"

Write a code that would allow analysing the log to answer the following questions:

  1. How many different IP addresses visited the website? In a given date? In a given date range?
  2. Which robots visited the site?
  3. Which URLs returned 404 result?

We also want to be able to easily add more questions in the future, and in a way that if the format of the log file changes we won't need to change analyzers code.

Therefore your task is to implement the following classes:

  1. class LogParser that takes a log file
  2. class LogEntry that represents a single entry. Its __init__ should take a line from the log file.
  3. analyzer classes, each provide the function:
  • update(entry) takes a log entry and updates internal data structure of the analyzer
  • stats - prints a short summary of the statistics it collected
    For example RobotsTracker's stats will print "The following robots visited your site: mj12
  1. Write code in LogParser that iterates on the log file, create log entries and updates the analyzers.

Remember if log file format chagnes you only need to change LogEntry to match the new format.

Modules

  1. Provided test file test_stack.py. Write the module required to make the test pass:
import unittest
from lib import mystack

class TestMyStack(unittest.TestCase):
    def setUp(self):
        mystack.add_item(10);
        mystack.add_item(20);
        mystack.add_item(22, 33);

    def test_flow(self):
        self.assertEqual(mystack.pop_item(), 33)
        self.assertEqual(mystack.pop_item(), 22)
        self.assertEqual(mystack.count_items(), 2)
        while mystack.pop_item(): pass

        self.assertEqual(mystack.count_items(), 0)
  1. Provided the test file test_contacts.py. Write the module required to make the test pass:
import unittest
from lib.contacts import Contacts

class TestContacts(unittest.TestCase):
    def setUp(self):
        self.home_book = Contacts()
        self.work_book = Contacts()

    def test_create_and_search(self):
        self.home_book.add('Tom', { 'lives_in': 'USA', 'email': 'tomthecat@gmail.com' })
        self.home_book.add('Bob', { 'lives_in': 'USA', 'email': 'bob@gmail.com' })
        self.work_book.add('Mike', { 'lives_in': 'Marks', 'email': 'mike@gmail.com' })

        results = self.home_book.contacts_by_lives_in('USA')
        self.assertTrue('Tom' in results)
        self.assertTrue('Bob' in results)
        self.assertEqual(len(results), 2)
  1. Write a module for finding anagrams in a list of words. Two words are an anagram if they have the same letters, for example add and dad are an anagram. The module should provide a class named Anagramer with the methods:

    • __init__(wordsfile) takes a words file and parses it to an anagram repository
    • get_random_anagram that returns a random anagram from the repository
    • list_anagrams(word) that lists all anagrams for a given word
  2. Write unit tests to test Anagramer.

Exceptions

  1. Write a program that takes a file name from command line and prints the number of lines in that file. If the provided argument does not refer to a file an error should be printed. Example Usage:
python countlines.py /etc/shells
11

python countlines.py /foo/bar
Sorry, file /foo/bar not found
  1. The following code assumes a class named ImageFile exists. Implement that class so the test passes:
import unittest

class TestImageFile(unittest.TestCase):
    def test_good_ext(self):
        try:
            img = ImageFile("file.png")
        except InvalidImageExt:
            self.fail("png should be a valid file extension")

    def test_bad_ext(self):
        with self.assertRaises(InvalidImageExt):
            img = ImageFile("file.mp3")

unittest.main()
  1. Write a module named exceptionlogger so each program importing that module will automatically create a file named exception.log and save the details of each unhandled exception in that file.

  2. Write a function named readtype that takes a type and reads data from the user until data of the given type is found. Example Usage:

x = readtype(int)
print(x)

t = readtype(str, prompt='Who are you? ')
print(t)
  1. Write the class AddressBook so the following code works:
c = AddressBook()

c.add(name='ynon', email='ynon@ynonperek.com', likes='red')
c.add(name='bob', email='bob@gmail.com', likes='blue')
c.add(name='ynon', email='ynon@gmail.com', likes='blue')

c.find_by(name='ynon')
# returns:
# [
#   {'name': 'ynon', 'email': 'ynon@ynonperek.com', 'likes': 'red'},
#   {'name': 'ynon', 'email': 'ynon@gmail.com', 'likes': 'blue}
# ]

c.find_by(likes='blue)
# returns:
# [
#   { 'name': 'bob', 'email': 'bob@gmail.com', 'likes': 'blue' },
#   {'name': 'ynon', 'email': 'ynon@gmail.com', 'likes': 'blue}
# ]

Code should be generic enough so if new fields are added everything still works.

  1. The following code should have created a triplet of blank squares for a tic-tac-toe game, but it has a strange bug. What went wrong and how would you fix it?
BLANK = ['', '', '']
g = [BLANK] * 3
g[0][0] = 'x'

Multiprocessing

  1. Provided the following passwords.md5 file that contains md5 hashed passwords. Write a python program that finds as many original passwords as you can. Can you speed it up using multiple processes? What's the ideal number of processes to use?

  2. Read documentation on Manager and afterwards update the program so after finding a password it'll also print how many passwords are left to search.

passwords.md5 file:

e10adc3949ba59abbe56e057f20f883e
5f4dcc3b5aa765d61d8327deb882cf99
827ccb0eea8a706c4c34a16891f84e7b
25d55ad283aa400af464c76d713c07ad
d8578edf8458ce06fbc5bb76a58c5ca4
9cdda062c5bd8e38e3a576c61208e1d4
25f9e794323b453885f5181f1b624d0b
81dc9bdb52d04dc20036dbd8313ed055
276f8db0b86edaa7fc805516c852c889
8621ffdbc5698829397d97767ac13db3
37b4e2d82900d5e94b8da524fbeb33c0
fcea920f7412b5da7be0cf42b8c93759
d0763edaa9d9bd2a9516280e9044d885
0d107d09f5bbe40cade3de5c71e9e9b7
8c62573d0982c91aad72e634ed6903f2
e99a18c428cb38d5f260853678922e03
96e79218965eb72c92a549dd5a330112
bee783ee2974595487357e195ef38ca2
d5bb5c0168e2952b6806d6a976b3d98a
9df3b01c60df20d13843841ff0d4482c
3bf1114a986ba87ed28fc1b5884fc2f8
eb0a191797624dd3a48fa681d3061212
c3add14b93c46349dfcb10e2d1c311a0
0acf4539a14b3aa27deeb4cbdf6e989f
84d961568a65073a3bcf0eb216b2a576
7d0710824ff191f6a0086a7e3891641e
e6c25c740b384de04637269b7715e067
4297f44b13955235245b2497399d7a93
ec0e2603172c73a8b644bb9456c1ff6e
5fcfd41e547a12215b173ff47fdd3739
  1. Write a web spider that will fetch an HTML page, then put all the links it finds in a Queue and fetches them too, slowly crawling the entire page. Keep the original directory structure, and make sure not to re-download files that were already fetched. Check your spider by downloading all python's PEP documents.

  2. What's the problem in the following code? Fix it!

import multiprocessing
import random

words = ['I', 'can', 'see', 'the', 'mountains']

def write_word_to_file(f):
    w = random.choice(words)
    f.write("{}\n".format(w))

def run(fl, n):
    for _ in range(n):
        write_word_to_file(fl)

q = multiprocessing.Pool()
f = open('out.txt', 'a')

procs = [ multiprocessing.Process(target=run, args=(f,5)) for _ in range(5) ]
for p in procs: p.start()
for p in procs: p.join()

f.close()
  1. Write a program that calculates all prime numbers from 1 to 100_000 and writes them to a file (not necessarily in order):
  • First use a single process
  • Then modify to use multiple processes, each writes to the file on its own
  • Finally modify to use a Queue where many processes find prime numbers, but only one process writes the results to a file.

Which option came out the fastest?

  1. The following code prints words to a file interwined:
import multiprocessing
import random
import os

words = ['I', 'can', 'see', 'the', 'mountains']

def run(n, i):    
    with open('out.txt', 'a') as f:
        for _ in range(n):
            f.write(words[i] + '\n')
            f.flush()

q = multiprocessing.Pool()
f = open('out.txt', 'a')

procs = [ multiprocessing.Process(target=run, args=(400,i)) for i in range(5) ]
for p in procs: p.start()
for p in procs: p.join()

f.close()

Use a Lock to force each block of words to appear on its own. Can you think of a better way to achieve the same result?

AsyncIO

  1. Use aiofiles and write a program that searches a specified text in multiple files. When you find the text print the name of the file and each line that contains that text.

  2. Create an async generator from the program in (1) so we'll be able to write:

async for [filename, line] in find_in_files('data/*', 'tea'):
  print(f'{filename}: {line}', end='')
  1. Use aiohttp to create a web spider: The spider will fetch URLs, parse each HTML file to find the links and then download all the links too. Check your spider by downloading all python's PEP documents.

  2. Use a semaphore to limit the number of concurrent requests to the server (so they won't lock us out)

  3. Use asyncssh and write a python script that copies itself to an external server

Unit Tests

  1. Given the following class:

class File:
    def __init__(self, name):
        self.path = name

    def create(self):
        open(self.path, 'a').close()

    def rename(self, new_name):
        os.rename(self.path, new_name)
        self.path = new_name

Write test case to verify:

  • can create files
  • can rename files
  • exception is raised when trying to create files without permissions
  • exception is raised when trying to rename a directory using this class

Fix the code if needed to make all tests pass.

  1. Given the mass renamer you wrote in Regular Expression exercise, write test code to verify:
  • Described functionality in the exercise (Regexp Part 4) works
  • Can rename files that have different name format
  • Code raises an exception if output format includes a label that does not appear in input format
  1. Given the following code:
from __future__ import print_function
import requests

def download(url, filename):
    r = requests.get(url, stream=True)
    if r.status_code == 200:
        with open(filename, 'wb') as f:
            for chunk in r:
                f.write(chunk)
    else:
        raise Exception(r.reason)


if __name__ == '__main__':
    url = 'https://fthmb.tqn.com/fgT8_8SUVjRLrOKMJp7OBD7R30o=/3372x2248/filters:fill(auto,1)/about/kitten-looking-at-camera-521981437-57d840213df78c583374be3b.jpg'
    download(url, 'kitten.jpg')

Write unit tests to:

  • verify code raises an exception when download fails
  • verify code writes content to file when download succeeds

Unit test code should not cause any network activity (i.e. should work offline).

  1. The following class randomizes a number and allows the user to guess what the number is:
from __future__ import print_function
import random

class NumberGuessingGame(object):
    def __init__(self):
        self._value = self.get_new_number()

    def guess(self, guess):
        if guess < self._value:
            print('too low')
        elif guess > self._value:
            print('too high')
        else:
            print('Bravo!')

    def get_new_number(self):
        return random.randint(1,1000)

Write unit test code to:

  • verify code writes "too low" when guess is below selected number
  • verify code writes "too high" when guess is above selected number
  • verify code write "Bravo!" when guess is exactly selected number
  1. Build a 1d array with 100 elements with values [2, 4, 6, 8, ... 200] (Hint: arange)

  2. Build a 2d (10, 10) array representing the multiplication table: (Hint: broadcasting)

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [  2,   4,   6,   8,  10,  12,  14,  16,  18,  20],
       [  3,   6,   9,  12,  15,  18,  21,  24,  27,  30],
       [  4,   8,  12,  16,  20,  24,  28,  32,  36,  40],
       [  5,  10,  15,  20,  25,  30,  35,  40,  45,  50],
       [  6,  12,  18,  24,  30,  36,  42,  48,  54,  60],
       [  7,  14,  21,  28,  35,  42,  49,  56,  63,  70],
       [  8,  16,  24,  32,  40,  48,  56,  64,  72,  80],
       [  9,  18,  27,  36,  45,  54,  63,  72,  81,  90],
       [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100]])
  1. From the above array (multiplication table), extract only the numbers in the diagonal: 1, 4, 9, ... , 100.

  2. Read on the module np.random, then create a random 1d-array of 10 integer elements, then print only the odd values. (Hint: Boolean mask)

  3. Create a random 2d-array of (10, 10) integer elements, then replace all the odd values with 0 (Hint: np.where).

  4. Randomize a 1-d array with values between 0 and 100, and then find the smallest value that is larger than 50.

  5. Randomize a 1-d array with values between 0 and 100, and then find its 5 largest unique values. (Hint: np.unique)

  6. Given two arrays: np.array([1, 2]) and np.array(['a', 'b']) print their cartesian product i.e.:

array([['1', 'a'],
       ['2', 'a'],
       ['1', 'b'],
       ['2', 'b']],
      dtype='<U21')

Hint: read about np.transpose, np.tile and np.repeat.

(*) 9. Write a function that takes a 2d numpy array that represents a game in Tic-Tac-Toe and checks if there's a winner. Example:

a = np.array([
  ['X', ' ', 'O'],
  [' ', 'X', 'O'],
  ['X', ' ', 'O']
])

is_winner(a)
True

Hint: Read about np.diag and np.fliplr

  1. https://adventofcode.com/2016/day/8 Hint: Read about np.roll
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment