Skip to content

Instantly share code, notes, and snippets.

@alexanderchuranov
Last active January 3, 2016 07:39
Show Gist options
  • Save alexanderchuranov/8431191 to your computer and use it in GitHub Desktop.
Save alexanderchuranov/8431191 to your computer and use it in GitHub Desktop.
Reading lines from standard input in Python and C++
#!/bin/sh
ECHO=/bin/echo
multiply_file()
{
source="${1}"
dest="${2}"
count="${3}"
update_frequency="${4}"
while [ "${count}" -gt 0 ]
do
cat "${source}" >> "${dest}"
count=$(( ${count} - 1 ))
if [ "$(( ${count} % ${update_frequency} ))" -eq 0 ]
then
${ECHO} -n "${count}..."
fi
done
${ECHO} "finished"
}
multiply_file readlines.cc bigfile1 1000 100
multiply_file bigfile1 bigfile 1000 100
rm bigfile1
all: readlines.clang readlines.gcc
readlines.clang: readlines.cc
clang++ -O4 -stdlib=libc++ -o readlines.clang readlines.cc
readlines.gcc: readlines.cc
g++ -O3 -o readlines.gcc readlines.cc
clean:
rm -f readlines.clang readlines.gcc bigfile
test: all bigfile
./readlines.py < bigfile
./readlines.clang < bigfile
./readlines.gcc < bigfile
bigfile: readlines.cc
./makebigfile
.PHONY: clean test bigfile

Reading Lines in Python and C++

This code snippet compares the speed of reading lines of text from standard input.

Sample output:

> make test  
clang++ -O4 -stdlib=libc++ -o readlines.clang readlines.cc  
g++ -O3 -o readlines.gcc readlines.cc  
./makebigfile  
900...800...700...600...500...400...300...200...100...0...finished  
900...800...700...600...500...400...300...200...100...0...finished  
./readlines.py < bigfile  
Read 29000000 lines in 7 seconds. LPS: 4142857  
./readlines.clang < bigfile  
Read 29000000 lines in 13 seconds. LPS: 2230769  
./readlines.gcc < bigfile  
Read 29000000 lines in 3 seconds. LPS: 9666666  
>
#include <ctime>
#include <iostream>
using namespace std;
int main()
{
cin.sync_with_stdio(false);
char buffer[1048576];
cin.rdbuf()->pubsetbuf(buffer, sizeof(buffer));
int count(0);
string line;
time_t start = time(0);
while (getline(cin, line))
++count;
time_t seconds = time(0) - start;
if (seconds > 0)
{
cout << "Read " << count << " lines in " << seconds << " seconds."
<< " LPS: " << count / seconds
<< endl;
}
}
#!/usr/bin/env python
import time
import sys
count = 0
start_time = time.time()
for line in sys.stdin:
count += 1
delta_sec = int(time.time() - start_time)
if delta_sec > 0:
lines_per_sec = int(round(count/delta_sec))
print("Read {0:n} lines in {1:n} seconds. LPS: {2:n}".format(count, delta_sec, lines_per_sec))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment