Skip to content

Instantly share code, notes, and snippets.

@pydemo
Created September 11, 2018 20:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pydemo/0b85bd5d1c017f6873422e02aeb9618a to your computer and use it in GitHub Desktop.
Save pydemo/0b85bd5d1c017f6873422e02aeb9618a to your computer and use it in GitHub Desktop.
read file in Cython
from libc.stdio cimport *
cdef extern from "stdio.h":
#FILE * fopen ( const char * filename, const char * mode )
FILE *fopen(const char *, const char *)
#int fclose ( FILE * stream )
int fclose(FILE *)
#ssize_t getline(char **lineptr, size_t *n, FILE *stream);
ssize_t getline(char **, size_t *, FILE *)
def read_file_slow(filename):
f = open(filename, "rb")
while True:
line = f.readline()
if not line: break
#yield line
f.close()
return []
def read_file(filename):
filename_byte_string = filename.encode("UTF-8")
cdef char* fname = filename_byte_string
cdef FILE* cfile
cfile = fopen(fname, "rb")
if cfile == NULL:
raise FileNotFoundError(2, "No such file or directory: '%s'" % filename)
cdef char * line = NULL
cdef size_t l = 0
cdef ssize_t read
while True:
read = getline(&line, &l, cfile)
if read == -1: break
#yield line
fclose(cfile)
return []
@sylvain-ri
Copy link

Thank you very much ! I get x2.5 reading speed with your implementation. What's the commented lines though, for example #FILE * fopen ( const char * filename, const char * mode ) ?

@pydemo
Copy link
Author

pydemo commented Apr 13, 2021

just signature of the method.
Warning - devops side/releasing cython in cloud will be a nightmare

@sylvain-ri
Copy link

I'm not yet at that point. How bad is it ? Stopped using Cython or there are some workaround that can be done ? thanks for the warning.

@sylvain-ri
Copy link

Hi, if you have some time, would you like to have a look at an issue ? I tried to read 4 lines at the same time, and my jupyter notebook keeps crashing.
can-getline-be-used-multiple-times

@sharon92
Copy link

sharon92 commented Nov 22, 2022

@sylvain-ri I am new to cython, I get an unresolved symbol error for getline. Did you have to include some header files in setup.py? Thanks!

@sylvain-ri
Copy link

Hi @sharon92 . Do you have more details ? You could post on Stackoverflow.
From what I did 2 years ago, i think cdef extern from "stdio.h" is needed. Check on https://github.com/sylvain-ri/PLoT-ME/blob/master/plot_me/cython_module/cyt_ext.pyx

# Related to the file reader. Can be replaced by from libc.stdio cimport fopen, fclose, getline ; +10% time
# from https://gist.github.com/pydemo/0b85bd5d1c017f6873422e02aeb9618a
cdef extern from "stdio.h":
    FILE *fopen(const char *, const char *)
    # int fclose ( FILE * stream )
    int fclose(FILE *)
    # ssize_t getline(char **lineptr, size_t *n, FILE *stream);
    ssize_t getline(char **, size_t *, FILE *)

@sharon92
Copy link

sharon92 commented Nov 24, 2022

@sylvain-ri Hi, I am trying to compile the pyx file using the command python.exe setup.py build_ext --inplace and my setup.py looks like this

from distutils.core import setup, Extension
from Cython.Build import cythonize

setup(
      ext_modules=cythonize("meshReader.pyx",    
                            annotate=True,
                            compiler_directives={'language_level' : "3"}),

      )


I am getting a linking error on the compiler for unresolved symbol "getline"

c:\Projects\CythonProjects\FP-Creator\src\meshReader>c:\Py310\python.exe setup.py build_ext --inplace Compiling meshReader.pyx because it changed. [1/1] Cythonizing meshReader.pyx running build_ext building 'meshReader' extension "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -Ic:\Py310\include -Ic:\Py310\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /TcmeshReader.c /Fobuild\temp.win-amd64-3.10\Release\meshReader.obj meshReader.c meshReader.c(1674): warning C4013: 'getline' undefined; assuming extern returning int creating c:\Projects\CythonProjects\FP-Creator\src\meshReader\build\lib.win-amd64-3.10 "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:c:\Py310\libs /LIBPATH:c:\Py310 /LIBPATH:c:\Py310\PCbuild\amd64 "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.19041.0\\um\x64" /EXPORT:PyInit_meshReader build\temp.win-amd64-3.10\Release\meshReader.obj /OUT:build\lib.win-amd64-3.10\meshReader.cp310-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.10\Release\meshReader.cp310-win_amd64.lib Creating library build\temp.win-amd64-3.10\Release\meshReader.cp310-win_amd64.lib and object build\temp.win-amd64-3.10\Release\meshReader.cp310-win_amd64.exp meshReader.obj : error LNK2001: unresolved external symbol getline build\lib.win-amd64-3.10\meshReader.cp310-win_amd64.pyd : fatal error LNK1120: 1 unresolved externals error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.33.31629\\bin\\HostX86\\x64\\link.exe' failed with exit code 1120

my pyx looks like this:

cimport cython
from libc.stdio cimport FILE, fopen, fclose, getline
from cython.parallel import prange

cdef extern from "stdio.h":
    # FILE * fopen ( const char * filename, const char * mode )
    FILE *fopen(const char *, const char *)
    # int fclose ( FILE * stream )
    int fclose(FILE *)
    # ssize_t getline(char **lineptr, size_t *n, FILE *stream);
    ssize_t getline(char **, size_t *, FILE *)
    
def read_file(filename):
    filename_byte_string = filename.encode("UTF-8")
    cdef char* fname = filename_byte_string

    cdef FILE* cfile
    cfile = fopen(fname, "rb")
    if cfile == NULL:
        raise FileNotFoundError(2, "No such file or directory: '%s'" % filename)


    cdef char * line = NULL
    cdef size_t l = 0
    cdef ssize_t read
 
    while True:
        read = getline(&line, &l, cfile)
        if read == -1: break

    fclose(cfile)
    return []

@sylvain-ri
Copy link

I haven't used Cython for 2 years now, did you check the tutorial ? https://cython.readthedocs.io/en/latest/src/tutorial/external.html
I see a first import cython before the example with from libc.stdlib cimport atoi (probably similar to our from libc.stdio cimport FILE, fopen, fclose, getline)

Which version of Cython do you use ?

Looking at your error code: "link failed with error code 1120" i suggest you to check this post: https://stackoverflow.com/questions/63750020/unresolved-external-symbol-error-when-linking-a-cython-extension-against-a-c-l
Or check more results when googling these terms.
All the best 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment