zmughal/perl-bindings-etc.md Secret

## perl-bindings-etc.md

      
    Raw
  

              perl-bindings-etc.md
            
          
    So, I have a bunch of projects in various states:
Most of my projects that involve Perl + C in the same codebase are all about bindings.
Embedding R in Perl


Git: https://github.com/zmughal/embedding-r-in-perl-experiment
State: Almost finished. Just needs docs. It uses the new and shiny Inline::Module for packaging the Inline::C code.
TODO: There are still some more things I want to handle in the issues. Like dealing with bignums, complex numbers, etc.

Embedding Octave in Perl


Git: https://github.com/zmughal/embedding-octave-experiment


State: Trying to figure out how to bind arbitrary C++ objects.


TODO: I'm keeping in touch with the author of Inline::CPP to figure out a way to cleanly bind all the different C++ objects.
It'd be great to get access to the Octave parser too for another project.
If this can be done with an FFI, that would be great. I'm looking at these EntropyOrg/p5-Graphics-VTK#4.


Miscellaneous bindings for Perl


Waveform DB http://www.physionet.org/physiotools/wfdb.shtml: Just C. SWIG
bindings exist, but as usual, SWIG bindings are ugly.


Insight Toolkit http://www.itk.org/: Heavily templated C++. Current
bindings for Java, Python, and Tcl use a tool called CableSWIG + gccxml to
fill in the templates and compile them for certain types.


Visualization Toolkit http://www.vtk.org/ : Untemplated C++. Old Perl
bindings are on CPAN, but I contacted the current maintainer (works at
Raytheon!!) and told him I have plans to update it, so he gave me comaint
bits. I created a backend that uses Inline::Python just as a
demo (ugh...I use proxy objects here), but I want to create native bindings
https://github.com/zmughal/p5-Graphics-VTK.


Freetype https://github.com/zmughal/p5-Font-FreeType: Just C. I've been fixing up
this code along with a guy from Belarus. I need to look through the TODO and
start picking off bugs / updating the XS. I contacted the current author on
CPAN, but he hasn't responded yet, so I might have to get it transferred to me.


MuPDF http://mupdf.com/: Just C. I have a simple interface I made using SWIG, but
that was before I knew XS https://github.com/zmughal/p5-MuPDF.


pdfium https://code.google.com/p/pdfium/: C++. PDF rendering library from
FoxIt and Google.


Leptonica http://www.leptonica.com/. Just C. Image processing library. I have a
rudimentary binding up on CPAN https://metacpan.org/pod/Image::Leptonica.
It works. I extracted docs out of the comments using a program, but the API
is too C-like right now. I need to figure out how to wrap it nicely.


libsigrok http://sigrok.org/: Just C. Library to read from oscilloscopes
and multimeters. I have a small bit of code to read from a RadioShack
multimeter, but this library has more device support. Make my own LabVIEW,
maybe?


Csound http://www.csounds.com/: Just C. Csound6 came out with a simple API
to compile orchestras and control scores. Getting that done should be easy.
Later, I want to see if I can get direct access to the various unit
generators. I don't know what I'll do with it yet, but it would be nice to
rewire the orchestra in realtime.


Teem volumetric analysis libraries http://teem.sourceforge.net/. Just C.


There are some more computer vision and machine learning libraries I want to
bind too, so I'm game for anything.
Code metadata extraction


I wanted to experiment with code extraction like in an IDE (or ctags,
cscope). So I created a small program that extracts comments and function
signatures from C code (using ctags) https://github.com/zmughal/p5-Quiver
and puts all that in a DB. I already used that to generate the documentation
for Image::Leptonica from the C sources, but the design needs an overhaul.
Later I want to play with the Marpa parser and add support to extract code
from many langauges. And Marpa has a neat thing where it can tell you how it
failed a parse, so the parsers can be made robust to (some) syntax errors
easily.


I created an IPython language kernel for Perl, but it currently doesn't have
code completion https://github.com/zmughal/p5-Devel-IPerl. I would like to
add completion using PPI, but I also have some ideas of using the Perl
debugger to extract dynamic type information at runtime and use that to make
suggestions (possibly by running the tests).


Bindings for srclib would be nice https://github.com/sourcegraph/srclib.
This might mean creating an Inline::Go package.


Automatic memoisation like IncPy http://www.pgbovine.net/incpy.html,
http://academic.odysci.com/article/1010113015034564/using-automatic-persistent-memoization-to-facilitate-data-analysis-scripting.
Related: Panda http://infolab.stanford.edu/panda/,
StarFlow http://dash.harvard.edu/handle/1/4797264


Social media scraping


I wrote an NNTP server that shows the Facebook feed as NNTP messages
https://github.com/zmughal/nntp-portal. It's currently read-only.
I realised that Facebook doesn't show everybody's posts via the API, so I
want to also scrape directly from the website. I found that doing that
without JS is painful to keep up with.
Later I want to add support for Slashdot, reddit, hackernews, etc.
I already have a scraper for Slashdot https://github.com/zmughal/p5-WWW-Slashdot-Scraper,
but I need to rate-limit it.


Units of measure


I have an experiment to add units of measure to arbitrary numerical types
https://github.com/zmughal/units-experiment. I want to do it properly by
modeling physical quantities and using udunits-2 to handle conversion constants.
This will need to know the difference between torque and energy http://en.wikipedia.org/wiki/Torque#Relationship_between_torque.2C_power.2C_and_energy

[t]he unit newton metre is dimensionally equivalent to the joule, which is the unit of energy. However, in the case of torque, the unit is assigned to a vector, whereas for energy, it is assigned to a scalar.


Date time parsing


I wrote a tool that will extract seminar info out of some websites
https://github.com/zmughal/seminar-extractor.
This is an initial stab at doing information extraction and date time
parsing, but I want to make it smarter so I don't have to worry about edge
cases. It's currently using DateTime::Format::Natural, but I want to look at
Probabilistic CFGs like the ones used in https://github.com/wit-ai/duckling.


Logic programming


Implement http://minikanren.org/ in Perl.