Skip to content

Instantly share code, notes, and snippets.

@auroraeosrose
Created June 19, 2014 23:22
Show Gist options
  • Save auroraeosrose/e90155727703190ab64c to your computer and use it in GitHub Desktop.
Save auroraeosrose/e90155727703190ab64c to your computer and use it in GitHub Desktop.
Ideas for merged extension generator

An initial dump of ideas for a php generator

  1. use composer and symfony console tools for cli manipulation
  2. pluggable architecture

components needed - configuration parsing scanning? (helper for generating definitions) templating

@auroraeosrose
Copy link
Author

Hmmm - nitpick - we need to call it a "scanner" for generating your json definitions files from source

a "parser" is something different in computer science ;)

I think your idea of a "template" and mine are different

This is how gtk/gen is designed to work currently - in very stereotypical lexer/parser style

  1. it reads in the definition file using the designated lexer class (say json or GIR xml) and creates "tokens" (actually PHP objects) filled with data describing each thing, class has a name, maybe a comment, method has return value, arguments, etc.
  2. The resulting objects are fed into the parser - the parser says "I have a class, include the class template and pass it this data as variables to interpolate" "I have a method, include the method template and give it these variables to interpolate" - so as little logic is used in the templates as possible (maybe a foreach or some if/else)

Both the lexer (what is the format of my definitions and how do I change it to an internal representation the parser understands) and possibly even the parser (how do I handle each object with it's data) can be pluggable in this situation - the parser could possibly even only be partially pluggable, but there would be a default implementation of this, with default template files

as for configuration - the storage mechanism shouldn't matter, as long as what the actual configuration class expectations are properly documented, an ini version, for example, would look like

config.ini
[description]
name=my_extension
author[]=June
author[]=John
contributor[]=Jay
contributor[]=Maria
version=1.0
deps[]=cairo
deps[]=zlib

[definition]
type=GIR
location=/opt/my_extension/defs
parser=default

[output]
create = true
templates=/opt/my_extension/tpl
location=/opt/my_extension/src

[build]
engine=zend
no_windows=true # maybe the extension doesn't support windows
require_lib=zlib
generate_pecl=false
generate_composer=true
generate_pickle= true
use_cpp=false

@jgmdev
Copy link

jgmdev commented Jun 23, 2014

Hmmm - nitpick - we need to call it a "scanner" for generating your json definitions files from source
a "parser" is something different in computer science ;)

Then lets call it scan xD, I used the term parser, because in doxygen's case there are lots of xml files that contain the whole structure of a library. So I wrote an xml parser? to extract all definitions and store them in a easy to read json format with an established format to feed the code generator (with easy to read I also mean json -> php array with json_decode)

I think your idea of a "template" and mine are different
This is how gtk/gen is designed to work currently - in very stereotypical lexer/parser style

it reads in the definition file using the designated lexer class (say json or GIR xml) and creates "tokens" (actually PHP objects) filled with data describing each thing, class has a name, maybe a comment, method has return value, arguments, etc.

Instead of lexer I call it the parser :D

wxphp generator works the same except that instead of creating PHP objects it creates an associative array for classes, constants, enumerations, etc... But we should definitely use PHP objects to traverse the definitions|tokens once loaded in memory.

Also I established a format for the json definition files so it doesn't matters if the original definitions are in GIR, doxygen, etc... So I called parser the process of extracting definitions from doxygen|GIR and storing it on an established json format that can be read by the code generator and distributed as part of your source repository (so the usage of the definitions directory is to store those json files with an established format).

This way, any one is able to work-with/regenerate the source code without the need of having original GIR, doxygen definition files, and without leaving the extensions directory.

The resulting objects are fed into the parser - the parser says "I have a class, include the class template and pass it this data as variables to interpolate" "I have a method, include the method template and give it these variables to interpolate" - so as little logic is used in the templates as possible (maybe a foreach or some if/else)

Same on wxphp, but instead of parser I called it the generator xD (so it seems I'm using wrong terms). In my sucky english dictionary I used generator as the word referring to the process of reading the json definitions and creating the c/c++ source code, config.m4/w32 of the extension.

so peg generate would do something like this:

<?php
$class_definitions = json_decode(file_get_contents("definitions/classes.json"));
$classes = new ClassSymbols($class_definitions); //Helper class to easily traverse the definitions
unset($class_definitions);

foreach($classes as $class)
{
    $authors = get_authors(); //Can be used from header.php

    ob_start();
    include("templates/classes/header.php");
    $classes_header_code .= ob_get_contents();
    ob_end_clean();

    $generated_code .= $classes_header_code;

    etc...

    file_put_contents("src/header1.h", $generated_code);
}

Obviously more code is required, but it is just a sample so I can better transmit my spaghetti ideas.

Both the lexer (what is the format of my definitions and how do I change it to an internal representation the parser understands) and possibly even the parser (how do I handle each object with it's data) can be pluggable in this situation - the parser could possibly even only be partially pluggable, but there would be a default implementation of this, with default template files

Exactly :)

as for configuration - the storage mechanism shouldn't matter, as long as what the actual configuration class expectations are properly documented, an ini version, for example, would look like

Ahhh nice ideas the [build] and deps options, we can surely make a great team xD

@auroraeosrose
Copy link
Author

So you're basically using json as an intermediate "cached" format? That's... not a good format for it

the json parser is OK in PHP, but it's not as nice as other formats and escaping might be problematic (oy, the charset issues you could discover)

If you want to have an easy to use intermediate format (as you're using json) you should use something more accessible and less prone memory eating

Might I suggest perhaps using sqlite as an intermediate cached format instead if you want to have that functionality? - it would be a single file and easily query-able, which would make updating WAY faster (you could store hashes of the original defs files in any format and check if they've changed since the last lexer pass with one query - tada, incremental lexing and parsing for free)

Another option would be to even just write out PHP files with the definitions of the classes/methods embedded right in, then you could totally skip any kind of json_decode or anything else and just include the intermediate files to get the definitions

@jgmdev
Copy link

jgmdev commented Jun 24, 2014

I love sqlite the problem with sqlite is that i cant use a text editor in case i want to manually modify or fix something on the definitions/cache.

Before using json the original wxphp developer used serialize/unserialize but if editing was required it was almost impossible.

Php format sounds good, but parsing a huge php file would surely take some time and memory too.

In my experience json hasnt been bad and wxWidgets is a huge library check https://github.com/wxphp/wxphp/blob/master/json/classes.json

Anyways if we choose php I would be happy, everything would PHP xd even template files :D it would just require a little bit more of work at first.

@jgmdev
Copy link

jgmdev commented Jun 24, 2014

An example of definitions/classes.php could be:

<?php
$class = new ClassSymbol("SomeClassName");

$class->AddMethod(
    "MethodName",
    [
        "static" => false,
        "const" => true,
        "virtual" => true,
        "pure_virtual" => false,
        "protected" => false,
        "return_type" => "const int*",
        "description" => "This method does this and that",
        "parameters" => new Parameters() //Will think on this later
    ]
);

Peg\Application->GetSymbolsTable()->AddClass($class);

//etc...

@auroraeosrose
Copy link
Author

I actually have quite a bit of that already done in one fashion - see https://github.com/gtkforphp/generator/tree/master/lib/objects - - we'll need to set up the proper hierarchy - module -> package (actually this is namespace but namespace is a reserved word - we'll have similar issues with clas) -> class -> method but also namespace -> function or just package->function - then for both return values and parameters they are subclasses of args - and we should probably do a "documentation" type with info to fill in docblocks, etc

Also I didn't mention it much but we should make this with a pluggable front-end
So we can have peg-cli and peg-gtk adn peg-web versions for people to play with ;)

@jgmdev
Copy link

jgmdev commented Dec 1, 2015

TODO (Reminder)

  • Use CastXML to parse C/C++ header files directly and convert them to an easy to parse xml file.
  • Add GIR support in order to parse GTK documentation.
  • Finish codegenerator for PHP version 5 (generate classes)
  • Use PHP version 5 codegenerator as base to start writing a PHP 7 code generator.
  • Also add HHVM code generation support.
  • Implement C++ templates support
  • Write a webui that could be launched from command line (eg: peg ui) in order to graphically modify definition files and enable/disable classes, methods, functions, etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment