Tech Is Hard

Credibility = Talent x Years of experience + Proven hardcore accomplishment

Category Archives: sample code

PIM using REBOL 2.1 more parsing


Since REBOL is new to me, I continue to experiment with the syntax to see what feels right to me in terms of the fundamental criteria – low coupling and high cohesion. I simplified the high-level rule and put some of the processing in the lower level matching rules. I added some flexibility in ordering of terms in the input, but you have to have the description last; since it’s intended to allow any input it will greedily match the date. I think the date format needs to eventually change. I’m ignoring the definitions of the lowest level terms from here on (letter, digit, etc.) except to say that the rules for day and month now set local values:

    day:      [copy the-day [ [#"1" | #"2" | #"3"] [#"0" | #"1"] | 
                              [#"1" | #"2"] digit | digit ] ]
								
    month:    [copy the-month [ "jan" | "feb" | "mar" | "apr" | "may" | "jun" | 
		                "jul" | "aug" | "sep" | "oct" | "nov" | "dec" ] ]

It may not be proper style to be clearing things at the beginning of the sentence rule, but it works and is clear. I got rid of the subject handling we had before, and I’m trying out a handler concept for tags. All we do when the rule completes now is print the parsed terms.

sentence: [( the-date: now/date
             the-subject: copy ""
             the-desc: copy ""
             the-amount: none
             the-tags: copy [] )
						
    some [ [some ws] | ["on" some ws date] | subject | amount | copy the-desc desc] 
    (print [the-date the-desc the-subject the-amount the-tags])
]

I changed the main rule to use some around a choice, instead of a bunch of alternative choices. The terms are set by the individual rules like date and desc:

desc:    [some [tag | some name-char | ws]]
tag:     [copy a-tag [ #"#" some name-char ] (tag-handler a-tag append the-tags a-tag)]
subject: [copy the-subject [ #"@" letter any name-char ]]
amount:  [copy the-amount [ opt sign some digit opt [ "." 0 2 digit ] ] 
             (the-amount: to decimal! the-amount)]
date:    [[ day | month ] [ [some ws] | #"/" ] [ month | day ] (
             default: now
             the-date: either (current: to date! rejoin [the-day "-" the-month "-" default/year]) > (default/date + 200)
                 [to date! rejoin [ the-day "-" the-month "-" (default/year - 1) ]]
                 [current]
             clear the-day clear the-month
         )]

You can see the tag-handler called in the tag rule. We’ll get to that. The date rule calculates the year automatically; if it’s more than 200 days in advance, we assume last year. My default tag-handler is expecting the current tag, and the accumulated tags block.

tag-handler: func [
    tag [string!]
    tags [block!]
][
    print ["tag found" tag "/" tags "/"]
]

There’s multiple ways in REBOL for us to supply that tag-handler, which we’ll look at as our needs become clearer. Run the rule with a reasonably human sentence:

>> parse/all “@visa on 19 Dec -230.21 #statefarm #insurance for #condo interior contents” iAm/sentence
tag found #statefarm / #statefarm /
tag found #insurance / #statefarm #insurance /
tag found #condo / #statefarm #insurance #condo /
19-Dec-2011 #statefarm #insurance for #condo interior contents @visa -230 #statefarm #insurance #condo
== true

Looking at it now, I might like moving the copy commands from the sub-rules back into sentence. I plan soon to import all my GnuCash transactions into a REBOL format I can query and add to. I’ll need to be able to query and edit them, and I’m hoping that it’s unnecessary to have a bunch of hierarchical expense and income accounts.

It hit me today that tagging allows “multiple inheritance” in contrast to hierarchies of organization.

Up until the last few years I’ve been doing my own taxes, and the only useful information I can recall needing is totals for certain categories of spending. Which forces me to categorize the transaction under only that applicable account. I’d rather tag it with meaningful attributes, among them #tax #deduction.

Advertisements

PIM using REBOL 2


I’ve updated the parsing rules. I find the REBOL parse dialect to be a level above regular expressions. I know people who are regex Jedi masters, but it really isn’t comparing apples to oranges. In REBOL’s BNF-like parse rules, one can define a grammar. Recursive and self-referential productions are possible — just like in human language. The parse rules most often contain REBOL code, surrounded by (), to execute at important points.

sentence: [ 
    [subject | date | amount] some ws 
    [subject | date | amount] some ws 
    [subject | date | amount]
    (
        either found? ctx: find subjects the-subject [
            do ctx/2
        ][
            append subjects compose/only/deep [ 
                (the-subject) [print ( reform ["!!!! found" the-subject] )] 
            ]
        ]

        print [ "DATE:" the-month the-day "ACCOUNT" the-subject "AMOUNT" the-amount ]
    )
]

subject: [copy the-subject [ #"@" letter any name-char ]]
date: 	 ["on" some ws copy the-day day some ws copy the-month month]
amount:  [copy the-amount [ opt sign any digit opt ["."] 0 2 digit ]]

day: 	 [	[#"1" | #"2" | #"3"] [#"0" | #"1"] | 
		[#"1" | #"2"] digit | 
		digit 
	]
month: 	[ "jan" | "feb" | "mar" | "apr" | "may" | "jun" | 
	  "jul" | "aug" | "sep" | "oct" | "nov" | "dec" 
	]

ws: 	   charset reduce [tab newline #" "]
sign: 	   [ #"+" | #"-" | none ]
name-char: [ letter | digit ]
letter:	   charset [#"A" - #"Z" #"a" - #"z"]
digit:	   charset [#"0" - #"9"]

(I don’t like the way I implemented allowing the terms in any order, but it can wait.)

You can see copy in the definitions for subject, date and amount to set a word (each beginning with the-) to the value that matches the expression that follows. At the end we process or collect the-subject.  If already found we can execute some code that’s unique to that subject. Otherwise I’m saving it and some stupid code to print that it was found. This is the code that will get executed when we find it later. The really cool thing about this (and it could be done in REBOL a number of ways) is that code is built at run time, as specifically tailored as you need it to be.

form, reform, mold, remold, join, rejoin, reduce and compose still confuse me some, so I always have the language reference open :), but here we use compose to reduce (evaluate, sort of) the items in its argument that are surrounded by (). The /only refinement leaves blocks it finds as blocks. Inner blocks are processed with the /deep refinement.

 
compose/only/deep [ (the-subject) [print ( reform ["!!!! found" the-subject] )] ]

I’ll use a journal block to save an audit trail of the statements being parsed and a block to hold the accumulated accounts/contexts:

journal:  copy [] ; use copy to initialize
subjects: copy []

and a set of sample statements:

samples: [
	"@discover on 11 Dec -13.22"
	"@discover -56.47 on 17 Dec"
	"on 13 Jan -1.20 @visa"
	"on 4 Jan @checking -1.01"
]

Now call parse with each of the statements, and see that we can gather the information we’ve set up rules for. The formatted lines show the values being copied to the correct words.

>> foreach statement samples [
[    either parse/all statement sentence [repend journal [ now statement ]][ print ["*** Couldn't parse" statement ]]
[    ]
DATE: Dec 11 ACCOUNT @discover AMOUNT -13.22
!!!! found @discover
DATE: Dec 17 ACCOUNT @discover AMOUNT -56.47
DATE: Jan 13 ACCOUNT @visa AMOUNT -1.20
DATE: Jan 4 ACCOUNT @checking AMOUNT -1.01

Print the journal where we saved every statement with a timestamp.

>> forskip journal 2 [print ["timestamp" journal/1 "statement" journal/2]]
timestamp 29-Dec-2011/21:07:48-7:00 statement @discover on 11 Dec -13.22
timestamp 29-Dec-2011/21:07:48-7:00 statement @discover -56.47 on 17 Dec
timestamp 29-Dec-2011/21:07:48-7:00 statement on 13 Jan -1.20 @visa
timestamp 29-Dec-2011/21:07:48-7:00 statement on 4 Jan @checking -1.01

The subjects block which contains the subjects that were encountered and a small code block to execute for each.

>> probe subjects
== ["@discover" [print "!!!! found @discover"] "@visa" [print "!!!! found @visa"] "@checking" [print "!!!! found @checking"]]

Let’s look at some of the other ways we can build the tiny code block that gets appended to subjects. It could be coded:

compose/only/deep [ (the-subject) [print ( reduce ["!!!! found" the-subject] )] ]
...
>> probe subjects
== ["@discover" [print ["!!!! found" "@discover"]] "@visa" [print ["!!!! found" "@visa"]] "@checking" [print ["!!!! found" "@checki..

The the runtime output looks the same, e.g. !!!! found @discover. But if the block will be executed a lot, having print join the two strings intuitively takes a little longer. Or it could have been coded as:

compose/only/deep [ (the-subject) [print  [ "!!!! found" (the-subject) ]] ]
...
>> probe subjects
== ["@discover" [print ["!!!! found" "@discover"]] "@visa" [print ["!!!! found" "@visa"]] "@checking" [print ["!!!! found" "@checki..

Which creates the same block as well as the same output.

The trick to picturing the evaluation is the parentheses. Some of my background languages make heavy use of macros, and that’s how I think of the (expression). What’s great to me is the preprocessor language is the same REBOL command dialect I use for everything else (REBOL script source is actually a dialect, too. There are no keywords.) Compose and related block-evaluating functions let me customize code blocks that will be executed many times. I can eliminate any redundant conditionals.

PIM using REBOL 1


OK. Something new. Ideas beget [usually bigger] ideas, before they’re even fully hatched.
I took a look at finance41 this last weekend. Very cool. It gains both simplicity and freedom for the user in terms of input. You can see a resemblance between its syntax and other formats like tweets and RTM for GTD.

It makes me wonder about financial software I use and the style I think we’re accustomed to.  With strict representations not only of specific bank accounts, et al., but also of categories of spending.  Hierarchical categories (categorization is another topic to hit on big time) , arguably the least useful way to find things.  In finance41, you decide what’s an account and what tags you want to apply to a transaction.  In that environment, it makes sense to make accounts only for your assets and liabilities and use tags for expenses and incomes.

It allows a query by multiple tags or defining macro tags, allowing me to slice and dice the data any way (that’s not available in f41, yet).  An example transaction in f41 is

@visa -119.23 on 23 Jul tile cutter from ace hardware #home #diy #kitchen

(Anything that’s not a date, amount, account or tag goes in the description field — but the separate items can appear in any order).  I can’t for the life of me see now why I need to be so formal as I have about the expense and income categories. I know what you’re thinking, “won’t that record the transaction in multiple registers and cause integrity issues?” Not if there’s no register or journal representing an expense or income. Just scan the transactions based on tags, which could be predefined in sets for convenient reporting.

Another thing I’d add is allow the tags and description to interleave.

And vastly expand the idea using REBOL.

I wrote these parse rules:

sentence:    [ copy the-subject subject 
               date-rule amount-rule (find-ctx the-subject)]
subject:     [ #"@" letter-char any name-char ]
date-rule:   [ thru "on" 
               some ws copy the-day day-rule 
               some ws copy the-month month-rule 
             ]
amount-rule: [ opt sign-char any digit-char opt ["."] 0 2 digit-char ]
day-rule:    [ [#"1" | #"2" | #"3"] [#"0" | #"1"] |
               [#"1" | #"2"] digit-char |
               digit-char
             ]
month-rule:  [ "JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" |
               "JUL" | "AUG" | "SEP" | "OCT" | "NOV" | "DEC"
             ]
ws:          charset reduce [tab newline #" "]
name-char: [ letter-char | digit-char ]
letter-char: charset [#"A" - #"Z" #"a" - #"z"]
sign-char: [ #"+" | #"-" | none ]
digit-char: charset [#"0" - #"9"]

and a sample sentence of:

sampleStatement: “@visa on 19 Dec -230.21”

In the sentence rule, we call find-ctx with the value we found for @visa (or anything else beginning with ‘@’). find-ctx looks like

find-ctx: func [{find, inseet or update context} ctx [string!]][
	either found? ctx: find myContexts new-ctx: copy ctx [
		do ctx/2
	][ 
		print ["new context" new-ctx] 
	]
]

(BTW the parse is returning false right now, which means it’s not completing properly, but it is getting all my data, and since we’ll be adding a bunch of stuff to this, I’m not going to sweat it.)

>> myContexts: [] parse/all sampleStatement sentence
new context @visa
== false
>> myContexts: ["@visa" [print "a visa charge"]] parse/all sampleStatement sentence
a visa charge
== false

When a context is found we can execute code specifically for it. I hope to gather information as they’re added to make some sort of meta connections.

PHP Property Manager: _pm class


class __pm {

    // internal meta-management

    /**
     * Unsets the indicated properties in client class
     *
     * @param  object  $z
     */
    static function init($z) {
        $oReflectClass = new ReflectionClass(get_class($z));

        // For each public property
        foreach ($oReflectClass->getProperties(ReflectionProperty::IS_PUBLIC) as $oReflectProperty) {
            if (!($sDocBlock = $oReflectProperty->getDocComment()) || false === strpos($sDocBlock, " @pm")) continue;

            $propertyInfo = self::getDocBlockInfo($sDocBlock, $pn = $oReflectProperty->getName());
            unset($z->{$pn});
        }
    }

    /**
     * Returns an array of info from a docblock
     *
     * @param  string $sDocComment  docblock
     * @param  string $pn           property name
     * @return array                parsed docblock
     */
    public function getDocBlockInfo($sDocComment, $pn) {

        // get the "short description" from docblock (start in position 3)
        preg_match("/\.*\s\*\s*([^\n]*)/m", $sDocComment, $desc, 0, 3);
        $return = array('label' => isset($desc[1]) ? $desc[1] : $pn);

        preg_match_all("/[^@]+@(\S+)\s*(\S+)?\s*([^@]+)?\n/", $sDocComment, $matches, PREG_SET_ORDER, 3);
        foreach ($matches as $tokens) {
            // token name
            $token = $tokens[1];

            // if we only have the @token, then its value is true
            if (count($tokens) == 2) {
                $return[$token] = true;
            }
            // otherwise we need arguments to @token as the value
            else {
                array_shift($tokens); array_shift($tokens);
                $return[$token] = $tokens;
            }
        }
        return $return;
    }

    // property accessors

    /**
     * Public property accessor
     *
     * @param  mixed   $z
     * @param  string  $pn
     * @return mixed
     */
    static function get($z, $pn) {
        self::trace();
        return null;
    }

    /**
     * Public property accessor
     *
     * @param  object  $z
     * @param  string  $pn
     * @param  mixed   $pv
     * @return mixed   Passed value
     */
    function set($z, $pn, $pv) {
        self::trace();
        return $pv;
    }

init should be called during construction of your object to set up everything for the properties in the calling class that are marked with @pm.

getDocBlockInfo extracts metadata about the property from the docblock and returns an array.  This metadata combined with other reflection will control how we set up this property. So far it looks like:

(
    [label] => Distribution Center
    [pm] => 1
    [var] => Array
        (
            [0] => DistCenter
        )
)

trace() will log a formatted stack trace using trace_line().  (I’m only putting this functionality in __pm for convenience.)

    // -- logging

    /**
     *
     * Logs a trace to the point where called
     *
     * @param  mixed  String to print | TODO function that returns
     *                a string.
     *                (Arguments and return values are
     *                AUTOMATICALLY printed.)
     * @return void
     */
    static function trace() {

        $argc = count($argv = func_get_args());

        $backtrace = debug_backtrace();
        // default order is inside out, so this puts it in "top to bottom"
        $backtrace = array_reverse($backtrace);

        $indent = 0;

        // get rid of weird google analytic request var
        $request = $_REQUEST;
//      unset($request['XXX']); // get rid of anything you don't always want logged

        $file = isset($backtrace[0]['file']) ? $backtrace[0]['file'] : '';
        $description = "<?php {$file}" . ($argc && count($backtrace) < 2 ? " [{$backtrace[0]['line']}] - {$argv[0]}" : '');

        if (!empty($_SERVER['HTTP_X_REQUESTED_WITH']) && strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest')
            $description = "AJAX request - " . $description;

            self::line("{$description} \$_REQUEST: " . self::prin_r($request, true));

        // get rid of the last stack entry for having called the log method, itself
        array_pop($backtrace);
	if (count($backtrace) > 0) {
            for ($i = 0, $stackCount = count($backtrace) - 1; $i++) {
                self::line(self::trace_line($indent, $backtrace[$i]));
            }
            self::line($argc ? self::trace_line($indent, $backtrace[$i], $argv[0]) : self::trace_line($indent, $backtrace[$i]));
        }
    }

    /**
     * Returns a single entry.
     *
     * @param  integer  $indent
     * @param  array    $trace_entry
     * @param  string   $argv[2]
     * @return string
     */
    private static function trace_line(&$indent, $trace_entry) {

        $line = array_key_exists('line', $trace_entry) ? "[{$trace_entry['line']}] " : '';
        $verb = $trace_entry['function'];

        if (array_key_exists('class', $trace_entry)) {
            $classfile = $trace_entry['class'];
            $verb = "->" . $verb;
            if (array_key_exists('object', $trace_entry) and $object = $trace_entry['object']) {
                $id = property_exists($object, 'id') ? $object->id : false;
                $class = "" : '>');
                if ($classfile != $classname)
                    $class .= " {$classfile}";
                $verb = $class . $verb;
            }
        else
            $verb = $classfile . $verb;
        }
        else {
            $file = isset($trace_entry['file']) ? substr($trace_entry['file'], self::$_baseDirLength) : 'unknown file';
            $verb = "{$file} " . $verb;
        }
        $s = str_repeat(" ", $indent) . "{$line}{$verb}(" . self::prin_r($trace_entry['args']) . ")";
        $indent += strlen($line);

        if (count($argv = func_get_args()) > 2)
            return $s . "=" . self::prin_r($argv[2], true);
        else
            return $s;
    }

With a couple useful output and formatting functions:

    static function line($string) {
        $line_start = date("m/d/Y H:i:s ");
        error_log("{$line_start}{$string}\n", 3, "__pm.log");
    }

    static function prin_r($arg) {
        $oneLine = preg_replace("/\s*\n\s*/" , ' ', print_r($arg, true));
        return gettype($arg) == 'object' ? '<'.$oneLine.'>' : $oneLine;
    }
}

PHP Property Manager: User and Model client classes


It will make things go faster to pack a lot of new things into this. User now looks like this:

require_once 'class.Model.php';

class User extends Model {

    public $id;

    /**
     * Distribution Center
     *
     * An object reference to $this User's distribution center object
     *
     * @pm
     * @var DistCenter
     */
    public $DistCenter;

    function __construct($id) {
        parent::__construct();
        $this->id = $id;
    }
}

Assuming that your class hierarchy might have a base class, I’ve added a base class of Model to User. (Remember, in my original implementation, the base class held all the property management, but I don’t want that restriction.) We’ve marked $DistCenter as a managed property with @pm and given it a type. Model should be doing most of the interfacing with __pm; it looks like:

require_once 'class.__pm.php';

class Model {

    function __construct() {
        __pm::init($this);
    }

    /**
     * Generic getter
     *
     * Called everytime a @pm property is referenced
     *
     * @param  string  $prop  Name of object property
     * @return mixed          Value of property.
     */
    public function __get($prop) {
    	try {
            return __pm::get($this, $prop);
    	}
        catch (Exception $e) {
            error_log($e);
            throw $e;
        }
    }

    /**
     * Generic setter
     *
     * Called everytime a @pm property of an object is set
     *
     * @param  string   $prop   Name of object property
     * @param  mixed    $value  Value to set object property to
     * @return mixed
     */
    function __set($prop, $value) {
        try {
            return __pm::set($this, $prop, $value);
    	}
    	catch (Exception $e) {
            error_log($e);
            throw $e;
    	}
    }
}

By supplying __get and __set interceptor functions, your class hierarchy can make use of __pm class. I’m adding logging of any Exception that’s thrown by __pm.

REBOL interface to GnuCash: Working with the GnuCash XML


Now to take the file we created and start messing with it.

I found an option to turn off compressing the data file for GnuCash, so I want to focus on parsing the XML.

There’s a few REBOL scripts that work with XML; I chose xml-parse.r.  Initially I used its builtin block parsing methods to turn the XML into REBOL blocks that I could work with, but I decided that parsing the entire document into blocks first, just so I could try random REBOL series! functions against it would be a huge waste.  So, I ended up writing a gnucash handler to look for specific elements as they’re parsed and save the child nodes of each as key/value pairs.

As a first try I want to supply the source XML and a request/response block!:
>> gnucash/list read %2011.xml [“gnc:transaction” []]

rebol [
  Title: "Process GnuCash data file"
  File: "%parse-gc-xml.r"
  Date:  13 Dec 2012
  Purpose: {
    interface to gnucash data
  }
]

do %xml-parse.r
gnucash: make xml-parse/parser [
    subjects: copy []
    content: copy ""
    row: none
    set-namespace-aware false

    list: func [ gc-xml [string!] subjects [block!] ][
        self/subjects: copy subjects
        parse-xml gc-xml
        return self/subjects
    ]
	
    handler: make xml-parse/xml-parse-handler [

        characters: func [ characters [string! none!] ][
            if all [ 
                found? row 
                found? characters 
            ][
                append content characters
            ]
        ]
        start-element: func [
            ns-uri [string! none!]
            local-name [string! none!] q-name [string!]
            attr-list [block!]
        ][
            clear head content
            if found? rowset: select subjects q-name [ 
                append/only rowset row: copy []
            ]
        ]
        end-element: func [
            ns-uri [string! none!]
            local-name [string! none!] q-name [string! ]
        ][
            if found? row [ 
                either select subjects q-name 
                    [ row: none ][ repend row [q-name copy content] ]
            ]
        ]
    ] ; handler
] ; gnucash

Try it out.
>> do %parse-gc-xml.r
>> trns: gnucash/list read %../2011.xml [“gnc:transaction” []]
== [“gnc:transaction” [[“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:…
>> trns/1
== “gnc:transaction”
>> trns/2
== [[“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-01-03 0…
>> trns/2/1
== [“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-01-03 00…
>>
>> trns/2/2
== [“trn:id” “1849b854cd0ad3fbbddc94f3d661672a” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-05-20 00…

So we can think of trns as having “rows” of gnc:transaction attributes.  Precisely, its second entry is the returned rowset.  If I want the date from the second transaction, I can.

>> select trns/2/2 “ts:date”
== “2011-05-20 00:00:00 -0600”

We can collect multiple rowsets with one call by adding additional element/block! pairs in the argument.

>> result: gnucash/list read %../2011.xml [“gnc:commodity” [] “gnc:account” []]
== [“gnc:commodity” [[“cmdty:space” “ISO4217” “cmdty:id” “USD” “cmdty:get_quotes” “” “cmdty:quote_source” “currency” “cmdty:quote_t…
>> acts: find result “gnc:account”
== [“gnc:account” [[“act:name” “Root Account” “act:id” “6197eee7c4c8c51c914cd3aa4114ef44” “act:type” “ROOT”] [“act:name” “Expenses”…
>> acts/1
== “gnc:account”
>> select acts/2/1 “act:name”
== “Root Account”

REBOL interface to GnuCash: Uncompressing the data


I first saw REBOL in about 1999 and immediately thought it was about the bitchinest thing since function pointers let me implement polymorphism in C (and assembler).  I am going to try once again, now that I have some free time, to learn it.  As part of a much bigger idea, I want to work with GnuCash files.

GnuCash (I am using version 2.4.8) puts all its data in a compressed file.  I used 7-Zip to check it out and that’s what I’ll use to uncompress it, since I couldn’t get the native REBOL scripts I found to work — but that’s OK, because I can call 7-Zip’s command line version from REBOL.

(Check out how to install on your machine on the REBOL site, please.)

And especially, before I start, I have frankly struggled with this language in the past.  I’ve read that REBOL can be harder to learn if you have a lot of traditional programming experience.

The script should let me select the GnuCash compressed data file to be used as 7-zip’s input and show me the uncompressed data.

Request input file

Ask for GnuCash input file to uncompress

rebol [
    Title: "Process GnuCash data file"
    File: "%rebcash.r"
    Date: 01 Dec 2012
    Purpose: {
        interface to gnucash data
    }
]

if gc-source-file:
    request-file/title/keep/only/filter
    "Select GnuCash File" "OK" "*.gc" [
    call/console reform [ 
        "C:\Program Files\7-Zip\7z.exe" 'e '-tgzip '-so '-y
            rejoin [ {"} to-local-file clean-path gc-source-file {"} ]
    ]
]

The -so option sends output to stdout.

The /console refinement on call lets me see its output in the REBOL interpreter.

The REBOL rejoin function evaluates the terms in the block and returns a concatenated string of results.  Some magic file system functions are used to turn the REBOL file! type that was returned by request-file into a path the host can be happy with.

So for a simple scripting language, that’s great. Here’s what starts dumping to my screen:

Output from 7-zip

Output in console window

What Happened to my ID?


So far, User looks like

class User {
   public $id;
   public $DistCenter;
   function __construct($id) {
      $this->id = $id;
      Foobar::init();
   }
}

Now Foobar is unsetting $id, which could be fixed by swapping the lines in User’s constructor. But I don’t feel comfortable that this is the only public property I’m going to want to protect and I think the programmer should explicitly declare which properties should be handled by Foobar. I also know I want to use PHPDoc in a disciplined way to make these classes self describing.

Let’s create one for User::DistCenter.

/**
 * Distribution Center
 *
 * An object reference to $this User's distribution center object
 *
 * @foo
 * @var DistCenter
 */

The first line, the short description, will be used for labels and error messages. @var is a standard tag and the type DistCenter will be our class name for distribution centers. @foo is going to be our special tag.

What Am I Modeling?


One reason I like working on legacy systems is that the current intent can be pretty much gleaned from the code and its behavior.  You have to assume it’s working for the most part and when you run into things that look in error, you determine whether there’s a non-obvious reason for it being that way, or it gets confirmed as a problem.  I’m going to use an example business model, but keep in mind that eventually we move all the generic functionality away from the business domain.  I think the example classes might be interesting to look at also.

The classes I was pretty clear on were regions, distribution centers and users.  While there were separate data access modules and service layer modules, cohesion in the latter, a layer that should have been using powerful classes to do all the repetitive logic, was non-existent.  There was absolutely no meaningful state in these classes, they were just a loose collection of related functions.  Each function repeated many of the same data access.  With more than one data method to do the same thing, I had to go check the arguments each time and make sure I was using the best available method.  Terribly granular code, which made people take shortcuts and make assumptions where it was convenient.  Watching the data access log, it was also clear we queried for the same data items many times during a single page load.  I’ve always believed that if you make the common things very easy to do, then I can focus on making my application code really clear and robust.

The 3 entities I listed are in a container relationship: Users are in a single DistCenter and a DistCenter is in a Region.  Starting with a User, instead of explicitly fetching a row in every method that needs to reference the user’s distribution center, I asked something like: “why can’t I instantiate a User with the id parameter and reference $myUser->DistCenter ?”

Let’s start with User, then.

class User {
   public $id;    // user's id and key everywhere
   public $DistCenter;    // reference to $this User's Center
   function __construct($id) {
      $this->id = $id;
   }
   function __get($p) {
   }
   function __set($p, $v) {
   }
}

But when I write

$User = new User();
$Center = $User->DistCenter;

__get doesn’t execute. I want my properties defined explicitly in the classes, but how can I make __get think they’re undefined? After trying null and other tricks, I found that I could unset() them during construction and get my desired result.

class User {
...
   function __construct($id) {
      $this->id = $id;
      Foobar::init($this);
   }
}

class __pm {
   /**
    * Unsets the indicated properties in client class
    *
    * @param  object  $O
    */
   static function init($O) {
      $classType = get_class($O);
      $oReflectClass = new ReflectionClass($classType);

      // For each public property
      foreach ($oReflectClass->getProperties(ReflectionProperty::IS_PUBLIC) as $oReflectProperty) {
         $pName = $oReflectProperty->getName();
         unset($O->{$pName});
      }
   }
}

__pm is the new class to automate all the intelligent property handling. We will have to modify User’s inherent definition a little as we go along to let __pm work, and those changes will become the pattern for any class.

Writing a Powerful PHP Property Manager Class


I’m going to recreate the process I went through in writing a side class that automates everything I could think of to intelligently support powerful business objects.  I think it does an awesome job of balancing performance with “language conformity” — the idea that any specialized library or framework should not require you to configure anything to make it work.  You shouldn’t have to “add this URL to that array” and “put the file in this particular directory”, etc., type of disjointedness.  No black box stuff, although sometimes using PHP interceptor functions, along with inheritance, can seem like magic.

“The only way to get a practical and high performing product is to fully exploit the available technology.”

After working with PHP for about 6 months in a generic “framework” (It was homegrown, but just like other popular frameworks in its organization, it assumed a non-business aware stance.  While I am 100% behind generic implementation, it is clear that this type of thinking as a system structure leads to a non-cohesive codebase), I decided as a way to learn OO PHP, I would try to write some classes that represented the business’ entities and hide a lot of the repeated data access in them.

Obviously I also wanted some way to only access the DB when necessary and to cache results during the server trip as object state.  I also wanted centralized validation.  Having each new method define its own validation rules, the way we were doing it, seemed insane to me.  There set of types to validate was finite, why have to keep defining what a “retailer_id” must look like, right?  And with the rules being repeated and scattered, inconsistencies were guaranteed.

As bonus goals, I wanted auto-complete functionality to be available in supporting IDEs and I wanted to be able to generate detailed class documentation that really gave a picture of the business domain.

What I’m going to try here was originally implemented within the base class of all the business classes. I realized to make it useful it needed to be in an auxiliary class, but had only partially implemented it that way (but with a lot of new and improved features like memcache and validation built in). It was originally written using PHP 5.1.something, I think, so there might be more refinements that would make the interface cleaner. I always kept the magic to a minimum and only used Reflection for initialization (unlike Yii, for instance, which relies heavily on it — I have to think that’s not cheap).

%d bloggers like this: