This work is going through our GnuCash data again, but not trying to be anything generic. Instead we’re going to set REBOL words to values that we mine and see how that works as an automatic dereferencing setup. REBOL has a lot in common with Lisp, or so is my understanding of the matter, and I took a cue from Paul Graham:
Lisp’s symbol type is useful in manipulating databases of words, because it lets you test for equality by just comparing pointers. (If you find yourself using checksums to identify words, it’s symbols you mean to be using.)
We’re going to parse the XML with the statement
gnucash/parse-xml read %2011.xml. So let’s look at this incarnation of the parser/handler:
gnucash: make xml-parse/parser [
handler: make xml-parse/xml-parse-handler [
content: copy ""
characters: func [ characters [string! none!] /local trimmedContent][
if all [found? characters not empty? (trimmedContent: trim characters)][
append content trimmedContent
name: class: description: ""
guid: parent: amount: act: none
"act:id" [guid: to word! content]
"act:name" [name: copy content]
"act:type" [class: copy content]
"act:parent" [parent: to word! content]
"gnc:account" [set guid reduce [name class parent]]
end-element: func [
ns-uri [string! none!] ns-prefix [string! none!]
local-name [string! none!] q-name [string!]
switch q-name end-states
clear head content
] ; handler
] ; gnucash
It’s a lot shorter. Granted, it’s specifically coded for certain elements, but it should be apparent from the definition of end-states how we could build it dynamically. end-states is just a block! that end-element uses as an argument to its switch function. And because of the schema, I can process everything in the end-element function. In other words I know the other elements in my end-states are all inside gnc:account. As end-element encounters any of the element names listed in end-states, it performs the corresponding block of actions. I used the fully qualified name, because many local-names are reused among GnuCash’s namespaces.
BTW, I really like how this works:
if all [found? characters not empty? (trimmedContent: trim characters)]; we make sure we have a string and then trim whitespace, appending if there’s something left over. Compare to your favorite language.
For the string! values, I have to copy the content, otherwise what’s being referenced keeps changing. But for word! values I didn’t bother with copy; my deduction is that
to word! gives me a copy of the value as a word! type. And when we reach the end of “gnc:account” we know we have everything, so we set a word! for this account’s guid and set that word! to the value of this account with:
set guid reduce [name class parent]
If I pick one of the account guids from my file and probe for it, I can see it’s been set in REBOL:
>> probe get to word! "2775e139d4404298cf73e6316db71cbd"
["Taxable" "INCOME" 921eddc8eb3349d0a818ef6e52417b81]
(We have to use to word! and enter the value as a string so REBOL’s console doesn’t try to interpret it as some other type just based on the characters.)
Where it gets interesting is if I print the value in certain ways. The guid “parent” reference (3rd item in the block) gets dereferenced for me automatically. It’s a word and it has a value.
>> print remold get to word! "2775e139d4404298cf73e6316db71cbd"
["Taxable" "INCOME" ["Investment" "INCOME" e84eba6fde334896d99a24c62d1162d3]]
I think that’s pretty neat.