Now to take the file we created and start messing with it.
I found an option to turn off compressing the data file for GnuCash, so I want to focus on parsing the XML.
There’s a few REBOL scripts that work with XML; I chose xml-parse.r. Initially I used its builtin block parsing methods to turn the XML into REBOL blocks that I could work with, but I decided that parsing the entire document into blocks first, just so I could try random REBOL series! functions against it would be a huge waste. So, I ended up writing a gnucash handler to look for specific elements as they’re parsed and save the child nodes of each as key/value pairs.
As a first try I want to supply the source XML and a request/response block!:
>> gnucash/list read %2011.xml [“gnc:transaction” []]
rebol [
Title: "Process GnuCash data file"
File: "%parse-gc-xml.r"
Date: 13 Dec 2012
Purpose: {
interface to gnucash data
}
]
do %xml-parse.r
gnucash: make xml-parse/parser [
subjects: copy []
content: copy ""
row: none
set-namespace-aware false
list: func [ gc-xml [string!] subjects [block!] ][
self/subjects: copy subjects
parse-xml gc-xml
return self/subjects
]
handler: make xml-parse/xml-parse-handler [
characters: func [ characters [string! none!] ][
if all [
found? row
found? characters
][
append content characters
]
]
start-element: func [
ns-uri [string! none!]
local-name [string! none!] q-name [string!]
attr-list [block!]
][
clear head content
if found? rowset: select subjects q-name [
append/only rowset row: copy []
]
]
end-element: func [
ns-uri [string! none!]
local-name [string! none!] q-name [string! ]
][
if found? row [
either select subjects q-name
[ row: none ][ repend row [q-name copy content] ]
]
]
] ; handler
] ; gnucash
Try it out.
>> do %parse-gc-xml.r
>> trns: gnucash/list read %../2011.xml [“gnc:transaction” []]
== [“gnc:transaction” [[“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:…
>> trns/1
== “gnc:transaction”
>> trns/2
== [[“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-01-03 0…
>> trns/2/1
== [“trn:id” “a1ddec4b4fe26e84745a8cddac018620” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-01-03 00…
>>
>> trns/2/2
== [“trn:id” “1849b854cd0ad3fbbddc94f3d661672a” “cmdty:space” “ISO4217” “cmdty:id” “USD” “trn:currency” “” “ts:date” “2011-05-20 00…
So we can think of trns as having “rows” of gnc:transaction attributes. Precisely, its second entry is the returned rowset. If I want the date from the second transaction, I can.
>> select trns/2/2 “ts:date”
== “2011-05-20 00:00:00 -0600”
We can collect multiple rowsets with one call by adding additional element/block! pairs in the argument.
>> result: gnucash/list read %../2011.xml [“gnc:commodity” [] “gnc:account” []]
== [“gnc:commodity” [[“cmdty:space” “ISO4217” “cmdty:id” “USD” “cmdty:get_quotes” “” “cmdty:quote_source” “currency” “cmdty:quote_t…
>> acts: find result “gnc:account”
== [“gnc:account” [[“act:name” “Root Account” “act:id” “6197eee7c4c8c51c914cd3aa4114ef44” “act:type” “ROOT”] [“act:name” “Expenses”…
>> acts/1
== “gnc:account”
>> select acts/2/1 “act:name”
== “Root Account”
Like this:
Like Loading...