Tech Is Hard

Credibility = Talent x Years of experience + Proven hardcore accomplishment

Category Archives: sample code

How to Easily and Consistently Round Up with Integer Division


I’ve encountered this in different languages for different purposes.  The first time was some assembler code that calculated how many total bytes it would take to hold an arbitrary number of bits.  The problem is it’s usually coded the way we think about the problem: divide, then if there’s a remainder, add 1 to the quotient.

A quick, easy way without any special checking, is to add the (divisor-1) to the dividend before doing the division.  In the bits to bytes example, I changed the code to do it this way

howManyBytes = (howManyBits + 7)/8

Instead of this way

howManyBytes = howManyBits/8
if (howManyBytes*8 < howManyBits)
  howManyBytes ++

You’re guaranteed to get the rounded up number without any extra work.

Advertisements

Re-key Your PHP Array with array_reduce


How much PHP code is dedicated to looping through an array of arrays (rows), comparing a search value against each row’s “column” of that name? Tons. What about re-indexing the array using the search column, so we can directly access the rows by search value?

Here’s a reducing function:

function keyBy ($reduced, $row) {
    $keyname = $reduced ['key'];
    $reduced ['data'] [$row [$keyname]] = $row;
    return $reduced;
}

The most important thing to keep in mind is the desire to obtain one value accumulated from an array. Inside keyBy, $reduced holds that value. The reduction works by accepting the currently reduced value and returning it with any updates you want. keyBy is extremely powerful because it will let me re-index an array of arrays using any column. Since an initial value for $reduced is required, I decided to make use of that argument to pass an arbitrary key name. To prevent any clashes with the data keys, I separated ‘key’ from ‘data’ in $reduced.

To “reduce” an array of rows into a direct-access array, I call keyBy by passing it to array_reduce, with the initial argument indicating which key to index by.

$customers = array (
	array ('id'    => 121, 'first' => 'Jane',	'last'  => 'Refrain'),
	array ('id'    => 290, 'first' => 'Bill',	'last'  => 'Swizzle'),
	array ('id'    => 001, 'first' => 'Fred',	'last'  => 'Founder')
);

$customersByLastName = array_reduce ($customers, "keyBy", array ('key' => 'last'));

print_r ($customersByLastName);
print $customersByLastName ['data']['Founder']['first'];

Array (
  [key] => last
  [data] => Array (
    [Refrain] => Array (
      [id] => 121
      [first] => Jane
      [last] => Refrain)

    [Swizzle] => Array (
      [id] => 290
      [first] => Bill
      [last] => Swizzle)

    [Founder] => Array (
      [id] => 1
      [first] => Fred
      [last] => Founder)
  )
)

Fred

Isn’t that an incredibly small amount of code that gets rid of a lot of code? If the array is searched more than once, it quickly becomes extremely efficient.

World’s Most Efficient Implementation of Discrete Timers in PHP


Well it may not be THE most efficient, but it’s pretty close. I’ve seen a lot of timer code in my time, and it always looks too elaborate for what it needs to do. After all, if I’m using timers then I’m probably very concerned about how long things take — I don’t want to add much overhead to track it. I came up with this code during an important project to allow me to record how long certain functionality was taking.

I wanted something that would be as simple and lightweight as possible. In order to maintain discrete timers that can’t interfere with each other, many timer implementations require instantiating a new instance of some class. Instead, I opted for a static array in my timer function, with the index of the array as a unique timer “name”; this serves the same purpose. With this function, I can print out how long the overall script has been running, at any time. I can create a new timer (which also starts it running), print how long since that timer’s been running and optionally keep the timer’s value or reset it each time I access its current value.

We need an empty static array to hold timers. The first step is to get the current system time. By passing true to microtime(), we get a float representing the number of seconds for the current time. When no timer name (the index in our static array) is passed in, we just need to subtract the script start time from current and return that.

If there is a timer name passed, that’s when things get creative. We have to get the time value from the last call with this timer, or null if the timer hasn’t been used before. If we’re initializing the timer, set it to the current system time and return 0.0 (to force the type) for its value. If the timer exists and we’re resetting (which is the default) it, set it to the current system time.

Finally we return the difference between now and the last call.

I should note that when using elapsed() for the script’s overall execution time, the value is returned in seconds, otherwise the return value is milliseconds (prevents having extremely small numeric values).

/**
* Return an elapsed milliseconds since the timer was last reset.
*
* If a timer "name" is not specified, then returns the elapsed time since the
* current script started execution. This script timer can not be "reset".
* THIS DEFAULT SCRIPT TIMER RETURNS IN SECONDS
*
* Always "resets" the specified timer unless you specify false for the reset
* parameter. Of course, on the first call for a particular timer, it will always
* reset to the time of the call.
*
* examples:
* elapsed(__FUNCTION__); // using __FUNCTION__ or __METHOD__ makes it easy to be unique
* ...
* ...
* echo "It took " . elapsed(__FUNCTION)/1000 . " seconds."
*
* @param string $sTname Name of the timer
* @param boolean $bReset Do you want to reset this timer to 0?
* @return float Elapsed time since timer was last reset
*/
function elapsed($sTname = null, $bReset = true) {

    static $fTimers = array(); // To hold "now" from previous call
    $fNow = microtime(true); // Get "now" in seconds as a float

    if (is_null($sTname))
        return ($fNow - $_SERVER['REQUEST_TIME']);

    $fThen = isset($fTimers[$sTname]) ? $fTimers[$sTname] : null; // Copy over the start time, so we can update to "now"

    if (is_null($fThen) || $bReset) {
        $fTimers[$sTname] = $fNow;
        if (is_null($fThen))
            return 0.0;
    }
    return 1000 * ($fNow - $fThen);
}

printf (
    "Since script started %f Create 2 new timers 'fred' %f and 'alice' %f", 
    elapsed(), elapsed('fred'), elapsed('alice'));

for ($i = 0; $i <100; $i++) {
    printf (
        "Since script started %f 'fred' gets reset %f, but 'alice' doesn't %f", 
        elapsed(), elapsed('fred'), elapsed('alice', false));
}

Since script started 0.391466 Create 2 new timers ‘fred’ 0.000000 and ‘alice’ 0.000000
Since script started 0.391521 ‘fred’ gets reset 0.041962, but ‘alice’ doesn’t 0.037909
Since script started 0.391545 ‘fred’ gets reset 0.021935, but ‘alice’ doesn’t 0.056982
Since script started 0.391563 ‘fred’ gets reset 0.019073, but ‘alice’ doesn’t 0.075102
Since script started 0.391578 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.090122
Since script started 0.391594 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.104904
Since script started 0.391609 ‘fred’ gets reset 0.015974, but ‘alice’ doesn’t 0.119925
Since script started 0.391624 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.134945
Since script started 0.391639 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.149965
Since script started 0.391654 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.164986
Since script started 0.391669 ‘fred’ gets reset 0.014782, but ‘alice’ doesn’t 0.180006
Since script started 0.391684 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.195980
Since script started 0.391699 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.211000

I think it’s convenient to use a pattern like

function foobar () {
    elapsed (__METHOD__);
    // lots of code
    $timeused = elapsed (__METHOD__);
}

Powerful jQuery Shorthand to Move Data Into DOM


I realize this is basic, though not obvious, jQuery stuff, but it’s powerful in the amount of work that can be done.  One could devise a fairly elaborate convention to make rendering driven largely by the markup.  Assume some JSON object like

var myJson = { "name": "Grant", "age": "ancient", "position": "seated" };

and elements like

 
<p myns:field="name"></p>
<p myns:field="age"></p>
<p myns:field="position"></p>

With this jQuery call I can set the text of all the elements without any knowledge of the model’s contents.

    $('[myns\\:field]') .text(function(idx, txt) {
      return myJson [$(this) .attr ('myns:field')];
    });

There’s a lot of potential; I’ll have to see if it’s worth it under varied scenarios.

Data-driving the Asynchronous Mongoose Updates in my Node Code


This was the first experiment to be able to ask for a sequence or set of Mongoose updates.  I wanted to be able to set up dependency and independent updates.  First I dealt with the asynchronous updates.  There is no rollback mechanism or anything, and the operation stops as soon as any of the updates fails, but we don’t get a good response unless they all succeed.  The secret sauce for the asynchronous updates was using Underscore.js’s after() function.

/**
 * Runs parallel, independent updates
 */
var asyncUpdate = function (successMsg, updates, resp) {
  var good = _.after (updates.length, function (doc) { 
    resp (successMsg + " update successful", 200); 
  });
  _.each (updates, function (arg, key, list) {
    arg.model 
      .update (arg.where, arg.update) 
      .run (function (e, doc) {
        if (e)  
          resp (arg.model.modelName + " update error: " + util .inspect (e), 500); 
        else {
          console .log (arg.model.modelName + " update successful");
          good(doc);
        }
      });
  });
};

Then call it with an array of objects of update control information.

 asyncUpdate ('Totally updated', [ 
   { model: Foo,    update: { $set: { field1: 'value1' } },         where: { _id: id } },
   { model: Bar,    update: { $set: { 'array.$.boolthing': true} }, where: { 'array.foo': id, 'array.boolthing': false } },
   { model: FooBar, update: { $set: { 'field': 'val'} },            where: { _id: id } }
 ], _.bind(r.send, r));

Obviously this isn’t very generic and it’s directly responding to HTTP calls in the wrong place, but it’s just an experiment. The net effect of this one is to stop if there’s an error, but to wait until all updates are successful to respond with success.

XSL 1.0 str-tolower Template


We’ll add a couple variables and a template to the strings.xsl stylesheet.

<!-- translation for lowercase-->
<xsl:variable name="lcletters" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="ucletters" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>

<xsl:template name="str-tolower">
  <xsl:param name="str"/>
  <xsl:value-of select="translate($str, $ucletters, $lcletters)"/>
</xsl:template>

And here’s a new UT to add to strings.test.xsl:

<xsl:call-template name="assert">
  <xsl:with-param name="expected" select="'@apple1'"/>
  <xsl:with-param name="actual">
    <xsl:call-template name="str-tolower">
      <xsl:with-param name="str" select="'@ApPlE1'"/>
    </xsl:call-template>
  </xsl:with-param>
</xsl:call-template>

Self-Contained Data Tables in XSL Stylesheets


I’m going to share a trick I’ve been using in my stylesheets for years, because every time I do it someone gets amazed.

Lookups often need to be performed during transformation of a document. Let’s use the example of taking month numbers and turning them into the standard 3-character abbreviations. Instead of hard-coding the values in a choose, or reading a separate XML document, add your own working data to the stylesheet with a namespace.

<grant:stuff>
    <grant:month-name month="01">Jan</grant:month-name>
    <grant:month-name month="02">Feb</grant:month-name>
    <grant:month-name month="03">Mar</grant:month-name>
    <grant:month-name month="04">Apr</grant:month-name>
    <grant:month-name month="05">May</grant:month-name>
    <grant:month-name month="06">Jun</grant:month-name>
    <grant:month-name month="07">Jul</grant:month-name>
    <grant:month-name month="08">Aug</grant:month-name>
    <grant:month-name month="09">Sep</grant:month-name>
    <grant:month-name month="10">Oct</grant:month-name>
    <grant:month-name month="11">Nov</grant:month-name>
    <grant:month-name month="12">Dec</grant:month-name>
</grant:stuff>

For this to work inside your XSLT, you have to add a namespace prefix declaration for it:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:grant="https://techishard.wordpress.com"

Then you use the document() function without a document, to get this stylesheet:

<xsl:value-of select="document('')/xsl:stylesheet/grant:stuff/grant:month-name[@month = 10]"/>

Will get you
Oct

Difference ‘tween skip and pick


REBOL series are the foundation of all data structures, so there are a lot of functions to deal with them. Often, under a given circumstance, more than one will do what’s needed. In the REBOL core series documentation, there’s sort of a fluid discussion of functions like first and next. And in another section, a comparison between first/second/third/etc and pick. I was changing values in a series and encountered the distinction between pick and skip. Some functions, like first and pick, are extraction functions, giving the series item as its own thing, while others, like next and skip, are in the context of the series. Often it might not matter, but due to the general applicability of the change function to all types of series!, it did for me. I want to change the first item in a:

>> a: copy ["abc" 123]
== ["abc" 123]
>> a/1
== "abc"
>> change a/1 "b"
== "bc"
>> a
== ["bbc" 123]
>> change pick a 1 "c"
== "bc"
>> a
== ["cbc" 123]
>> change first a "d"
== "bc"
>> a
== ["dbc" 123]
>> change head a "e"
== [123]
>> a
== ["e" 123]
>>

Ah! Until I used head (or next or skip or back), I am referencing the first item (originally “abc”), but the change function sees change string! string!. Some of the items I need to change are blocks I need to append to:

>> a: copy ["abc" [123]]
== ["abc" [123]]
>> append skip a 1 'foobar
== ["abc" [123] foobar]

Oops, that’s not it. Get rid of that with remove.

>> remove last a
** Script Error: remove expected series argument of type: series port bitset none
** Where: halt-view
** Near: remove last a

Same issue here. Last is giving the word! foobar which isn’t something remove can operate on.

>> remove back tail a
== []
>> a
== ["abc" [123]]
>> append last a 'foobar
== [123 foobar]

There it is. Any of the “extraction” functions will do for the append:

>> a
== ["abc" [123 foobar]]
>> append pick a 2 'foo
== [123 foobar foo]
>> append a/2 'bar
== [123 foobar foo bar]
>> a
== ["abc" [123 foobar foo bar]]
>>

REBOL/GnuCash: Listing the Transactions


With a couple additions to the end-states block!, the end-element function now lists the transacions in my GnuCash file, with the account references (contained in the <trn:splits> element of each transaction) dereferenced:

>> do %gc2iam.r
3-Jan-2011/13:02:43-7:00 Hobby Lobby fix chair pic glass 
[[4106 100] ["Home Repair" "EXPENSE" 3b7ddd7ff3110b9bc94f6425bba8fc83] 
[-4106 100] ["Chase MC" "CREDIT" 74bee4c0f2e95ff570291a9d3ea3a3ba]]
12-Jun-2011/10:19:19-6:00 BEST carpet cleaning 
[[12000 100] ["Home Repair" "EXPENSE" 3b7ddd7ff3110b9bc94f6425bba8fc83] 
[-12000 100] ["Discover" "CREDIT" 74bee4c0f2e95ff570291a9d3ea3a3ba]]
10-Sep-2011/0:46:22-6:00 lunch meeting
[[200 100] ["Parking" "EXPENSE" b0b714a0163f90222c5a6769c78ca791] 
[-200 100] ["Cash" "CASH" 74bee4c0f2e95ff570291a9d3ea3a3ba]]
16-Mar-2011/21:02:47-6:00 Jenny's Market 
[[3255 100] ["Gas" "EXPENSE" b0b714a0163f90222c5a6769c78ca791] 
[-3255 100] ["Discover" "CREDIT" 74bee4c0f2e95ff570291a9d3ea3a3ba]

Here’s the relevant part of the handler. I added a /local word to end-element, value, for some manipulation of the string that GnuCash stores, which is a fraction. For instance, $328.23 is represented as 32823/100. There is other currency metadata used to make conversions between these amounts and stock prices, for example. Right now I’m just going to be concerned about “normal” transactions like spending and transfers, but I don’t want to lose any precision until I know more, so I split the string using “/” and convert each component to an integer. I changed the to use to make; the implication is that make will give me a new item of my desired type, while to technically converts the argument. Remembering the admonition to use copy when initializing from something that will change, I think make must be safer here.

trnAccts: copy []
guid:               name:               class: 
parent:             trnDate:            description:  
none

end-states: [
    "act:id"          [guid:           make word! content]
    "act:name"        [name:                 copy content]
    "act:type"        [class:                copy content]
    "act:parent"      [parent:         make word! content]
    "trn:description" [description:          copy content]
    "ts:date"         [trnDate:        make date! content]
    "split:value"     [
        value: split content #"/" 
        append/only trnAccts reduce [make integer! value/1 make integer! value/2]
    ] 
    "split:account"   [append trnAccts to word! content]

    "gnc:account"     [set guid reduce [name class parent]]
    "gnc:transaction" [
        print [trnDate description remold trnAccts]
        clear head trnAccts
    ]
]
end-element: func [
    ns-uri		[string! none!]  ns-prefix [string! none!]
    local-name	[string! none!]  q-name    [string!]
    /local value
][ 
    switch q-name end-states
    clear head content
]

REBOL/GnuCash: using REBOL words


This work is going through our GnuCash data again, but not trying to be anything generic. Instead we’re going to set REBOL words to values that we mine and see how that works as an automatic dereferencing setup. REBOL has a lot in common with Lisp, or so is my understanding of the matter, and I took a cue from Paul Graham:

Lisp’s symbol type is useful in manipulating databases of words, because it lets you test for equality by just comparing pointers. (If you find yourself using checksums to identify words, it’s symbols you mean to be using.)

We’re going to parse the XML with the statementgnucash/parse-xml read %2011.xml. So let’s look at this incarnation of the parser/handler:

gnucash: make xml-parse/parser [
    set-namespace-aware true
    handler: make xml-parse/xml-parse-handler [

        content: copy ""

        characters: func [ characters [string! none!] /local trimmedContent][
            if all [found? characters not empty? (trimmedContent: trim characters)][
                append content trimmedContent
            ]
        ]

        name: class: description: ""
        guid: parent: amount: act: none

        end-states: [
            "act:id"          [guid:   to word! content]
            "act:name"        [name:       copy content]
            "act:type"        [class:      copy content]
            "act:parent"      [parent: to word! content]
            "gnc:account"     [set guid reduce [name class parent]]
        ]

        end-element: func [
	    ns-uri	[string! none!]  ns-prefix [string! none!]
	    local-name	[string! none!]  q-name    [string!]
        ][
            switch q-name end-states
            clear head content
        ]
    ] ; handler
] ; gnucash

It’s a lot shorter. Granted, it’s specifically coded for certain elements, but it should be apparent from the definition of end-states how we could build it dynamically. end-states is just a block! that end-element uses as an argument to its switch function. And because of the schema, I can process everything in the end-element function. In other words I know the other elements in my end-states are all inside gnc:account. As end-element encounters any of the element names listed in end-states, it performs the corresponding block of actions. I used the fully qualified name, because many local-names are reused among GnuCash’s namespaces.

BTW, I really like how this works: if all [found? characters not empty? (trimmedContent: trim characters)]; we make sure we have a string and then trim whitespace, appending if there’s something left over.  Compare to your favorite language.

For the string! values, I have to copy the content, otherwise what’s being referenced keeps changing.  But for word! values I didn’t bother with copy; my deduction is that to word! gives me a copy of the value as a word! type.  And when we reach the end of “gnc:account” we know we have everything, so we set a word! for this account’s guid and set that word! to the value of this account with:
set guid reduce [name class parent]
If I pick one of the account guids from my file and probe for it, I can see it’s been set in REBOL:

>> probe get to word! "2775e139d4404298cf73e6316db71cbd"
["Taxable" "INCOME" 921eddc8eb3349d0a818ef6e52417b81]

(We have to use to word! and enter the value as a string so REBOL’s console doesn’t try to interpret it as some other type just based on the characters.)
Where it gets interesting is if I print the value in certain ways. The guid “parent” reference (3rd item in the block) gets dereferenced for me automatically. It’s a word and it has a value.

>> print remold get to word! "2775e139d4404298cf73e6316db71cbd"
["Taxable" "INCOME" ["Investment" "INCOME" e84eba6fde334896d99a24c62d1162d3]]

I think that’s pretty neat.

%d bloggers like this: