Tech Is Hard

Credibility = Talent x Years of experience + Proven hardcore accomplishment

Tag Archives: code

Re-key Your PHP Array with array_reduce


How much PHP code is dedicated to looping through an array of arrays (rows), comparing a search value against each row’s “column” of that name? Tons. What about re-indexing the array using the search column, so we can directly access the rows by search value?

Here’s a reducing function:

function keyBy ($reduced, $row) {
    $keyname = $reduced ['key'];
    $reduced ['data'] [$row [$keyname]] = $row;
    return $reduced;
}

The most important thing to keep in mind is the desire to obtain one value accumulated from an array. Inside keyBy, $reduced holds that value. The reduction works by accepting the currently reduced value and returning it with any updates you want. keyBy is extremely powerful because it will let me re-index an array of arrays using any column. Since an initial value for $reduced is required, I decided to make use of that argument to pass an arbitrary key name. To prevent any clashes with the data keys, I separated ‘key’ from ‘data’ in $reduced.

To “reduce” an array of rows into a direct-access array, I call keyBy by passing it to array_reduce, with the initial argument indicating which key to index by.

$customers = array (
	array ('id'    => 121, 'first' => 'Jane',	'last'  => 'Refrain'),
	array ('id'    => 290, 'first' => 'Bill',	'last'  => 'Swizzle'),
	array ('id'    => 001, 'first' => 'Fred',	'last'  => 'Founder')
);

$customersByLastName = array_reduce ($customers, "keyBy", array ('key' => 'last'));

print_r ($customersByLastName);
print $customersByLastName ['data']['Founder']['first'];

Array (
  [key] => last
  [data] => Array (
    [Refrain] => Array (
      [id] => 121
      [first] => Jane
      [last] => Refrain)

    [Swizzle] => Array (
      [id] => 290
      [first] => Bill
      [last] => Swizzle)

    [Founder] => Array (
      [id] => 1
      [first] => Fred
      [last] => Founder)
  )
)

Fred

Isn’t that an incredibly small amount of code that gets rid of a lot of code? If the array is searched more than once, it quickly becomes extremely efficient.

World’s Most Efficient Implementation of Discrete Timers in PHP


Well it may not be THE most efficient, but it’s pretty close. I’ve seen a lot of timer code in my time, and it always looks too elaborate for what it needs to do. After all, if I’m using timers then I’m probably very concerned about how long things take — I don’t want to add much overhead to track it. I came up with this code during an important project to allow me to record how long certain functionality was taking.

I wanted something that would be as simple and lightweight as possible. In order to maintain discrete timers that can’t interfere with each other, many timer implementations require instantiating a new instance of some class. Instead, I opted for a static array in my timer function, with the index of the array as a unique timer “name”; this serves the same purpose. With this function, I can print out how long the overall script has been running, at any time. I can create a new timer (which also starts it running), print how long since that timer’s been running and optionally keep the timer’s value or reset it each time I access its current value.

We need an empty static array to hold timers. The first step is to get the current system time. By passing true to microtime(), we get a float representing the number of seconds for the current time. When no timer name (the index in our static array) is passed in, we just need to subtract the script start time from current and return that.

If there is a timer name passed, that’s when things get creative. We have to get the time value from the last call with this timer, or null if the timer hasn’t been used before. If we’re initializing the timer, set it to the current system time and return 0.0 (to force the type) for its value. If the timer exists and we’re resetting (which is the default) it, set it to the current system time.

Finally we return the difference between now and the last call.

I should note that when using elapsed() for the script’s overall execution time, the value is returned in seconds, otherwise the return value is milliseconds (prevents having extremely small numeric values).

/**
* Return an elapsed milliseconds since the timer was last reset.
*
* If a timer "name" is not specified, then returns the elapsed time since the
* current script started execution. This script timer can not be "reset".
* THIS DEFAULT SCRIPT TIMER RETURNS IN SECONDS
*
* Always "resets" the specified timer unless you specify false for the reset
* parameter. Of course, on the first call for a particular timer, it will always
* reset to the time of the call.
*
* examples:
* elapsed(__FUNCTION__); // using __FUNCTION__ or __METHOD__ makes it easy to be unique
* ...
* ...
* echo "It took " . elapsed(__FUNCTION)/1000 . " seconds."
*
* @param string $sTname Name of the timer
* @param boolean $bReset Do you want to reset this timer to 0?
* @return float Elapsed time since timer was last reset
*/
function elapsed($sTname = null, $bReset = true) {

    static $fTimers = array(); // To hold "now" from previous call
    $fNow = microtime(true); // Get "now" in seconds as a float

    if (is_null($sTname))
        return ($fNow - $_SERVER['REQUEST_TIME']);

    $fThen = isset($fTimers[$sTname]) ? $fTimers[$sTname] : null; // Copy over the start time, so we can update to "now"

    if (is_null($fThen) || $bReset) {
        $fTimers[$sTname] = $fNow;
        if (is_null($fThen))
            return 0.0;
    }
    return 1000 * ($fNow - $fThen);
}

printf (
    "Since script started %f Create 2 new timers 'fred' %f and 'alice' %f", 
    elapsed(), elapsed('fred'), elapsed('alice'));

for ($i = 0; $i <100; $i++) {
    printf (
        "Since script started %f 'fred' gets reset %f, but 'alice' doesn't %f", 
        elapsed(), elapsed('fred'), elapsed('alice', false));
}

Since script started 0.391466 Create 2 new timers ‘fred’ 0.000000 and ‘alice’ 0.000000
Since script started 0.391521 ‘fred’ gets reset 0.041962, but ‘alice’ doesn’t 0.037909
Since script started 0.391545 ‘fred’ gets reset 0.021935, but ‘alice’ doesn’t 0.056982
Since script started 0.391563 ‘fred’ gets reset 0.019073, but ‘alice’ doesn’t 0.075102
Since script started 0.391578 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.090122
Since script started 0.391594 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.104904
Since script started 0.391609 ‘fred’ gets reset 0.015974, but ‘alice’ doesn’t 0.119925
Since script started 0.391624 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.134945
Since script started 0.391639 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.149965
Since script started 0.391654 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.164986
Since script started 0.391669 ‘fred’ gets reset 0.014782, but ‘alice’ doesn’t 0.180006
Since script started 0.391684 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.195980
Since script started 0.391699 ‘fred’ gets reset 0.015020, but ‘alice’ doesn’t 0.211000

I think it’s convenient to use a pattern like

function foobar () {
    elapsed (__METHOD__);
    // lots of code
    $timeused = elapsed (__METHOD__);
}

Powerful jQuery Shorthand to Move Data Into DOM


I realize this is basic, though not obvious, jQuery stuff, but it’s powerful in the amount of work that can be done.  One could devise a fairly elaborate convention to make rendering driven largely by the markup.  Assume some JSON object like

var myJson = { "name": "Grant", "age": "ancient", "position": "seated" };

and elements like

 
<p myns:field="name"></p>
<p myns:field="age"></p>
<p myns:field="position"></p>

With this jQuery call I can set the text of all the elements without any knowledge of the model’s contents.

    $('[myns\\:field]') .text(function(idx, txt) {
      return myJson [$(this) .attr ('myns:field')];
    });

There’s a lot of potential; I’ll have to see if it’s worth it under varied scenarios.

How to do a Better Job at Meta-programming


Or

How I thought like a compiler 20 years ago to create a reporting system processing millions of 32k byte, varying length, records on tape per hour against hundreds of user-defined report packages .

Imagine, if you will, a young lad in front of a 3270 terminal, a green cursor blinking in his face, eager to do the impossible.  A couple years before (1988 through 89) I had been a contractor for my now employer and had written a set of assembler macros that could be used to define reports against a proprietary “database” of varying-length documents, up to 32k in length.  In order to pack as much sparse data as possible onto tape cartridges, each record contained fixed and varying sections and embedded bitmaps used to drive data access code.  My manager had been the original author (there’s not too many people who could have pulled it off, in retrospect, “hats off”, Jeff and Phil.)  Each database had a its own “schema” as a file of assembler macros that was compiled to a load module of control blocks.

To complement the low level code was a set of application macros to locate or manipulate data in the record and to examine the schema.  They should have allowed schema-independent programming, but in reality, to code for every possible data type and configuration in which the application data may lie was prohibitive, so some assumptions were usually made: “order records and item records may occur multiple times, always“, “an order will always have an amount and a date field”.

When I had done the original reporting system, it had to work for all schemas, so that meant putting a lot of limitations on the possible reports.  We were really coding to meet a certain set of specific reports, but in a generic way.  It worked pretty well, but the code had to be modified a lot to accommodate “unforeseen” reporting needs.

When I was tasked with the next generation, I knew I wanted the report definitions to be in a “grammar”.  Some type of structured way to define a report, that took into account nesting, ranging, counting and summing.  I also knew that it had to handle any schema and any “reasonable” report.  I shocked the team lead when I threw the stack of new reports into the trash that we were to use as our requirements.  I said “let’s write a spreadsheet concept and forget the specifics.  We’ll check them from time to time to make sure what I’m doing satisfies, but I want my blinders off.”

To do all these things, even (maybe more so) in a modern database technology would be hard.  With our data store I realized I had to think of something new and spent weeks drawing boxes and arrows and writing little tests.  When I was finished, my programs could handle a great variety of reports and process data faster than code that was schema specific.  Marketing could define their own reports now to analyze the sea of demographic data we held.

Next: what a sample report’s data looked like.

Data-driving the Asynchronous Mongoose Updates in my Node Code


This was the first experiment to be able to ask for a sequence or set of Mongoose updates.  I wanted to be able to set up dependency and independent updates.  First I dealt with the asynchronous updates.  There is no rollback mechanism or anything, and the operation stops as soon as any of the updates fails, but we don’t get a good response unless they all succeed.  The secret sauce for the asynchronous updates was using Underscore.js’s after() function.

/**
 * Runs parallel, independent updates
 */
var asyncUpdate = function (successMsg, updates, resp) {
  var good = _.after (updates.length, function (doc) { 
    resp (successMsg + " update successful", 200); 
  });
  _.each (updates, function (arg, key, list) {
    arg.model 
      .update (arg.where, arg.update) 
      .run (function (e, doc) {
        if (e)  
          resp (arg.model.modelName + " update error: " + util .inspect (e), 500); 
        else {
          console .log (arg.model.modelName + " update successful");
          good(doc);
        }
      });
  });
};

Then call it with an array of objects of update control information.

 asyncUpdate ('Totally updated', [ 
   { model: Foo,    update: { $set: { field1: 'value1' } },         where: { _id: id } },
   { model: Bar,    update: { $set: { 'array.$.boolthing': true} }, where: { 'array.foo': id, 'array.boolthing': false } },
   { model: FooBar, update: { $set: { 'field': 'val'} },            where: { _id: id } }
 ], _.bind(r.send, r));

Obviously this isn’t very generic and it’s directly responding to HTTP calls in the wrong place, but it’s just an experiment. The net effect of this one is to stop if there’s an error, but to wait until all updates are successful to respond with success.

XSL 1.0 str-tolower Template


We’ll add a couple variables and a template to the strings.xsl stylesheet.

<!-- translation for lowercase-->
<xsl:variable name="lcletters" select="'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:variable name="ucletters" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>

<xsl:template name="str-tolower">
  <xsl:param name="str"/>
  <xsl:value-of select="translate($str, $ucletters, $lcletters)"/>
</xsl:template>

And here’s a new UT to add to strings.test.xsl:

<xsl:call-template name="assert">
  <xsl:with-param name="expected" select="'@apple1'"/>
  <xsl:with-param name="actual">
    <xsl:call-template name="str-tolower">
      <xsl:with-param name="str" select="'@ApPlE1'"/>
    </xsl:call-template>
  </xsl:with-param>
</xsl:call-template>

Simple XSL 1.0 String Templates and Very Simple XSL Unit Testing


In order to reference external examples that evolve, I’m going to start with a stylesheet containing simple string functions and build on it.  There’s also a simple, but useful methodology for unit testing the XSL templates.

Dealing with strings is notoriously annoying in XSL, so in my xsl directory I have strings.xsl, a stylesheet to do repetitive and recursive stuff.  The first function we’ll need is str-replace.  I recently updated this template to be pretty short and sweet.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template name="str-replace">
  <xsl:param name="haystack"/>
  <xsl:param name="needle"/>
  <xsl:param name="repl" select="''"/>
  <xsl:choose>
    <xsl:when test="contains ($haystack, $needle)">
      <xsl:text><xsl:value-of select="substring-before ($haystack, $needle)"/></xsl:text>
      <xsl:text><xsl:value-of select="$repl"/></xsl:text>
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="substring-after ($haystack, $needle)"/>
        <xsl:with-param name="needle" select="$needle"/>
        <xsl:with-param name="repl" select="$repl"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:text><xsl:value-of select="$haystack"/></xsl:text>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<!-- translation for lowercase-->
<xsl:variable name="lcletters">abcdefghijklmnopqrstuvwxyz</xsl:variable>
<xsl:variable name="ucletters">ABCDEFGHIJKLMNOPQRSTUVWXYZ</xsl:variable>

<xsl:template name="str-tolower">
  <xsl:param name="str"/>
  <xsl:value-of select="translate($str, $ucletters, $lcletters)"/>
</xsl:template>

</xsl:stylesheet>

Hopefully it’s a fairly clear template: as long as the $haystack contains $needle, concatenate the string before $needle, the replacement value to use, and the result of calling the template with what comes after the first $needle occurrence.  When there’s no $needle, then it just results in $haystack.

I want to use what I’ve learned about unit testing to prevent regression in the future. It can be extremely hard to figure out which change caused regression in XSL; you may not notice the one scenario that triggers it for a while. I wrote strings.test.xsl to call templates in strings.xsl.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import href="strings.xsl" />

<xsl:call-template name="assert">
  <xsl:with-param name="expected" select="'@apple1'"/>
  <xsl:with-param name="actual">
    <xsl:call-template name="str-tolower">
      <xsl:with-param name="str" select="'@ApPlE1'"/>
    </xsl:call-template>
  </xsl:with-param>
</xsl:call-template>

<xsl:template match="/">
  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'abcBBB-BBBxyz'"/>
    <xsl:with-param name="actual">
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="'abcA-Axyz'"/>
        <xsl:with-param name="needle" select="'A'"/>
        <xsl:with-param name="repl" select="'BBB'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>

  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'abc-xyz'"/>
      <xsl:with-param name="actual">
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="'aAbcA-AxyzA'"/>
        <xsl:with-param name="needle" select="'A'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>

  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'eleven1'"/>
      <xsl:with-param name="actual">
        <xsl:call-template name="str-replace">
          <xsl:with-param name="haystack" select="'111'"/>
          <xsl:with-param name="needle" select="'11'"/>
        <xsl:with-param name="repl" select="'eleven'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>
</xsl:template>

<xsl:template name="assert">
  <xsl:param name="expected" select="'missing expected param'"/>
  <xsl:param name="actual" select="'missing actual param'"/>
  <xsl:if test="not ($actual = $expected)">
    <xsl:message terminate="yes">Expected <xsl:value-of select="$expected"/>; got <xsl:value-of select="$actual"/></xsl:message>
  </xsl:if>
</xsl:template>

</xsl:stylesheet>

The way I use strings.test.xsl is with any XML input document, because it doesn’t actually use the input XML.  It might be interesting to come up with a unit testing stylesheet that took the stylesheet to be tested as its input document.  And use the document() function to also do something introspective.

I admit there’s nothing really descriptive explaining what the test is for, but repeating the assert calls is really easy and I just update the template I’m calling or use a new set of parameters for a new condition. It’s a brute force method that works for now.

Self-Contained Data Tables in XSL Stylesheets


I’m going to share a trick I’ve been using in my stylesheets for years, because every time I do it someone gets amazed.

Lookups often need to be performed during transformation of a document. Let’s use the example of taking month numbers and turning them into the standard 3-character abbreviations. Instead of hard-coding the values in a choose, or reading a separate XML document, add your own working data to the stylesheet with a namespace.

<grant:stuff>
    <grant:month-name month="01">Jan</grant:month-name>
    <grant:month-name month="02">Feb</grant:month-name>
    <grant:month-name month="03">Mar</grant:month-name>
    <grant:month-name month="04">Apr</grant:month-name>
    <grant:month-name month="05">May</grant:month-name>
    <grant:month-name month="06">Jun</grant:month-name>
    <grant:month-name month="07">Jul</grant:month-name>
    <grant:month-name month="08">Aug</grant:month-name>
    <grant:month-name month="09">Sep</grant:month-name>
    <grant:month-name month="10">Oct</grant:month-name>
    <grant:month-name month="11">Nov</grant:month-name>
    <grant:month-name month="12">Dec</grant:month-name>
</grant:stuff>

For this to work inside your XSLT, you have to add a namespace prefix declaration for it:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:grant="https://techishard.wordpress.com"

Then you use the document() function without a document, to get this stylesheet:

<xsl:value-of select="document('')/xsl:stylesheet/grant:stuff/grant:month-name[@month = 10]"/>

Will get you
Oct

Difference ‘tween skip and pick


REBOL series are the foundation of all data structures, so there are a lot of functions to deal with them. Often, under a given circumstance, more than one will do what’s needed. In the REBOL core series documentation, there’s sort of a fluid discussion of functions like first and next. And in another section, a comparison between first/second/third/etc and pick. I was changing values in a series and encountered the distinction between pick and skip. Some functions, like first and pick, are extraction functions, giving the series item as its own thing, while others, like next and skip, are in the context of the series. Often it might not matter, but due to the general applicability of the change function to all types of series!, it did for me. I want to change the first item in a:

>> a: copy ["abc" 123]
== ["abc" 123]
>> a/1
== "abc"
>> change a/1 "b"
== "bc"
>> a
== ["bbc" 123]
>> change pick a 1 "c"
== "bc"
>> a
== ["cbc" 123]
>> change first a "d"
== "bc"
>> a
== ["dbc" 123]
>> change head a "e"
== [123]
>> a
== ["e" 123]
>>

Ah! Until I used head (or next or skip or back), I am referencing the first item (originally “abc”), but the change function sees change string! string!. Some of the items I need to change are blocks I need to append to:

>> a: copy ["abc" [123]]
== ["abc" [123]]
>> append skip a 1 'foobar
== ["abc" [123] foobar]

Oops, that’s not it. Get rid of that with remove.

>> remove last a
** Script Error: remove expected series argument of type: series port bitset none
** Where: halt-view
** Near: remove last a

Same issue here. Last is giving the word! foobar which isn’t something remove can operate on.

>> remove back tail a
== []
>> a
== ["abc" [123]]
>> append last a 'foobar
== [123 foobar]

There it is. Any of the “extraction” functions will do for the append:

>> a
== ["abc" [123 foobar]]
>> append pick a 2 'foo
== [123 foobar foo]
>> append a/2 'bar
== [123 foobar foo bar]
>> a
== ["abc" [123 foobar foo bar]]
>>

Populate a PHP Array in One Assignment


Imagine you are returning an array you need to populate with some values. This is typical:

    $myArray = array();
    $myArray['foo'] = $foo;
    $myArray['bar'] = $bar;

There are advantages to populating the array as a single assignment:

    $myArray = array(
        'foo' => $foo,
        'bar' => $bar
)

The separation of assignment statements from the initialization (and the initialization to an empty array might be many lines previous), allows for easier corruption of the entries in the array; code might be added later in between the original. People may add things to $myArray without documenting it. And just like Where to Declare, I’ll have to scan more code to determine the effect of making changes to $myArray. The second way shows the _entire_ value of $myArray being set at once, instead of parts of it. It presents a visual of the return structure and makes it harder for $myArray to go awry. I don’t have to look any further to see what $myArray is at that point.

%d bloggers like this: