Tech Is Hard

Credibility = Talent x Years of experience + Proven hardcore accomplishment

Category Archives: Uncategorized

Rebuilding existing systems


My favorite thing is to rewrite a legacy system. Not like a “rewrite the whole system and flip a switch and we’re using it”, but instead an incremental process that might take 12 to 18 months before you have a new system, without ever migrating. I’m calling it “rebuilding” to keep with all the construction analogies we use, because it involves re-engineering, re-architecting and rewriting . We should admit that most legacy systems, if compared to a building, are like the Winchester house. “Remodeling” might be better term, because you keep the system functioning the whole time it’s being rewritten. But in the process, we demolish entire rooms and floors, rebuilding new ones that make sense.

Sometimes I describe it as performing a complete organ transplant while keeping the patient alive. Actually the patient’s health keeps improving as we operate.

I’ve been doing it since the beginning of my programming career, because it’s in my nature to always see “what’s missing” from the current. When that natural inclination is combined with the universal theme “Don’t touch it if you don’t have to. Don’t fix it if it ain’t broke.”, a person learns how to incrementally shim new ways of doing things into a software system.

At at least 2 companies, I completely changed the internal architecture of existing systems this way. Huge systems, some in assembler and C. During the process, the way new software was written changed a lot, with tools and libraries of [macros | functions | classes] to remove the drudgery, hide the housekeeping, bring new expressiveness and shrink code.

Smaller code is better. As you see a larger and larger view of your software system and as the file system structure and classes begin to describe the process of your business system, dramatic gains are made. Applications behave consistently and users are happy. Error handling gets the attention it deserves (things go wrong a lot more than we like to think). As old PITAs go away, hidden inside pre-written code, we begin to think and write more elegantly. We start to imagine practical implementations where it seemed impossible before.

What’s been my secret weapon at successfully doing this? I admit that in my mind, rebuilding an existing system is easier than something brand new. I have a running model of what it should do, at least mostly do. The transformation becomes a process of changing, checking regressions and validating new behavior in the case of something that never worked right to begin with.

I always work with a library mentality. Frameworks that hem everything into an arbitrary predetermined structure are for saps. And would require a top down approach, which gets you back to trying to rewrite the system from scratch. You have to work from the inside out, bottom up, to change the existing system. Wherever you create something that will be used for new development, you have to bounce back and forth between the most efficient and maintainable layering of internal functionality and making the application level interface revolutionarily more simple and/or powerful than existing ways of doing the job.

To inject the new into an old system, we also have to be ready to write things that we will throw away and still devote full attention to its quality, lest we regress.

How to do a Better Job at Meta-programming


Or

How I thought like a compiler 20 years ago to create a reporting system processing millions of 32k byte, varying length, records on tape per hour against hundreds of user-defined report packages .

Imagine, if you will, a young lad in front of a 3270 terminal, a green cursor blinking in his face, eager to do the impossible.  A couple years before (1988 through 89) I had been a contractor for my now employer and had written a set of assembler macros that could be used to define reports against a proprietary “database” of varying-length documents, up to 32k in length.  In order to pack as much sparse data as possible onto tape cartridges, each record contained fixed and varying sections and embedded bitmaps used to drive data access code.  My manager had been the original author (there’s not too many people who could have pulled it off, in retrospect, “hats off”, Jeff and Phil.)  Each database had a its own “schema” as a file of assembler macros that was compiled to a load module of control blocks.

To complement the low level code was a set of application macros to locate or manipulate data in the record and to examine the schema.  They should have allowed schema-independent programming, but in reality, to code for every possible data type and configuration in which the application data may lie was prohibitive, so some assumptions were usually made: “order records and item records may occur multiple times, always“, “an order will always have an amount and a date field”.

When I had done the original reporting system, it had to work for all schemas, so that meant putting a lot of limitations on the possible reports.  We were really coding to meet a certain set of specific reports, but in a generic way.  It worked pretty well, but the code had to be modified a lot to accommodate “unforeseen” reporting needs.

When I was tasked with the next generation, I knew I wanted the report definitions to be in a “grammar”.  Some type of structured way to define a report, that took into account nesting, ranging, counting and summing.  I also knew that it had to handle any schema and any “reasonable” report.  I shocked the team lead when I threw the stack of new reports into the trash that we were to use as our requirements.  I said “let’s write a spreadsheet concept and forget the specifics.  We’ll check them from time to time to make sure what I’m doing satisfies, but I want my blinders off.”

To do all these things, even (maybe more so) in a modern database technology would be hard.  With our data store I realized I had to think of something new and spent weeks drawing boxes and arrows and writing little tests.  When I was finished, my programs could handle a great variety of reports and process data faster than code that was schema specific.  Marketing could define their own reports now to analyze the sea of demographic data we held.

Next: what a sample report’s data looked like.

Driving the Synchronous Mongoose Updates


This turned out to be a lot harder mentally.  And I still think I could serialize this more instead of the dynamic wrapping it does.

/**
 * Runs synchronous, dependent updates
 */
var syncUpdate = function (updates, respond) {
  return _.chain (updates) .reverse() .reduce (function update(next, arg) {
    return function (last) { 
      arg.model .update (arg.where, arg.update) .run (function (err, doc) {
        if (err)
          last (err);
        else if (next) 
          next (last);
        else 
          last (null, doc);
      });
    };
  }, null) .value ();
};

That reverse/reduce/return takes my updates and makes them inside out. We create callback wrapper code that checks for errors and if there’s another nested function to call. By passing “last”, the final callback function, to each nested invocation, we can jump all the way out on error. You can see that “next” is the accumulated function nest that we pass to reduce, and it gets evaluated inside the callback code. More of a macro type thing. If there’s something to call next, we continue to pass the “last” function along. And when there’s no more “next” to call, we call “last”.
Then I call it like:

  syncUpdate ([ 
   { model: Foo,    
     update: { $set: { field1: 'value1' } },         
     where: { _id: id } },
   { model: Bar,    
     update: { $set: { 'array.$.boolthing': true} }, 
     where: { 'array.foo': id, 'array.boolthing': false } },
   { model: FooBar, 
     update: { $set: { 'field': 'val'} },            
     where: { _id: id } } ]) 
     (
      function (err, doc) {
        if (err)
          r .send ("Error saving pick " + util .inspect (err), 500);
        else
          r .send ("saved pick " + doc, 201);
      });

One interesting difference is using syncUpdate() to return the function I want to call, which I immediately do, with my simple callback function as an argument. I think this is an improvement mechanically, from passing a success message and having the callback code be black box-ish.

I’ve got some ideas. I know of course, this should handle things other than Mongoose updates. It should be any function, and we’re getting pretty used to calling things with a callback function. In a nested situation what I want to do is call the specified function with a callback function that lets me test the result and if appropriate, pass the result on to another specified function. Or in the event of an error, at least for now, jump all de way out mon.

Used to stuff like this in assembler all the time and it’s a lot easier because you get to do anything you want. No rules to the way I allocate and initialize memory structures. In Javascript, however, I have to come up with a syntax that scales infinitely, for the nesting capability, and is flexible in the types of functions it can call.

Don’t Expect Me to Know What You’re Tweeting, Cuz I Don’t Read Your Wall


I saw something about a study that showed heavy Facebook users were more likely to feel unhappy with their own life/unfulfilled/envious, and it got me to thinking.

I used to say that TV, especially once there were more than 5 channels to choose from, was the beginning of the end.  I once read that “with television, people spend more time watching other — mostly fictional — people live, instead of living themselves”.  As much as I have it on myself, I try not to sit and watch unless it’s a movie I haven’t seen.  (Off topic sort of, but just yesterday I heard a psychologist say, “children have to know mommy and daddy need their time and they should just watch TV or something” — how automatic and horrendous.)

It’s what a lot of people do with Facebook, and to less extent maybe Twitter and blogs.  The latter two might play into the phenomenom I see as unique to the Web media, of needing an audience to quench the aforementioned envy.  Most people have probably already heard of Facebook addiction and the like, but I think there’s something more subtle at work, too.

Facebook creates the perfect storm for wasting time observing and comparing.  I know many of the people, thus am under similar constraints of success — same school, same profession, same geo, etc — but of course there’s going to be people posting stuff that makes my life look less glamorous.  (And like chat rooms, is any of it pumped up?)

So we read this and become envious, we spend more time reading it and we do less for ourselves, so we achieve less, etc.

In my opinion, the reality show genre is finely tuned to this.  When we watch actors, although part of us is in the fantasy of the script/plot, we don’t expect ourselves to be “them” – this is a movie, they are actors.  But when the participants become everyman who can wait in line to audition without previous qualification, without having earned it…  Show participants aren’t actors, so the audience identifies much more (“I could/should…”), but the odds are still hugely against them just because of numbers, so we become more immersed in an observer, wannabe role.

I don’t know what to make of it, but I’m just sayin.  How come I have to leave messages on cell phones so much?  How come people want to use texts, where they don’t *have* to respond?  Why do people now want to broadcast on their “wall” and expect the world to know, instead of communicating directly with the individuals who need to know?  Seems like a time saver, huh?  Well now I also have to monitor the daily flow from all the people who broadcast to me to pick out what might be information.  I’m worse off.

So I might save time when planning my own event because I can just “post” it.  BUT I have to check everyone’s posts to see what I might be interested in.  Instead doesn’t it save everyone time when each of us spends more communicating, so that everyone doesn’t have to listen in and decide for themselves whether it’s relevant to them?  Is there an ego factor at work here?  Where we want to imagine the world wants to listen to our thoughts and plans?

Do we need an audience because everyone else is famous, because we have such an immediate media, and round and round it goes

The Terminator Scenario


Machines that can build more of themselves are here.  It is a revolutionary concept to have something that, to some degree, can output itself.  We saw that in code years ago.  But this is stuff you can hold in your hand.

In answer to the detractors who ask “does the world need more plastic crap?”, I wonder if in the long run, such a system would mean less overall manufactured plastics.  Sort of a just-in-time inventory like concept.  Wouldn’t there be less mass-produced plastic?  And the supply chain for raw materials could become more streamlined, resulting in more efficiencies.

Ponderable, at least.

Simple XSL 1.0 String Templates and Very Simple XSL Unit Testing


In order to reference external examples that evolve, I’m going to start with a stylesheet containing simple string functions and build on it.  There’s also a simple, but useful methodology for unit testing the XSL templates.

Dealing with strings is notoriously annoying in XSL, so in my xsl directory I have strings.xsl, a stylesheet to do repetitive and recursive stuff.  The first function we’ll need is str-replace.  I recently updated this template to be pretty short and sweet.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template name="str-replace">
  <xsl:param name="haystack"/>
  <xsl:param name="needle"/>
  <xsl:param name="repl" select="''"/>
  <xsl:choose>
    <xsl:when test="contains ($haystack, $needle)">
      <xsl:text><xsl:value-of select="substring-before ($haystack, $needle)"/></xsl:text>
      <xsl:text><xsl:value-of select="$repl"/></xsl:text>
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="substring-after ($haystack, $needle)"/>
        <xsl:with-param name="needle" select="$needle"/>
        <xsl:with-param name="repl" select="$repl"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:text><xsl:value-of select="$haystack"/></xsl:text>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<!-- translation for lowercase-->
<xsl:variable name="lcletters">abcdefghijklmnopqrstuvwxyz</xsl:variable>
<xsl:variable name="ucletters">ABCDEFGHIJKLMNOPQRSTUVWXYZ</xsl:variable>

<xsl:template name="str-tolower">
  <xsl:param name="str"/>
  <xsl:value-of select="translate($str, $ucletters, $lcletters)"/>
</xsl:template>

</xsl:stylesheet>

Hopefully it’s a fairly clear template: as long as the $haystack contains $needle, concatenate the string before $needle, the replacement value to use, and the result of calling the template with what comes after the first $needle occurrence.  When there’s no $needle, then it just results in $haystack.

I want to use what I’ve learned about unit testing to prevent regression in the future. It can be extremely hard to figure out which change caused regression in XSL; you may not notice the one scenario that triggers it for a while. I wrote strings.test.xsl to call templates in strings.xsl.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import href="strings.xsl" />

<xsl:call-template name="assert">
  <xsl:with-param name="expected" select="'@apple1'"/>
  <xsl:with-param name="actual">
    <xsl:call-template name="str-tolower">
      <xsl:with-param name="str" select="'@ApPlE1'"/>
    </xsl:call-template>
  </xsl:with-param>
</xsl:call-template>

<xsl:template match="/">
  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'abcBBB-BBBxyz'"/>
    <xsl:with-param name="actual">
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="'abcA-Axyz'"/>
        <xsl:with-param name="needle" select="'A'"/>
        <xsl:with-param name="repl" select="'BBB'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>

  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'abc-xyz'"/>
      <xsl:with-param name="actual">
      <xsl:call-template name="str-replace">
        <xsl:with-param name="haystack" select="'aAbcA-AxyzA'"/>
        <xsl:with-param name="needle" select="'A'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>

  <xsl:call-template name="assert">
    <xsl:with-param name="expected" select="'eleven1'"/>
      <xsl:with-param name="actual">
        <xsl:call-template name="str-replace">
          <xsl:with-param name="haystack" select="'111'"/>
          <xsl:with-param name="needle" select="'11'"/>
        <xsl:with-param name="repl" select="'eleven'"/>
      </xsl:call-template>
    </xsl:with-param>
  </xsl:call-template>
</xsl:template>

<xsl:template name="assert">
  <xsl:param name="expected" select="'missing expected param'"/>
  <xsl:param name="actual" select="'missing actual param'"/>
  <xsl:if test="not ($actual = $expected)">
    <xsl:message terminate="yes">Expected <xsl:value-of select="$expected"/>; got <xsl:value-of select="$actual"/></xsl:message>
  </xsl:if>
</xsl:template>

</xsl:stylesheet>

The way I use strings.test.xsl is with any XML input document, because it doesn’t actually use the input XML.  It might be interesting to come up with a unit testing stylesheet that took the stylesheet to be tested as its input document.  And use the document() function to also do something introspective.

I admit there’s nothing really descriptive explaining what the test is for, but repeating the assert calls is really easy and I just update the template I’m calling or use a new set of parameters for a new condition. It’s a brute force method that works for now.

Populate a PHP Array in One Assignment


Imagine you are returning an array you need to populate with some values. This is typical:

    $myArray = array();
    $myArray['foo'] = $foo;
    $myArray['bar'] = $bar;

There are advantages to populating the array as a single assignment:

    $myArray = array(
        'foo' => $foo,
        'bar' => $bar
)

The separation of assignment statements from the initialization (and the initialization to an empty array might be many lines previous), allows for easier corruption of the entries in the array; code might be added later in between the original. People may add things to $myArray without documenting it. And just like Where to Declare, I’ll have to scan more code to determine the effect of making changes to $myArray. The second way shows the _entire_ value of $myArray being set at once, instead of parts of it. It presents a visual of the return structure and makes it harder for $myArray to go awry. I don’t have to look any further to see what $myArray is at that point.

Leaders


This is a big subject. I think the terms [project] manager and team leader are misapplied so often, their significance is diminshed. These should be titles of honor, seriously. But often their role turns out to be that of administrator. Status reports, reconciling time reports, scheduling time off… I had a client whose Director of one group — and when did IT get so many directors? Doesn’t a $100 million movie only have one? — was too busy to meet with the project manager because she was stuffing envelopes with documentation and complimentary ball point pens!

I’m talking about leadership. I’m talking about the traits of people like George Washington, George S. Patton and Vince Lombardi. Dammit, a software project is a battle! You’re survival may depend on the outcome.

People follow a manager because they have to; they follow a leader because they want to.

Leaders inspire with their commitment to doing things well and being thorough. They know the details. They raise the bar for everyone. And I’m not just saying these are the traits you should look for in a leader, I’m saying they define leadership.

A leader is as a leader does.

But vital in our field is that they know technology. Do you think Pat Riley doesn’t know everything about basketball? Don’t fool yourself. How else can they set technical direction for the team or organization? Old, new, mundane, abstract, everything they can get their eyeballs on. Learn and love what we do. Systems are our product – know everything about yours, the competition’s. Look at what they’re cooking up in universities and the W3C.

Beware the Peter Principle

Bad: Staff Your Pilot Project with Seasoned Internal Veterans (only)


After all, they know the most about our business, right? Sorry. Keep in mind that when I say the following, I’m indicting decades of dysfunctional organizational behavior, not individuals: if you assign the people who gave you what you have now, you’re going to end up with a system that’s not too much different than what you’ve had. Just in a new language. While it’s great fun to use the project to learn a new technology, you need a consultant with real world experience on the team. Of course you want plenty of internal SMEs on the team, too, but don’t let how your company does things now influence the new too much. Remember, you’re writing something new to replace something old because it doesn’t satisfy your needs. Think ahead. Consider how much faster you’re going to learn if you can bypass most of the pitfalls and dead-ends with an expert’s guidance. You don’t have to know everything. We all learn fastest from a good teacher. Organizations should not ignore this common sense.

About the “Redefinition” of Computer Terms


STOP!  You are trying to justify/rationalize a lack of understanding.  By doing so, the usefulness of the language is diminished.  Be clear on what OO and MVC are.  They are not defined by the Web or frameworks.

Let me illustrate.  What is an “apple”?  We all know, even though there are many, many different strains.  But still, the essence of apple, its apple-ness, distinguishes it from an orange.  The skin is different, the fruit is different.

Now let me redefine, since I am not clear on what makes an apple, an apple, much like the popularly misunderstood object-oriented and model-view-controller.  I place an orange and apple together, they’re both round fruit, so they are probably both apples.  My misunderstanding of the distinctions has caused me to mistakenly conflate the two items and lose important knowledge.  And I will proceed to make things like apple marmelade and orange pie, content that I’m using apples to make apple recipes.

The ability to draw correct distinctions and categorize correctly is in the top 5 traits of an excellent programmer.

%d bloggers like this: