Tech Is Hard

Credibility = Talent x Years of experience + Proven hardcore accomplishment

Tag Archives: CftLT

Rebuilding existing systems


My favorite thing is to rewrite a legacy system. Not like a “rewrite the whole system and flip a switch and we’re using it”, but instead an incremental process that might take 12 to 18 months before you have a new system, without ever migrating. I’m calling it “rebuilding” to keep with all the construction analogies we use, because it involves re-engineering, re-architecting and rewriting . We should admit that most legacy systems, if compared to a building, are like the Winchester house. “Remodeling” might be better term, because you keep the system functioning the whole time it’s being rewritten. But in the process, we demolish entire rooms and floors, rebuilding new ones that make sense.

Sometimes I describe it as performing a complete organ transplant while keeping the patient alive. Actually the patient’s health keeps improving as we operate.

I’ve been doing it since the beginning of my programming career, because it’s in my nature to always see “what’s missing” from the current. When that natural inclination is combined with the universal theme “Don’t touch it if you don’t have to. Don’t fix it if it ain’t broke.”, a person learns how to incrementally shim new ways of doing things into a software system.

At at least 2 companies, I completely changed the internal architecture of existing systems this way. Huge systems, some in assembler and C. During the process, the way new software was written changed a lot, with tools and libraries of [macros | functions | classes] to remove the drudgery, hide the housekeeping, bring new expressiveness and shrink code.

Smaller code is better. As you see a larger and larger view of your software system and as the file system structure and classes begin to describe the process of your business system, dramatic gains are made. Applications behave consistently and users are happy. Error handling gets the attention it deserves (things go wrong a lot more than we like to think). As old PITAs go away, hidden inside pre-written code, we begin to think and write more elegantly. We start to imagine practical implementations where it seemed impossible before.

What’s been my secret weapon at successfully doing this? I admit that in my mind, rebuilding an existing system is easier than something brand new. I have a running model of what it should do, at least mostly do. The transformation becomes a process of changing, checking regressions and validating new behavior in the case of something that never worked right to begin with.

I always work with a library mentality. Frameworks that hem everything into an arbitrary predetermined structure are for saps. And would require a top down approach, which gets you back to trying to rewrite the system from scratch. You have to work from the inside out, bottom up, to change the existing system. Wherever you create something that will be used for new development, you have to bounce back and forth between the most efficient and maintainable layering of internal functionality and making the application level interface revolutionarily more simple and/or powerful than existing ways of doing the job.

To inject the new into an old system, we also have to be ready to write things that we will throw away and still devote full attention to its quality, lest we regress.

Populate a PHP Array in One Assignment


Imagine you are returning an array you need to populate with some values. This is typical:

    $myArray = array();
    $myArray['foo'] = $foo;
    $myArray['bar'] = $bar;

There are advantages to populating the array as a single assignment:

    $myArray = array(
        'foo' => $foo,
        'bar' => $bar
)

The separation of assignment statements from the initialization (and the initialization to an empty array might be many lines previous), allows for easier corruption of the entries in the array; code might be added later in between the original. People may add things to $myArray without documenting it. And just like Where to Declare, I’ll have to scan more code to determine the effect of making changes to $myArray. The second way shows the _entire_ value of $myArray being set at once, instead of parts of it. It presents a visual of the return structure and makes it harder for $myArray to go awry. I don’t have to look any further to see what $myArray is at that point.

Where to Declare


I think it’s common practice to declare (and sometimes initialize) all of a function’s local variables “at the top”, but I usually declare them close to where they’re referenced. The result of declaring at the top can lead to what I call functional striation and makes refactoring more difficult.

An example:

function innocent_looking_foo() {
    var A, B, C;
	
    do (something with A)
	
    do (something with B)
	
    do (something with C)
	
    return
}

Right from the outset, it looks like this function may not be very cohesive, but the truth is, one sees a lot of code that looks like this in the real world. Let’s say someone adds D and ‘somethingelse’ now has to be done with a few of the variables:

function innocent_looking_foo() {
    var A, B, C, D;
	
    do (something with A)
	
    do (something with B)
    do (somethingelse with B)
	
    do (something with C)

    do (something with D)
    do (somethingelse with D)
	
    return
}

or worse(?)

function innocent_looking_foo() {
    var A, B, C, D;
	
    do (something with A)
    do (something with B)
    do (something with C)
    do (something with D)

    do (somethingelse with B)
    do (somethingelse with D)
	
    return
}

In some shops, this kind of growth continues for years until no one remembers if we really need to do each thing to what. And we have people who add “do (anewthing with D)” in between A and B. Remember that the “do” pseudo statements are usually represented by more than one line of code in our function. If I am working on any part of this, I have to examine the preceding code, all the way to the top, for references to the variable I’m concerned with. there may be dozens or, lets be honest, hundreds of lines where a variable can be corrupted. There is more chance that the order of execution matters, which is bad if we can avoid it.

Now read this:

function foo() {
    var A
    do (something with A)
	
    var B
    do (something with B)
    do (somethingelse with B)
	
    var C
    do (something with C)
	
    var D
    do (anewthing with D)
    do (something with D)
    do (somethingelse with D)	
	
    return
}

This sort of aligns the code with the variables it’s doing work to, and would be easier to refactor as it gets more complex. It’s easier to see repeating patterns and turn them into callable functions.

This guideline is probably even more applicable to variable initialization, irrespective of where it’s declared. Try not to separate the initialization from the first reference.

Coding for the Long Term


“Long term” here means this, as probably most things I say, may not be applicable to those who don’t have responsibility for evolving a system with a medium or large team of developers over years. In other words, it’s what I call “enterprise”.

Programming is made up largely of a series of decisions. What’s the next step in the process? What’s the next refinement to make? What’s the interface to my software? But the most frequent decisions are made as we write the code itself. How to represent the current state so that our code can be written to achieve the desired state. Do I use an array for this? Do I write a loop here or spend the time looking for a built in function? Do I need a “default” case in this switch?

How we represent our data, or state, will constrain the style of the code — at both a macro and granular level of detail. The shape of the two together have an effect far beyond completing the task at hand. The decisions we make when coding should be influenced by this knowledge.

I’m not saying I know for sure, but when I’ve explained to others before why I emphasized a particular way of doing something, they’d say “That makes a lot of sense. I never thought of it before”. Actually, it’s usually just a discernment for which of two choices is the better overall, and why.

So I’m going to humbly share some of that. Some of it may only apply to one language, but often the concepts are transferable.