How I thought like a compiler 20 years ago to create a reporting system processing millions of 32k byte, varying length, records on tape per hour against hundreds of user-defined report packages .
Imagine, if you will, a young lad in front of a 3270 terminal, a green cursor blinking in his face, eager to do the impossible. A couple years before (1988 through 89) I had been a contractor for my now employer and had written a set of assembler macros that could be used to define reports against a proprietary “database” of varying-length documents, up to 32k in length. In order to pack as much sparse data as possible onto tape cartridges, each record contained fixed and varying sections and embedded bitmaps used to drive data access code. My manager had been the original author (there’s not too many people who could have pulled it off, in retrospect, “hats off”, Jeff and Phil.) Each database had a its own “schema” as a file of assembler macros that was compiled to a load module of control blocks.
To complement the low level code was a set of application macros to locate or manipulate data in the record and to examine the schema. They should have allowed schema-independent programming, but in reality, to code for every possible data type and configuration in which the application data may lie was prohibitive, so some assumptions were usually made: “order records and item records may occur multiple times, always“, “an order will always have an amount and a date field”.
When I had done the original reporting system, it had to work for all schemas, so that meant putting a lot of limitations on the possible reports. We were really coding to meet a certain set of specific reports, but in a generic way. It worked pretty well, but the code had to be modified a lot to accommodate “unforeseen” reporting needs.
When I was tasked with the next generation, I knew I wanted the report definitions to be in a “grammar”. Some type of structured way to define a report, that took into account nesting, ranging, counting and summing. I also knew that it had to handle any schema and any “reasonable” report. I shocked the team lead when I threw the stack of new reports into the trash that we were to use as our requirements. I said “let’s write a spreadsheet concept and forget the specifics. We’ll check them from time to time to make sure what I’m doing satisfies, but I want my blinders off.”
To do all these things, even (maybe more so) in a modern database technology would be hard. With our data store I realized I had to think of something new and spent weeks drawing boxes and arrows and writing little tests. When I was finished, my programs could handle a great variety of reports and process data faster than code that was schema specific. Marketing could define their own reports now to analyze the sea of demographic data we held.
Next: what a sample report’s data looked like.