Been making a point of fleshing this out & polishing it, while I’m actually working on stuff, for clarity’s sake. These are things I’ve had to learn the hard way, that they don’t (so far as I know) teach you in class…in fact, some of them are diametrically opposed to what they teach you in class. Well hey, those who can, do, those who can’t, teach…
It’s not just professors. Management has a tendency to “teach” the wrong stuff; they’re supposed to be all about producing positive results, doing more with less, but unfortunately they tend to gravitate toward making the job of managing easier. Which is not the same thing at all. I’ve noticed that the other job, my job, the designing & coding, is a young-man’s game. There aren’t too many people who’ve stuck with it as long as I have, unless they’ve made it a point to avoid principal-engineer & design positions, and just do what they’re told. As long as it works for them, I won’t judge. Some of the young guys who had tech-lead positions over me & a lot of others, back in the day, I see went on to go sell Amway or real estate just a few years later. So the institutional memory is lacking; it’s missing the advantage that masonry had, with journeymen & apprentices, while the cathedrals were being built hundreds of years ago. It isn’t common for someone in the coding business to actually jot down what they learned, unless they’re going into the book-writing business, in which case…yeah, they still quit what they were doing, and start writing books.
Well. This is what’s helped me, in the past, today, and probably will without much change in the foreseeable future. Take it for what it’s worth…the better job I do keeping them in mind, the better the results I see at the end…
1. Any proposed statement not specifically defined and validated to be true, must be presumed false. The only exceptions to this rule involve things that, by being false, would make your efforts easier. These must be presumed true. In short, presume Murphy. Presume everything is aligned against you until your tests prove it isn’t so…then, presume your tests are wrong.
2. Programmers create programs and the purpose of a program is to define behavior. The job, therefore, is to define behavior. Bearing Rule #1 in mind, the mission becomes one of identifying and managing uncertainties. Any aspect of this left undone is failure, even if the shortage is not recognized immediately.
3. Keep the machinery doing what machinery does, keep the people doing what people do. When people have to act like automated processes in order to use your product, you built it wrong. If the automated process makes decisions factoring in arcane, obscure and unpredictable experience & state data, like people do, you built it wrong. Either one of these sins will bring consequences in the form of diminished confidence felt by those who use it. The test is, is there a feeling of dread when the user produces a stimulus, which is a product of the uncertainty about what the response will be. This should not be happening.
4. People listen to speeches and machines run programs. Programs, therefore, are not speeches. It is said that a speech is like a skirt, it should be short enough to hold people’s attention but long enough to cover the subject. The program just has the job of making sure the subject is covered; all other objectives are secondary. Contrary to popular belief, there is no correlation between brevity in a computer program and the ease involved in its maintenance. This presumes sloppiness on the part of those who write long programs and neatness on the part of those who write short programs. This axiom doesn’t hold, at least not with any logical certainty; it is a myth propounded by those who consider themselves above the occasionally onerous task of grappling with details.
5. The product of my experience investigating situations where systems aren’t behaving correctly, is a learned bias that the machines are doing exactly what they should be doing, and the people are the problem. That’s because mistakes have a tendency to originate with lack of definition (see rules 1 and 2). Machines and automated processes work according to complete definitions; people have the ability to work without complete definitions. That is a bug, and not a feature, with the people. The dysfunction in a system tends to start with the people, and with something they left undefined, or defined only inside their own heads and failed to communicate with other involved people.
6. Error messages are unappreciated. A lot of people who might have been solid contributors in the field, decide they’re not right for it and go do something else because they find themselves confused a lot, and they’re confused because they’ve been reading bad error messages. The best-designed processes will treat their session mission as one of correctly reporting on whatever went wrong, so that a successful execution is the exception and not the rule. When fixing a bug that involves a malformed error message in the aftermath of something else that went wrong, always fix the error message FIRST, THEN proceed to the other condition that caused it. The rationale is that the test with the malformed data but repaired error message, is a valuable test, but the test with the repaired data and broken error message is worthless, because it effectively conceals an execution path known to be broken.
7. A design can’t be good unless it solidly prioritizes its own objectives and then sticks to its knitting. These design objectives compete with each other. Example: A fragment of code can make use of a design pattern so it’s more maintainable across time, even with the introduction of new requirements, by engineers who are nominally familiar with the pattern. But it will be grossly unrecognizable and confusing to a coder who is not familiar with the pattern, even if he is experienced in the programming language. A decision not to use the pattern would result in code more readable to a new programmer, but more difficult to maintain. So there is mutual exclusivity here. Be aware. Choose your battles.
8. A great design takes testing into account, essentially beginning with the end in mind. Simple requirements translated into a complex suite of regression tests, manifest a mediocre design. A simple suite of tests, covering a complex patchwork of requirements, is a sign of a great design — assuming, of course, that the tests do indeed provide this coverage.
9. A good design delegates responsibility to as many layers as there are subjects to be addressed in the definition of behavior, with each layer having a substantial reason for being, but no layer taking on more than one subject within the definition. Each layer should be conceptualized and built with strict adherence to Design by Contract (DbC), Separation of Concerns (SoC), and fulfillment of the dicutum that interfaces should be easy to use correctly and hard to use incorrectly. The design of these layers must apply definition of behavior in response to both success and failure of operations at run-time. The test of good application of SoC is, how much of the implementation has to be changed when a new requirement is introduced, or an existing requirement changes. If this causes a ripple effect throughout the application even though it’s a relatively innocuous change, this may reflect inadequate or ineffective separation. If the necessary change is contained, with the layer boundaries acting as a sort of “breakwater” and as a result the overwhelming majority of prior work escapes unmolested, this is a sign of strong, effective separation.
10. If the most charismatic people are making all the decisions that matter, the project may already be in trouble. Making definitions that have to be made in order for the project to succeed, often is achieved at the expense of being interesting & fun; being interesting & fun often comes at the expense of making these vital definitions. Not always. But often. The litmus test is, at the point these definitions are needed for work to continue, is it a common occurrence that guidance is already available because someone successfully anticipated the need. If this is not the case, refer back to Rule 5 — people are the problem, they tend to spin new definitions out of whole cloth and proceed as if no one else could’ve arrived at a different definition. This is the point of team-dysfunction, where the team starts to produce work inferior to what any one of the members could have produced working in solitude, or fails to address problems that would have easily been solved by any one of them working in solitude. In such a situation, the advantages of charismatic leadership are mostly neutralized.
11. “Technical debt” is a great term. If your project takes on a life of its own and becomes self-sustaining, manage T.D. just like real, corporate debt. Pay what you can against it, when you can, allow it to languish a bit only when you have no other choice, get back to reducing it again just as soon as you can, down to zero if possible. And if you can’t get to it, you’d better get busy finding out why.
12. Programmers are not system administrators. Sys admins are not programmers. The only time it makes sense to have the same people doing both these things, is when the operation is too small to practically divide the roles up into separate personnel, in which case it’s best to think of it as administrator-less. There are many rationales for this. The first is that system admins and programmers labor toward different goals, the former toward continuity, the latter toward progress against time, which translates to invasive, and frequent, change. The second is operational security, which can be compromised if these roles are not separated.