Analysis of ISO 29119-2: Test Processes

This is the second post in a series following and analyzing the ISO 29119 standard. Most of the essential context references were covered in the first post, Analysis of ISO 29119-1. One thing that has changed since that first post is the AST Committee I proposed has been formalized. Watch for more from us soon!

So, what can we expect in Part 2 of the Standard?

ISO/IEC/IEEE 29119 supports dynamic testing, functional and non-functional testing, manual and automated testing, and scripted and unscripted testing.The processes defined in this series of international standards can be used in conjunction with any software development lifecycle model. Each process is defined…and covers the purpose, outcomes, activities, tasks and information items of each test process.

Can’t wait!

Please remember I am criticizing the standard (and the idea of a testing standard), not the people who worked on it. I believe that smart, experienced people attempted to lay out their view(s) of testing, hoping to help people test effectively. I think that in the right discussion about the many contexts software might be tested in, they might concede that no prescriptive standard can be relevant and useful in every context. In fact, some of them are already doing that. Whatever the shortcomings of 29119 (and there are plenty) it could never possibly satisfy its mission, even if it was a better standard than it actually is.

TL;DR

My best-practice, conform-ational approach is to summarize my primary conclusions at the top of my blog posts, sparing tens of readers the post’s full brilliance. Here are my “above the fold” takeaways from analyzing ISO 29119-2:

  • 29119 literally puts process (Part 2) before technique (promised in Part 4, still not published)
  • 29119 claims to be applicable to testing in *all* software development lifecycle models, despite heavy documentation and compliance burdens
  • 29119-2 has Conformance on page 1. To claim Conformance, there are 138 “Shalls” to conform to in this document. To claim “Tailored Conformance” without meeting every “Shall”, “justification shall be provided…whenever a process defined in…29119 is not followed”
  • Part 2’s vocabulary section has conflicts, revisions, and pointers to new terms relative to Part 1. This is not a “gotcha” – but is worth remembering when someone claims that with a test standard “At least there is a common vocabulary for testing”.
  • Conformance is driven by fear. Fear is the mind-killer.
  • Some of the “shalls” are highly specific. Some are vague and hard to understand. Some, through reference, contain multitudes. Some are nonsense.
  • The standard is not detailed enough to be very useful to someone who doesn’t already understand a fair amount about testing, yet an experienced tester would waste a lot of time and effort attempting to comply with it.

Conformance

29119-2 goes to Conformance very early – Page 1. Either Full or Tailored conformance can be claimed for the standard.

  • “Full conformance is achieved by demonstrating that all of the requirements (i.e. shall statements) of the full set of processes defined in this part of ISO/IEC/IEEE 29119 have been satisfied.”
  • “Tailored conformance is achieved by demonstrating that all of the requirements (i.e. shall statements) for the recorded subset of processes have been satisfied. Where tailoring occurs, justification shall be provided (either directly or by reference), whenever a process defined in…ISO 29119 is not followed. All tailoring decisions shall be recorded with their rationale, including the consideration of any applicable risks.”

I can find no guidance on what “the recorded subset of processes” means. Not just what the various nesting levels of “process” are in the standard, either. Are these the processes that reference record-keeping and documentation? I bet I can find a consultant to help not-interpret that…

There is a “Reference” example given for exclusion from the requirement for providing direct justification:

“Where organizations follow information item management processes in standards such as ISO 15489… ISO 9001…or use similar internal organizational processes, they can decide to use those processes in place of the information item management tasks defined in this part of ISO/IEC/IEEE 29119.”

So, no exclusion from the requirement to document and describe the justifications – just an exclusion from the requirement to provide a separate document including these justifications for ISO 29119, as long as they are in another document somewhere else.

After 10 months, the only defense raised thus far by the authors of the standard to the questions about difficult compliance is to claim it is more flexible than what is actually said in the standard:

 

… and that’s the last message in the conversation. I suppose we could take the word of a standard author over the standard itself, which says with little ambiguity under Intended Usage: “The organization shall assert whether it is claiming full or tailored conformance to this part of ISO/IEC/IEEE 29119”.

Clashing Definitions

Section 2 spells out definitions for some terms. There is overlap with Section 1 – and some disagreement with what was found there.

For example, in Section 1, Feature Set meant  “collection of items which contain the test conditions of the test item to be tested which can be collected from risks, requirements, functions, models, etc.” Section 2: “logical subset of the test item(s) that could be treated independently of other feature sets in the subsequent test design activities”. Additional differences, revisions, and pointers to new terms are found. This is not a “gotcha” – but is worth remembering when someone claims that with a test standard: “At least there is a common vocabulary for testing”, ISO 29119 already has divergence in critical definitions between its first two parts.

Do-not-think-it-meansAt least these terms are interesting to think about. It’s far less interesting to trace the relationships between test activity, test item, test condition, rest requirement, test phase, test plan, test policy, test planning process, test procedure, test procedure specification, test condition, test process, test sub-process, test script, test set, test models, test technique, test specification, and test type. Yes, these are all separate things, but time spent debating their boundaries is time not spent “testing”.

Exploratory testing is again defined as “spontaneously designs and executes”, not “simultaneously” as we define it.

Process and Hierarchy

screenshot.1571

This diagram shows a hierarchy of test processes. It doesn’t actually cover all the processes referenced in the standard, despite the caption’s claim. The diagram does demonstrate the standard’s insistence on separating control processes from execution processes.

It is intended to illustrate that the vertical layers define each other downwards. First is the organizational process that defines process for organizational test policies, which dictate policy, strategy, process, procedure, and “other assets”. Test Management Processes are defined at the project level, Dynamic Test Processes are said to control a phase or particular type of testing.

This seems tailored for adoption by the mid-level executive who wants to put their stamp on an organization’s entire testing practice. Over and over again, the standard lays out separate process nodes for each possible step of testing. This exhaustive documentation of the steps involved in one view of testing is way too much for an experienced tester, who would rather provide useful information to stakeholders. It’s still not enough to arm someone with no testing experience to plan and supervise good testing. So who is it for?

When Fear Drives Testing

IQSTD

Software testing is frequently perceived as a high-risk, low-reward activity by people who aren’t testers. It’s thought of as a cost center (“there is no ROI in testing”) and if anything goes wrong, someone’s in trouble. Over and over again, testing is blamed for poor quality, despite the fact that most people who work in software engineering know “you can’t test quality into the product”. Testing is often thought of as less intellectually rigorous than other parts of software engineering, frequently is not a prestigious area to work in, sometimes is led by people without real training, experience, and/or skill in testing, and is often a convenient scapegoat for quality issues – particularly by people who should know better.

Many people that work in testing (rightfully) fear the buck stopping on their desk after a quality failure, and for good reason. If you are likely to have blame imposed for a bug escape, the most rational response by a skilled person might be to interrogate the context and demand the tools and latitude to gather the most comprehensive and useful set of information about the system under test.

If you are controlled by fear, you might shy away from the responsibility, and look for some cover under best practices. After all, if you faithfully observed and obeyed someone else’s plan, you can’t be blamed if the plan fails, right? It wasn’t you, it was the plan!

If you don’t know what you are doing, you might be even more likely to seek the comfort of an externally defined standard that removes your responsibility to decide what to do. If you don’t trust your team (and yourself), you hand off control to someone or something else. Like a prescriptive standard, full of “shall statements” to replace “you thinking”.

The standard is still not detailed enough to be very useful to someone who doesn’t already understand a fair amount about testing, yet an experienced tester could waste a lot of time and effort trying to comply with it. Any discussion of actual techniques seems to be waiting for 29119-4 – at one point promised for late 2014, currently late in the approval process.

You Shall…

There are 138 instances of “shall” in this document. Some of them are highly specific. Some, by reference, contain multitudes. Some are simply nonsense. Some of them are too vague to be useful, though that may make them more applicable in multiple contexts. Some real wisdom can be found in here.

I spent some time pulling apart the various processes, sub-processes, dependencies, and circular references. Rather than try to further sketch out the overall shape of process (and documentation) requirements, I present my 10 most entertaining/concerning/Kafkaesque “Shall Statements” in ISO 29119-2:

  1. The person responsible for organizational test specifications shall implement the following activities and tasks in accordance with applicable organization policies and procedures with respect to the Organizational Test Process.
  2. The organizational test specification requirements shall be used to create the organizational test specification.
  3. Appropriate actions shall be taken to encourage alignment of stakeholders to the organizational test specification.
  4. The traceability between the test basis, feature sets, test conditions, test coverage items, test cases and test sets shall be recorded.
  5. The testing of the feature sets shall be prioritized using the risk exposure levels documented in the Identify and Analyze Risks activity (TP3).
  6. Any risks that have been previously identified shall be reviewed to identify those that relate to and/or can be treated by software testing.
  7. Each required test activity in the Test Strategy shall be scheduled based on the estimates, dependencies and staff availability.
  8. Those actions necessary to implement control directives received from higher level management processes shall be performed.
  9. Readiness for commencing any assigned test activity shall be established before commencing that activity, if not already done.
  10. The test coverage items to be exercised by the testing shall be derived by applying test design techniques to the test conditions to achieve the test completion coverage criteria specified in the Test Plan…
    NOTE 2 Where a test completion criterion for the test item is specified as less than 100% of a test coverage measure, a subset of the test coverage items required to achieve 100 % coverage needs to be selected to be exercised by the testing.

It’s not all baffling. Here’s a richly meaningful shall statement that demonstrates something about the depth necessary to understand context:

A Test Strategy (comprising choices including test phases, test types, features to be tested, test design techniques, test completion criteria, and suspension and resumption criteria) shall be designed that considers test basis, risks, and organizational, project and product constraints…

NOTE 3 This takes into consideration the level of risk exposure to prioritise the test activities, the initial test estimates, the resources needed to perform actions (e.g. skills, tool support and environment needs), and organizational, project and product constraints, such as:
a) regulatory standards; b) the requirements of the Organizational Test Policy, Organizational Test Strategy and the Project Test Plan (if designing a test strategy for a lower level of testing); c) contractual requirements; d) project time and cost constraints; e) availability of appropriately-skilled testers; f) availability of tools and environments; g) technical, system or product limitations.

Mapping

The last third of 29119-2 is an Annex mapping clauses of other standards (ISO 12207, ISO 15288, ISO 17025, ISO 25051, BS 7925, and IEEE 1008) to 29119-2. Rather than critique these other standards, I will simply question the value and purpose of this exercise. Is it to justify the standard, or to prove that it equals or even supersedes the others?

Conclusion

We still have parts 3 (and 4? soon?) of 29119 to go. Having processes defined before considering what we want to accomplish will guarantee we end at our desired results (whatever that might be), right?

Debates, Rent-Seeking, Personal Attacks, and Apologies

Yesterday, XBOSoft hosted a webinar entitled “The ISO 29119 Software Testing Standard – Friend or Foe?”. Philip Lew hosted a debate, somewhat formally constructed. The panelists were Jon Duncan Hagar, Rex Black, Jean Ann Harrison, and Griffin Jones.

As a close follower of this discussion, I’d say that not a lot of new ground was tread, but the significant issues were discussed, and it was the right size twitter event to participate in. Claire Moss provided detailed play-by-play. Iain McCowatt had some pointed comments. Lalitkumar Bhamare, Perze Ababa, Tim WesternKate Falanaga, and @dsynadinos also contributed. The hashtag was #ISO29119, but some tweets are under #ISO29119Debate. I don’t intend to describe the whole event, but to talk about something specific.

One of the specific things that ignited discussion was the term “rent-seeking“. When Griffin mentioned the term during the debate, it definitely got a strong reaction. Even Phil as the moderator objected. Here is the resulting discussion:

Rex’s response today led to me writing this blog post, because I need more space than Twitter to properly respond.

About Rent-Seeking

I first heard this term in James Christie’s talk at CAST 2015. James borrowed this term from his degree in economics, using it in his talk about standards, certification, and regulation. His hypothesis was that the term “economic rent” was useful in discussing some our community’s concerns, by providing language for discussing how some of the actors/vendors in the testing market can exert control on how business is done in that market. I recommend watching his talk, because he certainly says it better.

The term “rent-seeking” resonates for me, in both professional and political contexts. Like many really delicious words, it is rich in meaning. It describes a complex process that takes place in many industries, in many countries, in many contexts. There are degrees; pointing to a financial benefit from an activity doesn’t mean that it is inherently unethical. It does mean that the interests of people who profit from these arrangements are judged differently.

Some vendors have used standards for financial gain. This should not really be in question.

The definition on the mobile Wikipedia site – the one Rex referenced – is even harsher in making an ethical and value judgment. It says “rent-seeking is expending resources on political activity to increase one’s share of existing wealth without creating wealth.”

All this makes Rent-Seeking a loaded term to use in a discussion. That doesn’t mean it is inaccurate or not useful. Still, I should find a less pejorative term here that describes how a market changes with requirements for entry are added.

The Personal and the Political

I respect and admire Rex Black, who has had a long and successful career in software testing and training. He has worked hard and achieved a lot. He’s smart and funny, and I’ve enjoyed our brief in-person interactions. I think that Rex can point to hundreds (thousands?) of people he’s trained in software testing, and say that he’s has helped many people’s careers. Rex has put a great deal of time and effort into ISTQB, ASTQB, and other testing organizations, and can say that he has given a lot back to testing. He’s spoken and consulted all over the place. It’s more than a little presumptuous for me to call him a colleague, given his accomplishments.

That being said, Rex and I have fundamental disagreements on some issues in software testing. He’s taken a lot of fire from all over the CDT community. I once joked about him trolling CDT, and he suggested the opposite is true. I think that is exactly correct: as the personification of Big Quality to many people in our community, he’s taken some pretty rough criticism, and some of it has been personal.

I am friends with and respect some of the people that Rex has blocked on Twitter, but I can also understand why he’s blocked them. Many professional disagreements end up as personal ones; you don’t have to look too far in our community to see formerly close friends/mentors/partners/collaborators who won’t speak with each other today. In some of these cases, I think the person choosing to disengage has made the right choice. I don’t think any of these people are terrible, just that we’re still basically monkeys that wear clothes and sometimes throw shit at each other.

I won’t adjudicate Twitter behavior – but no one comes out ahead when someone feels attacked. We have professional issues about the current and future states of software testing to debate, and it is unfortunate that we find ourselves here: many people disagreeing with Rex, and him expected to debate dozens of people by himself. I don’t know who to nominate to help him, but I empathize that it must be exhausting. I feel that his willingness to articulate another point of view and engage in debate makes him a resource for our community that we should value.

I think it’s problematic to tell people what they ought to be or are allowed to be offended by. It’s hard not to internalize criticism of work that’s important to you, so that criticism should be done with care. You should be very, very careful about assigning motives to someone, and be even more careful in questioning their ethics. You can make a contribution by being slow to take offense, and patient and forgiving with those who make mistakes, but you also need to take care of yourself.

My Role, and My Apology

I’ve made some pointed comments in debating Rex, trying to walk the line of criticizing specific things without being personal. He has objected to some of the things I’ve said that have been too harsh. When I have agreed, I have apologized. He has accepted these apologies, and has not blocked me. I greatly respect and appreciate that.

Yesterday, I made the implication that Rex offering ISTQB certification classes after helping define the ISTQB standards was a form of Rent-Seeking. I put a sharp edge on it, and threw it out in a tweet with about 5 seconds of thought. I was wrong, and I am sorry I said it.

The reason I was wrong was because the point could have been made without implying that Rex was trying to rig a market, or that he intends to shut down competition, was seeking to purchase favorable regulation like a giant agribusiness, or was otherwise behaving unethically. The “unethical” charge that should not be made casually – and in this case, it’s just not supportable. I should have been more respectful, and been more precise in my criticism.

What I Should Have Said

I believe that all ISO standards are proposed as a standard for governance, whether that is corporate or legally (If it was in person, I might have added a sly comment referencing conservative political philosophy being opposed to needless regulation). I think that these governance standards add overhead and barrier to entry for testing companies, particularly small ones, because they require study and compliance overhead. When a standard is proposed, its very nature is that it wants to become as widely observed as possible.

The claims that ISO29119 makes about correctness and applicability are very broad, and from what I’ve read so far, it does not include qualifications on when, where, or how it should NOT be applied. When the lead author was asked about this, he said you can forego the standard “If you work in your garage (and are) not working with any clients.” I disagree with this, to say the least.

Certifications make similar claims in applicability, and have similar effects in creating barrier for entry for individuals to the profession of testing. When companies require a certification for hiring – that is the goal for every certification, to become the standard measure of competence, right? – they are making it harder for testers to get and change jobs without investing time and study into achieving these certifications. I think that by participating in writing these certifications, and then training people for the certification tests, there is a financial benefit realized while simultaneously imposing a cost for entry, reducing the efficiency of that market.

That is the extent of what I meant to say, and it’s damn close to the definition of Rent-Seeking. It’s still wrong to attack someone’s motives and ethics, and it’s toxic when you are trying to debate important issues.

Again, I am sorry.

Analysis of ISO 29119-1

The first sentence of the ISO 29119 Introduction states:

The purpose of the ISO/IEC/IEEE 29119 series of software testing standards is to define an internationally agreed set of standards for software testing that can be used by any organization when performing any form of software testing.

Don’t think I need to emphasize the “any”s, but go ahead and bold those in your mind if it helps.

TL;DR

Like my other blog posts, this one far exceeds widely-held standards for blog post length, using the best practice metric of word count. So, to summarize:

  • I have some experience with creating ISO Standards in another field. My experience was that it is a difficult and highly political process.
  • Of course our community objects to standards, because they cause crappy testing, here defined as not providing a meaningful, accurate account of quality risks of a project. ISO 29119-1 says it “is informative and no conformance with it is required”.
  • To ISO 29119-1, “Testing” is a “set of activities conducted to facilitate discovery and/or evaluation of properties of one or more test items”, noted to include “planning, preparation, execution, reporting, and management activities, insofar as they are directed towards testing.” It seems that the term “test item” (a system, a software item, a requirements document, a design specification, a user guide) is deliberately constructed to suggest these items can all be tested.
  • This particular document (ISO 29119-1) has a lot of definitions and Naming of Documents. The SDLC model referenced for context is Waterfall-y (Not Iterative).
  • Exploratory Testing, or as it has recently come to be called, “Testing”, is found buried under “Experience-Based Testing”. Further discussion is promised in 29119-2.
  • “Risk-based” is referenced multiple times, with strong language around claims of wide adoption. Given that choosing what to test should always(?) involve evaluating risk, this is hard to argue with: “…truisms…to those new to software testing they are a useful starting point“, indeed.
  • Overall? Several years behind the state of the art, overly focused on formality and control, and barely concerned with technique.

Context and Perspective

I spent a little more than two years in a Working Group like the one that produced ISO 29119. I worked on trusted archiving of digital data, specifically document and records management. Most of the stuff I wrote concerned storage requirements for permanent, reliable, auditable, and discoverable electronic storage of documents. Others worked on PDF standards – one great example of how and where standards are helpful. Of course, a certain vendor associated with PDF tools drove a lot of very specific format standards, so vendor capture is a real concern as well. The last point: standards get written by people who keep showing up.

It was occasionally interesting work to research and write about the reliability and trustworthiness of the different types of proprietary electronic WORM storage at a time when write-once optical media was generally being replaced by software. It was also very deliberative, highly political, entirely subject to who got to edit last, and all progress on publishing anything was blocked by one person’s two-handed, white-knuckled grip on every work item. They were very busy in their consulting business – having successfully marketed their committee position. It was with some relief that I resigned from the group, though there was some useful work being done there. Eventually, there could be guidance for how to properly store electronic documents and records so that data will be permanently preserved – a real problem that lacks clear guidance.

I believe standards can be very useful in some circumstances. A standard for what constitutes proper archival of scanned documents makes a lot of sense. Railways, communications protocols, food labeling, and a thousand more things are all good examples of areas where standards can be very valuable, even essential. I just don’t agree that software testing is one of them.

I strongly identify with the Context-Driven School of Testing (CDT), so it should be no surprise that my default position on any testing standard is against. I’ve said previously that my community could object to a standard without knowing its content, because a standard means that context follows process, whatever its prescriptions. By existing, it conflicts with the idea that context should inform method. As Fiona Charles says: “Where is the standard for writing software?” I’d add “Where are the standards for writing, editing, cooking, or painting?”

I came to my understanding of why making software is not manufacturing on my own, well before I found my way to CDT. To refine my problem statement for Sick Sigma, bug density metrics, standards, and anywhere else where someone is trying to force industrial quality processes and thinking on software: making software is not producing widgets. All software projects are one-offs created by developers, testers, and project staff of various skill levels and engagement, to solve different problems, using different tools and often shared components, based on variably incomplete and conflicting understandings of requirements. In a given context, at a given time, a team of (hopefully) smart and (always) flawed people work together to create something that barely works, and follow-up with as much bug-fixing as they are allotted time, money, and energy for. Requirements, conditions, and the skill and engagement of the people on the team are all moving targets.

Criticism aside, there is a large contingent of people who worked on these standards for many years. I respect their effort, and once someone decided there needs to be a software testing standard, a lot of work went into making these happen. There is some good work in here. There are other things I disagree with. I consider the people that worked on this standard colleagues, and there is no personal animus. We disagree about software testing, and this is reasonable professional debate.

I know that there are people who want predictability and certainty in planning projects. Sometimes the stakes are high enough (or feared to be high enough) that a detailed structure such as this process definition will comfort a stakeholder just by virtue of its specificity and number of control metrics.  I also understand that some of the context in any situation is externally imposed, and not always available for debate.

I think that employing skilled and experienced testers who study context, asking them lots of questions and listening closely to the answers, and leaving broad latitude in implementation is a better strategy. I think that the approach to software testing I espouse can lead to better results than following a cookbook, but it is possible to screw up either (or any!) approach. A meal prepared by a skilled cook from the ingredients on hand is delicious, nutritious, and sustainable. I don’t eat McDonalds, and I don’t recommend anyone else does, either.

If cheap and fast are the objectives, you might make a different choice. There is enough room in the world for people to use whatever approach makes sense for them. The problem is that when a specific approach is published as an international standard, it is asserted to be the best way to test software, using prescription to trump skill. When we join projects that already have engineering managers (or even worse, test managers) invested in following standards, they are A). Almost certainly in dire need of our help, and B). Going to be hard to separate from the security blanket of Bestandardices. Our stakeholders and the craft of testing suffer as a result.

I will report on the standard from my perspective as a Context-Driven Tester. I intend to review the pieces of ISO 29119 separately at first, as they are a lot to digest. Also, these get long enough. Let’s start with Part 1, or ISO/IEC/IEEE 29119-1:2013 as it is formally known. I’ll try to extract points of interest, but when other parts (2-5) of the standard are explicitly referenced, I’ll save the discussion of those issues until we get to the appropriate part of the standard.

This is a lot of stuff to read, friends. I’ve been chipping away at this for weeks. I encourage you to do your own review if you have an interest.

Standard’s Introduction

As noted, the standard starts with a statement of purpose. Next, there is an acknowledgement of many contexts (domains, organizations, methodologies), but says that the standard is applicable in many contexts.

In the next part, words like “conformance” and “required” appear. There is no conformance possible with this section; 29119 sections 2-4 are standards “where conformance can be claimed”. There is something important about safety and liability to think about there.

Definitions

In the next section, almost a hundred terms are defined. Someday, our community should consider creating a dictionary of terms we can debate from.

“Test case(s)” appears 30 times. The definition of test case includes preconditions (state?), input, and expected results. “Test item” is referenced here as something to be executed, while defined elsewhere as an object of testing (system, piece of software, requirements, document, user documentation). Test cases are noted as the lowest level of test input (cannot be nested) for a “test sub-process”.

Exploratory testing uses “spontaneously” where we would typically use “simultaneously” in front of designs and executes; perhaps this is a simple error. “Unscripting testing” is dynamic testing, or when the tester’s actions are not prescribed by written instructions in a test case. So this implies that they usually are?

“Test coverage” is a percentage to which test coverage items have been exercised by a test case or test cases. This seems difficult to calculate. Test coverage item is an attribute(s) derived from a test condition(s) with an unspecified test design technique.

Testing itself is a “set of activities conducted to facilitate discovery and/or evaluation of properties of one or more test items”, noted to include “planning, preparation, execution, reporting, and management activities, insofar as they are directed towards testing.” It seems that the term “test item” (which as defined here includes both software and documentation) is deliberately constructed to suggest these items can all be tested, and it is explicitly stated that it is not necessary to execute software to test. For example, a specification could be “tested” for correctness against requirements, and this would be considered a testing activity by this standard. It apparently would also be testing to discuss which team member will be responsible for reviewing the standard.

Other notes:

  • “Oracle” is missing altogether; pass/fail criteria is the substituted term.
  • Three separate terms are used to describe parts of equivalence partitioning. This is the only real testing skill described in this section. Later, fuzz testing and a few sampling techniques (for choosing test cases) are referenced, but not described.
  • Black box testing is folded into “specification-based testing”.
  • Documents are Named for Several Things, including Describing the Status of each Test Data Requirement and a Separate Document describing Test Environment Requirements (caps party inspired by ISO 29199-1)

Testing Concepts

Next is background about what testing is and what it is intended to accomplish. This is definitely a target-rich environment, with several points worthy of extended discussion. I’ve pulled out a couple that seem worth deconstructing. My comments in italics.

– Every defect is believed to be introduced by a human’s error or mistake, either in understanding requirements specification or creating software code.

I disagree with this point, as an incomplete understanding of requirements is the only state of discovery I’ve ever seen, meaning, the specification is never “complete”. This is natural and expected when one considers the limitations of communication between any two people. One of the strengths of iterative development is exposing these gaps and eventually fixing them (usually). Thankfully, the standard includes room for “undocumented understanding of the required behaviour”.

– The primary goals of testing are to provide information about the quality of the test item and residual risk. This risk is said to be related to how much the test item has been tested.

If all test items (by the standard’s definition) were equal, there might be something to that, but my experience is that different pieces of software/documents/etc vary greatly in complexity, risk, and in any other meaningful attribute that must be ignored in order to say one piece of software is equivalent to another. What does “how much” mean? Relative to what?

ISO 25010 is referenced for its eight quality characteristics, but the standard later details slightly different characteristics: Functional Suitability, Performance Efficiency, Compatibility, Usability, Reliability, Security, Maintainability, and Portability.

There are many mentions of “Dynamic testing”, but this is said to include executable test items, preparation, and follow-up. More detail is promised in 29119-2.

Testing is said to be a subset of verification and validation. Other standards are referenced: ISO/IEC 12207: software life cycle processes, and 1012-2012 – IEEE: Standard for System and Software Verification and Validation.

At some future date, I would like to figure out what the distinction being made here is; since this standard includes evaluation of specifications as “testing” along with a broad definition of software quality characteristics, it’s hard to imagine what’s left.

There is some discussion of testing in an Organizational and Project context. It is said that an organization might supplement the standard, though it “would adopt” the standard. It is then said that conformity with the organization’s processes is more typical than conforming directly to the standard. It’s still said that if an organization does not have “an appropriate set of processes”, they should apply this standard directly. The case is being made for standardization more than strict adoption of the standard. A welcome bit of guidance, though: “The experience of the industry is that no single test strategy, plan, method or process will work in all situations. Hence organizations and projects should tailor and refine the details of testing with reference to standards such as this.”

Many artifacts are Named and Capitalized, from Organizational Test Policies and Strategies to Project Test Plan, down to”Test Sub-process Plan”, with two examples given as System Test Plan and Performance Test Plan.

Some of these documents may be appropriate in some contexts, but I find heavy documentation to be one of the most undesirable characteristics of industrial testing, and a major contributor to why the Agile community seems to think most testing (besides automated checks) is a waste of time. Time spent writing documentation no one reads or updates is time better spent on gathering information.

There is a very complicated diagram describing the relationship between standards, test strategy, test processes, test policy, etc.

The most important point I see are that standards are put in the same box as Regulations and Laws – an especially toxic outcome of pushing adoption of standards such as these. I’ve written about this before, but it is hard enough getting people to use modern testing techniques without having to give the straw man of liability the illusion of a brain.

“Dynamic Test Processes” appears again – in a diagram saying that test design, test specification, and the “Test Environment Readiness Report” are all necessary to get to Test Execution. The term “Issue (s) Notice” is used post Test-Result, as a decision loop for whether or not to report issues; another pointer to 29119-2.

Software Life Cycle and Testing’s Place

The next part talks about the life cycle of software projects between conception and retirement. ISO/IEC 15288 is referenced as a source for life cycles, and ISO/IEC 25051 is mentioned for testing software written by another company. A Requirements…Design…Coding…Acceptance example is given, and then it is said that defining a development model is out of scope. Still, probably would have been better to use an iterative one.

Quality Assurance is described as a support process required to aid the SDLC:

a set of planned and systematic supporting processes and activities required to provide adequate confidence that a process or work product will fulfill established technical or quality requirements. This is achieved by the imposition of methods, standards, tools, and skills that are recognised as the appropriate practice for the context.

Actually, not that bad!

Measures should be collected during testing, as they can provide information about the quality of test processes and/or test items and the effectiveness of their application on each project.

Oh, there we go.

Testing information is to be provided to Project Management with completion reports and test measures. There is some talk about process improvement, and pushing process improvements across an organization.

“Risk-based Testing”

This is introduced as “Testing is a sampling activity”.  The need to identify product, project, and organizational risks, and then to tailor the test processes to address these risks is described in some detail. “Risk-based” is referenced multiple times, with strong language around claims of wide adaptation. Given that all test choices involves evaluating risk, this is hard to argue with.

There is a distinction made between choosing test cases for requirements coverage, and for risk, which is good. There is some talk about polling a wide group of stakeholders to develop a risk model, and most welcome is an introduction of context, using the Medical Device example of compliance.

Annexes

Five annexes are included. Some quick tastes:

Annex 1 is about testing as a part of “Verification and Validation”, and presents the following model. The idea of Metrics outside of Testing is worth more contemplation.

29119-1Annex1

Annex 2, Metrics: “In order to monitor and control testing and provide timely information to stakeholders it is necessary to effectively measure the test process…Thus all test efforts need to define and use metrics and provide measures in respect of both products and processes.”

frank_underwood_seriously

Annex 3 has our first discussion of Iterative Development processes, comparing them to “Sequential” and something called “Evolutionary” that seems to be trying to split the difference.

In Annex 4, there is a discussion of “test sub-processes”, defined earlier in the standard as “test management and dynamic (and static) test processes used to perform a specific test level (e.g. system testing, acceptance testing) or test type (e.g. usability testing, performance testing) normally within the context
of an overall test process for a test project”.

This new term is used constantly throughout the standard, but it doesn’t seem to add a lot of value beyond discussing control of the overall test process.  Examples given here include: Acceptance testing, Detailed design testing, Integration testing, Performance testing, Regression testing, Retesting, System testing, Component testing. There are some tables of each type, listing objectives, claims of detail processes, and technique (usually followed by “As Appropriate”.

My last piece of criticism here is that this supposes these are all distinct and separate activities,as opposed to overlapping and concurrent ones. Perhaps I don’t yet fully understand the usage here.

Annex 5 is about Testing Roles. It names a Strategist who establishes process and ensures conformance, Test Manager who manages to process, and a Tester who executes the processes. There is an interesting discussion about the independence between who writes the code and who tests it that unfortunately concludes with “The intention is to achieve as much independence between those who design the tests and those who produced the test item as possible, within the project’s constraints of time, budget, quality and risk.” Sounds like designing as large a communication gap as possible.

A bibliography lists many ISO and IEE standard documents, plus a ISTQB glossary of terms. Agile Testing (Crispin and Gregory) is listed, but no other books on testing that we would be familiar with are included, nor any contemporary sources.

In Conclusion…

We made it! It looks like we have the result of a committee, several years behind the state of the art, overly focused on formality and control, barely concerned with technique. So about as expected.

The other parts of the standard will be reviewed here soon, and by soon, I mean this year. I have not had any real progress on my attempt to formalize this work with AST, but I will continue to work on that, too.

Metrics Fixations: How People Feel About Numbers

The tweet that inspired this post:

Metrics that are not valid are dangerous.

TL;DR

My blog posts sometimes branch and overlap like legacy code that no one feels confident enough to refactor. So, new feature: the Too Long, Didn’t Read summary:

  1. Metrics are useful tools for helping evaluate and understand a situation. They have similar problems to other kinds of models.
  2. People believe metrics provide facts for reasoning, credibility in reporting, and safety in decision-making.
  3. Questioning metrics remains an important mission of our community.

Metrics are Models

A metric is a model. I see modeling here as a way of representing something so that we can more easily understand or describe it. They have value in expressing a measurement of data, but they need context to be information.

nerfherder
She Chooses Whoever Shoots Last?

I could look into my pasture full of hundreds of nerfs grouped in their pods, and communicate what I see as “There sure are lots of them.” Or, I might say “There are 1138 of them in 82 pods. Well, there were 1138 when I counted them all up last week. Oh wait, there have been seven calves, one death, and two missing since them. Yes, 1142, definitely 1142. I think. Unless some died or came back. And there are a few pregnant females out there. Still, only males for meat until wool production recovers.”

Other people have dug into the validity of metrics in great detail previously, and I don’t want to get sidetracked into (just) validity. We will get to the use of metrics shortly, but to get us into the right state of mind:

  • If I were to say that after implementing goat pairing in one pod of nerfs as a trial, nerf losses were at 7%, is that a good or bad number?
  • If nerf losses were 14% in the period before introducing goat pairing, does that help? What if I point out that there are an average of 14 nerfs in a pod? Are you going to ask where in the sample period today is?
  • Did I mention that wool production is down 38% because of the goats snacking on nerf fur clumps?
  • Meat revenue is up 3% this season.
  • Per animal? No, overall.
  • Meat prices are down relative to wool prices lately , but still up 5% this year to about $5.25.
  • How many animals butchered? I record that separately, but usually just divide pounds sold by 600 and use that for investor reports and taxes.
  • “All models are flawed. Some are useful.”
  • Remember not to confuse models for what they represent, lest you get the metrics – as opposed to the results – that you are looking for.
  • Correlation is not causation. It’s especially suspect when you are trying to explain something in retrospect.
  • The last, hardly subtle point: make sure what you measure matters.

And….time. It’s not that helpful to pick apart specific metrics – whether they measure something real, or if they are based on CMM Levels, KLOCs, defect densities, nerf herd finances, and other arbitrary/imaginary constructs. It’s not that helpful because it doesn’t necessarily change minds. Let’s instead discuss why people are so enamored with metrics, how they use them, and speculate on what they might be getting from them.

Quantifying With Measurement

By measuring something, we may feel like we are replacing feel with facts, emotions with knowledge, and uncertainty with determinism. By naming a thing, we abstract it; the constant march of computer science is to reduce complexity by setting aside information we don’t need, and simplify things to fewer descriptors. Everybody enjoys the idea of being a scientist.

894 story points, 37 story points per dev per sprint...
894 story points, at 37 story points per dev per sprint, is….

Similarly, we feel more control when we can point to a number. We can say that a thing is a certain height, length, size, etc, and we feel like we understand it. We’ve reduced the complexity to where we can describe a thing, removing the need to try to transfer some bit of tacit knowledge if we understand what we are looking at, or deceiving ourselves about how much we actually understand if we don’t. Everyone likes to feel clever.

We can then discuss quantities, group things that seem to be similar, and so forth. This means we can put it in spreadsheets, we can talk about how many people are needed to produce certain quantities, etc.

Of course, once something is represented by a number, it invites dangerous extrapolation: “Once we implement goat pairing across all pods, we’ll make $252,000 more!”

You Can’t Argue With Facts

When we can cite a number, wherever it comes from, we might feel like we are making quantitative judgments, removing our judgment and opinions. Something that is a fact isn’t open for interpretation, right?

leprechaun_on_the_loose

This provides us with cover and safety. Instead of stating an opinion, we can claim we’re simply pointing at reality. If you make a mistake in judgment, metrics can be the justification for why you did it. Wouldn’t anyone else have made that choice with those facts at hand?

Where did my facts come from? If they are measurements, how do I take them, and what do I discard? Why do they mean what I say they mean, and why do they mean that here and now? This is the slippery stuff that allows us to frame a discussion with our version of “facts” and interpretation of what they mean, inserting our biases and opinions while maintaining the illusion that we are making completely quantitative decisions, using only logic and reason, denying our influence in stacking the deck in the first place.

“Quantitatively, we’ve had the same experience as everyone else – goat pairing is essential for maximizing wool production.”

We Have to Measure Our Progress Somehow

If you do get a person pursuing metrics to admit problems with validity, a common deflection for reframing these conversations is to claim that however flawed they might be, metrics are an external requirement that is not open for discussion. When the boss’ boss demands metrics – or when we say that they do, we are attempting to end the conversation about the validity/need for metrics. Persisting with these questions past this signal that this is not open for discussion is going to reduce future influence, or worse.

Is this Accurate? Precisely.
Is this Accurate? Precisely.

This resolve comes from the experience of being asked to report status, which is essentially answering the following set of questions:

  • Is there progress being made?
  • Is the schedule still accurate?
  • Do you need help with anything?

If the answer is No, No, or Yes, there will need to be additional supporting detail. You are persuading another person to act or not act, committing personal credibility, and taking the risk that what you claim is correct enough that they won’t look foolish for endorsing it and you.

Reporting, Cloaked in Metrics

We often have limited opportunities to prove ourselves. We want our bosses, and our boss’ bosses, to believe that we are smart and capable. Presenting metrics to bolster our conclusion makes us feel more credible – and it can’t be denied that when the subject isn’t understood, almost any metrics are going to sound impressive and credible, making everyone involved feel smarter.

Many of us have found ourselves in discussions where a stakeholder is looking at a chart where the underlying measurements are barely – or not at all – understood, but they will still question the shape of curves and graph lines, asking for explanations when any troughs appear. This can be a powerful mechanism for having a discussion about the relevant issues, but there is a tradeoff in presenting a single metric – and having that become the standard.

Good reporting communicates facts, risks, context, and recommendations. Metrics that don’t support one of these are not in the mission of reporting.

What Does it All Mean?

Is it really true we can’t run a business without metrics? I don’t think I am advocating that, but I am suggesting we can help make it disreputable to manage to flat, two-dimensional metrics as if they were reality.

Managers have been simmered in the pot of using best practices to manage to metrics for at least a generation. Questioning metrics, both in formulation and usage, is an important mission of our community. We need to be thoughtful about when and how we raise these issues, but understanding the components of our reasoning is necessary to be confident that we are reasoning well.

Arguments to Moderation

In the last couple of months, other people outside of the Context-Driven Community have spoken up about the disagreements we’ve long had with certification and standards. One of the articles is here. Go ahead and read, it’s short and I’ll wait.

On first reading, the implication seemed to be that the Context-Driven Community’s approach to testing is from a single perspective – even though the editorial’s pronouncement is essentially CDT:

Limiting oneself to a single perspective is misguided and inefficient…It’s not either this or that. It’s all of the above. Testers must come at complex problems from a variety of ways, combining strategies that make sense in a given situation—whatever it takes to mitigate risks and ensure code and software quality.

Does the editorial writer know what CDT is about? This is something that could be said by any number of people in my community. My concern is that people who are not familiar will get the impression that CDT simply has different process or method prescriptions – a common fallacy amongst people who don’t (or won’t) understand what Context-Driven means. This is really frustrating, since this is the opposite of one of the most important things to us. We keep saying that our prescription is to examine the context, and select tools, methods, and reporting that are appropriate for the context. We have a bias against doing things that we see as wasteful, but we also acknowledge that these things may need to be done to satisfy some piece of the context.

Despite essentially agreeing with us, the mischaracterization of our point of view was necessary to serve the structure of the article as an Argument to Moderation. This is both a trope of modern “journalism” and a logical fallacy: selecting/characterizing two points of view as opposites, and then searching for some middle, compromise position, usually with pointed criticism directed at “both sides” to demonstrate how much more reasonable and wise the observer is.

JusticeGinsberg

This is a flawed model, though. Sometimes one position is simply correct. Often the two positions are not talking about the same reality, and the framing is important. There are typically more than two positions available on an issue, but as with politics, two seems to be the intellectual limit, with every point of view placed somewhere on a spectrum between slippery-slope extremes.

The debate – such as it is – about ISO 29119 is suffering from a lack of voices willing to take up for the standard’s content and mission. Even the authors of the standard are responding to criticism by defining down what “standard” means and what it’s for. No one seems to be speaking up against the things CDT says, but there are people who seem to be enjoying contradiction for its own sake, or taking on a bystander role, clucking about personal agendas without naming anyone or anything as an example.

Debate is appropriate for describing conversations about subjects where there is professional disagreement. That’s what’s here – and that’s all that’s here. We can disagree, as professionals, and it’s fine. “Can’t we all just get along” was first uttered as a call for peace during riots where people were being injured and killed. A professional debate is not a riot. I don’t hate people I disagree with. I consider them colleagues, and if we didn’t disagree, what would we talk about? If we didn’t feel passionately, why would we bother debating?

I’m not a fan of yelling at people on Twitter. It makes many people uncomfortable, nuance is lost, and often, the person doing the yelling just looks mean. These are all valid criticisms of communication style, but not of substance – both in the sense that it ignores the issues at hand, and in that complaining about the PR instead of the content is a transparent mechanism to claim the higher ground.

obamaseewhatyoudidIf you want to talk about how our community supports and nurtures young thinkers, discussion of this particular subject is valid and important. If you want to talk about twitter manners in order to not-so-subtly discredit a point of view without actually engaging with it, it’s not hard to see that.

People working within and profiting from a system are almost always going to think the system works well, despite whatever flaws they might acknowledge. Any criticism of the system is a challenge to the status quo, and will be opposed by the people working within a system. Particularly when you profit from a system, you should not expect to be exempted from criticism of that system, or your role in it. It was ever thus, and there is no reason why this field, or this subject should be any different.

I speak at conferences about the things I do and think that pertain to my field of study. I expect to encounter other experts, and be asked questions. If I didn’t get any questions, I probably didn’t say anything new, important, or relevant.

If you sell certification training or work on standards bodies, you nominate yourself as a spokesperson for the ideas you clearly support – or that support you, more like. If you claim expertise on a subject, or purport to accumulate anecdotes and then pass off your opaque classifications and conclusions from them as statistical evidence, you should expect to be asked questions and asked to provide more detail. If you are not willing to speak for and defend your ideas, maybe you shouldn’t be willing to profit from them, either?

If you’re an observer, you could add something to the discussion by debating the issues at hand. If your contribution is just to tone police, maybe sit this one out?

Standardization (is) for Dummies

A theme seems to have developed on this blog. There has been a lot of complaining about the control mechanisms of industrial-scale quality management. Let’s not change course now; we must stay committed to the process.

doge_wink
Very irony…much humor…wow

Today, I want to talk about the pathology of “Standardization” – the idea that the most efficient way for large organizations to “manage” work done by many people is to make the tools and/or processes the same across the organization, specifically for testing. I am not talking about rolling tool licenses up into an enterprise license, or even reporting structures coalesced into “establishing centers of excellence” (more like “marketing scheme to justify cost for enterprise tools of mediocrity”, amirite?)

Note the diversity of characters
Note the diversity of characters, even with “common tooling”

And of course, many things do benefit from standardization. Railroad travel, measurement (Metric, ‘Murica!), Lego, hardware interfaces, operating systems, etc. Often, it is the right thing to standardize.

Achieving standardization through generalization process engineering is really trying to replace variables of skill and experience with a perceived constant provided by documentation of some idealized process and a set of required artifacts.

The desire to model our world as full of easy equivalencies is easy to understand; we make decisions all the time between two or more choices – or at least “choices” as we have framed them (or allowed them to be framed for us). The reduction of complexity to symbols is necessary for us to decide what to do, where to go, and how to get there without paralysis.

Testers are rich sources and frequent users of heuristics. Heuristics are very effective when used responsibly, skillfully, and in the right circumstances. What always matters is context. Nothing is truly abstract.

Choosing between an apple or an orange for a morning glass of juice is a matter of preference, and a very different choice than deciding which tree to plant in your yard, which requires considering climate, sun exposure, and soil. Apples and oranges is not even really “Apples and Oranges” without understanding why the choice is being made, who is making it, and what the desired outcome(s) is.

Standardizing Test Practices

This weenie would like to see your test case documentation format
This weenie would like to see your test case documentation format

I believe that process weenies who lead standardization efforts really believe most of the things that they say. They believe that if they can properly standardize, document, and implement the Amalgamated Consolidated, Inc way to do things, they will save the company money, shorten testing cycles, implement the proper metrics, and reduce hard and soft costs. I didn’t say they are right, but I am saying that they are not intending to mislead when they make those claims. Given how the Iron Triangle of Cost, Time, and Quality works, they can indeed move towards two of the corners.

In addition to the very common pathology of presenting goal statements like “save money and improve efficiency” as a strategy, there are some other things that are said that I find troubling. Let’s unpack some of these.

“If we standardize, our training costs will be lower.”

“Standardizing will make it easy to transfer work and employees between groups.” 

Effective testers know the software and system they test, and how best to work with the people on their team. If they go to a new team, that knowledge will be lost from their old team. The relocated tester will need to build new contextual information again. If they get work from another team, they will have the same challenges.

It is incrementally cheaper to have one set of process documentation than two, or one set of software manuals. Of course, that documentation will already have holes by the time it is published, and by the time processes have a year or two to evolve, the documentation will have notes, exceptions, and whole sections that are flat-out ignored. Oh, you are going to keep the documentation updated, you say? Does that ever really happen?

The truth is, new employees aren’t really going to learn any faster from attempting to make every department do things the same way. They are going to have to learn whatever process they need to know to be successful in the area in which they are hired. Who cares whether Group A does things the same way as Group B? Not the new employee, they only care about how their group does things.

Every department/business unit/team will have a number of “local variables” – where data is stored, how to get equipment, refreshing builds – all of the contextual parts of the process that can only be learned through practice. It is also hugely important to learn what the group’s priorities are set, who gets mad, who is receptive, how this manager likes to run things, what that director means when they say things cryptically in email; every new employee has to learn these to be effective.

What portion of onboarding time is *really* spent on learning the steps of a process?

“HR says we have to formalize job descriptions/responsibilities/salary bands.” 

The 4 different shapes makes them appear more natural...
These are the 4 testing role descriptions we will use across the company…

Sometimes, the organization is looking to rationalize job descriptions, salaries, and other things that make it easier for bracketing employees against each other. This process is almost never good for workers. There might be some very pious talk about making sure pay is “fair” across the organization; but my experience is that it is far more likely expensive salaries end up targeted for cutting than low salaries are raised.

Treating people like interchangeable, programmable cogs is not only dumb, it’s dehumanizing and demotivating. Smart, passionate people will be motivated to find somewhere else to work where they can use and grow judgment and skill. If you are looking to commoditize testers between groups, you are likely to end up with a pool of McTesters with similar skill levels, and results across projects may also be similar in an equally undesirable fashion.

“It costs us money to have redundant tools in the organization.” 

Yes, this is absolutely true. But enough about middle management.

Tool-centric views of testing are somewhat less prevalent than they were a few years ago (though automation-obsession is a large problem). Open source and custom tooling seems to be pulling ahead – because there is always a bias towards and a significant focus on cost-cutting.

If you believe that all tools are equivalent, it becomes easy to make “dollars and sense” decisions at some remove from the front lines to reduce redundancy and merge the organization’s knowledge. Unfortunately, this is simply not true. Tools are not equally fit for the same purposes.

All tools have strengths and weaknesses – most of which are less important than the skill and judgment of the tool operator. If you actually do take away the tool an experienced and skilled person is comfortable using and replace it with another, you are also discarding a great deal of experience and developed work – the cost of which might be difficult to measure, but is really hard to replace without significant time and energy. Sure, some of it is crap – most automation code is. The useful bits would go into the garbage with the rest of it. A sunk cost, perhaps – but still discarded value.

Briefly, this trope: “We have wasted redundancy trapped inside more than one code base”.  As if code were perfectly commented and interchangeable, ready to be pulled off a shelf and swapped in to a waiting receptacle as if it were a battery.

“All of our documentation looks different.”

Standardization of document templates is probably harmless, beyond giving anal-retentive ninnies some cover in the form of work product to justify their grab at “thought leadership”.

Perhaps the first question should be “So?” or “And?” Documentation is just a way to communicate important information to the people who need it. Who actually consumes the documentation? What information are they looking for?

“We’re all doing things differently.”

We Must Be In Lockstep
We Must Be In Lockstep

Different groups of people will choose different methods for attacking different problems – or perhaps even the same ones. The collective skills, experience, and inclinations of one group will be different than another’s – so of course they will come up with different ways to do things.

This “argument” is a great example of “begging the question;” why is it bad to do things differently? What is to be gained by forcing people to work at the same pace in the same way? Effective groups will develop ways to work together efficiently – a process of continuing improvement.  It has to be asked – why is standardization so important, really? Does it make people feel safer, more secure, or less at risk? What is the real value of “consistency”? Are we solving for real problems or neuroticism?

It was in a sales context (full of other buzzwords like “cadence”) that I first heard the phrase “in lockstep”. This led me to call this “Three Legged Racing” – an old child’s activity that rewards careful synchronization, and is sometimes intended to deliver a teamwork lesson. The two children could run to the finish separately much faster, but tying them together induces a crippling limitation, forcing them to discard their natural instincts and abilities to stumble along, trying to get somewhere while struggling against their constraints, with grass-stained clothes, shortened tempers, and injured ankles sure to follow. 

Human beings have some natural skills and tendencies – which vary from individual to individual, but in the same way children run freely and naturally, people think, talk, and work in ways that feel comfortable and effortless. The best work product results from people not wasting effort on process overhead and administrivia, allowing them to find their flow. When people aren’t allowed to work in natural ways, it’s much harder for them to accomplish anything at all, and they will be unhappy.

I remind myself not quite frequently enough that I must be careful assigning motivations to others; this is how you end up begrudging and resenting people who don’t really spare you a second thought. Most people (everyone) are trying to fumble their way along with the rest of us, and are doing what they think is the right thing to do, by some formulation of what they think the right thing to do is. People are never as sinister as an irritated person might think – though even the worst people feel justified in what they are doing, trapped by circumstance into making perfectly rational and logical decisions.

moon_illusion1a
“Train Tracks are SUPPOSED to be the same width!”

When we try to prevent mistakes by attempting to dictate future activities, we are using the fear of what truly incompetent people might do to force competent people to discard their judgment. We are harshly judging people we don’t know, and we are supposing that we can still make better decisions than them without context.

If we are arrogant enough to try to dehumanize people in the future by giving them questionable marching orders from the present, we create environments that are not healthy for thinkers and passionate people – and they will leave.