I live in Mountain View, California and work for SOASTA. I have worked in testing for 16 years, and been engaged with the CDT community for 11.
Of the work I’ve done in the testing community, I am most proud of being a Speak Easy mentor. I’m also an organizer of a long-running peer workshop (WOPR, the Workshop on Performance and Reliability), and a member of the Community Advisory Board for STPCon. I have been a member of AST for several years, first attending CAST in 2010.
After last year’s CAST, I set out to form the AST Committee on Standards and Professional Practices to help AST engage with issues affecting our profession as a professional trade organization. This work, and my experience working with AST to get this Committee started led to me accepting a nomination to run for the AST Board.
I am asking for your vote because there are things I hope to accomplish for our organization, our community, and our profession as an AST board member. I believe I have a track record of getting things done, by digging in and doing them myself. I think that’s what we need more of on the AST board. With urgency, focus, humility, and willingness to iterate, we can accomplish a lot.
The ideas of our community’s founders and leaders changed testing forever. I want to publicize our school’s approach to testing as viable, respectable, and highly effective to the testing AND software development worlds, giving our membership support and resources for implementing modern testing principles. I also want to help solidify a launchpad for the next generation of our community’s leaders to lift off from. We have a bright, passionate, and diverse crop in our community that will shine very brightly when they get their turn.
I am sorry I won’t see so many of my friends and colleagues in Grand Rapids next month. During CAST, I will be speaking at an Agile Development conference (Agile2015). I plan to talk with that community about testing, and how “automating everything” is sure to miss many bugs. From where I sit in the heart of Silicon Valley, this desire to turn testing into checking is an even greater risk to our profession’s future than factory certifications and testing standards – and is not an empty house of cards like these economic tactics.
I will push AST forward as an organization by challenging the organization to improve as an advocate for all testers, whether they identify as a member of our community or not. AST should continue to proudly be a Context-Driven organization, but the business and profession of testing has been defined by large commercial interests for too long. We must present a credible, experience-based alternative to the obsolete, ineffective command and control processes recommended by those seeking to profit from marketing McTesting process and “expertise”. We will win with our superior ideas, demonstrating that in a world of faster development iterations with smaller teams, it’s our community that is moving the practice forward.
I hope to help AST do more for its membership. My first set of tasks are to help members to find and collect tools and links for self-learning, aid them with job posting and searches, help them connect with peers for advice, and provide them with practice references when they set out to improve testing in their organizations. A key task for helping this happen is to revitalize the AST website so that others can easily contribute their research, experiences, and voices to a common body of knowledge. I will ask the board to allow me to revamp the AST website towards community engagement, and I will personally do much of the work to make it happen.
I want to help AST expand its footprint internationally. While much of our membership is American, there are thriving communities in Europe, Australia, New Zealand, and even Asia. AST can support and sponsor decentralized, loosely affiliated CDT conferences in places where there is demand and people on the ground.
Lastly, I will continue to work with the STP board to welcome CDT content and speakers into those conferences. If you take a look at our fall schedule, you can see that is going quite well.
To learn more about me, you could visit my blog here at contextdrivenperformancetesting.com, follow me @ericproegler, or chat with me sometime. I’d love to hear more ideas about how to increase AST’s accountability to the community, so that it can better earn the community’s attention and participation.
ISO/IEC/IEEE 29119 supports dynamic testing, functional and non-functional testing, manual and automated testing, and scripted and unscripted testing.The processes defined in this series of international standards can be used in conjunction with any software development lifecycle model. Each process is defined…and covers the purpose, outcomes, activities, tasks and information items of each test process.
Please remember I am criticizing the standard (and the idea of a testing standard), not the people who worked on it. I believe that smart, experienced people attempted to lay out their view(s) of testing, hoping to help people test effectively. I think that in the right discussion about the many contexts software might be tested in, they might concede that no prescriptive standard can be relevant and useful in every context. In fact, some of them are already doing that. Whatever the shortcomings of 29119 (and there are plenty) it could never possibly satisfy its mission, even if it was a better standard than it actually is.
My best-practice, conform-ational approach is to summarize my primary conclusions at the top of my blog posts, sparing tens of readers the post’s full brilliance. Here are my “above the fold” takeaways from analyzing ISO 29119-2:
29119 literally puts process (Part 2) before technique (promised in Part 4, still not published)
29119 claims to be applicable to testing in *all* software development lifecycle models, despite heavy documentation and compliance burdens
29119-2 has Conformance on page 1. To claim Conformance, there are 138 “Shalls” to conform to in this document. To claim “Tailored Conformance” without meeting every “Shall”, “justification shall be provided…whenever a process defined in…29119 is not followed”
Part 2’s vocabulary section has conflicts, revisions, and pointers to new terms relative to Part 1. This is not a “gotcha” – but is worth remembering when someone claims that with a test standard “At least there is a common vocabulary for testing”.
Conformance is driven by fear. Fear is the mind-killer.
Some of the “shalls” are highly specific. Some are vague and hard to understand. Some, through reference, contain multitudes. Some are nonsense.
The standard is not detailed enough to be very useful to someone who doesn’t already understand a fair amount about testing, yet an experienced tester would waste a lot of time and effort attempting to comply with it.
29119-2 goes to Conformance very early – Page 1. Either Full or Tailored conformance can be claimed for the standard.
“Full conformance is achieved by demonstrating that all of the requirements (i.e. shall statements) of the full set of processes defined in this part of ISO/IEC/IEEE 29119 have been satisfied.”
“Tailored conformance is achieved by demonstrating that all of the requirements (i.e. shall statements) for the recorded subset of processes have been satisfied. Where tailoring occurs, justification shall be provided (either directly or by reference), whenever a process defined in…ISO 29119 is not followed. All tailoring decisions shall be recorded with their rationale, including the consideration of any applicable risks.”
I can find no guidance on what “the recorded subset of processes” means. Not just what the various nesting levels of “process” are in the standard, either. Are these the processes that reference record-keeping and documentation? I bet I can find a consultant to help not-interpret that…
There is a “Reference” example given for exclusion from the requirement for providing direct justification:
“Where organizations follow information item management processes in standards such as ISO 15489… ISO 9001…or use similar internal organizational processes, they can decide to use those processes in place of the information item management tasks defined in this part of ISO/IEC/IEEE 29119.”
So, no exclusion from the requirement to document and describe the justifications – just an exclusion from the requirement to provide a separate document including these justifications for ISO 29119, as long as they are in another document somewhere else.
After 10 months, the only defense raised thus far by the authors of the standard to the questions about difficult compliance is to claim it is more flexible than what is actually said in the standard:
… and that’s the last message in the conversation. I suppose we could take the word of a standard author over the standard itself, which says with little ambiguity under Intended Usage: “The organization shall assert whether it is claiming full or tailored conformance to this part of ISO/IEC/IEEE 29119”.
Section 2 spells out definitions for some terms. There is overlap with Section 1 – and some disagreement with what was found there.
For example, in Section 1, Feature Set meant “collection of items which contain the test conditions of the test item to be tested which can be collected from risks, requirements, functions, models, etc.” Section 2: “logical subset of the test item(s) that could be treated independently of other feature sets in the subsequent test design activities”. Additional differences, revisions, and pointers to new terms are found. This is not a “gotcha” – but is worth remembering when someone claims that with a test standard: “At least there is a common vocabulary for testing”, ISO 29119 already has divergence in critical definitions between its first two parts.
At least these terms are interesting to think about. It’s far less interesting to trace the relationships between test activity, test item, test condition, rest requirement, test phase, test plan, test policy, test planning process, test procedure, test procedure specification, test condition, test process, test sub-process, test script, test set, test models, test technique, test specification, and test type. Yes, these are all separate things, but time spent debating their boundaries is time not spent “testing”.
Exploratory testing is again defined as “spontaneously designs and executes”, not “simultaneously” as we define it.
Process and Hierarchy
This diagram shows a hierarchy of test processes. It doesn’t actually cover all the processes referenced in the standard, despite the caption’s claim. The diagram does demonstrate the standard’s insistence on separating control processes from execution processes.
It is intended to illustrate that the vertical layers define each other downwards. First is the organizational process that defines process for organizational test policies, which dictate policy, strategy, process, procedure, and “other assets”. Test Management Processes are defined at the project level, Dynamic Test Processes are said to control a phase or particular type of testing.
This seems tailored for adoption by the mid-level executive who wants to put their stamp on an organization’s entire testing practice. Over and over again, the standard lays out separate process nodes for each possible step of testing. This exhaustive documentation of the steps involved in one view of testing is way too much for an experienced tester, who would rather provide useful information to stakeholders. It’s still not enough to arm someone with no testing experience to plan and supervise good testing. So who is it for?
When Fear Drives Testing
Software testing is frequently perceived as a high-risk, low-reward activity by people who aren’t testers. It’s thought of as a cost center (“there is no ROI in testing”) and if anything goes wrong, someone’s in trouble. Over and over again, testing is blamed for poor quality, despite the fact that most people who work in software engineering know “you can’t test quality into the product”. Testing is often thought of as less intellectually rigorous than other parts of software engineering, frequently is not a prestigious area to work in, sometimes is led by people without real training, experience, and/or skill in testing, and is often a convenient scapegoat for quality issues – particularly by people who should know better.
Many people that work in testing (rightfully) fear the buck stopping on their desk after a quality failure, and for good reason. If you are likely to have blame imposed for a bug escape, the most rational response by a skilled person might be to interrogate the context and demand the tools and latitude to gather the most comprehensive and useful set of information about the system under test.
If you are controlled by fear, you might shy away from the responsibility, and look for some cover under best practices. After all, if you faithfully observed and obeyed someone else’s plan, you can’t be blamed if the plan fails, right? It wasn’t you, it was the plan!
If you don’t know what you are doing, you might be even more likely to seek the comfort of an externally defined standard that removes your responsibility to decide what to do. If you don’t trust your team (and yourself), you hand off control to someone or something else. Like a prescriptive standard, full of “shall statements” to replace “you thinking”.
The standard is still not detailed enough to be very useful to someone who doesn’t already understand a fair amount about testing, yet an experienced tester could waste a lot of time and effort trying to comply with it. Any discussion of actual techniques seems to be waiting for 29119-4 – at one point promised for late 2014, currently late in the approval process.
There are 138 instances of “shall” in this document. Some of them are highly specific. Some, by reference, contain multitudes. Some are simply nonsense. Some of them are too vague to be useful, though that may make them more applicable in multiple contexts. Some real wisdom can be found in here.
I spent some time pulling apart the various processes, sub-processes, dependencies, and circular references. Rather than try to further sketch out the overall shape of process (and documentation) requirements, I present my 10 most entertaining/concerning/Kafkaesque “Shall Statements” in ISO 29119-2:
The person responsible for organizational test specifications shall implement the following activities and tasks in accordance with applicable organization policies and procedures with respect to the Organizational Test Process.
The organizational test specification requirements shall be used to create the organizational test specification.
Appropriate actions shall be taken to encourage alignment of stakeholders to the organizational test specification.
The traceability between the test basis, feature sets, test conditions, test coverage items, test cases and test sets shall be recorded.
The testing of the feature sets shall be prioritized using the risk exposure levels documented in the Identify and Analyze Risks activity (TP3).
Any risks that have been previously identified shall be reviewed to identify those that relate to and/or can be treated by software testing.
Each required test activity in the Test Strategy shall be scheduled based on the estimates, dependencies and staff availability.
Those actions necessary to implement control directives received from higher level management processes shall be performed.
Readiness for commencing any assigned test activity shall be established before commencing that activity, if not already done.
The test coverage items to be exercised by the testing shall be derived by applying test design techniques to the test conditions to achieve the test completion coverage criteria specified in the Test Plan…
NOTE 2 Where a test completion criterion for the test item is specified as less than 100% of a test coverage measure, a subset of the test coverage items required to achieve 100 % coverage needs to be selected to be exercised by the testing.
It’s not all baffling. Here’s a richly meaningful shall statement that demonstrates something about the depth necessary to understand context:
A Test Strategy (comprising choices including test phases, test types, features to be tested, test design techniques, test completion criteria, and suspension and resumption criteria) shall be designed that considers test basis, risks, and organizational, project and product constraints…
NOTE 3 This takes into consideration the level of risk exposure to prioritise the test activities, the initial test estimates, the resources needed to perform actions (e.g. skills, tool support and environment needs), and organizational, project and product constraints, such as:
a) regulatory standards; b) the requirements of the Organizational Test Policy, Organizational Test Strategy and the Project Test Plan (if designing a test strategy for a lower level of testing); c) contractual requirements; d) project time and cost constraints; e) availability of appropriately-skilled testers; f) availability of tools and environments; g) technical, system or product limitations.
The last third of 29119-2 is an Annex mapping clauses of other standards (ISO 12207, ISO 15288, ISO 17025, ISO 25051, BS 7925, and IEEE 1008) to 29119-2. Rather than critique these other standards, I will simply question the value and purpose of this exercise. Is it to justify the standard, or to prove that it equals or even supersedes the others?
We still have parts 3 (and 4? soon?) of 29119 to go. Having processes defined before considering what we want to accomplish will guarantee we end at our desired results (whatever that might be), right?
I recently volunteered to be a mentor for the Speak Easy program that Fiona Charles and Anne-Marie Charrett have started. Being accepted as a mentor is one of the proudest moments of my career. I’d like to talk about why this program matters, why I volunteered, and why this honor means so much to me.
First, the program:
Speak Easy came about when Fiona Charles and Anne-Marie Charrett decided to walk the talk about creating diversity at technical conferences. As conference organisers, they’ve seen the challenges in sourcing diverse speakers for their programs. As experienced speakers they’ve understood and seen the difficulty in getting experience and confidence to speak at a professional level.
My experience as a conferenceorganizer bears this out. Women are not the only underrepresented group as speakers at most technical conferences; it’s essentially every demographic besides straight white dudes that has a less-than-proportional ratio of conference attendee to conference speaker at most conferences I’ve attended. Try that experiment – assess proportions of the audience relative to the proportions of the speakers – at the next conference you attend, or when looking at a board, executive team, government body, etc. You know, in case constantly being told this is an issue wasn’t enough, and you needed some independent verification.
Why Does It Matter?
Because speaking gigs improve careers. A lot. And these opportunities should be accessible to everyone who deserves them.
Many conferences don’t pay anything beyond a conference admission for the time spent preparing and practicing a track session. Some will help a little with hotel, meals, or other travel expenses. Keynote speakers are more likely to get a small honorarium. None of this compensation is significant compared to the raise a promotion can bring, or the revenue of a consulting gig.
Some people are lucky enough to speak at conferences on an expense account, but consultants give up billable time to attend and speak at conferences. The time spent preparing material is also significant, and is more non-billable time.
So why do we do it? We often call it marketing, because getting your name and your company’s name out there can be of real value, and it’s good to have Google results. Plus, that lets you use marketing budget for travel expenses. But that’s not the only reason – and for most people, it’s not the biggest value they personally get.
Not only are speaking gigs exposure to the world at large for you and your ideas, they are practice for stepping in front of a group of peers and confidently declaring a point of view. Only half of the phrase “thought leadership” is about ideas. That is an important part – study, research, and articulating what you know is a learning exercise that will help you grow. The other half is about confidence, persuading, connecting with an audience, and other skills that are necessary to lead effectively, both formally and technically. The experience of stepping out on front of a room that you think is staring at you, getting over your terror, fielding questions/challenges, and staking out your claim as an expert will increase your effectiveness in a variety of situations. I’ve known a very few people self-possessed enough to confidently speak in front of a group. Everyone else has to get over their impostor syndrome and learn how to do this in order to advance their careers.
Speaking at a conference is one of the best ways to get this experience, and will help your career. It’s a valuable opportunity, and if you hope to lead, either formally or with your ideas, it is something you should strongly consider pursuing. If you would like a gentler on-ramp, consider addressing a Meet-up, or lightning talks/ “speed-geeking” at a conference.
My blog posts sometimes branch and overlap like legacy code that no one feels confident enough to refactor. So, new feature: the Too Long, Didn’t Read summary:
Metrics are useful tools for helping evaluate and understand a situation. They have similar problems to other kinds of models.
People believe metrics provide facts for reasoning, credibility in reporting, and safety in decision-making.
Questioning metrics remains an important mission of our community.
Metrics are Models
A metric is a model. I see modeling here as a way of representing something so that we can more easily understand or describe it. They have value in expressing a measurement of data, but they need context to be information.
I could look into my pasture full of hundreds of nerfs grouped in their pods, and communicate what I see as “There sure are lots of them.” Or, I might say “There are 1138 of them in 82 pods. Well, there were 1138 when I counted them all up last week. Oh wait, there have been seven calves, one death, and two missing since them. Yes, 1142, definitely 1142. I think. Unless some died or came back. And there are a few pregnant females out there. Still, only males for meat until wool production recovers.”
Other people have dug into the validity of metrics in great detail previously, and I don’t want to get sidetracked into (just) validity. We will get to the use of metrics shortly, but to get us into the right state of mind:
If I were to say that after implementing goat pairing in one pod of nerfs as a trial, nerf losses were at 7%, is that a good or bad number?
If nerf losses were 14% in the period before introducing goat pairing, does that help? What if I point out that there are an average of 14 nerfs in a pod? Are you going to ask where in the sample period today is?
Did I mention that wool production is down 38% because of the goats snacking on nerf fur clumps?
Meat revenue is up 3% this season.
Per animal? No, overall.
Meat prices are down relative to wool prices lately , but still up 5% this year to about $5.25.
How many animals butchered? I record that separately, but usually just divide pounds sold by 600 and use that for investor reports and taxes.
The last, hardly subtle point: make sure what you measure matters.
And….time. It’s not that helpful to pick apart specific metrics – whether they measure something real, or if they are based on CMM Levels, KLOCs, defect densities, nerf herd finances, and other arbitrary/imaginary constructs. It’s not that helpful because it doesn’t necessarily change minds. Let’s instead discuss why people are so enamored with metrics, how they use them, and speculate on what they might be getting from them.
Quantifying With Measurement
By measuring something, we may feel like we are replacing feel with facts, emotions with knowledge, and uncertainty with determinism. By naming a thing, we abstract it; the constant march of computer science is to reduce complexity by setting aside information we don’t need, and simplify things to fewer descriptors. Everybody enjoys the idea of being a scientist.
Similarly, we feel more control when we can point to a number. We can say that a thing is a certain height, length, size, etc, and we feel like we understand it. We’ve reduced the complexity to where we can describe a thing, removing the need to try to transfer some bit of tacit knowledge if we understand what we are looking at, or deceiving ourselves about how much we actually understand if we don’t. Everyone likes to feel clever.
We can then discuss quantities, group things that seem to be similar, and so forth. This means we can put it in spreadsheets, we can talk about how many people are needed to produce certain quantities, etc.
Of course, once something is represented by a number, it invites dangerous extrapolation: “Once we implement goat pairing across all pods, we’ll make $252,000 more!”
You Can’t Argue With Facts
When we can cite a number, wherever it comes from, we might feel like we are making quantitative judgments, removing our judgment and opinions. Something that is a fact isn’t open for interpretation, right?
This provides us with cover and safety. Instead of stating an opinion, we can claim we’re simply pointing at reality. If you make a mistake in judgment, metrics can be the justification for why you did it. Wouldn’t anyone else have made that choice with those facts at hand?
Where did my facts come from? If they are measurements, how do I take them, and what do I discard? Why do they mean what I say they mean, and why do they mean that here and now? This is the slippery stuff that allows us to frame a discussion with our version of “facts” and interpretation of what they mean, inserting our biases and opinions while maintaining the illusion that we are making completely quantitative decisions, using only logic and reason, denying our influence in stacking the deck in the first place.
“Quantitatively, we’ve had the same experience as everyone else – goat pairing is essential for maximizing wool production.”
We Have to Measure Our Progress Somehow
If you do get a person pursuing metrics to admit problems with validity, a common deflection for reframing these conversations is to claim that however flawed they might be, metrics are an external requirement that is not open for discussion. When the boss’ boss demands metrics – or when we say that they do, we are attempting to end the conversation about the validity/need for metrics. Persisting with these questions past this signal that this is not open for discussion is going to reduce future influence, or worse.
This resolve comes from the experience of being asked to report status, which is essentially answering the following set of questions:
Is there progress being made?
Is the schedule still accurate?
Do you need help with anything?
If the answer is No, No, or Yes, there will need to be additional supporting detail. You are persuading another person to act or not act, committing personal credibility, and taking the risk that what you claim is correct enough that they won’t look foolish for endorsing it and you.
Reporting, Cloaked in Metrics
We often have limited opportunities to prove ourselves. We want our bosses, and our boss’ bosses, to believe that we are smart and capable. Presenting metrics to bolster our conclusion makes us feel more credible – and it can’t be denied that when the subject isn’t understood, almost any metrics are going to sound impressive and credible, making everyone involved feel smarter.
Many of us have found ourselves in discussions where a stakeholder is looking at a chart where the underlying measurements are barely – or not at all – understood, but they will still question the shape of curves and graph lines, asking for explanations when any troughs appear. This can be a powerful mechanism for having a discussion about the relevant issues, but there is a tradeoff in presenting a single metric – and having that become the standard.
Good reporting communicates facts, risks, context, and recommendations. Metrics that don’t support one of these are not in the mission of reporting.
What Does it All Mean?
Is it really true we can’t run a business without metrics? I don’t think I am advocating that, but I am suggesting we can help make it disreputable to manage to flat, two-dimensional metrics as if they were reality.
Managers have been simmered in the pot of using best practices to manage to metrics for at least a generation. Questioning metrics, both in formulation and usage, is an important mission of our community. We need to be thoughtful about when and how we raise these issues, but understanding the components of our reasoning is necessary to be confident that we are reasoning well.
In the last couple of months, other people outside of the Context-Driven Community have spoken up about the disagreements we’ve long had with certification and standards. One of the articles is here. Go ahead and read, it’s short and I’ll wait.
On first reading, the implication seemed to be that the Context-Driven Community’s approach to testing is from a single perspective – even though the editorial’s pronouncement is essentially CDT:
“Limiting oneself to a single perspective is misguided and inefficient…It’s not either this or that. It’s all of the above. Testers must come at complex problems from a variety of ways, combining strategies that make sense in a given situation—whatever it takes to mitigate risks and ensure code and software quality.“
Does the editorial writer know what CDT is about? This is something that could be said by any number of people in my community. My concern is that people who are not familiar will get the impression that CDT simply has different process or method prescriptions – a common fallacy amongst people who don’t (or won’t) understand what Context-Driven means. This is really frustrating, since this is the opposite of one of the most important things to us. We keep saying that our prescription is to examine the context, and select tools, methods, and reporting that are appropriate for the context. We have a bias against doing things that we see as wasteful, but we also acknowledge that these things may need to be done to satisfy some piece of the context.
Despite essentially agreeing with us, the mischaracterization of our point of view was necessary to serve the structure of the article as an Argument to Moderation. This is both a trope of modern “journalism” and a logical fallacy: selecting/characterizing two points of view as opposites, and then searching for some middle, compromise position, usually with pointed criticism directed at “both sides” to demonstrate how much more reasonable and wise the observer is.
This is a flawed model, though. Sometimes one position is simply correct. Often the two positions are not talking about the same reality, and the framing is important. There are typically more than two positions available on an issue, but as with politics, two seems to be the intellectual limit, with every point of view placed somewhere on a spectrum between slippery-slope extremes.
The debate – such as it is – about ISO 29119 is suffering from a lack of voices willing to take up for the standard’s content and mission. Even the authors of the standard are responding to criticism by defining down what “standard” means and what it’s for. No one seems to be speaking up against the things CDT says, but there are people who seem to be enjoying contradiction for its own sake, or taking on a bystander role, clucking about personal agendas without naming anyone or anything as an example.
Debate is appropriate for describing conversations about subjects where there is professional disagreement. That’s what’s here – and that’s all that’s here. We can disagree, as professionals, and it’s fine. “Can’t we all just get along” was first uttered as a call for peace during riots where people were being injured and killed. A professional debate is not a riot. I don’t hate people I disagree with. I consider them colleagues, and if we didn’t disagree, what would we talk about? If we didn’t feel passionately, why would we bother debating?
I’m not a fan of yelling at people on Twitter. It makes many people uncomfortable, nuance is lost, and often, the person doing the yelling just looks mean. These are all valid criticisms of communication style, but not of substance – both in the sense that it ignores the issues at hand, and in that complaining about the PR instead of the content is a transparent mechanism to claim the higher ground.
If you want to talk about how our community supports and nurtures young thinkers, discussion of this particular subject is valid and important. If you want to talk about twitter manners in order to not-so-subtly discredit a point of view without actually engaging with it, it’s not hard to see that.
People working within and profiting from a system are almost always going to think the system works well, despite whatever flaws they might acknowledge. Any criticism of the system is a challenge to the status quo, and will be opposed by the people working within a system. Particularly when you profit from a system, you should not expect to be exempted from criticism of that system, or your role in it. It was ever thus, and there is no reason why this field, or this subject should be any different.
I speak at conferences about the things I do and think that pertain to my field of study. I expect to encounter other experts, and be asked questions. If I didn’t get any questions, I probably didn’t say anything new, important, or relevant.
If you sell certification training or work on standards bodies, you nominate yourself as a spokesperson for the ideas you clearly support – or that support you, more like. If you claim expertise on a subject, or purport to accumulate anecdotes and then pass off your opaque classifications and conclusions from them as statistical evidence, you should expect to be asked questions and asked to provide more detail. If you are not willing to speak for and defend your ideas, maybe you shouldn’t be willing to profit from them, either?
If you’re an observer, you could add something to the discussion by debating the issues at hand. If your contribution is just to tone police, maybe sit this one out?
I made the mistake of clicking through. The article is titled “Best Practices of Context-Driven Testing”, and it is delicious, savory troll bait.
I had to leave a comment, which follows. Still too shallow – but that is appropriate for the subject:
(Update: a month later, comment not approved. As always, beware putting effort into comments and commenting interfaces.)
I’m not hung up on words. It’s just these two words really suck.
I understand that some people get tired of the CDT community digging in their heels on “Best Practices”, but it needs to be understood that this isn’t about being schoolmarms who pedantically correct “Who” with “Whom”. It’s about rejecting the idea that there is one correct answer in any situation, and that “Best Practices” are inextricably tied to the idea that you can replace judgment, skill, and experience with process weenery.
“Best Practices” can and will never accomplish their goal: replacing skill and experience with canned knowledge and prescriptive recipes. A skilled practitioner can explain when a practice makes sense, and when it doesn’t. An expert can evaluate a situation and solve a problem. An amateur risks believing they are an expert because they read a Wikipedia page – or can regurgitate a “Best Practice”.
People that think the “Best Practice” label is harmless have probably never run into a situation where someone is using that particular appeal to a nebulous, non-existent authority to assert control and dismiss disagreement. Many people have seen and experienced how dangerous it is to allow anyone to wave around whatever practice that sounds good that they read about last and call it “Best”.
Good testing requires examining what you know, where it came from, how much you trust it, and how you might falsify it. When a professional tester dismisses criticism of “Best Practices”, they are suggesting that they are probably not very good at testing. Not because they don’t have a prescribed reaction to a blasphemous term, but because they seem dismissive of the deep thinking required to be good at testing.
A theme seems to have developed on this blog. There has been a lot of complaining about the control mechanisms of industrial-scale quality management. Let’s not change course now; we must stay committed to the process.
Today, I want to talk about the pathology of “Standardization” – the idea that the most efficient way for large organizations to “manage” work done by many people is to make the tools and/or processes the same across the organization, specifically for testing. I am not talking about rolling tool licenses up into an enterprise license, or even reporting structures coalesced into “establishing centers of excellence” (more like “marketing scheme to justify cost for enterprise tools of mediocrity”, amirite?)
And of course, many things do benefit from standardization. Railroad travel, measurement (Metric, ‘Murica!), Lego, hardware interfaces, operating systems, etc. Often, it is the right thing to standardize.
Achieving standardization through generalization process engineering is really trying to replace variables of skill and experience with a perceived constant provided by documentation of some idealized process and a set of required artifacts.
The desire to model our world as full of easy equivalencies is easy to understand; we make decisions all the time between two or more choices – or at least “choices” as we have framed them (or allowed them to be framed for us). The reduction of complexity to symbols is necessary for us to decide what to do, where to go, and how to get there without paralysis.
Testers are rich sources and frequent users of heuristics. Heuristics are very effective when used responsibly, skillfully, and in the right circumstances. What always matters is context. Nothing is truly abstract.
Choosing between an apple or an orange for a morning glass of juice is a matter of preference, and a very different choice than deciding which tree to plant in your yard, which requires considering climate, sun exposure, and soil. Apples and oranges is not even really “Apples and Oranges” without understanding why the choice is being made, who is making it, and what the desired outcome(s) is.
Standardizing Test Practices
I believe that process weenies who lead standardization efforts really believe most of the things that they say. They believe that if they can properly standardize, document, and implement the Amalgamated Consolidated, Inc way to do things, they will save the company money, shorten testing cycles, implement the proper metrics, and reduce hard and soft costs. I didn’t say they are right, but I am saying that they are not intending to mislead when they make those claims. Given how the Iron Triangle of Cost, Time, and Quality works, they can indeed move towards two of the corners.
In addition to the very common pathology of presenting goal statements like “save money and improve efficiency” as a strategy, there are some other things that are said that I find troubling. Let’s unpack some of these.
“If we standardize, our training costs will be lower.”
“Standardizing will make it easy to transfer work and employees between groups.”
Effective testers know the software and system they test, and how best to work with the people on their team. If they go to a new team, that knowledge will be lost from their old team. The relocated tester will need to build new contextual information again. If they get work from another team, they will have the same challenges.
It is incrementally cheaper to have one set of process documentation than two, or one set of software manuals. Of course, that documentation will already have holes by the time it is published, and by the time processes have a year or two to evolve, the documentation will have notes, exceptions, and whole sections that are flat-out ignored. Oh, you are going to keep the documentation updated, you say? Does that ever really happen?
The truth is, new employees aren’t really going to learn any faster from attempting to make every department do things the same way. They are going to have to learn whatever process they need to know to be successful in the area in which they are hired. Who cares whether Group A does things the same way as Group B? Not the new employee, they only care about how their group does things.
Every department/business unit/team will have a number of “local variables” – where data is stored, how to get equipment, refreshing builds – all of the contextual parts of the process that can only be learned through practice. It is also hugely important to learn what the group’s priorities are set, who gets mad, who is receptive, how this manager likes to run things, what that director means when they say things cryptically in email; every new employee has to learn these to be effective.
What portion of onboarding time is *really* spent on learning the steps of a process?
“HR says we have to formalize job descriptions/responsibilities/salary bands.”
Sometimes, the organization is looking to rationalize job descriptions, salaries, and other things that make it easier for bracketing employees against each other. This process is almost never good for workers. There might be some very pious talk about making sure pay is “fair” across the organization; but my experience is that it is far more likely expensive salaries end up targeted for cutting than low salaries are raised.
Treating people like interchangeable, programmable cogs is not only dumb, it’s dehumanizing and demotivating. Smart, passionate people will be motivated to find somewhere else to work where they can use and grow judgment and skill. If you are looking to commoditize testers between groups, you are likely to end up with a pool of McTesters with similar skill levels, and results across projects may also be similar in an equally undesirable fashion.
“It costs us money to have redundant tools in the organization.”
Yes, this is absolutely true. But enough about middle management.
Tool-centric views of testing are somewhat less prevalent than they were a few years ago (though automation-obsession is a large problem). Open source and custom tooling seems to be pulling ahead – because there is always a bias towards and a significant focus on cost-cutting.
If you believe that all tools are equivalent, it becomes easy to make “dollars and sense” decisions at some remove from the front lines to reduce redundancy and merge the organization’s knowledge. Unfortunately, this is simply not true. Tools are not equally fit for the same purposes.
All tools have strengths and weaknesses – most of which are less important than the skill and judgment of the tool operator. If you actually do take away the tool an experienced and skilled person is comfortable using and replace it with another, you are also discarding a great deal of experience and developed work – the cost of which might be difficult to measure, but is really hard to replace without significant time and energy. Sure, some of it is crap – most automation code is. The useful bits would go into the garbage with the rest of it. A sunk cost, perhaps – but still discarded value.
Briefly, this trope: “We have wasted redundancy trapped inside more than one code base”. As if code were perfectly commented and interchangeable, ready to be pulled off a shelf and swapped in to a waiting receptacle as if it were a battery.
“All of our documentation looks different.”
Standardization of document templates is probably harmless, beyond giving anal-retentive ninnies some cover in the form of work product to justify their grab at “thought leadership”.
Perhaps the first question should be “So?” or “And?” Documentation is just a way to communicate important information to the people who need it. Who actually consumes the documentation? What information are they looking for?
“We’re all doing things differently.”
Different groups of people will choose different methods for attacking different problems – or perhaps even the same ones. The collective skills, experience, and inclinations of one group will be different than another’s – so of course they will come up with different ways to do things.
This “argument” is a great example of “begging the question;” why is it bad to do things differently? What is to be gained by forcing people to work at the same pace in the same way? Effective groups will develop ways to work together efficiently – a process of continuing improvement. It has to be asked – why is standardization so important, really? Does it make people feel safer, more secure, or less at risk? What is the real value of “consistency”? Are we solving for real problems or neuroticism?
It was in a sales context (full of other buzzwords like “cadence”) that I first heard the phrase “in lockstep”. This led me to call this “Three Legged Racing” – an old child’s activity that rewards careful synchronization, and is sometimes intended to deliver a teamwork lesson. The two children could run to the finish separately much faster, but tying them together induces a crippling limitation, forcing them to discard their natural instincts and abilities to stumble along, trying to get somewhere while struggling against their constraints, with grass-stained clothes, shortened tempers, and injured ankles sure to follow.
Human beings have some natural skills and tendencies – which vary from individual to individual, but in the same way children run freely and naturally, people think, talk, and work in ways that feel comfortable and effortless. The best work product results from people not wasting effort on process overhead and administrivia, allowing them to find their flow. When people aren’t allowed to work in natural ways, it’s much harder for them to accomplish anything at all, and they will be unhappy.
I remind myself not quite frequently enough that I must be careful assigning motivations to others; this is how you end up begrudging and resenting people who don’t really spare you a second thought. Most people (everyone) are trying to fumble their way along with the rest of us, and are doing what they think is the right thing to do, by some formulation of what they think the right thing to do is. People are never as sinister as an irritated person might think – though even the worst people feel justified in what they are doing, trapped by circumstance into making perfectly rational and logical decisions.
When we try to prevent mistakes by attempting to dictate future activities, we are using the fear of what truly incompetent people might do to force competent people to discard their judgment. We are harshly judging people we don’t know, and we are supposing that we can still make better decisions than them without context.
If we are arrogant enough to try to dehumanize people in the future by giving them questionable marching orders from the present, we create environments that are not healthy for thinkers and passionate people – and they will leave.
Once, someone irritated me about using the term “Best Practices”. I spouted off.
After reflection, I’ve realized that while comparing test case spreadsheets to Nickleback was good for getting people that already agree with me to snicker, it was not helping me make my point to people who did not already have that point of view.
Clearly, “Vancouver Creed” is evocative of strong emotions to many music fans, but lots of people like them, and I don’t just mean Avril Lavigne and other Canadian Mediocrities. They sell millions of records and sell out arenas, even today. These guys are rich beyond even *my* wildest dreams, and that is a pretty frightening Lynchianhip hop video.
The core issue is not the wording of whatever label is used for defining helpful practices as an abstract concept, but with trying to control dialogue about practices by insisting that precedent is the most important concept in deciding what to do. Here, I’ve tried to spell out what I think the Worst of the “Best Practices” label really is: how it can be used to fool ourselves and each other into rushing without sufficient care or information to bad decisions.
1. Implied Framing
Anyone can point at anything and say that it is a “Best Practice”. I think that adding chicken stock before whipping is a good practice for mashed potatoes. I’ve made mashed potatoes that way and liked it, so I could decide to call that a “Best Practice”.
However, that would be dumb. My wife doesn’t like mashed potatoes that way. She, and many other people, have different techniques to reach different definitions of delicious, creamy mashed potatoes.
If I insist on a practice because it will create an outcome *I* want, I am assuming that the outcome I want is exactly the same as everyone else wants. There are other problems here to discuss, but that is the first – that my view of things is the only valid one, and that everything must serve the end.
Many recommended testing practices seem to be focused on the end of establishing central control and repeatable testing practices. That may serve the purposes of a person “managing” a large testing project, but most stakeholders don’t care about that.
2. Over-Simplification – Problem and Implementation
Mashed potatoes are a terrible analogue for testing software. In fact, any process seen as deterministic from a handful of factors is the absolute opposite of useful.
All software projects are one-offs created by developers, testers, and project staff of various skill levels and engagement, to solve different problems, using different components, based on variably incomplete understandings of requirements. In a given context, at a given time, a team of (hopefully) smart and (always) flawed people work together to make JUST ONE thing that never existed before, and follow-up with as much bug-fixing as they have time, money, and energy for. They stumble through a project, splitting their time with other projects and non-work situations, and eventually call it done. Then they move on to create something else, with a different team addressing a different set of requirements, some a little more skilled, some a little more cynical or a little less interested. Requirements, conditions, and the skill and engagement of the people on the team, are all moving targets.
Software testing isn’t mashed potatoes. It isn’t even a restaurant; the closest I can get to is an “Iron Chef” style competition: here are some ingredients and requirements you may or may not be familiar with and an arbitrary timeline – go!
That being said, indulge me. Let’s briefly consider the simple system of mashed potatoes:
– Mashed potatoes involves first, potatoes. Selecting Yukon Gold, Redskin, Russet, Purple Peruvian, or another variety makes a huge difference. This woman is all wrong about types of potatoes, by the way. Redskin potatoes make awesome mashed potatoes. The age of the potatoes at picking, and how long they hung around before cooking matters, too. Finally, do you peel them first or not?
– Usually, the potatoes are boiled until soft. The mineral content of the water? Salt and garlic cloves in the water (no and yes for me)? How soft?
– Then they get whipped, often with a hand mixer, sometimes with a stand mixer, maybe with a food processor. Some amount of salt and butter is added. A splash of milk is a good idea, though I like sour cream or yogurt better. Some people like cheese.
– Then some amount of time passes, and the potatoes are eaten hot, reheated, or cooled off, and may or may not have gravy.
Are all mashed potatoes the same? I guess that depends on how much you care about the quality of the mashed potatoes. What’s good enough for a Tuesday night? What’s good enough for Christmas Dinner? If you wanted commodity potatoes, why not just get instant?
3. False Assertion of Authority/Expertise
“I’m an expert on potatoes. And mashing them. So let’s do it my way.”
If the person is saying that because they have a true love of potatoes and many years of experience, they should still be sensitive to the context: the available materials, time, budget and skill.
This is far more preferable than the next most likely situation: someone read something or watched a presentation, and are attempting to apply techniques they haven’t used. This turns on the worst assumption of a process engineer – that process trumps skill, experience, and context.
I love aphorisms, but one of my favorite is “Knows enough to be dangerous”. This is a place that applies.
4. Shut Down Debate/Appeal to Fear
This is related to the previous. Once you believe the right software testing is a matter of selecting the right process, then you can do whatever it takes to advocate for a “pure” implementation of it.
One highly effective way to control people is with fear.
Whether you construct legal strawmen, invoke the boss’ name to try to access people’s fear buttons, or go straight for the fear of losing their jobs if the project fails, the easiest way to halt debate is to make people think that implementing processes that worked somewhere else has less risk – even if you have to make it personal. You can always claim later that you were just being cautious, which will have many people forgiving you for your attempted manipulation.
I don’t mean to minimize the important of safety. If people don’t feel safe, of course they will be more conservative in decision making. My experience is that there is a lot more fear out there than is appropriate. I am saying that amplifying the level of fear is unkind at best, often cruel, and never fair as a technique for debate.
Sometime, there is no debate. A consensus builds (or the HiPPO makes the call), and perceived risk is reduced; whether that is project or political risk is rarely untangled.
Somehow, I think we are a long ways from being done with debating these issues. “Use Best Practices” is a mantra that has spread far and wide. We have a lot of work left to do.
Happy holidays to everyone. I hope that wherever you are, and whoever you’re with, that your mashed potatoes are satisfying to all.