On Certification

The Agile Alliance saved me a blog post, and said it much better.

It is the position of the board of the Agile Alliance that employers should have confidence only in certifications that are skill-based and difficult to achieve.

We also believe that employers should not require certification of employees.

And…

We are not a certification body and do not endorse any certification programs.

Seriously, click the link. It would be nice to see AST say something like this.

Metrics Fixations: How People Feel About Numbers

The tweet that inspired this post:

Metrics that are not valid are dangerous.

TL;DR

My blog posts sometimes branch and overlap like legacy code that no one feels confident enough to refactor. So, new feature: the Too Long, Didn’t Read summary:

  1. Metrics are useful tools for helping evaluate and understand a situation. They have similar problems to other kinds of models.
  2. People believe metrics provide facts for reasoning, credibility in reporting, and safety in decision-making.
  3. Questioning metrics remains an important mission of our community.

Metrics are Models

A metric is a model. I see modeling here as a way of representing something so that we can more easily understand or describe it. They have value in expressing a measurement of data, but they need context to be information.

nerfherder
She Chooses Whoever Shoots Last?

I could look into my pasture full of hundreds of nerfs grouped in their pods, and communicate what I see as “There sure are lots of them.” Or, I might say “There are 1138 of them in 82 pods. Well, there were 1138 when I counted them all up last week. Oh wait, there have been seven calves, one death, and two missing since them. Yes, 1142, definitely 1142. I think. Unless some died or came back. And there are a few pregnant females out there. Still, only males for meat until wool production recovers.”

Other people have dug into the validity of metrics in great detail previously, and I don’t want to get sidetracked into (just) validity. We will get to the use of metrics shortly, but to get us into the right state of mind:

  • If I were to say that after implementing goat pairing in one pod of nerfs as a trial, nerf losses were at 7%, is that a good or bad number?
  • If nerf losses were 14% in the period before introducing goat pairing, does that help? What if I point out that there are an average of 14 nerfs in a pod? Are you going to ask where in the sample period today is?
  • Did I mention that wool production is down 38% because of the goats snacking on nerf fur clumps?
  • Meat revenue is up 3% this season.
  • Per animal? No, overall.
  • Meat prices are down relative to wool prices lately , but still up 5% this year to about $5.25.
  • How many animals butchered? I record that separately, but usually just divide pounds sold by 600 and use that for investor reports and taxes.
  • “All models are flawed. Some are useful.”
  • Remember not to confuse models for what they represent, lest you get the metrics – as opposed to the results – that you are looking for.
  • Correlation is not causation. It’s especially suspect when you are trying to explain something in retrospect.
  • The last, hardly subtle point: make sure what you measure matters.

And….time. It’s not that helpful to pick apart specific metrics – whether they measure something real, or if they are based on CMM Levels, KLOCs, defect densities, nerf herd finances, and other arbitrary/imaginary constructs. It’s not that helpful because it doesn’t necessarily change minds. Let’s instead discuss why people are so enamored with metrics, how they use them, and speculate on what they might be getting from them.

Quantifying With Measurement

By measuring something, we may feel like we are replacing feel with facts, emotions with knowledge, and uncertainty with determinism. By naming a thing, we abstract it; the constant march of computer science is to reduce complexity by setting aside information we don’t need, and simplify things to fewer descriptors. Everybody enjoys the idea of being a scientist.

894 story points, 37 story points per dev per sprint...
894 story points, at 37 story points per dev per sprint, is….

Similarly, we feel more control when we can point to a number. We can say that a thing is a certain height, length, size, etc, and we feel like we understand it. We’ve reduced the complexity to where we can describe a thing, removing the need to try to transfer some bit of tacit knowledge if we understand what we are looking at, or deceiving ourselves about how much we actually understand if we don’t. Everyone likes to feel clever.

We can then discuss quantities, group things that seem to be similar, and so forth. This means we can put it in spreadsheets, we can talk about how many people are needed to produce certain quantities, etc.

Of course, once something is represented by a number, it invites dangerous extrapolation: “Once we implement goat pairing across all pods, we’ll make $252,000 more!”

You Can’t Argue With Facts

When we can cite a number, wherever it comes from, we might feel like we are making quantitative judgments, removing our judgment and opinions. Something that is a fact isn’t open for interpretation, right?

leprechaun_on_the_loose

This provides us with cover and safety. Instead of stating an opinion, we can claim we’re simply pointing at reality. If you make a mistake in judgment, metrics can be the justification for why you did it. Wouldn’t anyone else have made that choice with those facts at hand?

Where did my facts come from? If they are measurements, how do I take them, and what do I discard? Why do they mean what I say they mean, and why do they mean that here and now? This is the slippery stuff that allows us to frame a discussion with our version of “facts” and interpretation of what they mean, inserting our biases and opinions while maintaining the illusion that we are making completely quantitative decisions, using only logic and reason, denying our influence in stacking the deck in the first place.

“Quantitatively, we’ve had the same experience as everyone else – goat pairing is essential for maximizing wool production.”

We Have to Measure Our Progress Somehow

If you do get a person pursuing metrics to admit problems with validity, a common deflection for reframing these conversations is to claim that however flawed they might be, metrics are an external requirement that is not open for discussion. When the boss’ boss demands metrics – or when we say that they do, we are attempting to end the conversation about the validity/need for metrics. Persisting with these questions past this signal that this is not open for discussion is going to reduce future influence, or worse.

Is this Accurate? Precisely.
Is this Accurate? Precisely.

This resolve comes from the experience of being asked to report status, which is essentially answering the following set of questions:

  • Is there progress being made?
  • Is the schedule still accurate?
  • Do you need help with anything?

If the answer is No, No, or Yes, there will need to be additional supporting detail. You are persuading another person to act or not act, committing personal credibility, and taking the risk that what you claim is correct enough that they won’t look foolish for endorsing it and you.

Reporting, Cloaked in Metrics

We often have limited opportunities to prove ourselves. We want our bosses, and our boss’ bosses, to believe that we are smart and capable. Presenting metrics to bolster our conclusion makes us feel more credible – and it can’t be denied that when the subject isn’t understood, almost any metrics are going to sound impressive and credible, making everyone involved feel smarter.

Many of us have found ourselves in discussions where a stakeholder is looking at a chart where the underlying measurements are barely – or not at all – understood, but they will still question the shape of curves and graph lines, asking for explanations when any troughs appear. This can be a powerful mechanism for having a discussion about the relevant issues, but there is a tradeoff in presenting a single metric – and having that become the standard.

Good reporting communicates facts, risks, context, and recommendations. Metrics that don’t support one of these are not in the mission of reporting.

What Does it All Mean?

Is it really true we can’t run a business without metrics? I don’t think I am advocating that, but I am suggesting we can help make it disreputable to manage to flat, two-dimensional metrics as if they were reality.

Managers have been simmered in the pot of using best practices to manage to metrics for at least a generation. Questioning metrics, both in formulation and usage, is an important mission of our community. We need to be thoughtful about when and how we raise these issues, but understanding the components of our reasoning is necessary to be confident that we are reasoning well.

Arguments to Moderation

In the last couple of months, other people outside of the Context-Driven Community have spoken up about the disagreements we’ve long had with certification and standards. One of the articles is here. Go ahead and read, it’s short and I’ll wait.

On first reading, the implication seemed to be that the Context-Driven Community’s approach to testing is from a single perspective – even though the editorial’s pronouncement is essentially CDT:

Limiting oneself to a single perspective is misguided and inefficient…It’s not either this or that. It’s all of the above. Testers must come at complex problems from a variety of ways, combining strategies that make sense in a given situation—whatever it takes to mitigate risks and ensure code and software quality.

Does the editorial writer know what CDT is about? This is something that could be said by any number of people in my community. My concern is that people who are not familiar will get the impression that CDT simply has different process or method prescriptions – a common fallacy amongst people who don’t (or won’t) understand what Context-Driven means. This is really frustrating, since this is the opposite of one of the most important things to us. We keep saying that our prescription is to examine the context, and select tools, methods, and reporting that are appropriate for the context. We have a bias against doing things that we see as wasteful, but we also acknowledge that these things may need to be done to satisfy some piece of the context.

Despite essentially agreeing with us, the mischaracterization of our point of view was necessary to serve the structure of the article as an Argument to Moderation. This is both a trope of modern “journalism” and a logical fallacy: selecting/characterizing two points of view as opposites, and then searching for some middle, compromise position, usually with pointed criticism directed at “both sides” to demonstrate how much more reasonable and wise the observer is.

JusticeGinsberg

This is a flawed model, though. Sometimes one position is simply correct. Often the two positions are not talking about the same reality, and the framing is important. There are typically more than two positions available on an issue, but as with politics, two seems to be the intellectual limit, with every point of view placed somewhere on a spectrum between slippery-slope extremes.

The debate – such as it is – about ISO 29119 is suffering from a lack of voices willing to take up for the standard’s content and mission. Even the authors of the standard are responding to criticism by defining down what “standard” means and what it’s for. No one seems to be speaking up against the things CDT says, but there are people who seem to be enjoying contradiction for its own sake, or taking on a bystander role, clucking about personal agendas without naming anyone or anything as an example.

Debate is appropriate for describing conversations about subjects where there is professional disagreement. That’s what’s here – and that’s all that’s here. We can disagree, as professionals, and it’s fine. “Can’t we all just get along” was first uttered as a call for peace during riots where people were being injured and killed. A professional debate is not a riot. I don’t hate people I disagree with. I consider them colleagues, and if we didn’t disagree, what would we talk about? If we didn’t feel passionately, why would we bother debating?

I’m not a fan of yelling at people on Twitter. It makes many people uncomfortable, nuance is lost, and often, the person doing the yelling just looks mean. These are all valid criticisms of communication style, but not of substance – both in the sense that it ignores the issues at hand, and in that complaining about the PR instead of the content is a transparent mechanism to claim the higher ground.

obamaseewhatyoudidIf you want to talk about how our community supports and nurtures young thinkers, discussion of this particular subject is valid and important. If you want to talk about twitter manners in order to not-so-subtly discredit a point of view without actually engaging with it, it’s not hard to see that.

People working within and profiting from a system are almost always going to think the system works well, despite whatever flaws they might acknowledge. Any criticism of the system is a challenge to the status quo, and will be opposed by the people working within a system. Particularly when you profit from a system, you should not expect to be exempted from criticism of that system, or your role in it. It was ever thus, and there is no reason why this field, or this subject should be any different.

I speak at conferences about the things I do and think that pertain to my field of study. I expect to encounter other experts, and be asked questions. If I didn’t get any questions, I probably didn’t say anything new, important, or relevant.

If you sell certification training or work on standards bodies, you nominate yourself as a spokesperson for the ideas you clearly support – or that support you, more like. If you claim expertise on a subject, or purport to accumulate anecdotes and then pass off your opaque classifications and conclusions from them as statistical evidence, you should expect to be asked questions and asked to provide more detail. If you are not willing to speak for and defend your ideas, maybe you shouldn’t be willing to profit from them, either?

If you’re an observer, you could add something to the discussion by debating the issues at hand. If your contribution is just to tone police, maybe sit this one out?

Proposal for AST SIG on Standards, Certification, and Regulation

Immediately after CAST 2014, I entered a proposal for an AST Special Interest Group (SIG). I’ve put the contents of that proposal below, as a dialogue with what AST presents as the process for creating a SIG.

I have not yet received feedback from the AST board on this proposal. In any case, I consider it a declaration of principles and intention. I hope to work with the AST board on formalizing positions that the AST will take on the important issues our community faces. I look forward to seeing AST’s response to my proposal.

AST SIG on Standards, Certification, and Regulation

“Starting a SIG (http://www.associationforsoftwaretesting.org/programs/sigs/ ): Below are items to be included when proposing a SIG. Please email your proposal to the President. The AST board of directors will review the proposal and respond.”

  1. What is the focus, need, and purpose of this SIG? (An example mission or charter is helpful)
  2. What activities will be done by this SIG? (publications, workshops, multimedia)

I have included a charter below that describes these activities.

  1. Who will lead the SIG? (They must be an AST member in good standing)

Eric Proegler, initially. Fiona Charles, James Christie, Lee Copeland, Iain McCowatt, Huib Schoots, and Mark Tomlinson are the other founding members.

  1. Is there a need for funding? What will it be used for and how much is needed?

Yes. We are asking for about $600 USD, for purchasing ISO 29119-1 through 3 (~$200 each). Sections 4 and 5 are not yet published. AST will own these documents.

Charter: AST SIG on Standards, Certification, and Regulation

Our industry and our craft thrives and grows when professional testers can apply their skills and experience to solving testing problems. The quality of our work will decline if constrained by standards and certifications, which will mean buggier software released into the wild and increased risk to the public.

Resistance in our community to testing standards and certifications is broad, but somewhat diffused. In this environment, a small number of people have taken the opportunity to create systems where they can present themselves as the arbiters of how testing should be conducted, and as the gatekeepers to the profession, creating a direct financial benefit for themselves. These standards and certifications are frequently marketed in terms of liability and risk incurred by organizations who do not employ and require them – and insist that their service providers do the same.

Our world essentially already runs on software. In the coming years, unmanned vehicles, robotics, advanced medical devices, and other complex algorithm-driven control systems are going to make this even more obvious to a public that does not have a deep understanding of how software engineering and testing are related. Quality issues such as healthcare.gov are typically seen by the public as testing failures, not software engineering failures. If we allow standards and certification vendors to define the dialog, we will someday soon find ourselves in a regulatory climate reacting to the latest Very Bad Thing with an imperative to Do Something – with vendors who stand to profit ready to describe what that should be.

By engaging directly with standards, certification, and regulation, we hope to defend our craft, protect our fellow testers, and educate the public on testing. We must provide detailed critiques, describe alternatives, and call attention to what works. We will publish commentary and criticism on Testing Standards, Tester Certification, and the Regulation of Software Testing and Software Quality. We will directly engage with and influence these issues by contacting standards bodies, legislative staff, and the press.  We will advance AST as a credible, trusted, cited, and authoritative source of information for people who do not work in software engineering when they think, write, and legislate about software testing and quality.

Publishing:

  • We will work with the board to articulate official AST positions on Testing Standards and Tester Certifications, reflecting the mission and values of AST.
  • We will publish critiques of specific standards and certifications, starting with ISO 29119 and ISTQB. We will describe where they fall short, and making it clear who is advancing them.
  • We will publish short background pieces on how modern software testing is performed for audiences outside of software testing, in order to inform and influence how they think about testing and verification.
  • We will publish about testing intelligently and effectively in regulated environments, such as FDA regulated testing, the JPL’s experience with Exploratory Testing on spacecraft software, and other examples that attest to the value and trustworthiness of modern testing techniques and approaches.
  • We will provide timely commentary when software issues become news stories.

Engagement:

  • We will present our positions and rhetoric at testing conferences, sharing our findings and inviting feedback to improve them.
  • We will publicly seek seats on standards bodies such as ISO, enter comments on standards under development, and contribute to new standards.
  • We will conduct workshops to finalize and gather support and signatories for the positions and critiques we develop.
  • We will monitor legislative activity pertaining to software testing, quality, and engineering, report on what is happening to AST at large, and attempt to intervene/engage with legislative investigations and actions.
  • We will cultivate contacts with media organizations to encourage them to contact AST Members for information about software testing and quality.

Defending Against Standards and Certification

In the few weeks since James Christie scared the crap out of CAST 2014there have been a number of blog posts, a hashtag, a slow and large skeet target, a petition, and other actions taken. We now know about rent-seeking, some have properly contextualized ISO 29119, some of the lions in our field have spoken, and some of us have laid down markers for longer engagement with these issues.

Tough questions continue to be posed, and strong analyses of who is involved and what they’ve said in the past continues. A usual suspect has oscillated between poking fun at these concerns, saying that standards don’t really matter, and attacking our motives and how we identify ourselves (a rotation “Greatest Hit” rather than anything new, admittedly). Interesting how people who complain about “disrespectful and antagonistic rhetoric” have such constructive things to say.

Someone else is now unpopular (or, newly – James Bach has attempted to engage with him on multiple issues in the past), and there is enough of that to go around. There are sincere people who think standards can be helpful. Some have said that what has happened is an overreaction, but I think of it more as an overdue one.

rentseeking
The Uninteresting Fits of Capitalist Dreams

It’s the most coherent and together I’ve ever seen my community on an issue. There are long-standing, deep divides between my community and others who work in software quality on standards and certification. We disagree on issues that have a significant impact on our profession. Tone policing doesn’t address the actual disagreement. Intellectually eviscerating the content and arguments is simply waved away. We could talk more about compliance, pathetic and otherwise, but it seems we are finding a collective voice, and hopefully, we will no longer be dismissed as a lunatic fringe.

Rather than try to talk about all of the issues or re-echo some of what has already been said, I want to talk about why these issues matter to me enough to seek action. What are the practical impacts of standards and certifications on people working in testing?

I am against the testing standards and certifications I’ve seen because they I think their goals lead towards making testing a shitty job for a lot of people, and amplify the idea that testing and testers can be commoditized. When the experimentation and learning of testing is reduced to a series of reproducible steps, it is easier to “manage” (hire/fire/outsource) the people that do it, and to pay them less. Let’s not be mistaken: skilled testers are not desired in many of these situations. They are expensive, difficult to replace, hard to scale – and hard to commoditize.

Who Does This Affect?

For starters, everyone that works as a software tester. It’s difficult to know how many software testers there are in the world, and if that number is growing or shrinking. The US Bureau of Labor Statistics has a job code for testing, but seems to lump testers into a group called “Computer Occupations, All Other” with many other classifications. There were 196,000 US employees in that bucket in 2013.

I found a citation of 350,000 US testers in 2007. I heard Laurent Bossavit’s soft but steely voice in my head as I pursued this number (intersecting with the infamous NIST $59.5 billion annual cost of bugs legend), and found talk of calculations based on how many developers there were (~1.2m in 2007). The bucketed classification scheme I mentioned previously has only 183,000 for 2007 in the US. If we choose to base this on some custom calculation of available data, we could choose a fraction of the 1.4m developers as previously guessed at, or maybe even the 3.6m in IT.

ISTQB claims they have certified 336,000 testers through the end of 2013, with about 50,000 certifications issued in 2013. 8.1% of these in the Americas is claimed; that’s about 27,000 in the Western Hemisphere.

In any case, this gets us to a number somewhere between several hundred thousand and a few million testers in the world – and only a tiny percentage of those are flying the CDT flag.

The Testing Profession

My testing job – and the jobs of most of the people I know that work in testing – are decent jobs, but that’s because we have generally found our way to situations where we can apply judgement, experience, and skill to interesting and challenging work. These days, that circle is selected to the people I know from testing conferences and Twitter, which is certainly not the testing community at large.

Many – and I nearly said most – of the jobs I’ve seen in testing outside of this circle have not been good jobs, particularly in larger organizations. They have the “lowest-rung” IT job, sometimes considered just above first-level help desk, and sometimes below. Many testers use these jobs as stepping stones to get to something that pays better and is more interesting. Others see it as a path to management. Very few people see testing as a viable career, particularly when faced with what the testing jobs in their organization are. These testing jobs are always about to be reorganized/outsourced/sacrificed as budget markers because of how people react to issues reaching deployments, and because these jobs are difficult to tie to revenue.

One reason why these testers are not as effective is because of how they are forced to work; they spend a lot of time producing test case/test execution documentation – the primary criticism of (and prescribed “output” from) ISO 29119. Producing test case documentation might be interesting, and is an opportunity to find issues the same way writing automated checks is; you can find a problem while exploring the software in order to create test cases. Repeating the subset of activities that were actually documented from this process as manual checks is not the most effective way to find software problems, and usually doesn’t find new ones. It is checking to see if old problems reappear, and isn’t really testing. Documenting this low-value activity in detail is not useful, and lowers whatever value and effectiveness the “testing” effort was going to have even more by increasing the time spent not testing. These distinctions are either hard for factory testers to understand – or deliberately avoided.

Only Until Automation Is Cheaper
Only Until Automation Is Cheaper

Perhaps more importantly, checking and documenting checking is shitty, boring work that is not enjoyable and has significant handicaps to finding bugs baked in. It might give someone who has never seen the software before a guided tour, but testers who work with the software every day are not learning anything that will help them improve the software – or themselves. Testers stuck in these jobs spend their days doing uncreative, ineffective work, increasing the justification for and ease of implementation in outsourcing or automating their jobs away, leaving them without marketable skills and prospects the next time a desperate executive pulls out his spreadsheets, looking for something to squeeze costs on.

Developers seized their profession and remade it so they could be more effective, stop wasting their time on low-value, irrelevant tasks, recover their craft from industrial thinking, and made their job more rewarding, in multiple senses. Couldn’t and shouldn’t we do that, too?

Liability

Liability Musical Chairs - Who's The Least Cunning?
Liability Musical Chairs – Who’s The Least Cunning?

Sometimes when things go very wrong, there is an insistence that liability land on someone. Sometimes not. One example of scapegoating people for failing to prove a negative is the six-year jail sentences, and $10m in costs and fines assigned to 7 scientists in Italy who were held “responsible” for not warning the public about an earthquake. My experience is that testers are easy scapegoats for quality problems; the public sees bug escapes as testing failures instead of engineering failures.

My community’s resistance to ISO29119 and other attempts to control how others test could be critical to some future ally tester trying to defend intelligent testing – and themselves. In the US legal system, the rules for evaluating whether a technical or scientific methodology is relevant is directly tied to consensus, and maintenance of standards.

The Daubert Standard, from http://www.law.cornell.edu/wex/daubert_standard:

Standard used by a trial judge to make a preliminary assessment of whether an expert’s scientific testimony is based on reasoning or methodology that is scientifically valid and can properly be applied to the facts at issue. Under this standard, the factors that may be considered in determining whether the methodology is valid are: (1) whether the theory or technique in question can be and has been tested; (2) whether it has been subjected to peer review and publication; (3) its known or potential error rate; (4) the existence and maintenance of standards controlling its operation; and (5) whether it has attracted widespread acceptance within a relevant scientific community. See Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993). The Daubert standard is the test currently used in the federal courts and some state courts.  In the federal courts, it replaced the Frye standard.

The rigor our community applies to evaluating claims is a good fit here. We need to keep loudly and consistently calling bullshit on hand-wavy best practice and standard assertions, or they will become the *standards* our work is *judged* by.

Legislation

Many people throw up their hands about government activities, considering it a waste of their time to follow, frustrated by its arbitrary, uninformed, and frequent ineffectualness. Others rage over the sinister motivations assigned to the side they don’t agree with. There are always plenty, living in libertarian fantasies (redundant), who want to call the whole thing irrelevant.

Meanwhile, in the real world, government makes decisions every day, careening from one crisis to another, trying to solve every problem with lobbyist-authored, detailed instructions on how everything must work. Seems like a standards or certification board? Or at least the reality of what they want?

Here is my US Representative (lower Federal House) – the representative for all of Silicon Valley – trying to ask a question about a “Security Wall” for healthcare.gov. She is brushed off with the weakest of doubletalk.

It seems clear that we can choose to engage with this issue, or let other people define the terms for everyone. I want us to avoid becoming to subject to standards and certifications chosen for us by people who don’t understand the issues involved. Can we educate them and the public? Or will we have them respond reactively in the heat of a moment?

The Future: Algorithms, Automation, and Robotics

2014-05-13-google-self-driving-car-8
Once we solve for potholes, and road debris, and weather, and construction…

People get excited about hardware, but the future arrives in software.

I see these amazing test challenges all the time, mostly because this picture was taken within a mile of my house, and even closer to where these are being developed. Old-timey automakers are working on this now, too. There are other vehicles standing by.

The economic drive to automate shows no sign of letting up. Economic drives do not have consciences – they require people to pay attention and check abuses.

Our present already runs on algorithms that can cut communicationsshut down powercrash the stock market, or even kill us when there are bugs. Our access to water, food, health care, education, employment, housing, and transportation are all subject to computer systems working correctly – even when people operate them “correctly”. This is a very good place for a reminder that software and testing are still very young fields, with a lot left to learn and a lot farther to go.

HypeCycle_2014
This is not Sci-Fi. And if it were better, it would have bio-hacking. Still!

In the future, more and more complex tasks will be automated, to the extent that the important decisions can be automated. A few humans will be needed for exception cases, but software will be expected to handle more and more routine tasks. The economic effects of this trend are troubling enough. The risks of autonomous software issues in medicine, transportation, economics, energy, and “defense” need to be met by engaged, expert testers. We need thinking, exploring testers with time and space to do their best in order for our future to be safe.

The Bottom Line

I want people who work on software to demand skilled testing because of its superior risk mitigation. Modern testing is the way forward because it is more effective for exposing bugs. Our natural allies in the Agile Community will be happy to have us step forward and take control of the testing narrative. They are no more interested in us wasting time generating proof of compliance documentation than we are in doing it.

I will speak up for my values, and the values of my community. I will help amplify the ideas of my community, and look for ways we can influence how the world at large thinks about testing. I will push for our community’s treasures to be acknowledged as serious testers who are the very best in the world at what they do.

I want testing to be a skilled, well-paid profession. I want testing to be a profession that bright people can learn from, improve at, and advance in, not just something they pass through on their way to something else – unless they want to. As well-trained critical thinkers, they should be successful wherever they go.

Enough sitting out, being polite, and waiting for someone else to step forward. I will not stand by while consultants sell out our craft as a marketing exercise. We must rescue testing from compliance and documentation, and make it so that skilled testers are expected to choose the appropriate test methods and documentation for the stakeholders and context they find themselves on.

So far, I have helped set up and maintain professionaltestersmanifesto.org, which I hope you’ll consider signing. Karen Johnson has done a great thing for our community in helping articulate our principles.

I have talked about these issues in a podcast interview, though it has not posted yet. I have some writing to do here, and other places. I want to speak clearly and loudly about the benefits of modern testing, and how standards and certifications limit testers and testing.

Longer term, I have other work to do on this. Shortly after CAST, I made a formal proposal to AST for a Standards and Certification SIG. More on that soon.

Marketer Successfully Trolls Tester

I made the mistake of clicking through. The article is titled “Best Practices of Context-Driven Testing”, and it is delicious, savory troll bait.  

I had to leave a comment, which follows. Still too shallow – but that is appropriate for the subject:

(Update: a month later, comment not approved. As always, beware putting effort into comments and commenting interfaces.)

I’m not hung up on words. It’s just these two words really suck.

I understand that some people get tired of the CDT community digging in their heels on “Best Practices”, but it needs to be understood that this isn’t about being schoolmarms who pedantically correct “Who” with “Whom”. It’s about rejecting the idea that there is one correct answer in any situation, and that “Best Practices” are inextricably tied to the idea that you can replace judgment, skill, and experience with process weenery.

“Best Practices” can and will never accomplish their goal: replacing skill and experience with canned knowledge and prescriptive recipes. A skilled practitioner can explain when a practice makes sense, and when it doesn’t. An expert can evaluate a situation and solve a problem. An amateur risks believing they are an expert because they read a Wikipedia page – or can regurgitate a “Best Practice”.

People that think the “Best Practice” label is harmless have probably never run into a situation where someone is using that particular appeal to a nebulous, non-existent authority to assert control and dismiss disagreement. Many people have seen and experienced how dangerous it is to allow anyone to wave around whatever practice that sounds good that they read about last and call it “Best”.

Good testing requires examining what you know, where it came from, how much you trust it, and how you might falsify it. When a professional tester dismisses criticism of “Best Practices”, they are suggesting that they are probably not very good at testing. Not because they don’t have a prescribed reaction to a blasphemous term, but because they seem dismissive of the deep thinking required to be good at testing.

Good use of #7, btw: http://www.campaignercrm.com/en/community/blog/crm/post/10-best-practices-for-corporate-blogging/

Standardization (is) for Dummies

A theme seems to have developed on this blog. There has been a lot of complaining about the control mechanisms of industrial-scale quality management. Let’s not change course now; we must stay committed to the process.

doge_wink
Very irony…much humor…wow

Today, I want to talk about the pathology of “Standardization” – the idea that the most efficient way for large organizations to “manage” work done by many people is to make the tools and/or processes the same across the organization, specifically for testing. I am not talking about rolling tool licenses up into an enterprise license, or even reporting structures coalesced into “establishing centers of excellence” (more like “marketing scheme to justify cost for enterprise tools of mediocrity”, amirite?)

Note the diversity of characters
Note the diversity of characters, even with “common tooling”

And of course, many things do benefit from standardization. Railroad travel, measurement (Metric, ‘Murica!), Lego, hardware interfaces, operating systems, etc. Often, it is the right thing to standardize.

Achieving standardization through generalization process engineering is really trying to replace variables of skill and experience with a perceived constant provided by documentation of some idealized process and a set of required artifacts.

The desire to model our world as full of easy equivalencies is easy to understand; we make decisions all the time between two or more choices – or at least “choices” as we have framed them (or allowed them to be framed for us). The reduction of complexity to symbols is necessary for us to decide what to do, where to go, and how to get there without paralysis.

Testers are rich sources and frequent users of heuristics. Heuristics are very effective when used responsibly, skillfully, and in the right circumstances. What always matters is context. Nothing is truly abstract.

Choosing between an apple or an orange for a morning glass of juice is a matter of preference, and a very different choice than deciding which tree to plant in your yard, which requires considering climate, sun exposure, and soil. Apples and oranges is not even really “Apples and Oranges” without understanding why the choice is being made, who is making it, and what the desired outcome(s) is.

Standardizing Test Practices

This weenie would like to see your test case documentation format
This weenie would like to see your test case documentation format

I believe that process weenies who lead standardization efforts really believe most of the things that they say. They believe that if they can properly standardize, document, and implement the Amalgamated Consolidated, Inc way to do things, they will save the company money, shorten testing cycles, implement the proper metrics, and reduce hard and soft costs. I didn’t say they are right, but I am saying that they are not intending to mislead when they make those claims. Given how the Iron Triangle of Cost, Time, and Quality works, they can indeed move towards two of the corners.

In addition to the very common pathology of presenting goal statements like “save money and improve efficiency” as a strategy, there are some other things that are said that I find troubling. Let’s unpack some of these.

“If we standardize, our training costs will be lower.”

“Standardizing will make it easy to transfer work and employees between groups.” 

Effective testers know the software and system they test, and how best to work with the people on their team. If they go to a new team, that knowledge will be lost from their old team. The relocated tester will need to build new contextual information again. If they get work from another team, they will have the same challenges.

It is incrementally cheaper to have one set of process documentation than two, or one set of software manuals. Of course, that documentation will already have holes by the time it is published, and by the time processes have a year or two to evolve, the documentation will have notes, exceptions, and whole sections that are flat-out ignored. Oh, you are going to keep the documentation updated, you say? Does that ever really happen?

The truth is, new employees aren’t really going to learn any faster from attempting to make every department do things the same way. They are going to have to learn whatever process they need to know to be successful in the area in which they are hired. Who cares whether Group A does things the same way as Group B? Not the new employee, they only care about how their group does things.

Every department/business unit/team will have a number of “local variables” – where data is stored, how to get equipment, refreshing builds – all of the contextual parts of the process that can only be learned through practice. It is also hugely important to learn what the group’s priorities are set, who gets mad, who is receptive, how this manager likes to run things, what that director means when they say things cryptically in email; every new employee has to learn these to be effective.

What portion of onboarding time is *really* spent on learning the steps of a process?

“HR says we have to formalize job descriptions/responsibilities/salary bands.” 

The 4 different shapes makes them appear more natural...
These are the 4 testing role descriptions we will use across the company…

Sometimes, the organization is looking to rationalize job descriptions, salaries, and other things that make it easier for bracketing employees against each other. This process is almost never good for workers. There might be some very pious talk about making sure pay is “fair” across the organization; but my experience is that it is far more likely expensive salaries end up targeted for cutting than low salaries are raised.

Treating people like interchangeable, programmable cogs is not only dumb, it’s dehumanizing and demotivating. Smart, passionate people will be motivated to find somewhere else to work where they can use and grow judgment and skill. If you are looking to commoditize testers between groups, you are likely to end up with a pool of McTesters with similar skill levels, and results across projects may also be similar in an equally undesirable fashion.

“It costs us money to have redundant tools in the organization.” 

Yes, this is absolutely true. But enough about middle management.

Tool-centric views of testing are somewhat less prevalent than they were a few years ago (though automation-obsession is a large problem). Open source and custom tooling seems to be pulling ahead – because there is always a bias towards and a significant focus on cost-cutting.

If you believe that all tools are equivalent, it becomes easy to make “dollars and sense” decisions at some remove from the front lines to reduce redundancy and merge the organization’s knowledge. Unfortunately, this is simply not true. Tools are not equally fit for the same purposes.

All tools have strengths and weaknesses – most of which are less important than the skill and judgment of the tool operator. If you actually do take away the tool an experienced and skilled person is comfortable using and replace it with another, you are also discarding a great deal of experience and developed work – the cost of which might be difficult to measure, but is really hard to replace without significant time and energy. Sure, some of it is crap – most automation code is. The useful bits would go into the garbage with the rest of it. A sunk cost, perhaps – but still discarded value.

Briefly, this trope: “We have wasted redundancy trapped inside more than one code base”.  As if code were perfectly commented and interchangeable, ready to be pulled off a shelf and swapped in to a waiting receptacle as if it were a battery.

“All of our documentation looks different.”

Standardization of document templates is probably harmless, beyond giving anal-retentive ninnies some cover in the form of work product to justify their grab at “thought leadership”.

Perhaps the first question should be “So?” or “And?” Documentation is just a way to communicate important information to the people who need it. Who actually consumes the documentation? What information are they looking for?

“We’re all doing things differently.”

We Must Be In Lockstep
We Must Be In Lockstep

Different groups of people will choose different methods for attacking different problems – or perhaps even the same ones. The collective skills, experience, and inclinations of one group will be different than another’s – so of course they will come up with different ways to do things.

This “argument” is a great example of “begging the question;” why is it bad to do things differently? What is to be gained by forcing people to work at the same pace in the same way? Effective groups will develop ways to work together efficiently – a process of continuing improvement.  It has to be asked – why is standardization so important, really? Does it make people feel safer, more secure, or less at risk? What is the real value of “consistency”? Are we solving for real problems or neuroticism?

It was in a sales context (full of other buzzwords like “cadence”) that I first heard the phrase “in lockstep”. This led me to call this “Three Legged Racing” – an old child’s activity that rewards careful synchronization, and is sometimes intended to deliver a teamwork lesson. The two children could run to the finish separately much faster, but tying them together induces a crippling limitation, forcing them to discard their natural instincts and abilities to stumble along, trying to get somewhere while struggling against their constraints, with grass-stained clothes, shortened tempers, and injured ankles sure to follow. 

Human beings have some natural skills and tendencies – which vary from individual to individual, but in the same way children run freely and naturally, people think, talk, and work in ways that feel comfortable and effortless. The best work product results from people not wasting effort on process overhead and administrivia, allowing them to find their flow. When people aren’t allowed to work in natural ways, it’s much harder for them to accomplish anything at all, and they will be unhappy.

I remind myself not quite frequently enough that I must be careful assigning motivations to others; this is how you end up begrudging and resenting people who don’t really spare you a second thought. Most people (everyone) are trying to fumble their way along with the rest of us, and are doing what they think is the right thing to do, by some formulation of what they think the right thing to do is. People are never as sinister as an irritated person might think – though even the worst people feel justified in what they are doing, trapped by circumstance into making perfectly rational and logical decisions.

moon_illusion1a
“Train Tracks are SUPPOSED to be the same width!”

When we try to prevent mistakes by attempting to dictate future activities, we are using the fear of what truly incompetent people might do to force competent people to discard their judgment. We are harshly judging people we don’t know, and we are supposing that we can still make better decisions than them without context.

If we are arrogant enough to try to dehumanize people in the future by giving them questionable marching orders from the present, we create environments that are not healthy for thinkers and passionate people – and they will leave.

Mashing “Best Practices”

Once, someone irritated me about using the term “Best Practices”. I spouted off.

After reflection, I’ve realized that while comparing test case spreadsheets to Nickleback was good for getting people that already agree with me to snicker, it was not helping me make my point to people who did not already have that point of view.

Canadian Arena Rock Best Practices
Arena Rock Best Practices

Clearly, “Vancouver Creed” is evocative of strong emotions to many music fans, but lots of people like them, and I don’t just mean Avril Lavigne and other Canadian Mediocrities. They sell millions of records and sell out arenas, even today. These guys are rich beyond even *my* wildest dreams, and that is a pretty frightening Lynchian hip hop video.

The core issue is not the wording of whatever label is used for defining helpful practices as an abstract concept, but with trying to control dialogue about practices by insisting that precedent is the most important concept in deciding what to do. Here, I’ve tried to spell out what I think the Worst of the “Best Practices” label really is: how it can be used to fool ourselves and each other into rushing without sufficient care or information to bad decisions.

1. Implied Framing

Anyone can point at anything and say that it is a “Best Practice”. I think that adding chicken stock before whipping is a good practice for mashed potatoes. I’ve made mashed potatoes that way and liked it, so I could decide to call that a “Best Practice”.

However, that would be dumb. My wife doesn’t like mashed potatoes that way. She, and many other people, have different techniques to reach different definitions of delicious, creamy mashed potatoes.

If I insist on a practice because it will create an outcome *I* want, I am assuming that the outcome I want is exactly the same as everyone else wants. There are other problems here to discuss, but that is the first – that my view of things is the only valid one, and that everything must serve the end.

Many recommended testing practices seem to be focused on the end of establishing central control and repeatable testing practices. That may serve the purposes of a person “managing” a large testing project, but most stakeholders don’t care about that.

2. Over-Simplification – Problem and Implementation

Mashed potatoes are a terrible analogue for testing software. In fact, any process seen as deterministic from a handful of factors is the absolute opposite of useful.

All software projects are one-offs created by developers, testers, and project staff of various skill levels and engagement, to solve different problems, using different components, based on variably incomplete understandings of requirements. In a given context, at a given time, a team of (hopefully) smart and (always) flawed people work together to make JUST ONE thing that never existed before, and follow-up with as much bug-fixing as they have time, money, and energy for. They stumble through a project, splitting their time with other projects and non-work situations, and eventually call it done. Then they move on to create something else, with a different team addressing a different set of requirements, some a little more skilled, some a little more cynical or a little less interested. Requirements, conditions, and the skill and engagement of the people on the team, are all moving targets.

Software testing isn’t mashed potatoes. It isn’t even a restaurant; the closest I can get to is an “Iron Chef” style competition: here are some ingredients and requirements you may or may not be familiar with and an arbitrary timeline – go!

A Demanding Stakeholder who doesn't care about process, just results
A Demanding Stakeholder who doesn’t care about process, just results

That being said, indulge me. Let’s briefly consider the simple system of mashed potatoes:

– Mashed potatoes involves first, potatoes. Selecting Yukon Gold, Redskin, Russet, Purple Peruvian, or another variety makes a huge difference. This woman is all wrong about types of potatoes, by the way. Redskin potatoes make awesome mashed potatoes. The age of the potatoes at picking, and how long they hung around before cooking matters, too. Finally, do you peel them first or not?

– Usually, the potatoes are boiled until soft. The mineral content of the water? Salt and garlic cloves in the water (no and yes for me)? How soft?

– Then they get whipped, often with a hand mixer, sometimes with a stand mixer, maybe with a food processor. Some amount of salt and butter is added. A splash of milk is a good idea, though I like sour cream or yogurt better. Some people like cheese.

– Then some amount of time passes, and the potatoes are eaten hot, reheated, or cooled off, and may or may not have gravy.

Are all mashed potatoes the same? I guess that depends on how much you care about the quality of the mashed potatoes. What’s good enough for a Tuesday night? What’s good enough for Christmas Dinner? If you wanted commodity potatoes, why not just get instant?

3. False Assertion of Authority/Expertise

“I’m an expert on potatoes. And mashing them. So let’s do it my way.”

If the person is saying that because they have a true love of potatoes and many years of experience, they should still be sensitive to the context: the available materials, time, budget and skill.

This is far more preferable than the next most likely situation: someone read something or watched a presentation, and are attempting to apply techniques they haven’t used. This turns on the worst assumption of a process engineer – that process trumps skill, experience, and context.

I love aphorisms, but one of my favorite is “Knows enough to be dangerous”. This is a place that applies.

4. Shut Down Debate/Appeal to Fear

This is related to the previous. Once you believe the right software testing is a matter of selecting the right process, then you can do whatever it takes to advocate for a “pure” implementation of it.

One highly effective way to control people is with fear.

"And then we will probably all get sued!"
“And then we will probably all get sued!”

Whether you construct legal strawmen, invoke the boss’ name to try to access people’s fear buttons, or go straight for the fear of losing their jobs if the project fails, the easiest way to halt debate is to make people think that implementing processes that worked somewhere else has less risk – even if you have to make it personal. You can always claim later that you were just being cautious, which will have many people forgiving you for your attempted manipulation.

I don’t mean to minimize the important of safety. If people don’t feel safe, of course they will be more conservative in decision making. My experience is that there is a lot more fear out there than is appropriate. I am saying that amplifying the level of fear is unkind at best, often cruel, and never fair as a technique for debate.

Sometime, there is no debate. A consensus builds (or the HiPPO makes the call), and perceived risk is reduced; whether that is project or political risk is rarely untangled.

I SAID TMap!!!
I SAID TMap!!!

Somehow, I think we are a long ways from being done with debating these issues. “Use Best Practices” is a mantra that has spread far and wide. We have a lot of work left to do.

Happy holidays to everyone. I hope that wherever you are, and whoever you’re with, that your mashed potatoes are satisfying to all.

Managing Test Organization Transformation

Recently, I attended the 9th Workshop on Software Testing in Financial Services (STiFS9) in New York, hosted by Liquidnet. The Theme of the workshop was “Organizational Structure Models for Test Groups at Financial Firms.” One of the discussions that struck a rich vein of ideas was what is needed for successful transitions between types of organizational models in quality organizations.

While the experiences that the participants drew on tended to be with larger organizations in financial services industries, these ideas could be useful for any testing organization’s transformation – or maybe even outside of testing.  Specifically, we were talking about organizations that were moving from a centralized to decentralized testing organizational model – or from decentralized to centralized in organizations with dozens or even hundreds of testers.

STiFS is a LAWST-inspired peer workshop, meaning that the value of the workshop is the product of all of the participants in the workshop, which included Bernie Berger, Kaveri Biswas, Margaret Boisvert, Ross Collard, Joe Lopez, Mike Pearl, Don Pierce, Anna Royzman, Ben Weber, and myself. We captured pages and pages of ideas from these very smart and experienced testers and test managers. Bernie Berger, the CO of STiFS9 (and the driving force behind STiFS) has provided a report on the workshop here.

This post is not an official finding of STiFS9. It doesn’t include all of the discussion or notes and only represents my interpretation and reflection on what I heard. Other participants may have a different take on what was discussed and what it meant. This post is what it meant to me.

A decentralized testing organization, as I mean it here, has testers reporting into specific projects as part of the staff. This approach might be marketed as “(More) Agile“, though it is certainly not necessary for agile, Agile, Scrum, or Certified Pokemon Master processes.

Arguing Over User Stories
What I Think of When I Hear Scrum

A centralized testing organization usually has a reporting structure that includes most or all testers in an organization in one department, which allocates testers for providing testing services to specific projects. Testers report upwards in the testing department and may work on several different projects, depending on the needs and desires of the whole organization. One marketing term associated with this approach is “Testing Center of Excellence.”

2,857 Career Story Points
A True Centre of Excellence

For the testing organization’s transformation project to start successfully, there needs to be top-down support, from the executive levels of the company. This support can be demonstrated and described by clear communication of the organization’s commitment to the transformation and the goals of the transformation. These goals may be to respond to changing market conditions, updated regulations, concerns about the quality and speed of delivery in the organization, or some combination of these and other factors. They should be articulated as the mission that the transformation is designed to accomplish with clear objectives and sufficient justification.

This communication of goals is usually understood as an essential step for aligning the organization, but less obvious is the need for selling the transition to the organization without jargon or “business speak.” This transparency builds trust with employees, who need to feel safe during transitional periods, need to feel included in the process and planning, and to be optimistic about their jobs and the organization’s future after the transformation. Many people are uncomfortable with and even resistant to change; laying the groundwork will help them process the changes to come. Buy-in enlists people in the organization to support the transformation and its goals, and helps retain key employees through periods of uncertainty. This was called “bottom-up trust.”

This leads to another core requirement: an organization’s leadership may draw up a new org chart (and the new org chart should be clear to everyone), but that doesn’t explain *how* the organization changes successfully, can’t describe all of the details and requirements, or fully define the new roles. In fact, almost all of these details will not be understood and defined at the start of an organizational transformation, making broad buy-in even more essential.

The real hard work of solving problems and helping define new structures is likely to be undertaken by “Champions” – key contributors throughout the company who are influential and respected by their co-workers. Many of these champions should *not* be managers, so that their motivation and contributions are clear. Some influence could come from outside consultants, but that will vary by situation and consultant.

The champions should be known to and perhaps recruited by the person that owns the transformation – the transformation owner. This person must be able to solve technical and political problems, make decisions that stick, and manage an ever-mushrooming list of details. The organization should expect that they will get clear, credible, and authoritative communication on progress from this person – and receive it regularly. A trusted source of information helps keep the organization aligned, and may help counter-program the rumor mill that uncertain people will be paying extra attention to.

The transformation owner should drive collaboration and keep people aligned. One way to support this collaboration is impact analysis on stakeholders, to identify problems, and to help create patience during periods of lower efficiency and missed deadlines that will surely occur as the organization is changing direction. This allows the stakeholders to plan and adjust accordingly, by making it explicit that expectations should be renegotiated.

Communication is only one requirement of the role, though. The owner must be able to advocate upwards and across the organization for accountability and buy-in from everyone to the transformation mission – and get it. Short-term revenue and budget pressures will contend with the work needed to create new processes.

The transformation owner can sell leadership on the need for patience and tolerance for initial failures, and run pilot projects to help make the transformation smoother for the rest of the organization. Teams should be able to experiment – to test the transformation – and provide feedback. The willingness to adjust the plan based on this feedback increases the chance of success.

One thing that was noted as frequently missing in transformation plans was engagement with and consideration of risks. There are lots of different kinds of risks; by engaging and enumerating risks, mitigations can be found. Not only should success criteria be understood – failure criteria may be helpful, too! Frequently, rollback is not an option, but contingency planning may be very helpful in getting through rough patches, particularly if the “worst-case” scenario has been understood.

As the details of the plan are being built, sufficient effort should be allocated to workflow analysis. How does work get done today? How will it be done in the future? Some things will fall between the cracks, and the people who would have noticed previously may be doing different things at that point.

An important part of the transformation plan is knowledge transfer. A concept of an “Organizational Level of Knowledge” was described; the cumulative knowledge of the organization is an important asset, and its value should be recognized, and hopefully preserved. Process documentation is easy to identify, but the unwritten rules are harder. Who can really get something done in a given area? How do you put an announcement out on the grapevine? These are not issues that can get resolved with an increased training budget; they took years to develop and be discovered.

Enter beer.

Wine is fine, liquor might indicate a problem...
The Secret Sauce of Transformation

Whether it is used for team building, to provide opportunities for horizontal communication, or as a coping mechanism, there were broad and strong recommendations for using beer to create social situations for team building, casual/informal communication, and reducing stress levels. This touched off a series of comments about the critical effects on employees going through and adjusting to transformation.

Perhaps the most critical issue is having well-defined new responsibilities. People need to know what is expected of them, and they need to trust that they have clear guidelines. This helps them feel secure, helps makes reporting relationships clear, helps set expectations, and helps them adjust to their new position.

It was suggested that there should be additional focus given by HR (and overtime, if necessary) to understand and hellp mitigate impacts on specific people who are dealing with change. People working with new managers are more likely to take complaints or issues to HR instead of the manager; they need to feel that they are heard and their concerns are addressed. This also gives HR a chance to focus attention and effort on keeping key people such as star performers happy and productive.

One of the strong signifiers (or moment where it becomes real) is when people have to move their desks. Movers, phones, networking, etc. all have their own issues and schedules and will need to be coordinated. Offices and workspaces may need to be redesigned.

Sitting with the new team is important, but there are real impacts beyond a change of scenery. If the geographic location changes and people have to move, there should be some care in how to help them understand it. Even changes like an impact to the commute are going to hit harder when surrounded by uncertainty.

People will need training and coaching for new roles and processes. Agile was cited as a commonly experienced part of a transformation, but even if methodology isn’t changing, there are certainly other changes in workflows, and/or required domain/technical knowledge. Testers should feel supported and that there is patience and tolerance for them to skill up to their new roles.

Some of technical issues around testing were collected by the group. Maintenance of automation and preservation of code and test assets was reported as often overlooked. There is a need to resolve the ownership of shared test code and data – or a decision to abandon it. Test labs and equipment will have to find new owners, and it should be clear how to schedule shared resources. Some cleanup should take place after the project completes.

More subtle impacts include ownership and responsibility for compliance concerns and go-forward updates/forwards/changes/retirements of email aliases, message groups, and distribution lists. Managing configuration changes may need significant rethinking, and permissioning was mentioned as a specific problem that had been encountered.

Even when testing is broadly distributed, some testing functions such as Automation or Performance Testing may still be best supported horizontally. This should be considered as part of the plan, including how to request support, and who will lead these functions.

In addition to all of this input about planning, the group had suggestions for conducting the transformation project. Once the transformation project is underway, continuing updates against frequently occurring milestones will help maintain cohesion and focus. Phases of the project are helpful for keeping things on track.

The plan and progress should be highly visible, and some advocated for a metrics regime as part of understanding where the organization was before the transformation, how the organization is managing during the transformation, and how successful the transformation project was in improving the organization’s testing efficiency. There was a lot of energy around engaging with the measurements that would be used by the organization to evaluate the project. Some of the proposed uses of metrics included the awareness of strengths and weaknesses, and current efficiencies and costs. Skepticism about metrics led to a discussion about the need to avoid inherited bias and future gaming of the system by using clearly documented and tangible metrics.

For project management, phases were suggested to keep the steps small enough to manage. There should be frequent check-ins to discuss how things are going, with different groups of people. Milestones should occur regularly, so that efforts remain bite-sized. This also helps build in flexibility by providing opportunities for assessment and adjustment.

It is also important to celebrate reaching these milestones. Celebration of successes should not wait until the entire project is complete, but occur throughout to help build positive momentum and maintain visibility.

Even with frequent phase events, patience will be necessary, both with schedule and results. Significant changes are being implemented, and everyone involved has to give them a chance to succeed. A couple of knowing comments came up here: “Projects unravel in unexpected ways.” and “Don’t panic because things are going perfectly.”

This patience has to include some tolerance for failure, and patience with understanding and correcting failure. There will be some risk for people to speak up and identify failures and their causes; transformation is frequently a top-down exercise. There is likely to be some disagreement about whether something has failed, or about the reasons for failure when the failure is acknowledged. When the organization is stuck, those responsible for planning are likely to blame the execution.

Failures must be engaged with thoughtfully, so that the organization can learn from it and not repeat it. Managing failure is important for the safety of the organization, and everyone involved. There are first order costs on others such as stakeholders, but the testers are going be frustrated and potentially frightened by failures. The organization has to know when to re-evaluate the situation, when to put parts of the plan on hold, when problems are fixable or not, and – perhaps most importantly – to understand and expect that these will be tactical decisions made during the project, when more information is known.

Discussion about completing the transformation (or declaring it done) came up again and again. As one participant put it, “Sooner or later, you have to land the plane”. People need to understand that the new reality is here, and that they can exhale and focus on doing their new jobs well.

That *Best* Time on Twitter…

Here’s a story. It even has a message.  What it doesn’t have is a villain – though he has a FANTASTIC name for it.  I am definitely not a hero, but I’ll settle for thinker. Let’s start at the beginning.

Before I found Context-Driven Testing in 2004, everything I had heard and read about testing was not applicable or helpful to the work I was doing. The guidance I could find was highly prescriptive, and described testing as an exercise in keeping records of how you had done what you were supposed to do, which had been decided by someone smarter at some previous time. I knew it was wrong – and had done my part to challenge and defeat an ISO 9001 effort at my company- but I had a hard time explaining why.

Then…(take cover! incoming name drops!)..Ross Collard brought me to WOPR3, where I met Rob Sabourin, Brett Pettichord, Antony Marcano, Paul Holland, Roland Stens, Mike Kelly, Karen Johnson, Cem Kaner, Julian Harty, Richard Leeke, Dawn Haynes, and others. There, I heard a reference to schools, and found my way to an earlier draft of Brett Pettichord’s formulation of schools of testing.

I had found thinking about testing that was relevant to me, and that made sense. It changed me forever. Judgement and skill were more important than citing process documentation or checked off spreadsheets of test cases, providing information was more important than gate-keeping, adapting practice and process to the situation was more appropriate than imposing a “Best Practice”. Ever since, I am happy and proud to be a context-driven tester.

Yes, I helped myself to a nice tall glass of Kool-Aid. It tasted pretty good, too. So I had seconds. Mmm, Kool-Aid. I’ll talk more about this in a future post.

The value in discussing testing “schools” today is to understand different approaches to testing – or quality – or quality assurance – or whatever it is called at the place they pay you money to get you to show up and interact with complex systems to mitigate risk. I do not see others who work in testing with different approaches as adversaries, though I do disagree with them, sometimes vigorously. Still, I rep my school aggressively enough that I use the term “Modern Testing” sometimes. More on this another time, as well. Enough exposition, let’s get to the story.

In April, I had seen some Twitter sniping at Rex Black (@RBCS). Rex is a successful testing consultant, speaker, and trainer. He has been around a long time – and he is credited as a contributor to Brett’s schools presentation.

The last few months, Rex has taken some tough shots from the CDT community on certification, metrics, and other issues. At that point, Keith Klain (@keithklain) had been challenging him on Twitter about the reliability coefficient of the ISTQB exam. If you don’t know about that, Keith launched a petition about it here. I don’t have a strong or informed opinion on the matter – but the only test certification I have is BBST Foundations.

So, at STPCon in April, I found myself sitting at a table with Rex before his keynote/panel discussion. I made some joke about him trolling the context-driven community, he suggested it might be the other way around, and it was all very friendly. Then he went on stage for a panel discussion, and that was that.

A week later, I was at the stage of waking where I was not out of bed yet, but I was reading Twitter. I saw this tweet:

‏@RBCS Question for “context-driven”/RSTs: Why the animus toward the common business mgmt phrase “best practice”? Even twitter has best practices.
I thought about it for a minute, and started tapping. 30 minutes later, I had replied with 9 tweets. 10 might have been more psychologically satisfying, but 9 turned out to be the right number.
1. Context matters. You wouldn’t test a free mobile game the same way you would test a medical record system, for example.
2. The idea of Best Practices attempts to substitute process for skill, which leads to crap results.
3. Substituting process for skill is dehumanizing. Do you tell an artisan he’s wrong for not making beer the way Budweiser does?
4. Best Practices are a B-School fairy tale executives think they can use to manage work they don’t understand.
5. Best Practices stifle innovation and improvement by tying processes to something that seemed to work once, and got good PR
6. Best Practices use command-and-control to deliver the efficiency of government and the consistency of fast food.
7. Best Practices don’t evolve well. How do you identify Better Practices when they come along from somewhere that innovates?
8. The term is problematic. “Best” implies there is nothing better – and imposes framing that is not explained.
9. Who decides what “Best” is? Someone sees a presentation, it sounds good? Why won’t everyone say theirs are best? When do we vote?

Rex and I went back and forth on whether I was constructing straw men (somewhat, it must be admitted), and whether oatmeal was a breakfast best practice. Others piped up; there was a lot of talking past each other, and it doesn’t seem like anyone’s mind got changed.

fascinated_cat

OK, if you insist. Some more things I am glad I said:

DDT was a best practice once. Waterfall was the only way to develop software once. Is Agile or Scrum best practice today? Who decided?
The tweet you replied to pointed out a problem in selecting the best practice: if we each do it differently, which is “best”?

On Best Practices are “The consensus of the majority of experienced professionals”:

How was that consensus arrived at? When do we review and update? Anyone can point to a best practice and say a lot of people agree

Where is this consensus published so that we can know we are talking about the same consensus? What if you have last year’s version?

And, since it is my blog post, I’ll repeat my favorite exchange:

‏@RBCS Example: it is a best practice to study defects found in sw dev in order to learn how to write better sw and how to test better.
‏@ericproegler @RBCS …In almost every case, reviewing bugs is valuable. But there are times it isn’t. Can I trust you to know when?

End result? Nothing much. I’m not disciplined or thoughtful enough to have leveraged it into anything beyond a self-aggrandizing (and now recursive) blog post several weeks later. I don’t tweet or blog much, but I did appreciate the feedback and RTs.

I’ve learned that Rex isn’t bothered by the use of the word “Best” the way I and others in my community are.  He has suggested that we are focused way too much on the meanings of words, and others have responded. Six weeks later, he’s getting the same kind of flak, and is saying things like “Best practices must allow adaptation.”

Rex is definitely a good sport, and I respect him for hanging in there while maintaining a couple of other debates. He’s still taking shots from people in my community on this and other subjects. Most seem to be addressing genuine items of disagreement with him, though some of the outrage could be dialed back a bit. I certainly reserve the right to take more shots myself – though some days on Twitter look like this:

Blog you later, test debaters.