On Calling People Out

Hello again! It’s been way too long. I’ve been very busy at The AST, which has been taking every spare moment I have (and some I don’t). I’m revving up for WOPR25, and have been hard at work at my day job.

My blogging career has suffered greatly. So, what’s brought me back (to fighting with Twitter embed code)? Events at a Testing Conference this week.

screenshot-2050

TL;DR

I wasn’t there. What’s been seen on Twitter is supplemented with some accounts from people who were there. I don’t claim to have full knowledge of what occurred, what was said, or the history.

I still know enough. I don’t need to know any more than I do to state unequivocally that this is not OK. I am writing here to state that loudly and clearly.

I am absolutely writing about James Bach, as clearly as I can. I admire his work and his accomplishments. I feel differently about his conduct as a leader in our community, and I am calling him out for that.

This problem is bigger than Tuesday and Maaret. Other women have had similar experiences. It’s time for our community to decide what we will tolerate and what we won’t. Will we look out for each other? Or continue to look the other way?

The world is awakening, or at least being woke. We must do better.

Principles and Framing

There are a couple of things that need unpacking before we continue.

First, I am a member of the AST Board of Directors. I am not writing in this capacity today – I am speaking only for myself, as someone involved in the Context-Driven community. I will still carry these opinions as part of a group that makes decisions about conducting the AST’s business.

Secondly, I wasn’t present. My information is second-hand, except for the majority of it that occurred on Twitter where everyone can see. I think I’ll still be fine, given what I want to talk about.

Third, I know Maaret a little. I greatly respect her, I like her and her husband, and am pleased to share the Introvert’s Nod when we see each other. I have not spoken to her about this. I won’t speak for her, just myself.

Fourth, I know James a little better. I admire and respect his work, his intellect, and his accomplishments. He is truly a giant in our field, and has done a great deal for the practice and profession of testing. I also know he can be an asshole sometimes, and gets plenty of practice.

Despite seeing this over the years, and despite sympathizing with people he’s crushed, I’ve hired him for workshops. I’ve even publicly defended him as worth it on the whole, within the last 60 days. I have made business decisions, and I have some regrets. I write today about what I am thinking today. I want to believe he is a good person, and people that I respect insist that he is.

Fifth, this is not the first time that James has been publicly confrontational towards Maaret. I won’t rehash here, but suffice it to say, James has pressed her in similar ways before. While the behavior and results are consistent, the existence of this history casts Tuesday’s events in a different light.

Sixth, I love debate. Statements that can’t withstand scrutiny or debate aren’t worthy of acceptance. It is good to deconstruct, fact check, turn over, pressure test, and evaluate statements, in proportion to the utility they claim to possess. Sometimes, statements are casual, and not worth deep examination. Sometimes they are offered very seriously, and should be vetted thoroughly. Like most testers, I enjoy this process.

Finally and most importantly, I refuse to engage with this as some referee who needs to “see both sides”. Very few things actually work that way – some fake balance or false equivalence doesn’t bring us closer to truth. It more often props up something that’s just wrong. There is global warming, and humans have significantly contributed to it. You should vaccinate your kids. The toilet paper should go over the top, not down the back. Hillary won the debate. Budweiser sucks. James was wrong to do what he did.

What Happened?

The story as I have it goes something like this: Maaret gives a keynote address at the conference. A few minutes into her keynote, James interrupts her to correct some point of phrasing.

screenshot-2051

This is at least just rude in any circumstances. When the full context is considered – the inescapable, somewhat subjective (but only somewhat) combination of who said it, to whom, why, where, and how – it looks worse. An older man took it on himself to talk over a woman delivering a keynote address.  This is a woman he has previously felt he needed to publicly correct then, too. Much of the audience is nodding along right now, particularly women who’ve been treated this way.

I have no information about what the particular subject was, how that exchange went, or much else in the way of detail other than at the end of Maaret’s talk, there was the opportunity for questions, and James asked none.

What I do know is that at the start of *his* talk, with Maaret in the room, James started with The Slide. Apparently, Maaret objected in real time, and started responding. Which seems to be what James wanted.

As I’ve heard the story, after 10 or 15 minutes, the other people in the room wanted to hear the advertised talk, and asked for that. I hear that eventually occurred, and the rest of the incident happened on Twitter, where it is easily followed.

There are a number of problems here. Let’s take a look:

Naming Maaret: Did Maaret consent to being part of James’ talk? Not as far as I can tell. Why would you start your talk by calling out a specific person? James seems to feel not only that he is justified in doing so, but he is proud of it.

I disagree. Making a discussion of our craft about specific individuals does not make the world better. It means we ask people to choose sides. We make it personal, in every sense.

James could have made whatever reactionary points from an earlier talk he wanted to make without making it personal. He chose to make it personal. This is the single worst thing he did, from my perspective. Personally, I’d have gone with the talk I prepared. Doing otherwise seems a little cavalier with other people’s time and attention.

Coming out against “nice”: A favorite inside joke with my wife and I is that the world often mistakes politeness for niceness, implying that we are thinking and judging others worse than they know. So let’s call it polite. Grownups are generally polite to each other, and that’s a good thing. Being polite and pleasant to each other makes the world a better place. Not being polite hurts people needlessly. If you disagree with someone, you can still do so respectfully. That’s not what happened here.

By contrasting niceness with authenticity, James strongly implies that people that are nice are inauthentic.

Treating people with respect is the minimum requirement for being a decent person. The United States has a vein of people right now who are frustrated with “political correctness”, which they would describe as inauthentic and performative (if they knew those words). I recommend substituting “treating people with respect” every time you hear that phrase – it is enlightening to think about “treating people with respect is destroying our country” or whatever.

Compassion is a great place to be, though. The world is awfully short on it. I know that every person I meet is struggling with things they can’t control but still suffer from. When I am frustrated with people who don’t get something, are boring me, are wasting energy and time – I am not proud of myself. I then shudder to think of how people judge me. That’s why you should be polite…or, “nice”. Because why make this world any worse, if you can help it?

Also, in case you missed it – since the slide is about Maaret, isn’t James essentially calling her inauthentic?

What about anyone else in that room who wants to be nice – or polite? Isn’t that also saying they are inauthentic?

Suggests Debate is Critical: I pretty much agree with this. Who wouldn’t? We need our bullshit detectors turned up high, particularly in testing. If we see or hear something we disagree with, we need to be able to process that…productively.

What I disagree with is the contention that James gets to unilaterally set the rules of engagement, which seem to be that he can challenge anyone at any time about anything he wants. There was a question and answer period after Maaret’s keynote, where James could have asked whatever challenging question he wanted. I hear he did not avail himself of that opportunity. Outside of that window, he needs her consent to have a debate. It’s only a “debate” if both parties want to participate.

If you write that you disagree with someone’s public statements, then no worries, as long as you are reasonably polite about it. If you physically direct your criticism at an unwilling participant, it’s just harassment. Twitter is in person, for some definition of in person, because of @’s. It’s not as confrontational as speaking directly to someone, but it still makes your criticisms known to their subject, usually in real time, and challenges them to defend themselves.

If a person doesn’t want to debate with you any more, that could be because they feel overwhelmed by your superior logic. Or, it could be because they feel they’ve explained their point of view already, and you’re not getting it. Or, they could feel overwhelmed by being criticized by someone they admire. Or, they may have something else they need to do right then and don’t have time to discuss it.

Or, they’ve seen this movie before and know how it ends. Not all “debate” or criticism is honorable in intent. In fact, much of it isn’t, and says much more about the needs of the person doing the criticizing than anything else. If you’ve been close to a person who is emotionally abusive, you’ve experienced criticism for the sake of criticism, designed to hurt, crafted to cut deep, deployed to destroy confidence. It is very difficult to hear criticism and process it as face value, even if you haven’t been abused (many people have been abused).

Even if we grant the positive intention and somehow make it so that everyone can easily assume good intentions, we’re not going to get there with just arguing. We’ll just figure out who is good at arguing in public. If you have to resort to bullying someone into debate as combat sport, then it isn’t about ideas anymore.

Others defending James cited academia. Of course, in academia, people are allowed to assemble their arguments calmly, taking their time to reflect deeply so that they can best represent themselves. A process of real-time showdown does not lend itself to deep reflection.

And sometimes, people are just tired of arguing with YOU.

Deciding that you are entitled to criticize someone in person, whenever you want, however you want, is bullshit. That is demanding that people absorb you being an asshole whenever you think you’re entitled – which is exactly the right word here.

And again, by “defining (his) position vs. Maaret’s”, he is implying she is the opposite. What, she is against examining statements and discussing them? Or is it that she doesn’t grant James his terms of debate – verbal combat any time, any where he feels like it?

Double Negative Something Something: Again, the formulation here seems to imply that since James is contrasting himself with Maaret, she must believe excellence is achievable without focusing attention and energy. Not James’ best work.

In Summary

The way James behaved Tuesday was poor enough that I feel some responsibility to my current and future colleagues in testing to publicly say something about it. Other people tried to tell James he had gone too far. Instead, we got belligerence and admonitions that we simply didn’t understand his points. Let’s bring this home with a “public debate”:

Trying to deal with the problem, though I have a different opinion about what exactly the problem is. I am speaking up here, and saying that this was too much and too far.

 

Hopefully, it is clear that it was some specific things, and some of it was context. It got personal, and it seems to have got personal enough that the arguments got flimsy. Just like you assume we’ll trust your good intentions, assume we really mean what we say.

Any uninvited, in-person debate, yes. That’s exactly right. The other words you might use here are harassment or abuse.

While some of my criticism here is based on the context of an older man speaking at a younger woman, rest assured that James has behaved in a similar fashion to people of different sexes, ages, and ethnicities. When I note the sexist subtext, it’s because when we tolerate this level of disrespect in public discourse, we signal that our community is not a safe place. I’m still not convinced James would be equally likely to interrupt and confront a man.

We understood. You were clear, in what you said, and in your choices around who/what/where/how. The why? We can give you credit for just wanting truth out. But when someone is emotional, it’s hard to take their word that there aren’t other motivations, whether they know them or not.

James, you’re being called out. Your behavior is not acceptable, and it’s time for people to tell you to stop. 

You’ve been a pillar of our community, with roots that go back farther than most of us have been in testing. Our work and our craft are immensely better because of the work you’ve done. But imagine how much farther we’d be without the handicap of our brightest light also “leading” us into infighting and extremism.

You greatly influence the intellectual climate of the community, and therefore have a lot to do with its size and current condition. You’ve built a history of treating people poorly. Much of what has happened is perceived as related to the age and gender of the people you have “debated” on your own terms, regardless of what they were comfortable with. I think there is a gender issue here, too. In any case, one-sided debates are just abuse.

People fear you, and not just because of imposter syndrome. Whether you realize it or not, some good people have been lost to our community because of your behavior. Some were chased away directly by you, and others quietly left after seeing what happens if anyone gets too far ahead of themselves around you.

Sure, you’ve embarrassed some people who “deserved it”. But you’ve also shut down plenty who have not. I won’t publicly list the people who won’t engage with you any more, but I think you can think of a few pretty smart people – mostly women – without trying too hard.

Worse, there are acolytes who model your behavior, and feel entitled to treat people the same way you do. When people feel they have the right to challenge anything anyone ever says in public at any time, they’re getting that from you. I am far more ready to point to the gender implications in this source of abuse.

And it is abuse – it should be clear that no one owes you a debate duel when and wherever you demand it. Our community creates appropriate environments to encourage the easy flow of debate/dialog/talking about stuff. Outside of these, you are back to the expectations of society at large – basic human decency and politeness. Or if that seems inauthentic: not being an asshole, particularly to young women. 

Old white guys all over the place are finding out that behavior people used to tolerate isn’t tolerated any more. It was always wrong, but it’s not being allowed to pass by any more. Time to evolve.

WOPR24 Experience Report

This blog post relays the findings of a peer workshop on performance and reliability testing and monitoring. I will not provide the usual TL; DR summary this time – if you want this gold, you will have to sift it yourself.

On 22-24 Oct, I attended and facilitated WOPR24, a LAWST-inspired workshop for performance engineers. I have been involved in organizing WOPRs for about 6 years now, and have attended 18(!) WOPRs over the last 11 years.

The attendees of WOPR24 were Ajay Davuluri, Oliver Erlewein, James Davis, Andy Hohenner, Yury Makedonov, Eric Proegler, Mais Tawfik Ashkar, Andreas Grabner, Doug Hoffman, John Meza, Michael Pearl, Ben Simo, and Alon Girmonsky. We were hosted by BlazeMeter in their Mountain View, California office, which was a fantastic venue. Adi BenNun of BlazeMeter took great care of us, making us very comfortable, feeding us very well, and making sure we had everything we could want.

WOPR24Group

I could talk for a long while about WOPR, WOPR’s history, our mission of advancing the practice and community building, how WOPRs are put together, and how Ross Collard changed the trajectory of my career and life by inviting me to WOPR3. At some future date, I will. But for now, so that we can talk about the great content of WOPR24, I will just drop a link about WOPR and let you surf that site.

The Workshop Theme

Each WOPR has a Theme, to help focus the experience reports and lay out what we want to explore. From http://www.performance-workshop.org/wopr24/:

Production is where performance matters most, as it directly impacts our end users and ultimately decides whether our software will be successful or not. Efforts to create test conditions and environments exactly like Production will always fall short; nothing compares to production!

Modern Application Performance Management (APM) solutions are capturing every transaction, all the time. Detailed monitoring has become a standard operations practice – but is it making an impact in the product development cycle? How can we find actionable information with these tools, and communicate our findings to development and testing? How might they improve our testing?

Content – Experience Reports

The primary format of WOPR is Experience Reports (ERs), which are narratives supported by charts, graphs, results, and other relevant data. Each ER is followed by facilitated discussion triggered by the ER. 7 attendees presented ERs over 3 days. Systems we discussed included the following:

  1. An online advertising auction system to algorithmically price and purchase ads for specific user profiles in < 40ms. Presenter used Splunk to model and characterize log events, web hits, application instrumentation, business metrics, and other content.
  2. An automotive parts sourcing SaaS application. Presenter discussed supplementing regular human-conducted load tests with CI automated tests. Lively discussion about thresholds and test environment control/resetting ensued.
  3. A mapping application company’s efforts to test with virtualizing high end video cards (https://www.cdw.com/shop/products/NVIDIA-GRID-K2-graphics-card-2-GPUs-GRID-K2-8-GB/3126398.aspx).
  4. An application that collects, rolls up to big data sets, and displays back to the system’s owner detailed operational metrics from a very large number of embedded systems, distributed around the world.
  5. A order-taking website for a large retailer. Discussion was about launch of a site, and the difficulty of enrolling reluctant development stakeholders in performance testing projects.
  6. A non-profit transportation industry service that publishes rate tables multiple times per day to and from all the vendors in that sector. Discussion of concurrency bugs and how they were reproduced.
  7. A large financial services SaaS provider shared some of the issues around testing mobile in-app video and chat. Generating traffic and evaluating results were some of the technical issues we discussed. 

Content – Exercises

Between ERs, we conducted several guided discussions to explore specific areas of interest. Findings from those are relayed here:

What should we alert on?

In this exercise, we just started calling out thresholds and notes. We definitely didn’t finish.

Monitor What Threshold Notes
CPU > 75% Web/App, > 50-60% DB User + System CPU: For Vus with 4 or less and hyperthreaded, Warning 50 alarm 75, critical 95. With > 4/metal with HT disable warning 75 critical 99 Context-dependent Physical and virtual CPUs Per core and overall 1 minute monitor – number of sequential observations to trigger (two at which level? 3 alarm? 1 critical)? One minute? 5 Minute?
Order Rate Dynamic Baseline (Time/Day/Etc) Oliver: Tues is Max, Friday is Min revenue, order rates, conversion rate, bounce rate Business spend: Advertising money out
Failure Rate http error rate against yesterday Need to have a baseline. Beware of bots, synthetic measures, etc
Queue Length: Threads Web/App, detectable at App? Should have threads available, 2threads/core, beware of thread contention/concurrency events Alert on contention/concurrency?
Connection Pool Utilization JMX: DB Connections, Outgoing Requests DB Connections, external web service calls, depends on app server. Is app waiting for a connection?
TCP Inbound Queue Anything increasing in subsequent samples
Message Q Messages creates/sec, size (no growth), message age, expiring messages present?
CPU Q Load Average/CPU Queue Length > 2 per core Load Average: 4x number of cores
MIPS/Second 3500/second, threshold 2900
Response Time Several times higher than SLA TCP/Response Time can indicate app server overload distribution and median/thresholds vs baseline
Errors (Particular) Class/Types, Rates, and severity Error clusters
CPU Ready Time Any significant percentage
GC Time Percentage of time – what’s impact? Over 10% Full collection? Partial? Age of objects leaving generation
Averaging http status Compare to baseline average literal numbers By request (type), or by page. Watch for dramatic changes Watch for bots > 50% Big difference, prod vs. test. Remove synthetics in test. Load balancer, keep-alive, etc
Throughput Cloud $ rate, watch for bottleneck Network throughput + context switch over Cpu = ratio for sanity check, first bottleneck
GPU utilization in virtualization 75-85% utilization = efficient virtualization
Redirects Count/rate. Redirects on keep-alives Endless loops? Redirects per user, check for max
SAN I/O count
I/O latency Write latency Queue lengths let me see problem faster
Thread context switches
Wait time (DB) Latches, locks, just look for increases Also wait states
Log Volume (and by type)
Cache hit ratio Page life expectancy
DB Connections
Virtual active mem vs real memory Check for Disk swapping
Physical memory free 66% Managed memory Leave small amount free
Network I/O Based on network link rating
Network errors/packet discards
objects generated
Induced GCs
Live Sesions
Disk utilization – space and I/Os
Average SQL Statements/Request Data driven pattern detection problem n+1 query results
Disk space rate consumption
Starting/stopping monitoring agents
Connection Pool: Available
Monitoring Boxes
# restarts
Rate of change in connection pool count
recycling of worker process/app pool
Log messages by severity
thread status – deadlocks?
Business transaction rates – whatever those are
Revenue/sec
Availability
Cloud node thrashing Spin up spin down
Functional audit log/application log Specific events
Transaction span lifecycles
Scale of code change on updates This jar file is x times previous verision
Page size (html) Delta vs previous
User/system ratio
Web Server Request Queue: IIS
Batches/sec 200
%c 1/2/3 VMs in power-save mode? Detected measuring against baseline
Response time total for all requests (SUM) Saw things that were not obvious Correlate against load

What Would an Ideal Dashboard Look Like?

For this activity, we broke up into groups to talk about different contexts. There are some metacomments in each section, reflecting when they came out.

SaaS Company Dashboard

SaaSCompanyDash

This dashboard has horizontal bars to show different metrics for different consumers. The audience is the whole company. The Checkbox/Xs on the left are to provide a two state indicator to easily, rapidly, and broadly detect when something is wrong, by functional area. Here is what each had:

  • CEO – For each metric, the current time period against a previous time period. Demos, Signups, subscriptions, Retain/Drop Rates, and some support metric such as call count
  • CFO – Cost per lead, cost per user, and a sparkline for Revenue
  • CMO – Social media engagement, Mailing opens/click rates, ad conversion
  • CIO – Availability, error rate, Response time, throughput
  • CTO – Deployments, key resource metrics

Each one of these bars is designed for drill-down. Remember, this is the top level.

E-Commerce

For E-Commerce, they decided not to draw a single dashboard. They defined five different functional areas they care about, and drew a Venn diagram to show that there are overlaps of interesting information, but very little that everyone wanted, needed, or would benefit from knowing.

Consider this continuum, generic to specific, from “easy for everyone to understand” to “doesn’t have to be understood except by the individual who uses the dashboard”. You could also think of these as spokes from the center outwards.

Company -> Department -> Team -> Individual

As the questions were posed by them:

  • Who is looking at the dashboard?
  • What is their role?
  • What do they want to see?
  • What actions would they take if they did see something?

The four areas they defined, and the notes on each:

  1. Performance Testing
  2. Business – Adoption Rate, Active/passive campaigns, usage by feature, dollars per aggregate
  3. Development – Separate dashboards for each scrum team
  4. OPS – Active support ticket counts

BigCorp’s Dashboard

BigCorpDash

Here is a dashboard imagined for a very large company. It is intentionally light on detail to avoid having visitors/contractors get information they shouldn’t have. Some of the resulting discussion was about dysfunctions, but these are real safety issues in some organizations.

Consider the following characteristics in deciding what to have in the dashboard:

  • Tooling – what can you show
  • Time – both for measuring effort, and for displaying data in context
  • Security – Don’t show something you shouldn’t – review content with Security officer?
  • Liability – Knowing specific things – or being responsible for exposing specific things – might put a person in a position they don’t want to be
  • Cost – Screens, software, etc.
  • Political dimensions of having potentially out-of-context data be embarrassing to specific executives or individuals. Is this public shaming? Are people stuck with metrics that will never go green? Is that deliberate? What if we move the goalposts so that someone can see green – are they still worth tracking? Should we turn off our sales dashboard when we are clearly going to miss goals, so that we don’t depress morale?
  • NO STOCK PRICES. Or other metrics that are not actually connected to current, actionable data.

Some specifics about this dashboard:

The “Christmas Tree” or dragstrip pattern is for a number of areas to indicate green/yellow/red. This is designed only to communicate whether something interesting is happening.

Other visualization methods – use vertical meters, or speedometers.

The other primary visual is a graph comparing a current time period against a previous, such as this day’s revenue against last year’s/quarter’s/week’s. The interesting suggestion here was to change the background color as an indicator.

Final Metacomments

Focus in the information needs: What is actionable? What is relevant? What is helpful?

Go to people consuming data – what are they needing? Answer that question, instead of starting with what is easy to provide.

Revenue metrics are key

Iterate!

What metrics do we talk about? What metrics do we present? Those metrics already have a life, so we should try to reuse them.

Test for information completeness – establish that the dashboards are truthful and accurate. Make sure the team knows what the data means and where it comes from.

Dashboards provide transparency into current states – but require interpretation. They are starting place for discussions across teams.

Leave space for temporary/rotating items

Bonus Screenshot of an Ops Dashboard Someone Built

Unclear what the news presenter sees on the dashboard that concerns her ;-p

NewsPresenter

What Would We Want to Monitor for Mobile Users?

In this exercise, we listed the attributes that would be interesting to capture for mobile monitoring.

On the device:

  1. Network conditions: latency, bandwidth, jitter, packet loss
  2. Geographic location
  3. Screen size/device
  4. OS version
  5. Battery State
  6. Memory – free/available
  7. Number (list?) of apps running
  8. Powersave mode
  9. Chaperoned?
  10. Carrier
  11. Is the screen cracked or broken?
  12. Signal strength (wireless/cellular)
  13. Accessories
  14. Recent movement/direction/accelerometer data
  15. Contention? Drops/retransmits
  16. Carrier Plan status?
  17. Time: last restart/power cycle
  18. Rooted/Jailbroken?
  19. Date in Service
  20. Accessibility Options
  21. Geographically similar to other devices? People travelling together in a train, for example

In the app:

  1. Screen load RT
  2. Memory footprint
  3. Data transfer
  4. CPU cycles
  5. Gestures captured
  6. Round trips/waterfall
  7. Ad activity
  8. Freemium status?
  9. Known customer
  10. Time of day
  11. Demographics
  12. Date of Install
  13. Version
  14. Date last used
  15. Other app requests
  16. Phone connection right now?

Managing Test Organization Transformation

Recently, I attended the 9th Workshop on Software Testing in Financial Services (STiFS9) in New York, hosted by Liquidnet. The Theme of the workshop was “Organizational Structure Models for Test Groups at Financial Firms.” One of the discussions that struck a rich vein of ideas was what is needed for successful transitions between types of organizational models in quality organizations.

While the experiences that the participants drew on tended to be with larger organizations in financial services industries, these ideas could be useful for any testing organization’s transformation – or maybe even outside of testing.  Specifically, we were talking about organizations that were moving from a centralized to decentralized testing organizational model – or from decentralized to centralized in organizations with dozens or even hundreds of testers.

STiFS is a LAWST-inspired peer workshop, meaning that the value of the workshop is the product of all of the participants in the workshop, which included Bernie Berger, Kaveri Biswas, Margaret Boisvert, Ross Collard, Joe Lopez, Mike Pearl, Don Pierce, Anna Royzman, Ben Weber, and myself. We captured pages and pages of ideas from these very smart and experienced testers and test managers. Bernie Berger, the CO of STiFS9 (and the driving force behind STiFS) has provided a report on the workshop here.

This post is not an official finding of STiFS9. It doesn’t include all of the discussion or notes and only represents my interpretation and reflection on what I heard. Other participants may have a different take on what was discussed and what it meant. This post is what it meant to me.

A decentralized testing organization, as I mean it here, has testers reporting into specific projects as part of the staff. This approach might be marketed as “(More) Agile“, though it is certainly not necessary for agile, Agile, Scrum, or Certified Pokemon Master processes.

Arguing Over User Stories
What I Think of When I Hear Scrum

A centralized testing organization usually has a reporting structure that includes most or all testers in an organization in one department, which allocates testers for providing testing services to specific projects. Testers report upwards in the testing department and may work on several different projects, depending on the needs and desires of the whole organization. One marketing term associated with this approach is “Testing Center of Excellence.”

2,857 Career Story Points
A True Centre of Excellence

For the testing organization’s transformation project to start successfully, there needs to be top-down support, from the executive levels of the company. This support can be demonstrated and described by clear communication of the organization’s commitment to the transformation and the goals of the transformation. These goals may be to respond to changing market conditions, updated regulations, concerns about the quality and speed of delivery in the organization, or some combination of these and other factors. They should be articulated as the mission that the transformation is designed to accomplish with clear objectives and sufficient justification.

This communication of goals is usually understood as an essential step for aligning the organization, but less obvious is the need for selling the transition to the organization without jargon or “business speak.” This transparency builds trust with employees, who need to feel safe during transitional periods, need to feel included in the process and planning, and to be optimistic about their jobs and the organization’s future after the transformation. Many people are uncomfortable with and even resistant to change; laying the groundwork will help them process the changes to come. Buy-in enlists people in the organization to support the transformation and its goals, and helps retain key employees through periods of uncertainty. This was called “bottom-up trust.”

This leads to another core requirement: an organization’s leadership may draw up a new org chart (and the new org chart should be clear to everyone), but that doesn’t explain *how* the organization changes successfully, can’t describe all of the details and requirements, or fully define the new roles. In fact, almost all of these details will not be understood and defined at the start of an organizational transformation, making broad buy-in even more essential.

The real hard work of solving problems and helping define new structures is likely to be undertaken by “Champions” – key contributors throughout the company who are influential and respected by their co-workers. Many of these champions should *not* be managers, so that their motivation and contributions are clear. Some influence could come from outside consultants, but that will vary by situation and consultant.

The champions should be known to and perhaps recruited by the person that owns the transformation – the transformation owner. This person must be able to solve technical and political problems, make decisions that stick, and manage an ever-mushrooming list of details. The organization should expect that they will get clear, credible, and authoritative communication on progress from this person – and receive it regularly. A trusted source of information helps keep the organization aligned, and may help counter-program the rumor mill that uncertain people will be paying extra attention to.

The transformation owner should drive collaboration and keep people aligned. One way to support this collaboration is impact analysis on stakeholders, to identify problems, and to help create patience during periods of lower efficiency and missed deadlines that will surely occur as the organization is changing direction. This allows the stakeholders to plan and adjust accordingly, by making it explicit that expectations should be renegotiated.

Communication is only one requirement of the role, though. The owner must be able to advocate upwards and across the organization for accountability and buy-in from everyone to the transformation mission – and get it. Short-term revenue and budget pressures will contend with the work needed to create new processes.

The transformation owner can sell leadership on the need for patience and tolerance for initial failures, and run pilot projects to help make the transformation smoother for the rest of the organization. Teams should be able to experiment – to test the transformation – and provide feedback. The willingness to adjust the plan based on this feedback increases the chance of success.

One thing that was noted as frequently missing in transformation plans was engagement with and consideration of risks. There are lots of different kinds of risks; by engaging and enumerating risks, mitigations can be found. Not only should success criteria be understood – failure criteria may be helpful, too! Frequently, rollback is not an option, but contingency planning may be very helpful in getting through rough patches, particularly if the “worst-case” scenario has been understood.

As the details of the plan are being built, sufficient effort should be allocated to workflow analysis. How does work get done today? How will it be done in the future? Some things will fall between the cracks, and the people who would have noticed previously may be doing different things at that point.

An important part of the transformation plan is knowledge transfer. A concept of an “Organizational Level of Knowledge” was described; the cumulative knowledge of the organization is an important asset, and its value should be recognized, and hopefully preserved. Process documentation is easy to identify, but the unwritten rules are harder. Who can really get something done in a given area? How do you put an announcement out on the grapevine? These are not issues that can get resolved with an increased training budget; they took years to develop and be discovered.

Enter beer.

Wine is fine, liquor might indicate a problem...
The Secret Sauce of Transformation

Whether it is used for team building, to provide opportunities for horizontal communication, or as a coping mechanism, there were broad and strong recommendations for using beer to create social situations for team building, casual/informal communication, and reducing stress levels. This touched off a series of comments about the critical effects on employees going through and adjusting to transformation.

Perhaps the most critical issue is having well-defined new responsibilities. People need to know what is expected of them, and they need to trust that they have clear guidelines. This helps them feel secure, helps makes reporting relationships clear, helps set expectations, and helps them adjust to their new position.

It was suggested that there should be additional focus given by HR (and overtime, if necessary) to understand and hellp mitigate impacts on specific people who are dealing with change. People working with new managers are more likely to take complaints or issues to HR instead of the manager; they need to feel that they are heard and their concerns are addressed. This also gives HR a chance to focus attention and effort on keeping key people such as star performers happy and productive.

One of the strong signifiers (or moment where it becomes real) is when people have to move their desks. Movers, phones, networking, etc. all have their own issues and schedules and will need to be coordinated. Offices and workspaces may need to be redesigned.

Sitting with the new team is important, but there are real impacts beyond a change of scenery. If the geographic location changes and people have to move, there should be some care in how to help them understand it. Even changes like an impact to the commute are going to hit harder when surrounded by uncertainty.

People will need training and coaching for new roles and processes. Agile was cited as a commonly experienced part of a transformation, but even if methodology isn’t changing, there are certainly other changes in workflows, and/or required domain/technical knowledge. Testers should feel supported and that there is patience and tolerance for them to skill up to their new roles.

Some of technical issues around testing were collected by the group. Maintenance of automation and preservation of code and test assets was reported as often overlooked. There is a need to resolve the ownership of shared test code and data – or a decision to abandon it. Test labs and equipment will have to find new owners, and it should be clear how to schedule shared resources. Some cleanup should take place after the project completes.

More subtle impacts include ownership and responsibility for compliance concerns and go-forward updates/forwards/changes/retirements of email aliases, message groups, and distribution lists. Managing configuration changes may need significant rethinking, and permissioning was mentioned as a specific problem that had been encountered.

Even when testing is broadly distributed, some testing functions such as Automation or Performance Testing may still be best supported horizontally. This should be considered as part of the plan, including how to request support, and who will lead these functions.

In addition to all of this input about planning, the group had suggestions for conducting the transformation project. Once the transformation project is underway, continuing updates against frequently occurring milestones will help maintain cohesion and focus. Phases of the project are helpful for keeping things on track.

The plan and progress should be highly visible, and some advocated for a metrics regime as part of understanding where the organization was before the transformation, how the organization is managing during the transformation, and how successful the transformation project was in improving the organization’s testing efficiency. There was a lot of energy around engaging with the measurements that would be used by the organization to evaluate the project. Some of the proposed uses of metrics included the awareness of strengths and weaknesses, and current efficiencies and costs. Skepticism about metrics led to a discussion about the need to avoid inherited bias and future gaming of the system by using clearly documented and tangible metrics.

For project management, phases were suggested to keep the steps small enough to manage. There should be frequent check-ins to discuss how things are going, with different groups of people. Milestones should occur regularly, so that efforts remain bite-sized. This also helps build in flexibility by providing opportunities for assessment and adjustment.

It is also important to celebrate reaching these milestones. Celebration of successes should not wait until the entire project is complete, but occur throughout to help build positive momentum and maintain visibility.

Even with frequent phase events, patience will be necessary, both with schedule and results. Significant changes are being implemented, and everyone involved has to give them a chance to succeed. A couple of knowing comments came up here: “Projects unravel in unexpected ways.” and “Don’t panic because things are going perfectly.”

This patience has to include some tolerance for failure, and patience with understanding and correcting failure. There will be some risk for people to speak up and identify failures and their causes; transformation is frequently a top-down exercise. There is likely to be some disagreement about whether something has failed, or about the reasons for failure when the failure is acknowledged. When the organization is stuck, those responsible for planning are likely to blame the execution.

Failures must be engaged with thoughtfully, so that the organization can learn from it and not repeat it. Managing failure is important for the safety of the organization, and everyone involved. There are first order costs on others such as stakeholders, but the testers are going be frustrated and potentially frightened by failures. The organization has to know when to re-evaluate the situation, when to put parts of the plan on hold, when problems are fixable or not, and – perhaps most importantly – to understand and expect that these will be tactical decisions made during the project, when more information is known.

Discussion about completing the transformation (or declaring it done) came up again and again. As one participant put it, “Sooner or later, you have to land the plane”. People need to understand that the new reality is here, and that they can exhale and focus on doing their new jobs well.