Categories
Project Management

The Donald Rumsfeld School of Agile Software Delivery

By Jim Grey (about)

I often quip that when you plan and manage a sprint, as a leader you have to channel your inner Donald Rumsfeld.

You remember ol’ Don, don’t you? He was twice the Secretary of Defense in the United States, first in the Gerald Ford administration and later in the George W. Bush administration.

Rumsfeld is famous for having said, “There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns; the ones we don’t know we don’t know.” Here, listen:

He said this in answer to a reporter’s question about a lack of evidence linking Iraq’s government with supplying weapons of mass destruction to terrorists. It sounded so absurd at the time that the media had a field day ridiculing him.

But over the years, we’ve come to realize how much sense his statement makes. In anything you set out to do, you know what you know, you might know or be able to guess some things that you don’t know, and you certainly don’t know many things that you don’t know.

In agile software delivery — heck, in life:

  • You plan for the known knowns.
  • You make contingency plans for the known unknowns.
  • When unknown unknowns happen, all you can do is manage through them.

On a team I once managed, we wanted to build a feature to email end users the day after they abandoned a transaction on our site. We hoped to reengage them to finish their transaction.

We wrote and groomed the stories. Most of them had a clear implementation — plenty of known knowns. They were easy to estimate.

A few stories had known unknowns. One was, “It’s been a while since we’ve been in that repo, and it was written in our wild-west days. I might find some cruft in there that I’d need to straighten out to be able to finish this story.”

Another known unknown was, “I’d have to do a deep dive into that module to be sure, but if does this certain thing in this certain way, I’d have to do considerable extra work to stitch in the code I need to write there.”

In each case we discussed whether that work, were it necessary, would add meaningful complexity to the story and inflate its estimate. Sometimes it was yes, sometimes it was no. When it was yes, sometimes we wrote a dive story to find out and other times we thought the impact would be small enough that we could just accept the risk.

As we planned sprints with those stories, we believed we had accounted for everything we could think of, and were confident in our sprint plans. But two things went wrong we didn’t foresee — unknown unknowns bit us hard.

One story involved writing a job that would check overnight for abandoned transactions. The engineer who picked up that story encountered five or six serious unexpected difficulties with writing the job and with getting the job runner to pick it up properly. A story that we thought he would wrap up in two days spilled well into the next sprint.

Then it turned out that our email service was never built to handle the volume we threw at it, and it failed immediately in Production. One engineer devoted all of his time, and another part of his time, to solving the problem. A lot of our tester’s time went to troubleshooting with the engineers. The situation was surprisingly thorny. The engineers implemented several successive fixes over several days before finally getting past the problem.

Thanks to these unknown unknowns, we missed a couple sprint plans by a mile. Here’s how we managed through them: I worked with the team to prioritize stories and figure out which ones we were likely not to deliver. The engineers watched for stories ready to be tested, and helped our tester by picking some of them up themselves to keep the revised plan on track. Finally, I reset expectations with my management about what we could actually deliver, and how that would affect delivery of other work that depended on it.

You could argue that these should not have been unknown unknowns. Writing a job should have been cut and dried. We should have thought about volume in our planning, and tested for it.

Hindsight is always 20/20. Retrospectives are where you explore that hindsight and make improvements for the future. After you experience unknown unknowns, improvements generally take two forms:

They become known knowns, or maybe known unknowns. We learned a lot about the complexities of jobs in our world. We knew we needed to write a few more in upcoming sprints, so we broke them down into several smaller stories that we could test incrementally. Also, we now expected that there might be challenges in writing jobs we hadn’t encountered yet, and to lay in contingency plans for them.

You can lay in plans to neutralize them as unknowns in the future. We found out that other teams had been having trouble with the job runner. We met with the team that owned that code to figure out how to make the job runner more reliable. They put tech-debt stories in their backlog to handle it. We also researched how we might make the email service more robust and investigated third-party email services. We ended up going with a third-party service.

Categories
Quality The Business of Software

If you want your software to keep producing, be prepared to do some dirty work

By Jim Grey (about)

A woman named Verna built the house I live in. She landscaped it nicely; a sprawling flowerbed stretches in front of my front door and picture windows. Every spring, I eagerly await Verna’s spring color: yellow daffodils, purple hyacinths, red tulips, and finally the giant pink peonies.

Home

I’ve added a few things: lilies, mums, lavender, coreopsis, phlox. I love phlox! But my eagerness to keep adding color petered out pretty quickly because it turns out I hate digging in the dirt.

I don’t much enjoy any of the other routine garden maintenance, either. Mulching. Deadheading. Dividing overgrown plants. Weeding – oh god, the weeding. Does it make me lazy that I just spray my weeds with Roundup and move on?

I just want to enjoy the flowers. But this ninth spring I’ve lived in my home, a few of my plants didn’t come back as strong. A couple didn’t come back at all.

So I asked my mom. She’s the gardener in our family. “When was the last time you fertilized?” she said. “Um, never,” I said. “Ah,” she said.

It turns out that you can’t just ignore the soil, or the plants themselves for that matter. Things growing in it year after year uses up all the nutrients, and crowded plants compete with each other for what little is there. “I’m surprised your flowers didn’t stop coming back a few years ago,” Mom said.

Daffodils

I did some serious fertilizing this season. I also separated some overgrown hostas and moved some of Verna’s plants so they had some elbow room. Not fun, but necessary.

♦ ♦ ♦

I once worked for a software company whose flagship product sold briskly. Version 1.0 was five years in the past, and since then we’d added lots of new features so the product could continue to lead the market. And now here came the head of Product Management asking for more new features

Dan, a quiet fellow, graying at the temples, led Development. “Well, yes, we can add all those features,” Dan said, adjusting his glasses. “This one will take six months. That one will take four. This other one, well, I think that’ll take a year.”

The Product Manager was dumbfounded. “Features of similar scope took far less time in the past, and you had fewer developers then. What gives?”

Dan looked up at the Product Manager kindly, and drew a breath. “Well, we’ve been under such pressure to quickly add features to this product that we’ve not focused enough on its overall design. We’ve also made no time to keep our underlying architecture up to date. These are things I’ve been pointing out all along the way. But we’ve just bolted features on wherever we thought we could get away with it. Now, to add any one of the features you’ve requested, we basically have to unbolt three or four other features, and blend the code all together. And we have to write complicated bridge code to do modern things with our aging architecture, and when that doesn’t work we will have to upgrade some parts of it and test the product well to make sure everything still works. It’s a slow process. And it’s just going to get slower and slower the longer we keep going like this.”

That the product’s design had become cancerous and the underlying architecture had gone out of date were not considered a crisis –- but not being able to rapidly add new features sure was. It focused the company’s entire attention. Their response was to code up a “next generation” product from scratch, which was a disastrous idea for a whole bunch of reasons beyond the point of this story. When the dot-com bubble burst in 2001-2002, they had not yet successfully launched the next-generation product, and they still couldn’t add features to the old product fast enough. Revenue fell precipitously. Quarterly layoffs began, but it was not enough to keep the wolves from the door. That once-promising company was sold; the company that bought it outsourced development to China.

More recently I went to work for another promising software company. They had been in business for about a decade and had sold their software to a number of very large companies. But in the couple years before I’d been hired, the pace of new feature delivery had slowed to a crawl. Adding new features had become increasingly difficult and always broke existing features. As a result, it took longer and longer to test the product, but even then, major bugs were still being delivered to customers. Meanwhile, younger, more nimble competitors were stealing business away from us. As the rate of new revenue decreased, support costs skyrocketed. It was unsustainable, and that company, too, had to sell itself to another company to avoid collapse.

It was much the same story: the company had focused entirely on rapid new-feature delivery and not enough on ongoing design and architecture. After a decade, their soil had gone infertile and the code had become tangled. Nothing new would grow.

♦ ♦ ♦

Software as a garden: to be able to grow more software, to be able to grow revenue with it, you have to keep the soil fertile and give the roots room. The problem is, gardening projects are a hard sell. These are things like refactoring older parts of the code that no longer serve efficiently, or upgrading or replacing outdated parts of the architecture, or redesigning subsystems that work fine today but can’t adapt to things the company wants to do in the future. When you tell executives you need to do these things, what they hear is that they can’t have new features while you do it. New features fuel growing companies.

But if you don’t tend your garden, sooner or later it will stop producing.

Categories
Quality The Business of Software

Don’t piss off your users by suddenly changing your UI

By Jim Grey (about)

Delivering software on the Web is great. Especially with continuous delivery, we can deliver changes large and small anytime we want. And then we can get quick feedback from our users and the market, adjust the software accordingly, and push those updates fast, too. It’s utopia and the Holy Grail rolled into one!

Except that users are not very excited when we change things. They want software to stay as it is. Well, mostly: they want us to fix the bugs that affect them, of course, or even to add this or that little feature. But please, they plead, don’t make it work differently than it does now.

Meanwhile, we face various pressures. Markets shift; new needs emerge and old needs become less important. Technologies shift; old frameworks become outdated, new frameworks enable us to keep pace. Today everything has to not only work on mobile, but feel native to mobile — and all run on a single codebase. This is shifting product direction across our industry.

That’s the backdrop against which WordPress, the content engine behind one out of every four Web sites, rolled out a new editor last week. It was part of a complete rewrite of all of WordPress.com. Their old technologies just couldn’t stretch to where the world was moving. So they threw it out and started from scratch. Their new editor is fast — fast! — and works fluidly, while looking great both in my browser and on my phone.

2015-11-22_2157
Spanking new editor in my browser…

But boy, were users pissed. Pissed! Check the WordPress.com forum: 19 pages of complaints and counting. Sometimes, I swear, users wouldn’t be happy if you sent them gold bars, because they preferred the silver bars you used to send them. But very often users have a point: they’ve gotten into the swing of your software, and now you’ve changed it and they have to learn it all again. Worse, maybe now something they used to be able to do isn’t there anymore, or if it is, they can’t find it.

IMG_4117
…and on my phone

For the record, I was the first commenter on that thread, because I experienced some of those frustrations. I tried to be kind, but several features I use either went missing or were accessed in a way that I couldn’t easily discover. Argh! And I wished the editing space were wider; it felt awfully cramped. I wasn’t alone in any of these complaints.

I wanted to just edit a post. I didn’t want to learn a new interface. But I found that there’s no way to just revert to the previous editor. It is simply gone.

I understand what drives changes like this and know that this is a monumental achievement this is for WordPress. Still, because I’m a heavy WordPress user, more than anything else I feel frustrated. The new editor breaks all of my usage flows, and I’m having to rediscover everything. I didn’t want this.

It’s the same, by the way, with Microsoft Office’s ribbon, which replaced an older menu structure way back in Office 2007. That’s forever ago in software terms. Yet there are still features I can tell you exactly how to find in those old menus, but I have to Google where they are on the ribbon.

Users don’t give a rip about your business or the future of technology. They use your product to accomplish a thing. As long as they can consistently and easily accomplish that thing, they stay happy. Many users learn your product’s nuances and become quite adept with them. When you suddenly change the UI and all of their flows are interrupted, of course they’re frustrated.

So what would happen if you followed Basecamp’s model? Their software helps companies small and large manage projects. Last month, they released Basecamp 3, a ground-up rewrite — yet they received not a single complaint from existing users. That’s because Basecamp 2, and for that matter Basecamp 1, remain fully active. Existing users can upgrade if they want, or stay put if they don’t. There are compelling reasons to move to Basecamp 3. But if you’re a happy Basecamp 1 or 2 user, those products will be there, fully supported, for as long as you want to use them.

Maybe your company can’t do that. But what can you do to ease the transition for your users, so they can stay fully productive? Think this through. It’s more important than any technology or implementation decision you make.

Fortunately, WordPress does, for some reason, still provide back-door access to an even older editor.

2015-11-22_2158
Outdated but highly functional classic editor

I don’t care that this is based on outdated technology: it’s fully featured, and I know how to make it sing. I cut my blogging teeth on this editor when I started my personal blog in 2007. I’ve written over a thousand posts in it. I hope it never goes away.

Categories
Quality Testing

Fast failure recovery lets you take more risk and increase speed

By Jim Grey (about)

I can just imagine the “oh no” moment that rippled through Feedly last week when they learned that their big new release for iOS 9 crashed on launch for iOS 7 and 8 users. Their first response: add a hastily-written warning to the App Store description.

IMG_4066

Their second response: fix the bug, fast. It was ready the next day.

IMG_4069

Apparently, Apple has a way for developers to rush the very occasional critical fix into the App Store, sidestepping the normal, weeks-long approval queue.

Apple’s App Store fast track provides a safety net, and I don’t blame them for using it. Because Feedly could recover fast, hardly anybody will remember this gaffe, and it won’t cost the company very much.

I actually feel slightly bad talking about it, because it keeps the memory of this bug alive. But it illustrates such a simple equation: the faster you can recover from failure, the less perfect your product has to be when you ship it, and therefore the less expensive it is to build and support, and the faster your company can move.

I’m not advocating for delivering buggy product on purpose. Follow good development practices and test for important risks to deliver the best software you can as fast as you can. But when you inevitably deliver a bad bug, being set up to recover fast means you can deliver without worry.

I remember when the best way to get software to users was to mail it to them on CDs. A bug of this magnitude was a much bigger deal then. It was harder to warn users of the problem, and lots slower and more expensive to correct it. In that world, Feedly’s bug could have damaged their reputation for a long time. So back then it made sense to test more thoroughly — and therefore spend more time and money before releasing.

But today, in a world of Web software and 24-hour emergency App Store turnaround, you’ll deliver faster and with less expense when you set yourself up for fast failure recovery. Continuous integration and continuous delivery are usually a part of that strategy.

Categories
Quality Testing

Myths of test automation – debunked!

By Jim Grey (about)

wrote a post last year criticizing test automation when it’s used to cover for piles of technical debt and poor development practices. But I still think there’s a place for automation in post-development testing. There are two keys to using it well: knowing what it’s good at, and counting the costs. Without those keys it’s easy to fall prey to several myths of test automation. I aim to debunk them here.

Myth: Automation is cheap and easy

It is seductive to think that just by recording your manual tests you can build a comprehensive regression-test suite. But it never seems to really work that way. Every time I’ve used record and playback, the resulting scripts wouldn’t perfectly execute the test, and I’ve had to write custom code to make it work.

St. Paul's Episcopal Church

What I’ve found is that it takes 3 to 10 times longer to automate one test than to execute it manually. And then, especially for automation that exercises the UI, the tests can be brittle: you have to keep modifying scripts to keep them running as the system under test changes.

I’ve done straight record and playback. I’ve created automated modules that can be arranged into specific checks. I’ve led a team that created tests on a keyword-driven framework. And I currently lead a team that writes code that directly exercises a product’s API. The amount of maintenance has decreased with each successive approach.

A side note: given the cost of automating one test, can you see that you want to automate only what you are going to run over and over again, because otherwise the investment doesn’t pay?

Myth: Automation can test anything, and is as good as human testing

Automation is really good at repeating sets of actions, performing calculations, iterating over many data sets, addressing APIs, and doing database reads and writes. I love to automate these things, because humans executing them over and over is a waste of their potential.

This gets at a whole philosophical discussion about what testing is. I think that running predetermined scripts, whether automated or not, is just checking, as in, “Let me check whether clicking Save actually saves the record.” This subset of testing just evaluates the software based on predefined criteria that were determined in the past, presumably based on the state of the software and/or its specification or set of user stories as they were then.

The rest of testing involves human testers experimenting and learning, evaluating the software in its context now. This is critical work if for no other reason than the software and its context (environment, hardware, related software, customer needs, business needs, and so on) changes. An exploring human can find critical problems that no automated test can.

I want human testers to be free to test creatively and deeply. I love automated checks because they take this boring, repetitive work away from humans so they have more time to explore.

Myth: When the automation passes, you can ship!

It’s seductive to think that if testing is automated, that passing automation is some sort of Seal of Approval that takes out all the risk. It’s as if “tested” is a final destination, an assurance that all bets are covered, a promise that nothing will go wrong with the software.

But automation is only as good as its coverage. And if nobody outside your automation team understands what the automation covers, saying “the automation passed” has no fixed meaning.

It’s hard to overcome this myth, but to the extent I have, it’s because as an automation lead and manager I’ve required engineers to write detailed coverage statements into each test. I’ve then aggregated them into broad, brief coverage statements over all of the parts of the software under test. Then I’ve shared that information — sometimes in meetings with PowerPoint decks, always in a central repository that others can access and to which I can link in an email when I inevitably need to explain why passing automation isn’t enough. Keeping this myth at bay takes constant upkeep and frequent reminders.

Myth: Automation is always ready to go

Hope Rescue Mission

“Hey, we want to upgrade to the next version of the database in the sandbox environment. Can you run the automation against that and see what happens?”

My answer: “Let’s assume I can even run the automation in sandbox. If it passes, what do you think you will know about the software?” The answer almost always involves feelings: “Well, I’ll feel like things are basically okay.” See “When the automation passes, you can ship!” above.

Automation is software, full of tradeoffs aimed at meeting a set of implicit and explicit goals. Unless one of those goals was “must be able to run against any environment,” it probably won’t run in sandbox. The automation might count on particular test data existing (or not existing). It might not clean up after itself, leaving lots of data behind, and that might not be welcome in the target environment. It might depend on a particular configuration of the product and its environment that isn’t present.

Even in the environment the automation usually runs in, it might not be ready to go at a moment’s notice. Another goal would need to be, “must be able to run at any time.” There are often setup tasks to perform before the automation can run: a reset of the database the automation uses, or the execution of scripts that seed data that the automation needs.

Myth: Just running the automation is enough

When I run automated tests, part of me secretly hopes they all pass. That’s because when there’s a failure, I have to comb through the automation logs to find what happened, figure out what the automation was doing when it failed, and log into the software myself and try to recreate the problem manually. Sometimes the automation finds just the tip of a bug iceberg and I spend hours exploring to fully understand the problem. Some portion of the time, the failure is a bug in the automation that must be fixed. When it’s a legitimate product bug, then I have to write the bug in the bug tracker.

I am endlessly amused by how often I’ve had to explain that just running the automation isn’t the end of it: that if there are any failures, the automation doesn’t automatically generate bug reports. The standard response is some variation of “What? …ohhhhhh,” as it dawns on them. So far, thankfully, it has always dawned on them.

Myth: Automated tests can make up for years of bad development practices

I’ve just got to restate my point from my older post on this subject. If your development team doesn’t follow good practices such as writing lots of automated unit tests (to achieve about 80% code coverage), code reviews, paired testing, or test-driven development, automation from QA is not going to fix it. You can’t test in quality — you have to build it in.

If you’re sitting on a messy legacy codebase, one where your test team plays whack-a-mole with bugs every time you make changes to it, you are far, far better served investing in the code itself. Refactor, and write piles of automated unit tests.

You want on the order of magnitude of thousands of automated unit tests, hundreds of automated business-rule tests (which hopefully directly exercise an API, rather than exercising a UI, for resiliency and maintainability), and tens of automated checks to make sure the UI is functioning.

I’ll belabor this point: Invest in better code and better development practices first. When you deliver better quality to QA, you’ll keep the cost of testing as low as possible and more easily and reliably deliver better quality to your customers and users.

Categories
Career

Twenty-five years in the software salt mines

By Jim Grey (about)

Tomorrow it will have been 25 years since I started my career in the software industry.

It might seem odd that I remember the day only until you know that I started work on Monday, July 3, 1989, making my second day a paid holiday. The office was nearly deserted on my first day. My boss regretted not having me start on July 5 so he could have had an extra-long weekend too.

I was 21 years old when I joined that little software company in Terre Haute. I’m 46 now. I have worked more than half my life in and around the software industry.

taught myself how to write computer programs when I was 15. When I was 16, my math teacher saw some of my programs and praised my work. He encouraged me to pursue software development as a career. He began to tell me about this tough engineering school in Terre Haute.

I graduated from that tough engineering school hoping to find work as a programmer. Jobs were hard to come by that year, so when a software company wanted to hire me as a technical writer I was thrilled just to work. And then it turned out I had a real knack for explaining software to people. I did it for twelve years, including a brief stint in technology publishing and five years managing writers.

I then returned to my technical roots, testing software and managing software testers. I learned to write automated functional and performance tests – code that tests code – and it has taken me places in my career that I could never have imagined.

Office
My office at one of my career stops

I’ve worked for eight companies in 25 years. The longest I’ve stayed anywhere is five years. I left one company in which I was a poor fit after just 14 months. I’ve moved on voluntarily seven times, was laid off once, and was fired and un-fired once (which is quite a story; read it here). Changing jobs this often isn’t unusual in this industry and has given me rich experience I couldn’t have gained by staying with one company all this time.

I’ve worked on software that managed telephone networks, helped media buyers place advertising, helped manufacturers manage their business, run Medicare call centers, helped small banks make more money, enabled very large companies to more effectively market their products, and gave various medical verticals insight so they can improve their operations and their business.

Some of these companies were private and others were public; so far, I’ve liked private companies better. Some of them made lots of money, some of them had good and bad years, and one of them folded. Some of them were well run and others had cheats and liars at the helm. Some were very difficult places to work, but those were crucibles in which I learned the most. Others have brought successes beyond anything I could have hoped for a quarter century ago.

I did, however, hope for a good, long run in this industry, and I got it. But I’m also having a hard time envisioning another 25 years. It’s not just because I’d be 71 then. I really like to work, and – right now at least – I plan to do so for as long as I am able. But I’m starting to have trouble imagining what mountains I might yet climb in this career. Maybe that’s part of reaching middle age – indeed, many of my similarly aged colleagues, some with careers far beyond mine, have gone into other lines of work. I’m still having a lot of fun making software, though. I currently manage six software testers, one test-automation and performance-test developer, and one technical writer. I get to bring all of my experience to bear, and encourage my teams to reach and grow. I don’t want to stop just yet.


If this sounds familiar, it’s because it’s an update of an eariler post. Cross-posted to my personal blog, Down the Road.

Categories
Quality Testing

When test automation is nothing more than turdpolishing

By Jim Grey (about)

I used to think that writing a fat suite of automated regression tests was the way to hold the line on software quality release over release. But after 12 years of pursuing that goal at various companies, I’ve given up. It was always doomed to fail.

In part, it’s because I’ve always had to automate tests through a UI. When I did straight record-and-playback automation, the tests were enormously fragile. Even when I designed the tests as reusable modules, and even when I worked with a keyword-driven framework, the tests were still pretty fragile. My automation teams always ended up spending more time maintaining the test suite than building new tests. It’s tedious and expensive to keep UI-level test automation running.

But the bigger reason is that I’ve made a fundamental shift in how I think about software quality. Namely, you can’t test in quality – you have to build it in. Once code reaches the test team, it’s garbage in, garbage out. The test team can’t polish a turd.

Writing an enormous pile of automated tests through the UI? Turdpolishing.

I’ve worked in some places where turdpolishing was the best that could be done. Company leadership couldn’t bear the thought of spending the time and money necessary to pay down years of technical debt, and hoped that building out a big pile of automated tests would hold the line on quality well enough. I’ve led the effort at a couple companies to do just that. We never developed the breadth and depth of coverage necessary to prevent every critical bug from reaching customers, but the automation did find some bugs and that made company leadership feel better. So I guess the automation had some value.

But if you want to deliver real value, you have to improve the quality of the code that reaches your test team. Even if the software you’re building is sitting on a mountain of technical debt, better new code can be delivered to the test team starting today. I’m a big believer in unit testing. If your software development team writes meaningful unit tests for all new code that cover 60, 70, 80 percent of the code, you will see initial code quality skyrocket. Other practices such as continuous integration, pair programming, test-driven development, and even good old code reviews can really help, too.

But whatever you do, don’t expect your software test team to be a magic filter through which working software passes. You will always be disappointed.

Categories
Project Management

Three-point estimation is a critical skill you should build right now

By Jim Grey (about)

Maybe you’re one of the lucky ones.

Madison Bank clock

Maybe you work in a shop that executes scrum well. Your team keeps the backlog well groomed, understands its velocity, and delivers working software sprint after sprint. Or maybe you work in an open research-and-development shop where the journey is as important as the result, and projects are always open-ended.

If, like me, you have to deliver software on deadline using a less-than-perfectly executed methodology, this post is for you. It’s going to give you a technique for knowing where you stand against your deadlines – a very useful skill that will reduce the chaos you experience and help your boss steer projects more successfully, both of which are good for you.

It is a method for implementing continuous reestimation, which I wrote about in this post. Read it to know why this will make you and your boss more effective.

Teams that have worked for me and have estimated using this technique have seen their projects come in within estimate about 80% of the time. The other 20% of the time, they learned important things about their work that let them estimate more accurately the next time.

Here’s the technique:

  1. Break down your work into tasks.
  2. Estimate the tasks using a three-point formula.
  3. As you work, reestimate every day.
Break down your work into tasks

You might resist this step because in your core you know you will miss some tasks. It’s okay. This process honors that you don’t know what you don’t know. Just outline the tasks as best you can.

But think small, because you’ll be more accurate when you estimate more small things than fewer big things. Break things down as small as you reasonably can.

Estimate the tasks using a three-point formula

For each task, then, think of the following:

  • If everything goes right on that task, how long will it take? This is the best case, B.
  • If everything goes wrong on that task, how long will it take? This is the worst case, W.
  • If a reasonable amount of things go wrong on that task, how long will it take? This is the likely case, L.

Think for a minute or two about each task, but any more than that is probably overthinking it.

Important: Calculate your estimates in hours. Not days, not weeks, hours. There’s too much slush in days and weeks. Besides, in an 8-hour day you probably work just 6 hours on project tasks. You spend the other two hours in meetings, having hallway conversations, using the restroom, and being interrupted.

Here is where some magic comes in. Just go with me on this. For each task, drop B, W, and L into this formula and calculate the answer.

\displaystyle Estimate=\frac{B+4L+W}{6}

You’re weighting the likely case, but honoring the best and worst cases – and in one deft motion you take fear out of your estimates. What makes estimates bad is either unexpected things happening or overpadding for unexpected things happening. Thinking best/likely/worst helps you be more objective. And when you add up these three-point estimates for all your tasks, you get a reasonable cushion for the things that could go wrong.

Divide the number of hours by the number of hours per day or week you will work on the project. Lay that number of days out on a calendar; tell your boss that’s when you’ll be done.

As you work, reestimate every day

You are going to discover tasks you didn’t think of at first. And some tasks are going to take longer than you originally thought. When this happens, estimate the new tasks and/or reestimate the task that’s taking longer. Recalculate your end date and break the news to your boss.

Your boss ought to kiss you right then and there, by the way, because you gave him or her information early enough to do something about it – adjust scope, add resources, move the date, or tell you that unfortunately you are looking at some late nights. Hopefully the answer will be a, b, or c more often than d.

And every time you have to reestimate, you’ve learned something about your work that helps you estimate more accurately in the future.

Categories
Process Project Management Quality

If you want to ship software, stay in touch with how much you suck

By Jim Grey (about)

My colleague Matt Block recently posted on his blog a link to an article about how a software shop’s business model affects how well agile scrum works for them. It breaks business models down into emergent, essentially meaning that the company builds product to meet goals such as selling ads or driving traffic, and convergent, essentially meaning that the company builds product that directly serves a target market. The article argues that agile is made for emergent and is a poor fit for convergent. That’s just a sketch of the article; go read it to get the full flavor.

Eminence says: Monrovia Sucks!
Graffiti found in the town neighboring Monrovia.

I’ve always worked for companies following convergent business models. We’ve made our money by selling the software we created, which made it always important to deliver a certain scope by a certain time. When those companies implemented agile scrum, they could never fully adapt a key principle of it: when it’s time to ship, you ship whatever is built. In a convergent world, scope is king; you ship when everything specified is built.

I e-mailed my brother, Rick Grey, a link to this article. It’s great to have a brother who does the same thing I do for a living as we can talk endlessly about it. I thought we’d have a conversation about how to scope an agile project, but instead he had a brilliant insight: What if agile is good for convergent-model companies because it tells you sooner how much your project is off track? He gave me permission to share his e-mailed reply, which I’ve edited.

– – –

What if the companies we’ve worked for and all the other convergent-model teams of the world are doing agile just fine? By “just fine” I mean “as good as they do waterfall,” which may not be “just fine,” but we’ll get to that in a minute. Meanwhile, consider:

Long waterfall project:

  • No one pays real attention to progress (there’s always next month to catch up)
  • Engineers go dark, checking out huge sections of the codebase and not merging them back for long periods
  • Engineers (who are notoriously poor estimators) claim 50% done when it’s really about 25% – and then, as the code-complete milestone nears, they (usually innocently) claim 90% done when it’s really 70%
  • A couple of days before the code-complete milestone, engineering finally acknowledges they won’t hit the milestone and delays delivery to QA – “but we’re 95% done, for sure”
  • Under the pressure of already having missed a deadline, developers quietly take shortcuts to make it possible to hit the new QA delivery date
  • Weeks and months of unmerged changes come crashing in, creating conflicts and compile/deploy problems, further delaying delivery to QA
  • QA, now staring with a multiple-week handicap on an already-too-aggressive schedule, quietly takes its own shortcuts
  • QA finds hairy showstopper bugs and so the ship date gets moved
  • Management is livid, so QA goes into confirmatory testing mode just to get it out the door

Agile project of the same size:

  • Much of the above happens at a smaller scale, one iteration at a time
  • You fail to deliver everything planned starting with the first sprint
  • Instead of spending 80% of the project thinking you don’t suck as an organization and the last 20% realizing that you do, agile lets you feel like you suck every step of the way
  • Takeaway for management: “agile sucks” and/or “we suck at agile”

I assert that most teams are bad at delivering under a convergent business model. The hallmark pathologies of software delivery under a convergent model are too numerous and powerful for most teams to overcome, but their struggles are masked by waterfall until the end. Agile surfaces the problems every iteration. You feel like a loser by week 4 instead of week 40.

But this is actually a win. You get better project visibility and a tighter feedback loop, meaning you’ve got a better chance to make adjustments earlier to get the most out of your team you have. Embrace the feedback loop as a chance to make things better, and learn not to view it as proof of how much you (collectively) suck.

– – –

I will add that agile also helps you keep resetting expectations within your organization, because it makes it standard practice to keep reestimating what it will take to finish everything. This is just what I was talking about in my last post (read it here).

Categories
Quality The Business of Software

Obamacare, healthcare.gov, and how government software gets made

By Jim Grey (about)

I was not surprised when I heard that the Obamacare Web site, healthcare.gov, crashed and burned right out of the gate.

But I was disappointed. Regardless of what I think of the Affordable Care Act, it’s the law. I wanted its implementation, including healthcare.gov, to go well.

Still, I wasn’t surprised because I know how government software gets made.

obamacare2

Several years ago I worked in middle management for a company that built a government Web application related to health-care customer service. I was in charge of testing it to make sure it worked. It is probably not going out on a limb to say that the people who built healthcare.gov experienced many of the same kinds of things I experienced on that project.

Let me be plain up front: I was a poor fit for government software development. I was too free-wheeling and entrepreneurial for the control-and-compliance environment that government contracting encourages. I find it difficult to write about the experience without showing my frustrations with its realities. But I think I understand those realities well and objectively.

The government doesn’t know how to do anything. They hire it all out, and then they manage and administer the process. As a result, on this project they relied heavily on compliance with “best practices,” as if those practices contained some sort of magic that delivered quality software. They don’t, of course; the government was shocked when Version 1.0 of our software had typical quality problems right out of the gate. Those practices served primarily to leave an audit trail the government could follow.

In the end, the project was a success. Despite Version 1.0’s glitches, which we quickly fixed, the software was immediately put to use and led to productivity improvements over an older, green-screen system. I spoke with many of the software’s users, and despite a few grumbles most of them liked using it.

But this was one mighty expensive piece of software to build, from winning the contract to defining what the software should do to building and maintaining the software. Here’s why.

The bid process

I was hired after we won the contract, but I heard stories about the bid process. We had no experience building software on this scale, but we wanted into the lucrative cost-plus government contracting business for its guaranteed profit margins. So we offered a lowball bid aimed at getting the government’s attention, not at what it actually was going to take to build the software. And then to our surprise we won the business. After the elation wore off, we were left with an “oh shit” feeling – we needed to actually build the software for that amount. How the heck would we pull that off?

We finished Version 1.0 on my watch, but I don’t know whether we delivered it within budget. It seemed to me, however, that the bid process encouraged underbidding and overspending.

The requirements process

When you make something for the government, they want to know exactly what they’re getting, in excruciating detail. So we started by writing the biggest, thickest requirements document I’ve ever seen. We weren’t building this software from scratch – we bought what was then the leading customer-relationship-management software platform and used it’s software-development toolkit to heavily customize it for our needs. But we had to write highly detailed specifications anyway.

obamacare1

Requirements gathering was more about navigating choppy political waters and brokering compromise than about specifying usable, stable, and scalable software. To develop the requirements, we flew in representatives from every company that would use the software and put them into a big room to hash out how it would work. But all of these companies were themselves government contractors. Their people all knew each other – and, frequently, competed against each other for contracts. Some of them competed against us trying to win this contract. The room was thick with mistrust and agenda,

The building process

The government lives in constant fear of being screwed by its contractors. It goes back to Abraham Lincoln’s time, when rampant fraud among suppliers threatened the Civil War effort. (Seriously. Gunpowder cut with sawdust. Uniforms that dissolved in the rain. Read about it here.)

So not only did the government hire us to build the software, they hired another firm to watch us do it. This is called independent verification and validation, or IV&V. Their job was to make sure that we followed software-development “best practices” and that we built what we said we were going to build. But making matters worse, the company that won the IV&V contract, I’m told, also had bid on the project to build the software in the first place. It always seemed clear to me that they wanted to show us to be fools so that they could take over the project. They ran us ragged over every last minor detail.

The level of perfectionism in terms of “best practice” adherence was intense. Yet when we delivered the software, it had several usability challenges and outright bugs. Worse, it struggled to keep up with the load users placed on it. If you’ve ever built software, you know that these are typical challenges with Version 1.0 of anything. But the government was shocked, dismayed, and appalled. We spent the next several months issuing update releases to make it perform as it needed to. Of course, IV&V ran roughshod over us the whole way – but they were in the hot seat too because their “best practices” had failed to prevent these problems.

The process overhead

Process is tricky to apply well. Too little leads to chaos, too much adds needless cost and delay. I’m not anti-process – rather, I’ve built a career on bringing just the right level of process into a software development environment to make it effective. But most of the process we had to follow involved documenting our work to prove to the government that we had actually done it. This frequently hindered our ability to deliver software cost-effectively, and sometimes stood in the way of quality.

obamacare3

We bought a well-known software product that stored requirements and linked them to the code and the test cases so we could prove that we built and tested each requirement. This involved tracing every requirement to every line of code and every test case, an enormous task in and of itself. I personally created a traceability report each quarter and sent it to the government. All of this required a lot of work from skilled technical people, but in my judgment did not materially help us better build or test the software.

Our test cases were contractually required to be documented in such detail that a trained monkey could execute them. They were at the level of “Step 1. Type your username into the Login box. Expected result: Your username appears in the Login box. Step 2. Type your password into the Password box. Expected result: A row of asterisks appears in the Password box.” A test case that took fifteen minutes to execute could have taken two hours to write and could have been a dozen pages long. We had hundreds of test cases. Many test cases were not appropriate to be added to the regression test suite and be executed every release, so we spent a lot of time writing them to execute them a small handful of times.

It was supposed to be against the rules to write a bug report that had no associated test case. Testers would often stumble upon a bug by accident or find one while doing ad-hoc testing – and then find themselves in a conundrum. Writing the test case that led to the bug and tracing it back to requirements took time we frequently lacked at that point in the game. When the bug was serious enough, everybody looked the other way when it wasn’t associated with a test case. I wonder whether any of the testers avoided writing test cases by falsely associating the bug with an existing test case.

We did get one big break. We lobbied for, and to our astonishment successfully won, an exception to a standard practice: we did not have to print screen shots of the results of every test step. Other projects for which we had contracts had to do this. As you can imagine, managing all that paper slowed progress considerably. Those projects collected those screen shots into boxes, which were sent to offsite storage.

The mounting costs

All of these process steps meant spending more money, mostly in the form of human effort. There were other ways in which the government’s way of making software added costs to the project. Here’s a short, incomplete list:

  • Frequent, ongoing training about compliance with standards, which, amusingly, is where I learned about the Civil War fraud.
  • Entering time worked on three separate software systems – one for the project-management tool, one for government accounting, and one my employer used to manage time off. I spent an hour a week entering time.
  • A prohibition on open-source software. The government wanted all software used to be “supported,” meaning that there had to be a phone number to call for help. So we spent money on commercial tools that sometimes weren’t as capable as open-source versions. In a couple cases, the only tool or component available for a task was open source, and we couldn’t build the application without it. We did get the government to bend the rule for us in those cases, but it took heavily documented justifications and layers of approvals to make it happen.
  • Strict separation of duties to protect the government against a rogue contract employee from sabotaging the system. This meant, for example, that I couldn’t restart the computers we used for testing when they needed it, I knew how to do it, but I was not allowed. I had to write a request for an infrastructure engineer to do it, and then wait sometimes for days for it to reach the top of his priority list.

As you can see, there was nothing easy or inexpensive about this project. Yet we got it done and the software worked. It’s still in use today. We showed that it’s possible – just slow and expensive – to build software the government’s way.

So I have great empathy for those who built healthcare.gov. No doubt about it: the site failed, and they built it. But they must feel tremendous pressure right now as they scramble to both handle the heat they’re getting from the government and to rush fixes to the site so that it works well enough. But if their experience building that site was anything like my experience building government software, it’s hardly shocking that it launched with challenges.