2/07/2011

Whitepaper: Building functional safety into complex software systems

My colleague Chris Hobbs writes books, designs software, sings Schubert, teaches pilots, and, if all that isn't enough, pens papers on functional safety. Speaking of which, I've just started reading Chris's latest paper, "Building Functional Safety into Complex Software Systems, Part I," which contains the following anecdote:

    "Thirty-seven seconds after it was launched on June 4 1996, the European Space Agency’s (ESA) new Ariane 5 rocket rained back to earth in pieces. This failure was rather costly: some US $370 million, and a stinging embarrassment for ESA.

    It has become one of the best known instances of software that had been exhaustively tested and even field proven — in this case, more accurately, sky-proven — ceasing to function correctly though it had not been changed. What had changed was the context in which the software ran..."


This story highlights the paper's thesis: that the functional safety of today’s complex, multi-threaded software systems cannot be validated by traditional, state-based testing alone.

In theory, such systems are deterministic. And in theory, all of their states and state transitions can be identified. But in practice, these states and transitions are so numerous that they cannot be counted, let alone tested.

Does this mean we must throw up our collective hands in despair? Not at all, says Chris. He emphasizes that it is still possible to build functionally safe complex software systems — but since I don't want to spoil the story, I'll stop talking now and invite you to read the paper.

And while you're at it, I invite you to check out other papers Chris has written on safety-critical systems and software:

  • Fault Tree Analysis with Bayesian Belief Networks for Safety-Critical Software


  • Using an IEC 61508-Certified RTOS Kernel for Safety-Critical Systems


  • Protecting Applications Against Heisenbugs


  •  

    No comments: