Using TDD with Concurrent Applications

room: Conference D, M — time: Thursday 08:30-10:00, Thursday 10:30-12:00
Average Rating: -

Synopsis

Testing concurrent code is hard. Very hard. The number of things that can go wrong increases dramatically simply going from one thread to two. There is an additional increase going from two to more than two, but generally not so much as simply going from one to two. Knowing what is safe and not safe, while important, is not enough. Understanding the theory behind concurrent programs and traditional algorithms, such as producer-consumer, readers-writes and the dining philosophers, while important, is still not enough. You need to know how to approach testing so that you have a better chance of finding concurrency issues before you release to production.

Content

This tutorial begins with some not-so-basic, basics. We first discuss two seemingly simple examples of a badly written class - in both cases we’re only looking at one line of Java Code. We tear apart the code and look at the byte-code to get a better understanding of how this can break at a fundamental level. From this, we derive the beginning of an atomic operation from the perspective of the Java Virtual Machine.

Now that we know the code is broken, the next step is to demonstrate that it can/will/does break. How can we go about doing that? This leads us into a discussion of testing threaded code and, time permitting, we develop a test to demonstrate that the code fails. If time does not permit, we’ll work with one such test. Invariably, writing such tests is not only difficult, but building reliable tests that demonstrate failures in a “reasonable” amount of time is a very hard problem - without a little help. After attempting to increase the percentage of time we are able to detect failures, we’ll use a tool to make detection both fast and reliable.

Our next task will be to take a single-threaded solution that takes too long and improve its performance using concurrent programming. To do so, we’ll test our way from a working, single-threaded solution to a working, multi-threaded solution. Along the way, we’ll apply what we just learned to make sure we did not introduce any threading issues. The particular problem involves redesign based on contention. We will predict what we should get in terms of a performance improvement. When we fail at meeting that performance goal, we will analyze the contention, remove it and move towards our predicted performance gain.

We conclude with a traditional dead-lock problem. We’ll review hand-coded tests written to be able to reliably demonstrate deadlock situations. We’ll then look at the tool we used to and see how it could help us to detect deadlock situations.

If time permits, we’ll spend some time reviewing some of the new features introduced into Java 5, not because we want to learn the library but because there are designs in the library that take advantage of modern processors. Specifically, we’ll look at developing non-blocking, thread-safe solutions as well as thread-safe collections.

Process/Mechanics

Methodology:

The development of the exercises uses experiential learning (http://schuchert.wikispaces.com/ExperientialLearningNotes) as the primary vehicle for delivery of all materials. Everything we do will begin with a question or an exercise. Through a series of learning cycles (http://reviewing.co.uk/research/learning.cycles.htm), we will develop a deeper understanding of how concurrent systems can fail and how we can better write tests to find those failures.

Target Audience

At the very least, audience members should be comfortable programming in some high-level language that has support for concurrent programming. The tutorial will use Java, however, most of the material that we’ll discuss applies to other languages as well.

This is an intermediate tutorial.

Value

  • A deep understanding of how/why concurrent programs can fail
  • How to write tests that have the potential to expose concurrency-related defects
  • Tools you can use to improve your chances of finding such defects

Proposed Agenda

The mix of assignment/discussion work will depend on the format. Half-day formats will involve less coding. A full day tutorial will include the same discussions and also reserve more time for coding.

  • A First Simple Example: (provide code)

    • Question 1: This code is broken, how?
    • Assignment/Discussion: Write a test that demonstrates it is broken.
      • How long does your test take to demonstrate the failure?
      • How reliable is it at demonstrating the failure?
      • What happens when your test runs in other environments?
      • In what ways does your test allow for execution in different configurations?
      • What suggestions/conclusions can you offer from your experience writing this test?
    • Improving reliability:
      • Discussion - how can we improve the reliability of this test?
      • What tools might we use to do this?
    • Demonstration:
      • Use ConTest to improve reliability AND to speed up fault detection.
  • Improving Performance Using Concurrent Programming (provide code)

    • Evaluate the performance of this system.
    • Predict the speedup to be gained from a rewrite using concurrent programming techniques
    • Write a concurrent version
    • Questions
      • Did you write it correctly?
      • How can you write tests to verify it is correct using multiple threads?
    • Review discussion and improve testing approach.
    • Analyze performance - where’s the contention
  • Deadlock Detection - provide an application that can experience deadlock

    • Evaluate the system
    • Write tests to verify whether or not it can exhibit deadlock
    • (Assuming at least one person/group is successful) Discuss your approach. If not, I’ll have one prepared
    • Lessons learned.
  • Non-blocking approaches and modern processors.