Synopsis
Testing concurrent code is hard. Very hard. The number of things that can go wrong increases dramatically simply going from one thread to two. There is an additional increase going from two to more than two, but generally not so much as simply going from one to two. Knowing what is safe and not safe, while important, is not enough. Understanding the theory behind concurrent programs and traditional algorithms, such as producer-consumer, readers-writes and the dining philosophers, while important, is still not enough. You need to know how to approach testing so that you have a better chance of finding concurrency issues before you release to production.
Content
This tutorial begins with some not-so-basic, basics. We first discuss two seemingly simple examples of a badly written class - in both cases we’re only looking at one line of Java Code. We tear apart the code and look at the byte-code to get a better understanding of how this can break at a fundamental level. From this, we derive the beginning of an atomic operation from the perspective of the Java Virtual Machine.
Now that we know the code is broken, the next step is to demonstrate that it can/will/does break. How can we go about doing that? This leads us into a discussion of testing threaded code and, time permitting, we develop a test to demonstrate that the code fails. If time does not permit, we’ll work with one such test. Invariably, writing such tests is not only difficult, but building reliable tests that demonstrate failures in a “reasonable” amount of time is a very hard problem - without a little help. After attempting to increase the percentage of time we are able to detect failures, we’ll use a tool to make detection both fast and reliable.
Our next task will be to take a single-threaded solution that takes too long and improve its performance using concurrent programming. To do so, we’ll test our way from a working, single-threaded solution to a working, multi-threaded solution. Along the way, we’ll apply what we just learned to make sure we did not introduce any threading issues. The particular problem involves redesign based on contention. We will predict what we should get in terms of a performance improvement. When we fail at meeting that performance goal, we will analyze the contention, remove it and move towards our predicted performance gain.
We conclude with a traditional dead-lock problem. We’ll review hand-coded tests written to be able to reliably demonstrate deadlock situations. We’ll then look at the tool we used to and see how it could help us to detect deadlock situations.
If time permits, we’ll spend some time reviewing some of the new features introduced into Java 5, not because we want to learn the library but because there are designs in the library that take advantage of modern processors. Specifically, we’ll look at developing non-blocking, thread-safe solutions as well as thread-safe collections.
Methodology:
The development of the exercises uses experiential learning (http://schuchert.wikispaces.com/ExperientialLearningNotes) as the primary vehicle for delivery of all materials. Everything we do will begin with a question or an exercise. Through a series of learning cycles (http://reviewing.co.uk/research/learning.cycles.htm), we will develop a deeper understanding of how concurrent systems can fail and how we can better write tests to find those failures.
Target Audience
At the very least, audience members should be comfortable programming in some high-level language that has support for concurrent programming. The tutorial will use Java, however, most of the material that we’ll discuss applies to other languages as well.
This is an intermediate tutorial.
Value
Proposed Agenda
The mix of assignment/discussion work will depend on the format. Half-day formats will involve less coding. A full day tutorial will include the same discussions and also reserve more time for coding.
A First Simple Example: (provide code)
Improving Performance Using Concurrent Programming (provide code)
Deadlock Detection - provide an application that can experience deadlock
Non-blocking approaches and modern processors.