34.1 Latency

Definition 34.1.1

A pipeline is balanced iff there exists C>0C>0 such that the latency of every task is at most CC.

The question then is when is a pipeline balanced? Let us consider some examples of pipelines.

Example 34.1.1

The pipeline composed of t1,t2t_{1},t_{2} where |t1|=1|t_{1}|=1 and |t2|=2|t_{2}|=2 is unbalanced.

Intuitively we can understand this by drawing a diagram of this pipeline, which looks something like this

t1t_{1}t2t_{2}t1t_{1}t2t_{2}t1t_{1}t2t_{2}t1t_{1}t2t_{2}t1t_{1}t2t_{2}t1t_{1}t2t_{2}

and from the diagram it is not unreasonable to guess that the latency LL for the nnth task follows the formula

L=|t1|+|t2|+(n1)L=|t_{1}|+|t_{2}|+(n-1) (34.1)

(assuming that we start at n=1n=1 for the first execution rather than n=0n=0). This clearly implies that LL is unbounded.

Why does the latency grow like that? We can keep scheduling executions of t1t_{1}, while we have to wait for those of t2t_{2}. How about if we swap t2t_{2} and t1t_{1} (it would not be wrong to guess that the pipeline becomes balanced).

Example 34.1.2

The pipeline from 34.1.1, but where our pipeline executes t2,t1t_{2},t_{1}, rather than t1,t2t_{1},t_{2}.

We now end up with a diagram which looks a bit like this

t2t_{2}t1t_{1}t2t_{2}t1t_{1}t2t_{2}t1t_{1}

And from the pattern, we can see that we can neatly slot each new task in. Previously we could very quickly execute the first step in the pipeline, but then had to wait a long time for the slow steps from previous executions to complete. Now we can execute the first step slowly, but because it took longer than the first step, we know that the subsequent machines will definitely be free for us to use.

This lends itself to the following theorem

Theorem 34.1.1

Let P=(t1,,tn)P=(t_{1},...,t_{n}) be a pipeline. PP is balanced (i.e. the latency of PP is bounded) if |t1||t2||t3||tn||t_{1}|\geq|t_{2}|\geq|t_{3}|\geq...\geq|t_{n}|.

Proof: we argue by induction on the number of stages in the pipeline.

The base case is simple; if P=(t1)P=(t_{1}) then the latency of PP is |t1||t_{1}|.

Next, suppose that the pipeline P=(t1,t2,,tn)P=(t_{1},t_{2},...,t_{n}) is balanced (and satisfies the condition |t1||t2||tn||t_{1}|\geq|t_{2}|\geq...\geq|t_{n}|). Let |tn+1||tn||t_{n+1}|\leq|t_{n}|.

The case that we essentially have so far is a pipeline which looks a bit like this

and if we slot tn+1t_{n+1} on the end, we end up with a balanced pipeline. This is because we know that the \sayspace on the end is at most t1t_{1} and tn+1t1t_{n+1}\leq t_{1}, so it (geometrically) will \sayfit.

Author’s note: this proof clearly needs some work.

Of course, it also makes sense to ask for when a pipeline is unbalanced. Intuitively, if one of the later stages takes more time than one of the earlier ones, then we will run into problems - the pipeline will take too long.