[,

]]
* is a variable of a type t, is a subset of t and *

~~ is
a set of PU numbers with |~~~~| =< |~~

~~|. If ~~~~==~~

~~ (as it is usually
the case), the "BY"-clause may be omitted. The ~~

~~. In each of these
parallel excutions, ~~* has a different value, taken from . If
|S|>|P| and |S|=<|P|*2, then the additional ", *

~~" indicates that
the loop is to be executed twice |P| times. If a variable is
declared within a phase, this implicitly creates one variable for
each parallel process.
// I suppose this is the general form of the PARDO used in the
// lecture
Parallel algorithm: An algorithm containing phases. The parts of the
algorithm outside phases are executed by the first PU available for
the algorithm.
Sequential algorithm: An algorithm, which is not parallel.
RAM, Random Access Machine: A PRAM with one PU.
Read: The process of one PU reading the value of one memory cell.
Write: The process of one PU writing the value of one memory cell.
Access: A read or a write.
Exclusive Access: The phenomenon that no two PUs access the same
memory cell in one phase.
Concurrent Access: The phenomenon that more than one PU may access
one memory cell in one phase. Not all PRAMs may allow concurrent
access.
EREW-PRAM: A PRAM which
* can only do exclusive read
* can only do exclusive write
CREW-PRAM: A PRAM which
* can do concurrent read
* can only do exclusive write
CRCW-PRAM: A PRAM which
* can do concurrent read
* can do concurrent write
Arbitrary concurrent write PRAM: A CRCW-PRAM, where it cannot be said
deterministically, which value a memory cell will contain after
different PUs have written different values to it in one phase.
Common concurrent write PRAM: A CRCW-PRAM, which only allows
concurrent writing if the same value is written.
Priority concurrent write PRAM: A CRCW-PRAM, in which PUs with a lower
number are considered more powerful. If a concurrent write is done,
the value of the PU with a lower number is put into the cell.
Machine model: A set of all PRAMs, which share the same support for
concurrent access.
n-is-a-Assumption: The assumption, that the length of an array has a
certain property (such as being even, being a power of 2 or being a
square number). This can always be achieved by prolonging the array by
some irrelevant entries: zeroes, the values EMPTY or INF, depending on
the problem.
Distinctness-assumption of a sequence S: The assumption that there are
no two elements of the sequence, which are equal. On a PRAM, this can
easily be achieved by creating a partner array for S, containing the
indices 0..S.length-1. Then, we understand each element of S as a pair
of the element of S and the corresponding element of the partner array.
Thus, all elements are distinct, without their relative order being
changed. After having treated the elements, we forget about the
partner array.
Example: S =<7,8,3,6,2,2,3,9>
S' =<0,1,2,3,4,5,6,7>
new S =<(7,0),(8,1),(3,2),(6,3),(2,4),(2,5),(3,6),(9,7)>
Parallel recursive procedure: A procedure, which executes two parallel
calls to itself. Each of the calls is assigned a memory area, which
can hold
* the parameters, if they are values
* the variables declared in the procedure
* the location of the call in the algorithm
* all of this for all possible calls, which the procedure might
execute, including all parallel calls to itself
A new notation is used:
~~```
This algorithm requires O(log(n)) time and n/log(n) PUs. Hence, it is
work optimal. The ANSV-problem can also be solved on a CRCW-PRAM in
time O(log(log(n))^2) with n/log(log(n))^2 PUs, if the problem is
split into sqrt(n) sub-problems, which are solved recursively. Then
there are log(log(n)) recursive calls, each of which runs time
O(log(log(n))). The recursion is stopped at a sub-problem size of
log(log(n)). These sub-problems are then solved sequentially. The
ANSV-problem can also be solved in time O(1) on a CRCW-PRAM with
n^2 PUs:
PROCEDURE ansvN2(A, rightDom : array[n=k^2] of Integer,
P : array[>n^2] of PU)
ASSERT all elements of A are distinct
VAR m : array[n][n] of Bit
FOR (i,j) in {(0,0),(0,1),...} PARDO BY PUs P
m[i][j] := A[i]j?1:0
FOR i in {0,...n-1} PARDO
rightDom[i] := firstbit(m[i][0..n-1],P[i*n # n])
This algorithm arranges the PUs in a matrix. Each looks at a pair of
two numbers of A and sets m[i][j] to 1, if A[i]>A[j]. Thereby, only
the upper right triangle of the matrix is filled. Then, the first
bit in a line i identifies the position of the right dominator of i.
The ANSV-problem cannot be solved faster than the Merging-problem,
because merging can be reduced to ANSV:
PROCEDURE mergeByANSV(X,Y, result : array[n] of Integer,
P : array[>2*n] of PU)
ASSERT X and Y are sorted, all elements distinct
VAR seq : array[2*n] of Integer :=
```

__
S describes a depth-first-search in the tree. A node v is entered
the first time via a link (u,v). All its child nodes are treated and
then v is left via (v,u) and entered no more. If the tree is given
by the advanced adjacancy list representation, S can be calculated
in constant time. As a result, an Euler circuit in a tree of n nodes
can be found in time O(log(n)) on a CREW-PRAM with n PUs as follows:
PROCEDURE euler(tree: AdvancedAdjacencyList,
result : array[n*2-1] of AdvancedEdge,
P : array[>n] of PU)
VAR way : array[2*n-1] of AdvancedEdge
VAR list : array[2*n-1] of Node
FOR e in Edges of tree PARDO BY PUs P, P
list[e]:=e.pair.cdr
way[e]:=e
unwringle(way,list,result,P)
This algorithm can be optimized by the sequential division strategy
to run with n/log(n) PUs.
// This algorithm makes explicit what I think is behind the abstract
// presentation in the lecture
Generic-Euler-Solution: The following algorithm template, which
runs in time O(log(n)) with n/log(n) PUs for trees with n nodes:
PROCEDURE genericEuler(tree : AdvancedAdjacencyList,
result : array[n] of Node,...,
P : array[>n] of PU)
ASSERT tree is undirected
VAR ec : array[2*n-1] of AdvancedEdge
euler(tree, ec, P)
VAR weight, prefixweight : array[2*n-1] of Integer
FOR i in {0,...2*n-3} PARDO BY PUs P, P
weight[i] := ec[i].weight
prefixSum(weight, prefixweight, P)
FOR i in {0,...2*n-3} PARDO BY PUs P, P
ec[i].weight:=prefixweight[i]
This algorithm gets a tree, finds an euler circuit, assigns weights to the
edges of the circuit, calculates the prefix sums of these weights and
replaces the edge weights by the prefix sums. Some code may be introduced
in between. The general idea is that the euler circuit in combination with
the prefix sums simulates a depth-first search with a counter. With each
edge passed, the counter is incremented by the corresponding edge weight.
As each edge is traversed exactly once, each edge maintains its specific
edge weight sum.
Tree-Root-problem: The problem of, given an undirected tree (S,R) and a
root node, determining the relation R'__

```
// Set edge weights to 1
FOR e in Edges of tree PARDO BY P, P
e.weight:=1
``````
// Create doubleec := ec + ec
VAR doubleec : array[4*n-2] of AdvancedEdge
VAR rootpos : Integer
FOR i in {0,...2*n-2} PARDO BY P, P
doubleec[i]:=ec[i]
doubleec[2*n-2+i]:=ec[i]
IF ec[i].node == root THEN rootpos := i
// Now, doubleec[rootpos # n*2-1] is an euler circuit
// starting at the root. Copy it to ec
FOR i in {0,...2*n-2} PARDO BY P, P
ec[i] := doubleec[rootpos+i]
``````
// Find for each node the latest outgoing edge
FOR e in Edges of tree PARDO BY P, P
IF e.weight > e.pair.car.weight THEN
result[e.start] := e.end
ELSE result[e.end] := e.start
This algorithm first circulates the euler circuit so that it starts at
the root. As each edge weight is 1, an edge is heavier if it is passed
later. Since a node is entered, worked on and then left and entered no
more, the edge used for leaving can be seen as part of the tree. Regarding
an edge
```__ and its inverse edge __, the one with the higher value
will be the one used for leaving. If e.g. __ has the higher weight
than __, then the successor of u is set to v. The result is an
array, which contains for each node its parent.
Example:
Edge weights: Prefix edge weights
<---1----- <---12-----
u v u v
----1----> ----99---->
=> v is the parent of u
Binary tree: A tree in which all nodes have either 0, 1 or 2 children.
One talks of the "left" and the "right" child of a node. If a node
has just one child, this is the "left child".
Complete binary tree: A binary tree, in which each node has either 2 or 0
children and where the number of nodes is a power of 2 minus 1.
Postordering of a binary tree (S,R): A function postorder:S->{1,...|S|},
such that postorder(leftchild(u))
// Add inverse edges, produce an undirected tree
inverse : Stack of AdvancedEdge
FOR e in Edges of tree PARDO BY PUs P
// Edges parent to child get weight 0
e.weight:=0
// Edges child to parent get weight 1
tree[e.end]:=(car:(start:e.end,end:e.start,
pair:e,weight:1),cdr:tree[e.end])
push(inverse,tree[e.end])
```
// Run through all inverse tree edges, give edge
// weight as postorder number to start node
FOR e in inverse PARDO BY PUs P
result[e.start] := e.weight
result[n-1]:=n
This algorithm first converts the tree to an undirected tree. While
doing so, it gives edge weight 0 to tree edges and edge weight 1 to
inverse tree edges. Then an Euler circuit is found and the prefix sums
of the edge weights are calculated. Now the prefix edge weight of an
inverse tree edge leaving v is the postorder number of v.
Example:
Edge weights: Prefix edge weights
parent parent
| ^ | ^
0 | 17 |
| 1 | 99
V | V |
child child
=> postorder(child)=99
Level-problem, BFS-depth-problem: The problem of, given a tree,
determining for each node its level. This problem can be solved
on a CREW-PRAM with n PUs in time O(1) by the genericEuler-procedure:
``````
// Add inverse edges, produce an undirected tree
inverse : Stack of AdvancedEdge
FOR e in Edges of tree PARDO BY PUs P
// Edges parent to child get weight 1
e.weight:=1
// Edges child to parent get weight -1
tree[e.end]:=(car:(start:e.end,end:e.start,
pair:e,weight:-1),cdr:tree[e.end])
push(inverse,tree[e.end])
``````
// Run through all inverse tree edges, give edge
// weight as bfs-depth to start node
FOR e in inverse PARDO BY PUs P
result[e.start] := e.weight
result[0]:=0
This algorithm increases the counter for each tree edge and decreases
the counter for each inverse tree edge. Thus, the counter reflects the
level of a node or its distance from the root.
Example:
Edge weights: Prefix edge weights
parent parent
| ^ | ^
1 | 3 |
| -1 | 2
V | V |
child child
=> bfsdepth(child)=3
Least significant 1-bit of an integer i: The index of the rightmost 1 in
the binary representation, noted ls1(i).
Inordering of a binary tree (S,R): A function inorder:S->{1,...|S|}, such
that inorder(leftchild(u))
```)
and the right child is
rightchild(v) = decimal(**)
where b is the binary representation of v. The i-th level is the set
of nodes where the last (#levels-i-1) bits are 0.
Example: 8
4 12
2 6 10 14
1 3 5 7 9 11 13 15
Left descendant of a node v in a binary tree: The left child of v or
one of its descendants. In an inorder tree, the binary representation
of the left descendants of v is of the form
binrep(leftdesc) = ****,
where b is the binary representation of v and x[j] are arbitrary bits,
which are not all 0.
Right descendant of a node v in a binary tree: The right child of v or
one of its descandants. In an inorder tree, the binary representation of
the right descendants of v is of the form
binrep(leftdesc) = ****
where b is the binary representation of v and x[j] are arbitrary bits,
which are not all 0.
Left baby of a tree node s: The leftmost baby of node s. In an inorder
tree, the left baby of a node v is
leftbaby(v) = decimal(****)
where b is the binary representation of v.
First appearance of a node v in a path p: The index i such that p[i]==v
and p[j]!=v for all j**n] of PU) : Integer
IF ec[0].start==node THEN RETURN 0
FOR i in {1,...2*n-1} PARDO BY PUs P, P
IF ec[i].start==node AND
bfsdepth[ec[i-1].start] and

```
will be
left away.
Last appearance of a node v in a path p: The index i such that p[i]==v
and p[j]!=v for all j>i. The last appearance of a node in an euler
circuit can be determined by help of the bfs-depth: i is the first
appearance of v if p[i]==v and p[i+1] has a smaller bfs-depth than
p[i]. Thus, the first appearance can be found in time O(1) on a
CREW-PRAM with n PUs, where n is the number of nodes.
Unrelated nodes of a tree: Two nodes, none of which is an ancestor of
the other.
LCA, lowest common ancestor of two tree nodes c1 and c2: The
node p=lca(c1,c2), such that p is an ancestor of c1 and c2 and that
no child of p is also an ancestor of c1 and c2. If the tree is an
inorder tree, the lca can be calculated as follows:
lca(u,v) = decimal(
```

**)
where b[1]...b[i-1] are the first equal bits of the binary
representations of inorder(u) and inorder(v).
Interval-minimum-problem: The problem of, given a sequence of
integers and two indices, determining the index of the smallest value
in the sequence between the two given indices. This problem requires
a precomputation of an inorder tree with as many leaves as
there are elements in the sequence. With each node of this tree, one
stores the prefix minima and the suffix minima of all baby descendants.
This tree can be calculated in time O(log(n)) by n PUs on a CREW-PRAM
as follows:
PROCEDURE intervalTree(s : array[n] of Integer,
result : array[n][n] of
(prefminInd,sufminInd:Integer)
P : array[>n] of PU)
// Set all leaves to the sequence elements
FOR i in {0,...n} PARDO BY PUs P
result[2*i+1][0] := i
// Walk up in the tree
VAR #levels := log2(n)
FOR lev := log2(n) TO 0 DO
FOR node in level(lev) PARDO
combinePref(result[leftchild(node)][].prefminInd,
result[rightchild(node)][].prefminInd,
result[node][].prefminInd,2^(#levels-lev-1),P[...])
combineSuf(result[leftchild(node)][].sufminInd,
result[rightchild(node)][].prefminInd,
result[node][].sufminInd,2^(#levels-lev-1),P[...])
This algorithm walks up in the tree and combines the prefix- and suffix-
minima of the two children of a node to a prefix- and suffix-minima.
The function combinePref combines the prefix minimum arrays of the two
children of by copying the left prefix array and copying the
right prefix array. But a value from the right prefix array is only copied
if it is smaller than the rightmost value of the left prefix array.
PROCEDURE combinePref(s : array[n] of Integer,
left, right:array[n] of Integer,
len : Integer,
result : array[n] of Integer,
P : array[len] of PU)
FOR i in {0,...len/2-1} PARDO BY PUs P
result[i]:=left[i]
result[len+i]:=s[right[i]]**