parallel – root42

I am currently reading about the P-RAM (Parallel Random Access Machine), which is kind of the Turing Machine for parallel algorithms.

This machine consists of a bunch of CPUs acting on a shared memory. It is very similar to current SIMD and multiprocessing implementations, apart from the memory. The kind of memory access for P-RAM comes in different flavors. Reads and writes can be concurrent or exclusive. Usually, when you talk about P-RAM you imply the CREW P-RAM. This means concurrent reads, but exclusive writes. This sounds rather natural, since reading (from RAM or rather from a cache) could happen concurrently on most real machines, but writes should be exclusive, otherwise it is not clear which result will actually be stored. The latter aspect is usually not implemented by real hardware, so a bit of trickery needs to be done. There seem to be publications on this topic.

Much nicer would be a real CRCW P-RAM, which in literature is also called the W-RAM. This is a very general parallel machine, and it allows for some very efficient algorithms. For example, take a look at the following program, which computes the minimum value of an array:


import random

minimum = None
n = 10
M = [ 0 ] * n
C = [ int(1000 * random.random()) for x in range(0, n) ]

print C

for i in range(0, n):
    for j in range(0, n):
        if C[i] < C[j]:
            M[j] = 1

for i in range(0, n):
    for j in range(0, n):
        if M[i] == 0 and i < j:
            M[j] = 1

for i in range(0, n):
    if M[i] == 0:
        minimum = C[i]

print minimum

On a sequential machine, this algorithm obviously runs in O(n^2) time and O(n) space. Your standard implementation of a sequential array-minimum computation would run in O(n) with O(1) space. However, if you were to have a CRCW P-RAM, you could change all the outer for-loops to parallel-for loops, similar to what you can do in OpenMP. Assuming that you have O(n^2) processors available, which is not that unthinkable nowadays anymore, this will reduce the whole time complexity to O(1), with O(n) space complexity. Awesome, isn’t it?

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Category: parallel

Parallel Algorithms are Strange