Distributed parallel computing is a form of computing in which a computing task is divided into multiple sub-tasks that are then worked on at the same time, or in parallel, by a distributed computer network. Parallel computing is commonly used for large computing tasks, where processing the entire task sequentially would be very time consuming. Distributed parallel computing is commonly used in modern science and other areas that require large amounts of computer processing power.
A distributed computer network is a group of computers, each with its own individual memory and processor, that are connected by a network that allows them to communicate and work together. Individual computers in the network are commonly called nodes. The computers in a distributed network can be physically separated by large distances, though this is not always so. Computing done by a group of computers in the same location can also be called distributed if the individual nodes of the network are capable of operating autonomously and each node has its own memory, with individual processors sharing information with each other by message passing rather than by using a single shared memory.
An advantage of distributed parallel computing is that a distributed network can incorporate large numbers of individual computers that are not necessarily in the same location. This means that the amount of processing power that can be brought to bear on a large task when an entire network computes different parts of it in parallel can be extremely high. Distributed networks can also be more reliable than a single system, because the network can continue to operate even if some of its constituent nodes cease to function; there is no single point of failure that can bring down the entire distributed network simultaneously. A distributed network is also more scalable and easier to modify to suit changing needs, because adding or removing individual nodes from a network is usually easier than making alterations to a single large system.
Distributed parallel computing is frequently used for advanced scientific computing in many modern scientific fields, such as astrophysics and meteorology. These fields involve enormously vast, complex calculations that require great processing power to complete in a reasonable amount of time. Distributed parallel computing is also frequently used for computer graphics rendering, which requires large amounts of processing resources and is very well suited to being broken down into many parallel tasks. Individual frames of computer animation might be assigned to different nodes in the network, for example.
Several types of distributed parallel computing exist. Most commonly, the term is used to refer to a network of computers working on a problem in parallel while separated from each other geographically. Individual computers communicate via the Internet to form a loose network, often called a grid, and components of the problem the grid is working on are assigned to individual nodes and processed in parallel.
Often, individual nodes are not working solely or even primarily on the problem being worked on by the grid. Instead, individual nodes use some of their processing power for the grid while using the rest independently for other, unrelated tasks. In many cases, distributed parallel processing functions by taking advantage of moments when the computer’s processor would otherwise be idle. Instruction cycles that would have otherwise gone unused — for example, during periods when the computer is on but not being used or during moments when the computer is awaiting input from its user — are instead used for tasks assigned by the grid.
Grids can vary greatly in size and power, from small groups of interconnected computers contributing spare instruction cycles to common tasks at a business office to worldwide networks with processing power rivaling or surpassing dedicated supercomputers. There are a number of large distributed parallel processing projects based on the participation of volunteers who allow their personal computers to donate spare processing power to the project, most often some form of computationally intensive scientific research. One popular example is Folding@home, which runs simulations of protein molecules for medical research on hundreds of thousands of individual computers.
Distributed parallel processing can also refer to cluster computing. Unlike the relatively loose connection of a computer grid, cluster computing uses a group of machines acting in parallel that work together very closely, acting in some respects as a single unit. The nodes are usually in the same location and connected by a local area network, but each node of a computer cluster still has its own separate memory, even if the cluster is not separated geographically. The individual computers comprising the cluster are usually identical, because this makes distributing tasks among them simpler, though this is not always the case. Clusters are frequently used for scientific computing, because they can provide high processing power using relatively inexpensive store-bought hardware.