World Library  
Flag as Inappropriate
Email this Article

Grid computing

Article Id: WHEBN0000049373
Reproduction Date:

Title: Grid computing  
Author: World Heritage Encyclopedia
Language: English
Subject: Quasi-opportunistic supercomputing, SLinCA@Home, Supercomputer architecture, NorduGrid, Computer cluster
Collection: Grid Computing
Publisher: World Heritage Encyclopedia

Grid computing

Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. Grid computing is distinguished from conventional high performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application.[1] Grid computers also tend to be more heterogeneous and geographically dispersed (thus not physically coupled) than cluster computers.[2] Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries.

Grid size varies a considerable amount. Grids are a form of distributed computing whereby a “super virtual computer” is composed of many networked loosely coupled computers acting together to perform large tasks. For certain applications, “distributed” or “grid” computing, can be seen as a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local high-speed computer bus.


  • Overview 1
  • Comparison of grids and conventional supercomputers 2
  • Design considerations and variations 3
  • Market segmentation of the grid computing market 4
    • The provider side 4.1
    • The user side 4.2
  • CPU scavenging 5
  • History 6
  • Fastest virtual supercomputers 7
  • Projects and applications 8
    • Definitions 8.1
  • See also 9
    • Related concepts 9.1
    • Alliances and organizations 9.2
    • Production grids 9.3
    • International projects 9.4
    • National projects 9.5
    • Standards and APIs 9.6
    • Software implementations and middleware 9.7
    • Monitoring frameworks 9.8
  • See also 10
  • References 11
    • Bibliography 11.1
  • External links 12


Grid computing combines computers from multiple administrative domains to reach a common goal,[3] to solve a single task, and may then disappear just as quickly.

One of the main strategies of grid computing is to use middleware to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Grid computing involves computation in a distributed fashion, which may also involve the aggregation of large-scale clusters.

The size of a grid may vary from small—confined to a network of computer workstations within a corporation, for example—to large, public collaborations across many companies and networks. "The notion of a confined grid may also be known as an intra-nodes cooperation whilst the notion of a larger, wider grid may thus refer to an inter-nodes cooperation".[4]

Grids are a form of distributed computing whereby a “super virtual computer” is composed of many networked loosely coupled computers acting together to perform very large tasks. This technology has been applied to computationally intensive scientific, mathematical, and academic problems through volunteer computing, and it is used in commercial enterprises for such diverse applications as drug discovery, economic forecasting, seismic analysis, and back office data processing in support for e-commerce and Web services.

Coordinating applications on Grids can be a complex task, especially when coordinating the flow of information across distributed computing resources. Grid workflow systems have been developed as a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, in the Grid context.

Comparison of grids and conventional supercomputers

“Distributed” or “grid” computing in general is a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well-suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors.[5] The high-end scalability of geographically dispersed grids is generally favorable, due to the low need for connectivity between nodes relative to the capacity of the public Internet.

There are also some differences in programming and deployment. It can be costly and difficult to write programs that can run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues. If a problem can be adequately parallelized, a “thin” layer of “grid” infrastructure can allow conventional, standalone programs, given a different part of the same problem, to run on multiple machines. This makes it possible to write and debug on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.

Design considerations and variations

One feature of distributed grids is that they can be formed from computing resources belonging to multiple individuals or organizations (known as multiple administrative domains). This can facilitate commercial transactions, as in utility computing, or make it easier to assemble volunteer computing networks.

One disadvantage of this feature is that the computers which are actually performing the calculations might not be entirely trustworthy. The designers of the system must thus introduce measures to prevent malfunctions or malicious participants from producing false, misleading, or erroneous results, and from using the system as an attack vector. This often involves assigning work randomly to different nodes (presumably with different owners) and checking that at least two different nodes report the same answer for a given work unit. Discrepancies would identify malfunctioning and malicious nodes. However, due to the lack of central control over the hardware, there is no way to guarantee that nodes will not drop out of the network at random times. Some nodes (like laptops or dialup Internet customers) may also be available for computation but not network communications for unpredictable periods. These variations can be accommodated by assigning large work units (thus reducing the need for continuous network connectivity) and reassigning work units when a given node fails to report its results in expected time.

The impacts of trust and availability on performance and development difficulty can influence the choice of whether to deploy onto a dedicated cluster, to idle machines internal to the developing organization, or to an open external network of volunteers or contractors. In many cases, the participating nodes must trust the central system not to abuse the access that is being granted, by interfering with the operation of other programs, mangling stored information, transmitting private data, or creating new security holes. Other systems employ measures to reduce the amount of trust “client” nodes must place in the central system such as placing applications in virtual machines.

Public systems or those crossing administrative domains (including different departments in the same organization) often result in the need to run on heterogeneous systems, using different operating systems and hardware architectures. With many languages, there is a trade off between investment in software development and the number of platforms that can be supported (and thus the size of the resulting network). Cross-platform languages can reduce the need to make this trade off, though potentially at the expense of high performance on any given node (due to run-time interpretation or lack of optimization for the particular platform). There are diverse scientific and commercial projects to harness a particular associated grid or for the purpose of setting up new grids. BOINC is a common one for various academic projects seeking public volunteers; more are listed at the end of the article.

In fact, the middleware can be seen as a layer between the hardware and the software. On top of the middleware, a number of technical areas have to be considered, and these may or may not be middleware independent. Example areas include Virtual organization management, License Management, Portals and Data Management. These technical areas may be taken care of in a commercial solution, though the cutting edge of each area is often found within specific research projects examining the field.

Market segmentation of the grid computing market

For the segmentation of the grid computing market, two perspectives need to be considered: the provider side and the user side:

The provider side

The overall grid market comprises several specific markets. These are the grid middleware market, the market for grid-enabled applications, the utility computing market, and the software-as-a-service (SaaS) market.

Grid Globus Toolkit, gLite, and UNICORE.

Utility computing is referred to as the provision of grid computing and applications as service either as an open grid utility or as a hosting solution for one organization or a VO. Major players in the utility computing market are Sun Microsystems, IBM, and HP.

Grid-enabled applications are specific software applications that can utilize grid infrastructure. This is made possible by the use of grid middleware, as pointed out above.

Software as a service (SaaS) is “software that is owned, delivered and managed remotely by one or more providers.” (Gartner 2007) Additionally, SaaS applications are based on a single set of common code and data definitions. They are consumed in a one-to-many model, and SaaS uses a Pay As You Go (PAYG) model or a subscription model that is based on usage. Providers of SaaS do not necessarily own the computing resources themselves, which are required to run their SaaS. Therefore, SaaS providers may draw upon the utility computing market. The utility computing market provides computing resources for SaaS providers.

The user side

For companies on the demand or user side of the grid computing market, the different segments have significant implications for their IT deployment strategy. The IT deployment strategy as well as the type of IT investments made are relevant aspects for potential grid users and play an important role for grid adoption.

CPU scavenging

CPU-scavenging, cycle-scavenging, or shared computing creates a “grid” from the unused resources in a network of participants (whether worldwide or internal to an organization). Typically this technique uses desktop computer instruction cycles that would otherwise be wasted at night, during lunch, or even in the scattered seconds throughout the day when the computer is waiting for user input or slow devices. In practice, participating computers also donate some supporting amount of disk storage space, RAM, and network bandwidth, in addition to raw CPU power.

Many volunteer computing projects, such as BOINC, use the CPU scavenging model. Since nodes are likely to go "offline" from time to time, as their owners use their resources for their primary purpose, this model must be designed to handle such contingencies.


The term grid computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid. The power grid metaphor for accessible computing quickly became canonical when Ian Foster and Carl Kesselman published their seminal work, "The Grid: Blueprint for a new computing infrastructure" (1999).

CPU scavenging and volunteer computing were popularized beginning in 1997 by and later in 1999 by SETI@home to harness the power of networked PCs worldwide, in order to solve CPU-intensive research problems.

The ideas of the grid (including those from distributed computing, object-oriented programming, and Web services) were brought together by Ian Foster, Carl Kesselman, and Steve Tuecke, widely regarded as the "fathers of the grid".[6] They led the effort to create the Globus Toolkit incorporating not just computation management but also storage management, security provisioning, data movement, monitoring, and a toolkit for developing additional services based on the same infrastructure, including agreement negotiation, notification mechanisms, trigger services, and information aggregation. While the Globus Toolkit remains the de facto standard for building grid solutions, a number of other tools have been built that answer some subset of services needed to create an enterprise or global grid.

In 2007 the term cloud computing came into popularity, which is conceptually similar to the canonical Foster definition of grid computing (in terms of computing resources being consumed as electricity is from the power grid). Indeed, grid computing is often (but not always) associated with the delivery of cloud computing systems as exemplified by the AppLogic system from 3tera.

Fastest virtual supercomputers

  • As of June 2014, Bitcoin Network – 1166652 PFLOPS.[7]
  • As of April 2013, Folding@home – 11.4 x86-equivalent (5.8 "native") PFLOPS.[8]
  • As of March 2013, BOINC – processing on average 9.2 PFLOPS.[9]
  • As of April 2010, MilkyWay@Home computes at over 1.6 PFLOPS, with a large amount of this work coming from GPUs.[10]
  • As of April 2010, SETI@Home computes data averages more than 730 TFLOPS.[11]
  • As of April 2010, Einstein@Home is crunching more than 210 TFLOPS.[12]
  • As of June 2011, GIMPS is sustaining 61 TFLOPS.[13]

Projects and applications

Grid computing offers a way to solve utility for commercial and noncommercial clients, with those clients paying only for what they use, as with electricity or water.

Grid computing is being applied by the National Science Foundation's National Technology Grid, NASA's Information Power Grid, Pratt & Whitney, Bristol-Myers Squibb Co., and American Express.

One cycle-scavenging network is SETI@home, which was using more than 3 million computers to achieve 23.37 sustained teraflops (979 lifetime teraflops) as of September 2001.[14]

As of August 2009 Folding@home achieves more than 4 petaflops on over 350,000 machines.

The European Union funded projects through the framework programmes of the European Commission. BEinGRID (Business Experiments in Grid) was a research project funded by the European Commission[15] as an Integrated Project under the Sixth Framework Programme (FP6) sponsorship program. Started on June 1, 2006, the project ran 42 months, until November 2009. The project was coordinated by Atos Origin. According to the project fact sheet, their mission is “to establish effective routes to foster the adoption of grid computing across the EU and to stimulate research into innovative business models using Grid technologies”. To extract best practice and common themes from the experimental implementations, two groups of consultants are analyzing a series of pilots, one technical, one business. The project is significant not only for its long duration, but also for its budget, which at 24.8 million Euros, is the largest of any FP6 integrated project. Of this, 15.7 million is provided by the European commission and the remainder by its 98 contributing partner companies. Since the end of the project, the results of BEinGRID have been taken up and carried forward by

The Enabling Grids for E-sciencE project, based in the European Union and included sites in Asia and the United States, was a follow-up project to the European DataGrid (EDG) and evoled into the European Grid Infrastructure. This, along with the LHC Computing Grid[16] (LCG), was developed to support experiments using the CERN Large Hadron Collider. The A list of active sites participating within LCG can be found online[17] as can real time monitoring of the EGEE infrastructure.[18] The relevant software and documentation is also publicly accessible.[19] There is speculation that dedicated fiber optic links, such as those installed by CERN to address the LCG's data-intensive needs, may one day be available to home users thereby providing internet services at speeds up to 10,000 times faster than a traditional broadband connection.[20] The European Grid Infrastructure has been also used for other research activities and experiments such as the simulation of oncological clinical trials.[21]

The project was started in 1997. The NASA Advanced Supercomputing facility (NAS) ran genetic algorithms using the Condor cycle scavenger running on about 350 Sun Microsystems and SGI workstations.

In 2001, United Devices operated the United Devices Cancer Research Project based on its Grid MP product, which cycle-scavenges on volunteer PCs connected to the Internet. The project ran on about 3.1 million machines before its close in 2007.[22]

As of 2011, over 6.2 million machines running the open-source Berkeley Open Infrastructure for Network Computing (BOINC) platform are members of the World Community Grid, which tops the processing power of the current fastest supercomputer system (China's Tianhe-I).[23]


Today there are many definitions of grid computing:

  • Plaszczak/Wellner[24] define grid technology as "the technology that enables resource virtualization, on-demand provisioning, and service (resource) sharing between organizations."
  • IBM defines grid computing as “the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across ‘multiple’ administrative domains based on their (resources) availability, capacity, performance, cost and users' quality-of-service requirements”.[25]
  • An earlier example of the notion of computing as utility was in 1965 by MIT's Fernando Corbató. Corbató and the other designers of the Multics operating system envisioned a computer facility operating “like a power company or water company”.[26]
  • Buyya/Venugopal[27] define grid as "a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements".
  • CERN, one of the largest users of grid technology, talk of The Grid: “a service for sharing computer power and data storage capacity over the Internet.”[28]

See also

Related concepts

Alliances and organizations

Production grids

International projects

Name Region Start End
European Grid Infrastructure (EGI) Europe May 2010 Dec 2014
Open Middleware Infrastructure Institute Europe (OMII-Europe) Europe May 2006 May 2008
Enabling Grids for E-sciencE (EGEE, EGEE II and EGEE III) Europe March 2004 April 2010
Grid enabled Remote Instrumentation with Distributed Control and Computation (GridCC) Europe September 2005 September 2008
European Middleware Initiative (EMI) Europe May 2010 active
KnowARC Europe June 2006 November 2009
Nordic Data Grid Facility Scandinavia and Finland June 2006 December 2012
World Community Grid Global November 2004 active
XtreemOS Europe June 2006 (May 2010) ext. to September 2010
OurGrid Brazil December 2004 active

National projects

Standards and APIs

Software implementations and middleware

Monitoring frameworks

See also


  1. ^ Grid vs cluster computing
  2. ^ What is grid computing? - Gridcafe. Retrieved 2013-09-18.
  3. ^ a b "What is the Grid? A Three Point Checklist". 
  4. ^ "Pervasive and Artificial Intelligence Group :: publications [Pervasive and Artificial Intelligence Research Group]". May 18, 2009. Retrieved July 29, 2010. 
  5. ^ Computational problems - Gridcafe. Retrieved 2013-09-18.
  6. ^ "Father of the Grid". 
  7. ^ (15 June 2014). "Bitcoin Network Statistics". Bitcoin. Staffordshire University. Retrieved June 15, 2014. 
  8. ^ Pande lab. "Client Statistics by OS". Folding@home. Stanford University. Retrieved April 23, 2013. 
  9. ^ "BOINCstats – BOINC combined credit overview". Retrieved March 3, 2013. 
  10. ^ "MilkyWay@Home Credit overview". BOINC. Retrieved April 21, 2010. 
  11. ^ "SETI@Home Credit overview". BOINC. Retrieved April 21, 2010. 
  12. ^ "Einstein@Home Credit overview". BOINC. Retrieved April 21, 2010. 
  13. ^ "Internet PrimeNet Server Distributed Computing Technology for the Great Internet Mersenne Prime Search". GIMPS. Retrieved June 6, 2011. 
  14. ^ [1]
  15. ^ Home page of BEinGRID
  16. ^ Large Hadron Collider Computing Grid official homepage
  17. ^ "GStat 2.0 – Summary View – GRID EGEE". Retrieved July 29, 2010. 
  18. ^ "Real Time Monitor". Retrieved July 29, 2010. 
  19. ^ "LCG – Deployment". Retrieved July 29, 2010. 
  20. ^ "Coming soon: superfast internet"
  21. ^ Athanaileas, Theodoros, et al. (2011). "Exploiting grid technologies for the simulation of clinical trials: the paradigm of in silico radiation oncology". SIMULATION: Transactions of The Society for Modeling and Simulation International (Sage Publications) 87 (10): 893–910.  
  22. ^ [2]
  23. ^ BOINCstats
  24. ^ P Plaszczak, R Wellner, Grid computing, 2005, Elsevier/Morgan Kaufmann, San Francisco
  25. ^ IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand
  26. ^ Structure of the Multics Supervisor. Retrieved 2013-09-18.
  27. ^ "A Gentle Introduction to Grid Computing and Technologies" (PDF). Retrieved May 6, 2005. 
  28. ^ "The Grid Café – The place for everybody to learn about grid computing".  


  • Benedict, Shajulin; Vasudevan (2008). "A Niched Pareto GA approach for scheduling scientific workflows in wireless Grids". Journal of Computing and Information Technology 16: 101.  
  • Smith, Roger (2005). "Grid Computing: A Brief Technology Analysis" (PDF). CTO Network Library. 
  • Buyya, Rajkumar (July 2005). "Grid Computing: Making the Global Cyberinfrastructure for eScience a Reality" (PDF). CSI Communications (Mumbai, India: Computer Society of India (CSI)) 29 (1). 
  • Berstis, Viktors. "Fundamentals of Grid Computing". IBM. 
  • Elkhatib, Yehia (2011). Monitoring, Analysing and Predicting Network Performance in Grids (Ph.D.). Lancaster University. 
  • Ferreira, Luis; et al. "Grid Computing Products and Services". IBM. 
  • Ferreira, Luis; et al. "Introduction to Grid Computing with Globus". IBM. 
  • Jacob, Bart; et al. "Enabling Applications for Grid Computing". IBM. 
  • Ferreira, Luis; et al. "Grid Services Programming and Application Enablement". IBM. 
  • Jacob, Bart; et al. "Introduction to Grid Computing". IBM. 
  • Ferreira, Luis; et al. "Grid Computing in Research and Education". IBM. 
  • Ferreira, Luis; et al. "Globus Toolkit 3.0 Quick Start". IBM. 
  • Surridge, Mike; et al. "Experiences with GRIA – Industrial applications on a Web Services Grid" (PDF). IEEE. 
  • Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
  • The Grid Technology Cookbook
  • Francesco Lelli, Eric Frizziero, Michele Gulmini, Gaetano Maron, Salvatore Orlando, Andrea Petrucci and Silvano Squizzato. The many faces of the integration of instruments and the grid. International Journal of Web and Grid Services 2007 – Vol. 3, No.3 pp. 239 – 266 Electronic Edition

External links

  • GridCafé—an layperson's introduction to grid computing and how it works
  • SuGI-Portal—more on grids.
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.