FAQ
General Grid Questions
(Sections adapted from an article originally published in Symmetry Breaking by Katie Yurkewicz ).
Grid computing revolutionizes the way scientists share and analyse data by enabling researchers to share computer power and data storage over the Internet. Grid projects already help researchers search for new wheat genes, predict storms, or simulate the Sun’s interior. The 7000-odd physicists working on experiments at the Large Hadron Collider will rely entirely on grid computing, specifically on the Worldwide LHC Computing Grid, to connect them with LHC data.
But some reports on “the Grid” promise much more than the technology can now – and in some cases will ever – deliver. Below we separate fact from fiction and address some technical questions.
-
Will the Grid replace the Internet?
-
No. Grid computing, like the World Wide Web, is an application of the Internet. When the LHC turns on, data will be transferred from CERN to 11 large computing centres around the world at rates of up to 10 gigabits per second. Those large centres will then send and receive data from 200 smaller centres worldwide. All this data transfer will take place over the Internet. Dedicated fibre-optic links are used between CERN and the large centres; the smaller centres connect together through research networks and sometimes the standard public Internet.
-
Will I be able to download movies 10,000 times faster using the Grid?
-
No. First, in order to get such data-transfer rates, individuals would have to do what the large particle physics computing centres have done, and set up (or lease) a dedicated fibre-optic link between their home and the source of their data. Second, today’s grid computing technologies and projects are geared toward research and businesses with highly specific needs, such as vast amounts of data to process and analyse within large, worldwide collaborations. While other computer users may benefit from grid computing through better weather prediction or more effective medications, they may not be logging onto a computing grid anytime soon. (Something called “cloud computing”, where your programs are run in a central location rather than on your own computer, may also be on the horizon.)
-
Was the Grid invented at CERN?
-
No. The first pioneering steps in grid computing were taken in the US. The term “grid computing” was first used in a book by Grid pioneers Ian Foster and Carl Kesselman, as a metaphor for making computing power accessible in the same way as electrical power. The LHC Computing Grid Project, led by CERN, uses resources contributed by grid projects around the globe. The Enabling Grids for E-sciencE project in Europe (also led by CERN), the Open Science Grid in the US, GridPP in the UK, and the INFN Grid in Italy are some of the independent grid projects that provide support for the computing needs of many areas of research and contribute to the LHC Computing Grid.
Much more information about grid computing and its uses in particle physics and other areas of research is available at www.isgtw.org and gridcafe.web.cern.ch. -
What is a Cluster Manager?
-
The cluster manager (CM) is a service that runs on each node of a VRS cluster. The CM is primarily responsible for maintaining information about nodes in the system. All nodes are equal; specifically, the CM process on each node is (for now) essentially identical. No one node or CM has special privileges or responsibilities above the others.
Each CM should maintain a (static during runtime) list of authentication information (e.g. public keys) for nodes that may join the cluster. These lists can be hand-edited when a cluster administrator wants to allow a new node or remove (disallow) an existing node. The administrator should ensure that all nodes have the same authentication list, so that when a machine wants to sign on to the cluster, it can connect to and authenticate with any of the nodes currently in the cluster. Adding a new node to the authentication information list requires generating the various authentication keys, adding the private key to the node and the public key to the authentication information list.
The CM should also maintain a (dynamic) list of nodes which are currently active in the cluster. This list must always be consistent among the various nodes currently in the cluster, as various services (in particular, the resource manager) will use transactions that depend on *all* the nodes in the cluster agreeing on an update. If there is any ambiguity in terms of which nodes are currently connected to the cluster and which are not, the transaction manager won't know which nodes to contact and from which to require a 'commit' signal.
When a machine (call it A) wants to sign in to the cluster, it must contact one of the nodes currently in the cluster (call it B), and send its authentication information. Assuming that machine A on the cluster authentication information list and its authentication info is valid, node B will transactionally update the active node list on ALL machines currently on the active node list. -
What is Gigabit Ethernet (GE)?
-
Gigabit Ethernet is the Ethernet standard offering Gigabit services. The standard typically employs fibre but can be supported on Cat 6 cable. This technology until know has been used for campus style backbone networks; it is know finding its way to the desktops for high end servers and intensive graphical applications.