MICC JINR Multifunctional
Information and Computing
Complex

RU

PROOF

PROOF, i.e. Parallel ROOT Facility is an extension of ROOT that allows performing parallel analysis of large sets of ROOT files on multiprocessor machines. That is, PROOF can parallelize tasks, which can be formulated as a series of independent subtasks. 


PROOF provides:

  • Absence (or a very small number) of differences between the analysis session running on local ROOT and the remote parallel session of PROOF. A typical parsing script should work the same way.
  • No limit on the number of processors that can be used in parallel.
  • Adaptability of the session to changes in the remote environment (load changes on the cluster nodes, network interrupts, etc.).

In the PROOF implementation, slave servers ("workers") are active components that request the main server for getting new work whenever they are ready to do it. The main goal of PROOF is to minimize the time for performing a common task in the presence of all working workers ending their assigned task at the same time. In this scheme, the performance of parallel processing is a function of duration of carrying out each small task, the "package", i.e. the work volume assigned to the worker, and network bandwidth. The main setting parameter is the size of the package. If the package size is too small, the effect of parallelism will be leveled by a waste of time to the service data transfer caused by the movement of many packages within the network between master and slave servers. If the package size is too large, the effect of the difference in performance is expressed insufficiently. Another very important factor is the location of data. In most cases, it is necessary to process a large amount of data files that are distributed across different cluster nodes or separated from each other geographically. To group these files together, a chain that provides a single logical view of many physical files is used.

A package is a simple data structure of two numbers, i.e. “initial event” and “number of events”. The size of the package is determined dynamically after the work of workers in real time. The main server ("master") generates a package when requesting the slave server, taking into account the time spent on processing the previous batch and the size of the files in the chain. The wizard saves the list of all generated packages to one of the slave servers, so in case any "worker" "dies" during processing, all its packages can be recycled by the rest of the "workers". A package can be as small as the main processing unit, i.e. an event.

To use PROOF, a user (client) must start a PROOF session. Practically it means the creation of a TProof object or a pointer to it.

root[] TRpoof *proof = TProof::Open("url")

This class manages Parallel ROOT Facility on a cluster. It starts up worker servers, keeps track of their work, status and other practices, sends messages to all workers, collects results, etc. A full description of the TProof class can be read here. We will only consider main methods of TProof.

The method

TProof * TProof::Open(const char * cluster = 0, const char * conffile = 0, const char * confdir = 0, Int_t loglevel = 0)

runs a Proof session on the specific cluster. The first parameter is URL representing the network address of the cluster master. The second parameter - conffile - is a name of the configuration file that describes the remote PROOF cluster (this argument allows you to describe various cluster configurations). The default file is proof.conf. The third parameter - confdir - is a directory in which there is a configuration file and other PROOF-related files (for example, motd and noproof files).

The method

TProof::Print (Option_t opt)

displays the final status of the session. By default, it shows information about the following: client, master, ROOT version, platform, location of user directories, number of active and inactive workers, real work time, CPU time and I/O used in the session, etc.

The method

TProof::Exec(const char  *cmd)

allows you to execute ROOT commands on workers or on the master. For example, PROOF diagnoses CINT commands that require a file ('.L', '.x' and '.X') and makes sure that the updated file version exists on nodes. The file is loaded only if necessary.

The method

TProof::SetParallel(Int_t nodes))

tells PROOF how many auxiliary nodes to use in parallel.

When you run a macro for the first time, you may notice some waiting time due to its distribution by workers. When starting a second time and onwards, if the macro has not changed, it goes much faster. By default, the command is executed only on workers, not on the master. To perform it on the master, you can do the following:

root [3] proof->SetParallel(0)

The Process method has several types of interface:

Long64_t TProof::Process(TDSet * dset, const char * selector, Option_t * option = "", Long64_t        nentries = -1,

Long64_t first = 0)

Processes a data set implemented in ROOT by the special class TDSet using the specified selector file. (.C) or the Tselector object.

Long64_t TProof::Process(TFileCollection * fc, const char * selector, Option_t * option = "", Long64_t nentries = -1,

Long64_t first = 0)    

Processes a data set (TFileCollection) using the specified selector file (.C) or the TSelector object.

Long64_t TProof::Process(const char * selector, Long64_t n, Option_t * option = "")           

General (not based on pre-selected data) selector processing: the Process () method of the specified selector object (.C) or TSelector is called "n" times. What is a selector?

Selector

To be able to parallelize at the event level, PROOF must adjust the flow of events. This requires that the code be defined in advance, but with a flexible structure. In ROOT, this requirement is provided by the Selector structure defined by the abstract TSelector class, which determines three logical steps:

Begin - where the input data, parameters, a file for output are set; running on the client and workers;

Process - where the actual work is done; is called for every event on workers. This part can be parallelized;

Terminate - where results are processed (fit, visualized ...); called by the client and workers.

That is, you can schematically represent the structure of the Selector as follows:

 void Begin() - client initialization, performed once on the client

 void SlaveBegin() - node initialization, performed once on each subordinate node

 void Process(Long64_t entry) - calculation, specified number of times (entry) on each subordinate node
void SlaveTerminate() - ffinal processing on the node, performed once on each subordinate node

void Terminate() - processing and output of data on the client, performed once on the client

Packetizer

The packetizer is responsible for load balancing between workers. It decides where each piece of work - package - must be processed. The packetizer object is created on the master node. The work of workers, as well as the speed of transfer of various files, can vary considerably. To dynamically balance work distribution, the packetizer uses the pull architecture: when workers are ready for the next processing, they ask the packetizer for the next package.

Let us consider an example in which a one-dimensional histogram is created and filled with random numbers obeying the Gauss distribution. First, we will write a script for an ordinary ROOT session without using PROOF.

#include "TH1F.h"

#include "TRandom.h"

#include "TCanvas.h"

void gauss(Int_t n=100000){

TH1F *fH1F = new TH1F("FirstH1F","First TH1F in PROOF",100,-10,10);

TRandom *fRandom=new TRandom3(0);

for(Int_t i=0;i<n;i++){

Double_t x=fRandom->Gaus(0.,1.);

fH1F->Fill(x);

}

TCanvas *c1=new TCanvas("c1","Proof ProofFirst canvas",200,10,400,400);

fH1F->Draw();

c1->Update();

}

The red color in the script highlights the definition of the histogram and the random number generator. In blue - the work of the script - in this case, the random number generation and filling of the histogram with it. Data output is highlighted in green - creating a canvas and drawing a histogram on it.

Now, to perform the same work in parallel using PROOF, let us create the ProofFirst class, derived from the TSelector abstract class. A description and methods of TSelector class methods can be viewed here. The header file ProofFirst.h looks like:

#ifndef ProofFirst_h

#define ProofFirst_h

 

#include < TSelector.h >

#include "TH1F.h"

#include "TRandom.h"

#include "TCanvas.h"

 

class TH1F;

class TRandom;

 

class ProofFirst : public TSelector {

public :

 

   TH1F   *fH1F;   

   TRandom *fRandom;

   ProofFirst();

   virtual ~ProofFirst();

   virtual Int_t   Version() const { return 2; }

   virtual void    Begin(TTree *tree);

   virtual void    SlaveBegin(TTree *tree);

   virtual Bool_t  Process(Long64_t entry);

   virtual void    SetOption(const char *option) { fOption = option; }

   virtual void    SetObject(TObject *obj) { fObject = obj; }

   virtual void    SetInputList(TList *input) { fInput = input; }

   virtual TList  *GetOutputList() const { return fOutput; }

   virtual void    SlaveTerminate();

   virtual void    Terminate();

 

   ClassDef(ProofFirst,2);

};

#endif

and the implementation file ProofFirst.C –

#include "ProofFirst.h"

#include "TH1F.h"

#include "TRandom3.h"

 

//_____________________________________________________________________________

ProofFirst::ProofFirst()

{

   // Constructor

   fH1F = 0;

   fRandom = 0;

}

//_____________________________________________________________________________

ProofFirst::~ProofFirst()

{

   // Destructor

   if (fRandom) delete fRandom;

}

//_____________________________________________________________________________

void ProofFirst::Begin(TTree * /*tree*/)

{

}

 

//_____________________________________________________________________________

void ProofFirst::SlaveBegin(TTree * /*tree*/)

{

 

   fH1F = new TH1F("FirstH1F", "First TH1F in PROOF", 100, -10., 10.);

   fOutput->Add(fH1F);

   fRandom = new TRandom3(0)0;

}

 

//_____________________________________________________________________________

Bool_t ProofFirst::Process(Long64_t)

{

     if (fRandom && fH1F) {

      Double_t x = fRandom->Gaus(0.,1.);

      fH1F->Fill(x);

   }

   return kTRUE;

}

 //_____________________________________________________________________________

 void ProofFirst::SlaveTerminate()

 {

 }

  //_____________________________________________________________________________

void ProofFirst::Terminate()

{

   TCanvas *c1 = new TCanvas("c1", "Proof ProofFirst canvas",200,10,400,400);

   fH1F = dynamic_cast(fOutput->FindObject("FirstH1F"));

   if (fH1F) fH1F->Draw();

   c1->Update();

}

We announce the histogram and the random number generator as data members of the ProofFirst class (red color). In the constructor class, we initialize the histogram and generator pointers to 0. In the destructor, we destroy the random number generator. The histogram belongs to the output list, so it is not destroyed. The SlaveBegin () method creates instances of the histogram and the random number generator (red color). The main job is performed by the Process () method, in this case it is the generation of a random number and filling of the histogram with it (blue color). Finally, the Terminate () method shows the result on the terminal (green color).

Now we are ready to handle this selector. Therefore, we run PROOF

TProof* proof=TProof::Open("lite://","workers=12")

and call the method

TProof::Process("ProofFirst.C+", 1000000000)

Here is what you should see on your screens:

A more complicated example, namely, tree reading, calculations, histogram filling, fitting, can be found on the following link.