We have moved to www.dataGenX.net, Keep Learning with us.

Thursday, October 18, 2012

DataStage Configuration file : Explained - 3




Below is the sample diagram for 1 node and 4 node resource allocation:


 

 

Sample configuration files

 

Configuration file for a simple SMP

 

A basic configuration file for a single machine, two node server (2-CPU) is shown below. The file defines 2 nodes (node1 and node2) on a single dev server (IP address might be provided as well instead of a hostname) with 3 disk resources (d1 , d2 for the data and Scratch as scratch space).

The configuration file is shown below: 



node "node1"
{             fastname "dev"
               pool ""
               resource disk "/IIS/Config/d1" { }
               resource disk "/IIS/Config/d2" { }                            
               resource scratchdisk "/IIS/Config/Scratch" { }
}

node "node2"
{
               fastname "dev"
               pool ""
               resource disk "/IIS/Config/d1" { }
               resource scratchdisk "/IIS/Config/Scratch" { }
}             
         

 

 

Configuration file for a cluster / MPP / grid


The sample configuration file for a cluster or a grid computing on 4 machines is shown below.
The configuration defines 4 nodes (node[1-4]), node pools (n[1-4]) and s[1-4), resource pools bigdata and sort and a temporary space. 



node "node1"
            {
                        fastname "dev1"
                        pool "" "n1" "s1" "sort"
                        resource disk "/IIS/Config1/d1" {}
                        resource disk "/IIS/Config1/d2" {"bigdata"}                      
                        resource scratchdisk "/IIS/Config1/Scratch" {"sort"}
            }

            node "node2"
            {
                        fastname "dev2"
                        pool "" "n2" "s2"
                        resource disk "/IIS/Config2/d1" {}
                        resource disk "/IIS/Config2/d2" {"bigdata"}                      
                        resource scratchdisk "/IIS/Config2/Scratch" {}
            }

            node "node3"
            {
                        fastname "dev3"
                        pool "" "n3" "s3"
                        resource disk "/IIS/Config3/d1" {}
                        resource scratchdisk "/IIS/Config3/Scratch" {}
            }

            node "node4"
            {
                        fastname "dev4"
                        pool "n4" "s4"
                        resource disk "/IIS/Config4/d1" {}
                        resource scratchdisk "/IIS/Config4/Scratch" {}
            }




Resource disk : Here a disk path is defined. The data files of the dataset are stored in the resource disk.

Resource scratch disk :  Here also a path to folder is defined. This path is used by the parallel job stages for buffering of the data when the parallel job runs.



Parts of this tutorial --





njoy the simplicity.......
Atul Singh

No comments :

Post a Comment