We have moved to www.dataGenX.net, Keep Learning with us.

Saturday, July 28, 2012

DataStage Documentation Best Practices


This document contains the Data Stage Best practises and recommendations which could be used to improve the quality of data stage jobs. This document would further be enhanced to include the specific Data Stage problems and there troubleshooting.


Friday, July 27, 2012

DataStage huge logs removal resolution

Problem – While running the Jobs in DataStage, sometimes the log becomes so huge that the jobs hangs down. Sometimes when the Job is aborted, even then the Warnings keep on generating for a long time. And this increases the log messages for that job, subsequently increasing the table saving information in the Metadata database.
This increases the size of the table and the database tremendously.

Tuesday, July 24, 2012

XMeta DB : Datastage Repository

Prior to Datastage version 8, the datastage repository information was stored in files(uv). From version 8, the repository information is also stored in database. Since the information is available in database, it is easier to write queries and find the details. During the installation you will have the option to install the repository either in DB2 or in Oracle.

Usually, the datastage repository is built on DB2 database. because Db2 comes with the Information Server Software by defalt.

A word of caution: Do not try to alter the XMETA repository contents as it may have adverse effects.

Reclaiming disk space from DB2 XMeta tablespace after deleting projects

The DB2 XMeta database has grown over time as more development projects were created. After deleting all the old projects, the tablesize of the XMeta database has not reduced. The disk space occupied by the DB2 XMeta database is still the same. What DB2 commands can be run to reduce the XMeta tablespace on the hard disk?

Deleting projects from the DS Administrator frees space on the XMeta database but this does not translate into a reduction in the XMeta tablespace occupied on the hard disk.

Monday, July 23, 2012

Sunday, July 22, 2012

Setting environment variables for the parallel engine in DataStage

You set environment variables to ensure smooth operation of the parallel engine. Environment variables are set on a per-project basis from the Administrator client.

1. Click Start > All Programs > IBM Information Server > IBM InfoSphere DataStage and QualityStage Administrator, and log in to the Administrator client.
2. Click the Project tab, and select a project.
3. Click Properties.
4. On the General tab, click Environment.
5. Set the values for the environment variables as necessary.


Saturday, July 21, 2012

How to reduce the current size of IBM InfoSphere Information Server XMETA database

Xmeta DB is used by DataStage itself for doing its work. But sometimes when there is lot of project on DataStage then it need to be monitored. 
In this case, Xmeta size is growing rapidly so here is some steps which can solve this problem a bit.

The first method is from within db2 (the system should be quiesced for this approach):

   1.  Start the db2 command line processor:

    cd /opt/IBM/db2/V9/bin

Thursday, July 19, 2012

Toad shortcuts

Today,Toad is a software application from Quest Software used for development and administration of various relational databases. Sharing some Toad tool shortcut…

Tuesday, July 17, 2012

Locate the DB2 connection port number on Linux/Unix

Hey guys, here we see how to get the port no which is assigned to DB2 Services

1)Open a DB2 command prompt
2)Verify you are on the correct instance

db2 get instance

Monday, July 16, 2012

DB2 Errors_1

SQL1397N The DB2 service failed to logon.

When trying to start a DB2 service from Windows Services, the following error occurs:
Error 1069: The service did not start due to a logon failure.

Saturday, July 14, 2012

How to deploy a configuration file in DataStage

Hi Friends
Now how to Deploy/Apply the Conf file.

Deploying the new configuration file


Now that you have created a new configuration file, you use this new file instead of the default file. You use the Administrator client to deploy the new file. You must have DataStage® Administrator privileges to use the Administrator client for this purpose.

How to Create a Configuration File in DataStage

Hi Friends 
Configuration file is playing a most important role in Parallel DataStage job. This is the file which have provide parallel environment to job to run parallel.

Today I am going to share how to Create/Edit the Conf file.

Creating a configuration file

            Open the Designer Client and follow the below steps

How to Restore/backup the PuTTy settings…

Hello guys,
I don’t know if this is useful or not but sharing here a small tricks of taking a backup/restore the PuTTy sessions settings.

Putty stores its settings in the Windows registry. To save a backup of your Putty settings, you'll need to export this registry key to a file.


1.      Click Start->Run and type "regedit" in the "Open" dialog. Click "Ok"

2.      Once Registry Editor starts, you'll be presented with an application which looks something like:

3.      Press "Ctrl+F" to bring up the Find dialog. Enter the name of the key, "SimonTratham" in the "Find What" field, and make sure only "Keys" is checked in the "Look At" section of the dialog. Finally, click "Find Next"

4.      The search may take a while. When the search completes we'll see the key name for which we're looking Or Go to this path  HKEY_CURRENT_USER à Software à SimonTatham

5.      Select “SimonTatham” and Click File->Export. Give your file an appropriate name like, "putty.reg" and click "Save"

6.      Save the putty.reg file somewhere safe. The file doesn't contain any passwords or actual SSH key values so no need to worry about if someone stole your settings

7.      Whenever you want to restore these settings, you have to Right Click on reg file and select the option Merge. This will add these settings to your Putty.

8.      Don’t double click on the Reg file otherwise it will overwrite your existing settings with the Reg files settings.

We are done here.

njoy the simplicity.......
Atul Singh

Friday, July 13, 2012

DataStage tip for beginners: developer short cuts

Hi guys,
Last night I am googling for some shortcut to execute jobs, some tool tricks. I found this another great Article by Vincent McBurney. So sharing that here with you all…will join you soon with the DS command line tricks.

Import Export

* When you do an export cut and paste the export file name. When you go to your project and run an import paste the file name instead of having to browse for it. While export and import independently remember the last file name used they do not share that name between each other.

* When you switch export type between category and individual job it is quick to switch the type, close the export form and open it again. That way the job name or category you have highlighted will be automatically picked.
There is an Export option to export by individual job name or export by category name. This is on the second tab in the export form. Often when you go to export something it is on the wrong option,

Thursday, July 12, 2012

Tips for debugging a datastage job

Some tips for the beginners of datastage. Hope this will help you to debug the job in datastage.

Enable the following environment variables in DataStage Administrator:
  • APT_PM_PLAYER_TIMING – shows how much CPU time each stage uses
  • APT_PM_SHOW_PIDS – show process ID of each stage
  • APT_RECORD_COUNTS – shows record counts in log

How to restart DataStage ?

Hi Friends, today I am going to share how to restart the server, what are the steps we have to take care while performing this task.

The order in which to stop all services is:

   1. Stop DataStage Engine
   2. Stop the ASB Agent
   3. Stop the WebSphere Application Server (WAS)

Here are the step-by-step instructions for each of the above.

Wednesday, July 11, 2012

Vmstat : Monitoring Virtual Memory

Virtual Memory :
Using a disk as an extension of RAM so that the effective size of usable memory grows correspondingly

How Virtual Memory Works

Physical memory—the actual RAM installed—is a finite resource on any system. As we know All processes use memory, but each process doesn't need all its allocated memory all the time. Taking advantage of this fact, the kernel frees up physical memory by writing some or all of a process' memory to disk until it's needed again.
When pages are written to disk, the event is called a page-out, and when pages are returned to physical memory, the event is called a page-in. A page fault occurs when the kernel needs a page, finds it doesn't exist in physical memory because it has been paged-out, and re-reads it in from disk.
Page ins are usual, so no need to worry about these, but Page out have. When Kernel got to know that we are running low on Physical memory then it start searching defined virtual space in Hard Disk and write the processes in there. Sometimes Kernel is quite busy with these task, fetching n inserting pages, and unable to give more resource to process running in front.

For tracking the memory status we use “VMSTAT” command.
By default vmstat displays the memory information in kb

A) If we simply execute the “vmstat” command on the terminal

$ vmstat

This will give the current status of Virtual Memory and quit.
·         kthr – r: Total number of processes waiting to run
·         kthr – b: Total number of busy processes
·         Memory – avm: Active virtual pages
·         Memory – free: Free virtual memory
·         Page – re: Pager input/output list
·         Page – pi: Pages paged in from paging space
·         Page – po: Pages paged out to paging space
·         Page – fr: Pages freed (page replacement)
·         Page – sr: Pages scanned by page-replacement algorithm
·         Page – cy: Clock cycles by page-replacement algorithm
·         Faults – in : Device interrupts
·         Faults – sy : System calls
·         Faults – cs : Kernel thread context switches
·         Cpu – us : User time
·         Cpu – sy : System time
·         Cpu – id : CPU idle time
·         Cpu – wa : CPU idle time during which the system had I/O or disk request
·         Cpu – pc : Number of physical processors consumed
·         Cpu – ec : The percentage of entitled capacity consumed

B) To get the Virtual memory status in every 3 seconds

$ vmstat 3

This command will generate the result in every 3 sec for infinity times. For stopping the output we can use Ctrl-C

C) To get the Virtual memory status in every 2 seconds with 10 update only

$ vmstat 2 10

D) Display number of forks since last boot

$ vmstat –f

E) Displays the number of interrupts taken by each device since system startup

$ vmstat –i

F) Display the Sum of event in readable format

$ vmstat –s
display the sum of events, which contains an absolute count of paging events since system startup.

G) Display the various statistics

$ vmstat –v

H) To increase the width of output  

$ vmstat –w
The default output without increasing the width

For more detailed information about “VMSTAT” go to this link

njoy the simplicity.......
Atul Singh

Tuesday, July 10, 2012

Backspace Key Not Working in PuTTY

If you access your NIX machines from a Windows workstations by SSH, most probably you are using PuTTY..
One example is the backspace key. It just won’t work in PuTTY connecting to SUSE server. I have to put the cursor before the character I want to delete and press the DELETE key.

Go to your PuTTY configuration ---> Terminal ---> Keyboard
Look for Change Sequences Sent By ---> The Backspace Key
From there, select Control-H.
Save your session and try it.

njoy the simplicity.......
Atul Singh

File System Monitoring Script in NIX

Hi Friends,
Here I am come up with second FS monitoring script, this script sent you the list of FS which
Have the FS size more then the THRESHOLD value you defined.

njoy the simplicity.......
Atul Singh

how to grant select privileges on all tables for DB2

Take these steps to generate SELECT privileges for a given USER on all tables in a SCHEMA.
We can either create a script or execute the commands one by one on Terminal.

DBTABLES=`db2 -x "SELECT tabname FROM syscat.tables WHERE tabschema=UPPER('$MYSCHEMA')"`

njoy the simplicity.......
Atul Singh

Verify DB2 is available or Not

In this post, telling you the 3 method to know if DB2 is available or not.


Method 1

Log onto the server and execute db2start as the instance owner. If the instance is running you’ll see , something like:
$ db2start
07/10/2012 23:36:14     0   0   SQL1026N  The database manager is already active.SQL1026N  The database manager is already active.


Method 2

Log onto the server a run a netstat –a/ NETSTAT -a. This lists all TCP/IP connections. Look for the instance names or the ports , such as :
$ netstat –a | grep –i db2
tcp        0      0 *:db2c_db2a               *:*                         LISTEN
tcp        0      0 *:db2c_db2b               *:*                         LISTEN
tcp        0      0 *:db2c_db2c               *:*                         LISTEN
tcp        0      0 *:db2c_db2d              *:*                         LISTEN
tcp        0      0 *:db2c_db2e              *:*                         LISTEN

Method 3

 On the Linux command line , issue the command :
 $ ps aux | less | grep 'db2sysc'
If you see similar to  this reponse – then the DB2 servers are running.
db2test   13474  0.1 11.3 8860292 3506408 ?     Sl   Jul07  24:49 db2sysc 0     
db2dba  13823  0.0  0.9 1858144 279532 ?      Sl   Jul07   9:09 db2sysc 0

njoy the simplicity.......
Atul Singh

Sunday, July 08, 2012

A few keyboard shortcuts that every Linux user should know

A few keyboard shortcuts that every Linux user should know. Here are a few examples of Linux shortcuts that will help anyone who uses Linux.
Switch to the first text terminal. Under Linux you can have several (6 in standard setup) terminals opened at the same time.

<Ctrl><Alt><Fn> (n=1..6)
Switch to the nth text terminal.

Print the name of the terminal in which you are typing this command.

Switch to the first GUI terminal (if X-windows is running on this terminal).

<Ctrl><Alt><Fn> (n=7..12)
Switch to the nth GUI terminal (if a GUI terminal is running on screen n-1). By default, nothing is running on terminals 8 to 12, but you can run another server there.

(In a text terminal) Autocomplete the command  if there is only one option, or else show all the available options.

Scroll and edit the command history. Press <Enter> to execute.

Scroll terminal output up. Work also at the login prompt, so you can scroll through your bootup messages.

Scroll terminal output down.

(in X-windows) Change to the next X-server resolution (if you set up the X-server to more than one resolution). For multiple resolutions on my standard SVGA card/monitor, I have the following line in the file /etc/X11/XF86Config (the first resolution starts on default, the largest determines the size of the “virtual screen”):
Modes “1024×768″ “800×600″ “640×480″ “512×384″ “480×300″ “400×300″ “1152×864″

(in X-windows) Change to the previous X-server resolution.

(in X-windows) Kill the current X-windows server. Use if the X-windows server crushes and cannot be exited normally.

Shut down the system and reboot. This is the normal shutdown command for a user at the text-mode console. Don’t just press the “reset” button for shutdown!

Kill the current process (mostly in the text mode for small applications).

Log out from the current terminal.  See also the next command.

Send [End-of-File] to the current process. Don’t press it twice else you also log out (see the previous command).

njoy the simplicity.......
Atul Singh

Friday, July 06, 2012

What is Shell, Shell type & Shell script ?

What is a Shell?

·         User interface to AIX/UNIX/LINUX
·         Command interpreter
·         Programming language

Shells Type:

There are lot of shell types, some are listed here

Korn(88)                       - ksh
Korn (93)                      - ksh93
Bourne                          - bsh
bash                              - bash
Restricted                      - Rsh
C                                  - csh
Trusted                         - tsh
POSIX                            - psh
Default                          - sh
Remote                         - rsh

What Is a Shell Script?

·         A readable text file which can be edited with a text editor
-- /usr/bin/vi shell_prog
·         Anything that you can do from the Shell prompt
·         A program, containing:
-- System commands
-- Variable assignments
-- Flow control syntax
-- Shell commands
·         And Comments !

njoy the simplicity.......
Atul Singh

History Of Datastage

DataStage was conceived at VMark, a spin off from Prime Computers that developed two notable products: UniVerse database and the DataStage ETL tool. The first VMark ETL prototype was built by Lee Scheffler in the first half of 1996[1]. Peter Weyman was VMark VP of Strategy and identified the ETL market as an opportunity. He appointed Lee Scheffler as the architect and conceived the product brand name "Stage" to signify modularity and component-orientation[2]. This tag was used to name DataStage and subsequently used in related products QualityStage, ProfileStage, MetaStage and AuditStage. Lee Scheffler presented the DataStage product overview to the board of VMark in June 1996 and it was approved for development. The product was in alpha testing in October, beta testing in November and was generally available in January of 1997.

VMark acquired UniData in October of 1997 and renamed itself to Ardent Software[3]. In 1999 Ardent Software was acquired by Informix[4] the database software vendor. In April of 2001 IBM acquired Informix and took just the database business leaving the data integration tools to be spun off as an independent software company called Ascential Software[5]. In March of 2005 IBM acquired Ascential Software[6] and made DataStage part of the WebSphere family as WebSphere DataStage. In 2006 the product was released as part of the IBM Information Server under the Information Management family but was still known as WebSphere DataStage. In 2008 the suite was renamed to InfoSphere Information Server and the product was renamed to InfoSphere DataStage.

DataStage Editions
Enterprise Edition: a name give to the version of DataStage that had a parallel processing architecture and parallel ETL jobs.
Server Edition: the name of the original version of DataStage representing Server Jobs. Early DataStage versions only contained Server Jobs. DataStage 5 added Sequence Jobs and DataStage 6 added Parallel Jobs via Enterprise Edition.
MVS Edition: mainframe jobs, developed on a Windows or Unix/Linux platform and transferred to the mainframe as compiled mainframe jobs.
DataStage for PeopleSoft: a server edition with prebuilt PeopleSoft EPM jobs under an OEM arrangement with PeopeSoft and Oracle Corporation.
DataStage TX: for processing complex transactions and messages, formerly known as Mercator.
DataStage SOA: Real Time Integration pack can turn server or parallel jobs into SOA services.

njoy the simplicity.......
Atul Singh

Thursday, July 05, 2012

BAT Script to execute a DataStage Job from Command line in Windows

Hi Guys,
Here I am sharing a BAT script for executing the DS job on Windows Server.
Please go through with it and let me know if caught in any issue or problem

@echo off
:: -----------------------------------------------------------------
:: Name: CommonProviderExtract.bat
:: -----------------------------------------------------------------
:: Description:
:: Run the DataStage job from command line using the parameters.
:: -----------------------------------------------------------------
:: Created by: Sreeram Makam, April 29,2010
:: -----------------------------------------------------------------
:: Todo:
::    Add more functionality like job reports.
:: -----------------------------------------------------------------
:: Required Components:
::     dsjob.exe (Windows Version)
:: -----------------------------------------------------------------
:: Command Line Parameters:
:: 1. Host
:: 2. User
:: 3. Password
:: 4. Project
:: -----------------------------------------------------------------
:: Ensure that everything that are set here are not permanent.
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: Test for command line parameters.
:: -----------------------------------------------------------------
    IF "%1"=="" GOTO Syntax
    IF "%2"=="" GOTO Syntax
    IF "%3"=="" GOTO Syntax
    IF "%4"=="" GOTO Syntax
:: -----------------------------------------------------------------
:: Set paramters.
:: -----------------------------------------------------------------
    SET Host=%1
    SET User=%2
    SET Password=%3
    SET Project=%4
 SET TGTSODSPassword=<Password>
 SET EnterpriseInterfacesDSN=<Target DSN>
 SET EnterpriseInterfacesUID=<Username>
 SET EnterpriseInterfacesPWD=<Password>
 SET JobName=Common_ProviderExtract
:: -----------------------------------------------------------------
:: Hard-coded values.  Dependent on each computer.
:: -----------------------------------------------------------------
    SET Designer=C:\Progra~1\Ascential\DataStage\dsdesign.exe
    SET DsJob=C:\IBM\InformationServer\Clients\Classic\dsjob
    SET JobList=DsJobReportList.txt
    SET ProjectList=ProjectList.txt
    SET DSLog=DsJobReportLog
    SET BackupDir=C:\tmp\log
:: -----------------------------------------------------------------
:: Get the current Date
:: -----------------------------------------------------------------
    FOR /f "tokens=2-4 delims=/ " %%a in ('DATE/T') do SET DsxDate=%%c%%a%%b
:: -----------------------------------------------------------------
:: Get the current Time
:: -----------------------------------------------------------------
    FOR /f "tokens=1* delims=:" %%a in ('ECHO.^|TIME^|FINDSTR "[0-9]"') do (SET DsxTime=%%b)
:: -----------------------------------------------------------------
:: Set delimeters so that current time can be broken down into components
:: then execute FOR loop to parse the DsxTime variable into Hr/Min/Sec/Hun.
:: -----------------------------------------------------------------
    SET delim1=%DsxTime:~3,1%
    SET delim2=%DsxTime:~9,1%
    FOR /f "tokens=1-4 delims=%delim1%%delim2% " %%a in ('echo %DsxTime%') do (
        set DsxHr=%%a
        set DsxMin=%%b
        set DsxSec=%%c
        set DsxHun=%%d
:: -----------------------------------------------------------------
:: If provided directory is missing an ending \, append it.
:: Validate %BackupDir%'s existance.
:: -----------------------------------------------------------------
    if exist %BackupDir%\ set BackupDir=%BackupDir%\
    if NOT exist %BackupDir% GOTO BadMain
:: -----------------------------------------------------------------
:: Set the log file name to improve readability of code.
    SET LogFileName=%BackupDir%%DSLog%-%DsxDate%-%DsxHr%_%DsxMin%_%DsxSec%.log
:: -----------------------------------------------------------------
:: Announce to log of this program's run.
:: -----------------------------------------------------------------
    ECHO. > %LogFileName%
    ECHO DsJobReport ran on %DsxDate% %DsxHr%:%DsxMin%:%DsxSec% with the following parameters >> %LogFileName%
    ECHO Host=%Host% >> %LogFileName%
    ECHO User=%User% >> %LogFileName%
    ECHO BackupDir=%BackupDir%%DsxDate%\ >> %LogFileName%
    ECHO Designer=%Designer% >> %LogFileName%
    ECHO DsJob=%DsJob% >> %LogFileName%
    ECHO ProjectList=%ProjectList% >> %LogFileName%
    ECHO JobList=%JobList% >> %LogFileName%
    ECHO DSLog=%DSLog% >> %LogFileName%
    ECHO. >> %LogFileName%
:: -----------------------------------------------------------------
:: Begin Job Execution
:: NOTE:  %ERRORLEVEL% does not work for some reason.
:: -----------------------------------------------------------------
        ECHO %DsJob% -server %Host% -user %User% -password %Password% -run -wait -jobstatus -param $TGTSSODSDSN=%TGTSSODSDSN% -param $TGTSODSUserID=%TGTSODSUserID% -param $TGTSODSPassword=%TGTSODSPassword% %Project% %JobName% >> %LogFileName%
    %DsJob% -server %Host% -user %User% -password %Password% -run -wait -jobstatus -param $TGTSSODSDSN=%TGTSSODSDSN% -param $TGTSODSUserID=%TGTSODSUserID% -param $TGTSODSPassword=%TGTSODSPassword% %Project% %JobName%
        IF NOT %ERRORLEVEL%==0 GOTO ProjFail
        ECHO. >> %LogFileName%
        ECHO *** Completed Job execution for Job: %%i on Host: %Host% Project: %Project% >> %LogFileName%
        ECHO     to File: %BackupDir%%DsxDate%\%%i.html >> %LogFileName%
        ECHO. >> %LogFileName%
:: -----------------------------------------------------------------
::    ECHO *** Export completed successfully for projects:
::    type %TempFile%
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: a job failed to be exported.
:: -----------------------------------------------------------------
    ECHO *** ERROR:  Errorlevel did not return a value 0: %%i on Host: %Host% on Project: %Project%
    ECHO. >> %LogFileName%
    ECHO *** ERROR:  Errorlevel did not return a value 0: %%i on Host: %Host% on Project: %Project% >> %LogFileName%
    ECHO. >> %LogFileName%
:: -----------------------------------------------------------------
:: Report that paramters are not valid on screen and log file.
:: Note: Password are not reported for better security.
:: -----------------------------------------------------------------
    echo Invalid parameters - Unable to access Server.
    echo. >> %LogFileName%
    echo Invalid parameters - Unable to access Server. >> %LogFileName%
    echo. >> %LogFileName%
GOTO Syntax
:: -----------------------------------------------------------------
:: Report that directory is non-existant.
:: -----------------------------------------------------------------
    echo Bad/Non-existing directory: %BackupDir%
    echo Please ensure that you have permission to access/create directories
    echo and files.  Also ensure that directory listed exists.
    echo. >> %LogFileName%
    echo Bad/Non-existing directory: %BackupDir% >> %LogFileName%
    echo. >> %LogFileName%
:: -----------------------------------------------------------------
:: Report that program does not have privs to create directory.
:: -----------------------------------------------------------------
    echo Unable to create subdirectory: %BackupDir%%DsxDate%\
    echo Please ensure that you have permission to access/create directories
    echo and files.
    echo. >> %LogFileName%
    echo Unable to create subdirectory: %BackupDir%%DsxDate%\ >> %LogFileName%
    echo. >> %LogFileName%
:: -----------------------------------------------------------------
:: Report proper syntax usage.
:: -----------------------------------------------------------------
    echo DataStage DsJobReport Routine
    echo Based on design by Sreeram Makam
    echo Usage: CommonProviderExtract Server User Password Project
:: -----------------------------------------------------------------
    ECHO. >> %LogFileName%
:: -----------------------------------------------------------------

njoy the simplicity.......
Atul Singh