Welcome to plsql4all.blogspot.com SQL, MYSQL, ORACLE, TERADATA, MONGODB, MARIADB, GREENPLUM, DB2, POSTGRESQL.

Wednesday 13 August 2014

Multi Load (MLoad) In Teradata

MLoad:-

Teradata Mload, Fload and Bteq stand for Teradata multi load, Teradata Fast load and basic Teradata queries respectively. These are the tool provided by Teradata itself, which can be use for various purposes. We can use multi load and fast load to upload data for external files to Teradata database. External files can be text file, excel files, csv files. These are command driven utilities and can be used only with the help of command prompt in windows. Teradata basic query (Bteq) can be use to execute various DML and DDL command at a time. Multi load and fast load are the highly secured commands and require appropriate privileges to perform insert or update or delete operations. Let’s discuss each of these in details:-

    1)       Teradata Multi load :-

Teradata multi load is also called as Teradata Mload or ML. It’s a command driven utility provided by Teradata. It can be use for high volume data maintenance on multiple Teradata table or views. Multi load job can be use to perform a number of different import and delete jobs on database tables.

         1)        Can do multiple data insert, update, and delete functions on up to five different tables.
         2)        Can have up to 100 DML steps; 
         3)        Can remove large numbers of rows from a single table.  
         4)        Host and LAN support.
         5)        Full Restart capability.
         6)        Error reporting via error tables.

Data source of the input file:-

Teradata multi load can import data from various locations:-

     1)       Disk or tape files 
     2)       Input files on local machine
     3)       Input files on a network-attached workstation.
     4)       Many other sources

Format of the input file:-

We can specify format of the input file. The format could be:-

      ·         FASTLOAD 
      ·         BINARY
      ·         TEXT
      ·         UNFORMAT

      ·         VARTEXT

We can understand more about Teradata multi load utility with the help of below diagram:-

When to use Teradata multi load?

We can use Teradata multi load when we need to perform DML operation on multiple tables in one script. That table may or may not be empty.

Advantages of Mload:-

1)       Batch mode utility that runs on the host system.
2)       Supports up to five populated tables.
3)       Host and LAN support.
4)       Error reporting via error tables.
5)       Supports up to five populated tables.

Limitations of Teradata Multiload Utility:-

1)       MultiLoad Utility doesn’t support SELECT statement.
2)       Concatenation of multiple input data files is not allowed.
3)       MultiLoad doesn’t support Arithmetic functions i.e. ABS, LOG etc. in Mload Script.
4)       MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
5)       MultiLoad doesn’t support USIs (Unique Secondary Indexes), Referential Integrity, Join Indexes, Hash indexes and triggers.
6)       Import task requires use of primary index (PI).

Operating modes:-

Teradata Multi load runs in the following operating modes: 
• Interactive mode; 
• Batch mode.

Interactive mode: - Teradata Multi load uses terminal screen and keyboard as the standard output and input streams.

Batch mode: - Multi load uses > and < to redirect the standard output / input streams.
To invoke Teradata Multi load. In interactive mode, enter the following command (we are using on Windows):

C:\Documents and Settings\chanchal>MLOAD_SCRIPT_PATH

To invoke Multi Load in batch mode, use the command below (we are using on Windows):-

C:\Documents and Settings\chanchal> MultiLoad [options] < infile > out_file 

Here, the infile is a Teradata MultiLoad job script file which includes all the required Multi load commands and Teradata SQL statements, whereas the out_file is the output stream file which will contain log for that script. 
Let’s try to create sample Mload script and run it.

Before we performed Mload, we would need to make sure that we have already created table in which we need to insert data or we have specify the table creation script in Mload script itself.  For this example we are going to create employees table explicitly before performing the Mload. So let’s create a table as shown below page and below is the sample Mload script. In this script we are using DDL as well:-


Note:-
In logon command, you need to specify the Teradata server name or Teradata IP address in blank section above.


Also note, you would need to alter the path of the infile and outfile according to your infile and outfile path.

Above Mload script is having five phases, called as 5 phases of Mload which are listed below:-


Preliminary Phase: -

This is the first phase and it looks into the basic set up of Mload script. It includes validate all statement in Mload and all SQL statement, start all session whatever specified, create work tables, create error tables, create restart log, apply locks to target tables.

DML Phase:-

This is the second phase in Mload and it looks into the DML transaction of the Mload script. It includes store DML steps in work tables, host data to be filled in from input file and allows link between DML and transaction record. It can also include the table’s creation script which we can use to load data in.

Acquisition Phase:-

This is the third phase in the Mload and it looks into the acquisition of data from input file to appropriate AMP’s. It get’s the data from host and apply it to appropriate AMP worktables, reblock and store in worktable of target table and set up transition to the Application phase.

Application phase:-       

This is the fourth phase in the Mload and it applies the input data to the appropriate Target Tables with the help of AMPs. It receives data from acquisition phase and applies it on target tables.

Clean-up phase:-


This is 5 and last phase in Teradata Mload and it looks into the basic clean up. It releases all locks on the target tables and views. It also drops the temporary work tables and all empty error tables from database.

The most important thing in Teradata utilities that they return particular code. Teradata Mload, Fload and Bteq returns code after executing all the script. Multiload returns the returns codes (0, 4, 8, and 12) to the operating system or calling program and sometimes its call log file. Let’s look at what are the return codes in Teradata.

As discuss above Teradata can return different codes as discuss above. Teradata return:-

0:- When all the process execute successfully.

4:- When there is any warning to the user. For example, if you are trying to create table which is already there 
in the database then it will show you return code as 4.

8:- When user specifies wrong code. For example, when you are trying to access the table or view which is not present in database then it will return you return code 8.

12:- When server fatal error occurs. For example, when you specify wrong number of parameters then it will show you return code 12.

Let’s try to insert below shown excel file into the Teradata database using Mload.

Format of the infile (input file):-

We have a excel file and we need to insert the data which is present in excel file. For that we need to follow below steps:-

                  1)       Screenshot of the excel file



 Note:-

You will have to look into the format of the date as Teradata allows ‘YYYY-MM-DD’ format.

2)   Point your mouse on save as and click on “Other Format”


                    3)       Save excel file as Tab delimited file.




                 4)       Save the file on a particular location


                   5)       Your sample tab delimited file will look like below


Above data is separated by tab (Tab on keyboard) and hence it looks like this (Big gap between two values).

Note:-

      While converting file from excel to tab delimited file just ignore first row as it holds the column header.

After performing above 5 steps, you will have Mload script and input file on a particular location. Now you need   to execute a script so the data from the input file will get uploaded into the database table. Do the following step on command prompt:- 


Above Mload command looks little bit complicated but nothing is complicated in that. I just have specified the name of the Mload script and the name of the log file with full path where it resides (Log file will get generated on the specified path above).

A brief explanation about the above command is below:-
1)       Open command prompt
2)       Go to the path where your Mload script resides (optional)
3)       Type Mload “< >”
4)       Between “< >” type the name of the Mload script.
5)       After “>” type the name of the log file. (This file will contain the info as what has happened in the script.)
6)       And hit enter.

Suppose I have saved the above file as mload_sample, while executing this script I will enter below code on the command prompt:-

Mload < mload_sample> mload_sample_log.log

Above command in command prompt would look like below:-


After executing above Mload command, you will have log file on specified location. Sample log file is attached herewith (double click on the below name). After checking log file, you can say you have imported data successfully.


The phases we have seen above will look like below in the log file. Below is the part of the log file, you need to check full log file to confirm you have uploaded data correctly and the data is present in the table as well.


     ========================================================================
     =                                                                      =
     =          MultiLoad Initial Phase                        =
     =                                                                      =
     ========================================================================
     ========================================================================
     =                                                                      =
     =          MultiLoad DML Transaction Phase    =
     =                                                                      =
     ========================================================================
     ========================================================================
     =                                                                      =
     =          MultiLoad Acquisition Phase               =
     =                                                                      =
     ========================================================================
     ========================================================================
     =                                                                      =
     =          MultiLoad Application Phase              =
     =                                                                      =
     ========================================================================
     ========================================================================
     =                                                                      =
     =          MultiLoad Task Cleanup                     =
     =                                                                      =
     ========================================================================

Last 6 lines will be like below:-

**** 11:53:54 UTY6216 The restart log table has been dropped.
**** 11:53:54 UTY6212 A successful disconnect was made from the RDBMS.
**** 11:53:54 UTY2410 Total processor time used = '0.140625 Seconds'
     .       Start : 11:52:29 - THU DEC 05, 2013
     .       End   : 11:53:54 - THU DEC 05, 2013
     .       Highest return code encountered = '0'.


Look at the above highest return code encounter, its 0. Also you need to cross verify that whether the records inserted are actually inserted in a table or not. You can do it with the help of BTEQ.

2 comments:

Please provide your feedback in the comments section above. Please don't forget to follow.