FastLoad as name suggest can load vast amounts of data from flat files from a host into empty tables in Teradata with lightning-like speed. FastLoad was developed to loads millions of record into Empty Teradata tables. In my earlier post i have briefly explained about the most used Teradata Data Loading Utilities Features, Uses and Best Practices. Now i am going to explain detail post about Fastload utility which is very simple but very powerful if there is requirement to just load the data in an empty Teradata table.
Fastload is a command line utility through which you can load data from Disk on a channel attached system,Flat files on a network attached system and also from any source which is properly formatted. FastLoad typically offers higher performance levels than a standard application program written to load data to an empty table because data allocation, conversion, movement, and loading are automatic and performed in parallel. Also as Fastload does not use Transient Journal while loading the data, so faster the speed. Fastload is only used for Inserting the data, it cannot be used for any other DML operation.
A typical Fastload script consist Fastload commands and Teradata SQL statement, out of which Fastload commands provide the session control and data handling specifications for the data load operations, and Teradata SQL statements perform the actual data load functions on the Teradata Tables. Using a single Fastload script you can load a single table only, so to load multiple table you need to submit multiple fastload jobs, one for each table.
Automatic Data Conversion in Fastload
Fastload support automatic data conversion and can convert appropriate data type according to the definition of the destination table. The FastLoad can convert data as the following rules
- Numeric – to – numeric
- Character – to – numeric
- Character – to – date
- Date – to – character
Fastload Operating Modes
Fastload supports two operating modes
Interactive Mode:- In interactive mode, FastLoad use terminal screen and keyboard as the standard output and input streams. If you
want to invoke FastLoad in interactive mode, use command :
Batch Mode:- In batch mode, FastLoad uses > and < to redirect the standard output / input streams.
To invoke FastLoad in batch mode, use the command below (for Microsoft Windows):
c:\teradata\bin\fastload [options] < infile > outfile
Here, the infile is a Teradata FastLoad job script file which includes all the required FastLoad commands and Teradata SQL statements, whereas the outfile is the FastLoad output stream file.
Fast Load Phase
Fast job is divided into two phases
Phase 1- Acquisition Phase:- The primary function of Phase 1 is to transfer data from the host computer to the Access Module Processors (AMPs) as quickly as possible. Much like a post office package being first sent to a shipping hub in a seemingly unrelated city and then, from that hub it is sent to the destination city.
- FastLoad uses one SQL session to define AMP steps and another SQL session for log table restart operations.
- FastLoad sends data in form of 64k block of records to each AMP which has a “Load” session.
- The deblocking task within the AMP hashes each record and redistributes the rows.
- Each and every AMP has a receiving task which collects the rows and writes a block of unsorted data rows to disk.
- At the end of Phase 1, each AMP has the rows it should have, but the rows are not in row hash sequence.
Phase 2- Application Phase:- Phase 2 can be compared to preparing the package for arrival at the destination city, it must first be sorted by street and zip code, placed onto local trucks and be driven to their final, local destinations. Similarly, FastLoad’s Phase 2 is mission critical for getting every row of data to its final address (i.e., where it will be stored on disk).
- When the FastLoad job receives the END LOADING statement, only then does the FastLoad start the End Loading (sort phase).
- Each AMP simultaneously sorts the Target table, puts the rows into their blocks, and then writes the blocks to the AMPs assigned disk.
- If the table has Fallback the rows are generated and written.
- Table data is now available after the Application Phase completes.
Prerequisite for executing Fastload Script
Empty Table :- Fastload can be used to load only empty tables and due to that there should not be any record prior to running the fastload script.If the table is not empty then fastload will pause and show the error.
Log Table :- Log table is used to store all the progress during execution of fastload.
Error Table :- Fastload requires two error tables to catch any error which happens during execution, if we don’t create error tables then this utility automatically create two tables with below names
- ET_TableName //ERROR Table
- UV_TableName //UNIQUE VIOLATION Table
First error table captures below types of errors
- Data conversion error
- Constraint validation
- Unavailable AMP error
Second error table captures error related to Unique Primary Index. Suppose the table which you are loading has UPI defined and your source file contains duplicates data then those record will be moved into UV table.
These table comes very useful in error handling.
Sample FastLoad Script to Load Data From Delimited Flat File
This script is will INSERT data into an empty Teradata table called TEST_FL. This table exists in the database DEV.
FastLoad will first logon, then it will build the table structure (unless it already exists and is empty). After that it will begin loading, but it will always define two error tables. Here we just need to give two error table names, Fastload will automatically create the structure.
If everything works fine then the INSERT is performed and we are done. Fastload will automatically drop the error tables once load is successful.
Fastload script with explanation given in comment
.logmech ldap /* if you are using LDAP mechanism for authentication otherwise optional*/ .SESSIONS 4; .LOGON servername/userid,password; /*Teradata server, Your User ID, Password For better security mention the login details in another file and use RUN logon.txt command in FastLoad*/ DROP TABLE DEV.TEST_FL; /*To drop existing table*/ DROP TABLE DEV.ERR_FL1; /*To Drop error table*/ DROP TABLE DEV.ERR_FL2; /*To Drop error table*/ CREATE MULTISET TABLE DEV.TEST_FL ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO ( EID VARCHAR(6), NAME VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC) PRIMARY INDEX ( EID ); /*SET RECORD is used to define the delimiter used in file, here | is the delimiter for current flat file*/ SET RECORD VARTEXT “|”; /*to define the structure of the file use DEFINE and mention the file path from which, table will be loaded*/ DEFINE EID (VARCHAR(6)) ,NAME (VARCHAR(20)) FILE = /tmp/employee; /*flat file path*/ /*BEGIN will start the load process */ BEGIN LOADING DEV.TEST_FL ERRORFILES DEV.ERR_FL1,DEV.ERR_FL2; INSERT INTO DEV.TEST_FL VALUES( :EID, :NAME); END LOADING; LOGOFF; /*last statement in the fastload script*/
I have given very simple script though which you can understand the loading process and how fastload script works.
Limitation with FastLoad Utility
Below are some points which you need to consider before using fastload for data loading
- Target Table must be empty
- A single Fastload script can load a single empty table at a time
- No Secondary indexes are allowed on Target Table, because of the way this utility is designed.
- No referential integrity allowed on Target Table. If there is RI on table then it must be dropped before loading and then later it can be created.
- No trigger allowed at the load time. If there is trigger defined then it must be disabled prior to loading the data.
- No duplicate data will be loaded even if the table is multi-set.
- No more than one data type conversion is allowed per column during fast load.
I have tried to give a quick explanation about Teradata FastLoad utility. Please feel free to share the post and also give your opinion and feedback in the comment.
Reference : Teradata Load Utilities