Author: Sanjamala Nayeem
Original Publication Date: April 3, 2015
- MapR 4.0.1
- sqoop 1.4.5
When using the Teradata as database from which data is imported to mfs, and user do not have create table privileges, import will fail if more than one map is opted in parallel to be executed.
This Article is to show how to handle import if user do not have privileges of create table while importing data from Teradata.
By default, temporary tables are used to ensure atomicity. During import operation, a temporary table is created with a copy of the original data divided into a number of partitions. The number of partitions in the temporary table is the same as the number of mappers in Hadoop (as specified by the user). Then each individual mapper pulls data from one single partition, and multiple mappers run in parallel to one another. After data are successfully imported, the temporary table is deleted.
In the opposite direction, during export operation, each individual mapper creates a temporary table to load data into. Similar to import operation, multiple mappers runs in parallel to one another. At the end, data from all temporary tables are merged into the target table, and then the temporary tables are deleted as well. If for any reason export operation cannot be completed successfully, the target table will remain intact.
Disabling the temporary tables
While temporary tables provide the benefit of atomicity, it also takes up additional time and space. Hence, there is a switch available to turn off the temporary table usage as needed depending on the use case scenario. To disable the temporary table usage during import operation, for example,
sqoop import -D sqoop.teradata.import.use.temporary.table=false --connect jdbc:teradata://localhost/DATABASE=sqooptest --username sqooptest --password xxxxx --table MY_TABLE --num-mappers 3 --target-dir /user/sqooptest/MY_TABLE
In this case, all mappers pull data from the source table MY_TABLE directly at the same time.