how to do bulk import to hive from oracle in quick time otherwise normally it takes almost for 10 hr for 1 billion row.
Anyone has done bulk load from oracle in quick time.
Hi PRAKHAR SHUKLA,
Have you tried increasing the number of mappers in the Sqoop scripts.. By default the number of mapper is 1. You can extend to any value and see how it performs ..
i have tried with 6 mapper but not much help.
Are you pulling data from a single table, or is the query joining multiple tables to create a record set?
Are you dumping the table(s) completely or are you looking for data that has only recently been updated?
Is your data partitioned on Oracle and are you pulling data from each of the partitions?
Are you querying on an indexed column?
Is your data evenly spread across your range scans?
(each mapper is running a separate query against a specific range)
How busy is your oracle database? Your network? Your hadoop cluster?
Retrieving data ...