I have a single table in MySql which contains around 24000000 records. I need a way to import this data into a table in MapR DB with multiple column families. I initially chose Sqoop as the tool to import the data but later found that I cannot use Sqoop to directly import the data as Sqoop does not support multiple column family import as yet.
I have populated the data in MapR FS using Sqoop from the MySql database.
What are my choices to import this data from MapR FS to MapR DB table with 3 column families?
It seems for bulk import, I have two choices:
- ImportTSV tool: this probably requires the source data to be in TSV format. But the data that I have imported in MapR FS from MySql using Sqoop seems to be in the CSV format. What is the standard solution for this approach?
- Write a custom Map Reduce program to translate the data in MapR FS to HFile and load it into MapR DB.
I just wanted to ensure that are these the only two choices available to load the data. This seems to be a bit restrictive given the fact that such a requirement is a very basic one in any system.
If custom Map Reduce is the way to go, an example or working sample would be really helpful.