I am working on a project that involves looking at health care claims data (structured) along with data from EMR (Electronic Medical Records) which is usually un-structured. In traditional sense, currently we are writing complex ETL jobs to import data into RDBMS. Every time data changes, it requires changes to ETL.
I understand I can add this data file to Hadoop as a flat file. But that would not solve my problem either. Ideally, I am looking for a solution that can help ingest semi-structured data and without days of coding, let me import and join this data with structured data.
All suggestions are welcome.