We are starting to establish a MapR cluster for Security Analytics, and as a first step we need to see all the data available to be used on the analytics. One part of it was logs which can be wmi logs (Windows Event Logs), and Machine logs (usually on xml/JSON).
As a Data Engineer, I need to suffice the need of our Data Scientist and ask me to parse this XML files to structured format (tables or csv). Are there any available tools in MapR or Hadoop to parse this files? as im trying to minimize the time frame if I need to make this on scratch (planning to create a python script that runs on spark)
Any inputs is appreciated