I just want to clarify the Modify input data part of Lab 8.1 using standard Hadoop commands. Is the point of this is to show us we that MapR supports standard Hadoop commands with the MapR-FS?
The bit of confusion I have is the earlier discussion of the HDFS of the limitations of HDFS, one being data is immutable and modify, you actually have to copy the file locally, modify, and the put it back onto the cluster, and HDFS will not actually replace or delete a file, it just adds the new file.
In our lab exercise we:
1) put data onto the cluster
2) copy the file we want to change locally (resolve.conf) using hadoop fs -get
3) We modify /tmp/resolve.conf locally
4) then we remove the file existing file from the cluster using the hadoop fs -rm /h-input/resolve.conf (because if we don't remove it then the next step fails). But would we actually be able to do this step in a HDFS environment or because we are using MapR-FS we can remove the file?
5) Move the modified file back to the cluster using hadoop fs -put /tmp/resolve.conf /h-input
Just trying to further clarify the differences between HDFS and MapR-FS.