I am going through mapr certified data analyst guide pdf. There are 10 sample questions of which i have doubt about question 10. Below is the question.
You have a 20 node cluster that includes 10 nodes dedicated for data analysis. These are all data nodes in the topology, /data/analysis/. To best take advantage of data locality with your queries, you will install Drill on:
a) every node in the /data/analysis/ topology, and all of the control nodes.
b) every node outside of the /data/analysis/ topology.
c) every node in the /data/analysis/ topology.
d) every node in the cluster.
I would like to discuss about answer options and understand why option C is correct.
My general knowledge to support data locality says, Drill must installed on data nodes. Question tells out of 20, 10 are data nodes. So option B is ruled out as answer as there is no point installing Drill on non data nodes, this will not give data locality. Option A and D are one and the same as they both mean to install Drill on all nodes. Doing so would definitely help us achieve data locality, but i believe we will end up with redundant drill installations on non data nodes/control nodes. So installing Drill on 10 nodes only with topology /data/analysis/ will be just enough to achieve Data Locality hence option C is right.
Is my understanding right?