Apache Drill - Where to install drillbit?

Discussion created by Lokesh on Aug 8, 2017
Hi All,


I am going through mapr certified data analyst guide pdf. There are 10 sample questions of which i have doubt about question 10. Below is the question.


You have a 20 node cluster that includes 10 nodes dedicated for data analysis. These are all data nodes in the topology, /data/analysis/. To best take advantage of data locality with your queries, you will install Drill on: 


a) every node in the /data/analysis/ topology, and all of the control nodes.

b) every node outside of the /data/analysis/ topology.

c) every node in the /data/analysis/ topology.

d) every node in the cluster. 


I would like to discuss about answer options and understand why option C is correct.


My general knowledge to support data locality says, Drill must installed on data nodes. Question tells out of 20, 10 are data nodes. So option B is ruled out as answer as there is no point installing Drill on non data nodes, this will not give data locality. Option A and D are one and the same as they both mean to install Drill on all nodes. Doing so would definitely help us achieve data locality, but i believe we will end up with redundant drill installations on non data nodes/control nodes. So installing Drill on 10 nodes only with topology /data/analysis/ will be just enough to achieve Data Locality hence option C is right.


Is my understanding right?