AnsweredAssumed Answered

Streamsets sdc can't detect directories in user created volumes of NFS mounted cluster

Question asked by reedv on Jan 3, 2018
Latest reply on Jan 25, 2018 by Mufy

Have installed streamsets and the prerequisites according to the streamsets docs, setting the user env. vars. to have the default user and group be called "mapr" (see here) and restarted the sdc service. Now a problem is that when trying to use a Local FS origin that is a directory in a volume on the cluster (with access granted to the mapr user, "mapr") that is mounted to the node NFS, I get the error that the file does not exist when trying to preview the pipeline with just the origin. Yet, moving that directory to the top level mounted NFS directory /mapr/mycluster.cluster.local/myorigindir seems to clear the problem and the pipeline can be previewed without validation errors.

 

Further inspection in the MCS lead me to see that the original volume I was trying to load from from a custom volume that only had read/write permissions for "mapr" user, while the root volume "/" does have public read write by default. So ok, but I set the sdc user env. vars. to use the name "mapr", so why did I get rejected(?) when trying to access the custom volume? What is happening here when I set the env. vars. for sdc (because it does not seem to be doing anything)?

 

This is a snippet of what my /opt/streamsets-datacollector/libexec/sdcd-env.sh file looks like:

 

# user that will run the data collector, it must exist in the system
#
export SDC_USER=mapr

# group of the user that will run the data collector, it must exist in the system
#
export SDC_GROUP=mapr

 

(This is similar to a question I posted here, except this deals specifically with what setting the user env. vars. for sdc is really doing).

Outcomes