AnsweredAssumed Answered

How does the hadoop CLI work with the FileServer for local data?

Question asked by coderfi on Jan 5, 2015
Latest reply on Jan 6, 2015 by coderfi
I am trying to understand why a vanilla `hadoop fs -copyFromLocal` command is failing for me.
I am running the command on a host that is running the FileServer.
The file has part of its blocks residing locally on the same physical host.

The unique situation for me is that I am running the hadoop CLI inside a docker container (running on the same host). It shares the same networking stack as host, so, from the FileServer's point of view, the client would appear to be coming locally.

As far as I can determine, when the client and FileServer are local to one another, it appears that somewhere in the MAPR libs, some sort of direct memory access between the client process and FileServer is occurring.

However something like this would fail (and does fail!), because docker keeps the host processes isolated from whatever is running inside the container.

Do you think this is what is happening? Basically, the client and FileServer is trying to directly 'talk' to one another but can not?

Note, when the data resides on other physical hosts, the command works fine.
Writing data (i.e. copyFromLocal) also works when trying to write blocks to other hosts, but fails as soon as it tries to write data to the local host.
`hadoop fs -ls` also seems to work perfectly fine (I guess because those commands flow through the CLDB, which is remote).