AnsweredAssumed Answered

NFS Write Failures in 3.0.2

Question asked by impermisha on Jan 21, 2014
Latest reply on Jan 23, 2014 by mufeed
Copying files into the cluster via NFS is sporadically failing. This happens regardless of the host writing to the NFS.

The error message:
cp: closing `/mnt/c1/files/file.gz': Input/output error

If you do a hadoop put, it goes in fine :
hadoop dfs -put file.gz /files/

If you delete the 0 byte file and retry via cp over the mnt, consistently fails.
I have restarted the entire cluster as well for kicks - no change.

Looking at the logs from cldb
2014-01-21 13:01:30,093 INFO  com.mapr.fs.cldb.CLDBServer [RPC-12]: ContainerAssign ignore Writer 10.100.0.100:0 for volume, incoming with max size, 2861669 and with 432 containers

From the nfs log
2014-01-21 13:01:30,1414 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x86eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660
2014-01-21 13:01:30,1439 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x76eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660
2014-01-21 13:01:30,1467 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x66eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660

If you take that node offline, just moves the error2014-01-21 13:05:47,927 INFO  com.mapr.fs.cldb.Containers [RPC-8]: ContainerFailure reported by FileServer 10.100.0.101 for container 3117 on StoragePool b1e34b2f9a49ab0d00528bcbc304a34a on failed fileserver 10.100.0.103

2014-01-21 13:01:30,1414 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x86eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660
2014-01-21 13:01:30,1439 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x76eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660
2014-01-21 13:01:30,1467 ERROR nfsserver[25623] fs/nfsd/fileops.cc:3769 10.100.0.29[0x66eb91d] Write failed: status=22 [nfsfh=0.3037188352.2323.1167.3551098] fs=10.100.0.111:5660

Outcomes