AnsweredAssumed Answered

Access to S3 for input & output files

Question asked by jldupont on Aug 31, 2012
Latest reply on Sep 4, 2012 by gera
I am using Amazon EMR with MapR.  I try specifying file paths on S3 to no avail: I always get something like:

    Caused by: java.io.IOException: Could not resolve path: s3://some_other_bucket/out
     at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:223)
     at com.mapr.fs.MapRFileSystem.delete(MapRFileSystem.java:389)

I use the following JAR arguments (example):

`ClassName -input s3n://some_bucket/ -output s3n://some_other_bucket/out -mapper 20 -reducer 4`

Please help! I am a total Hadoop newbie!

**Update**:  After ssh'ing in the master node, I get:


    hadoop@ip-10-70-7-195:/mnt/var/log/hadoop/steps/1$ hadoop fs -ls s3n://some_bucket/
    12/09/02 20:01:24 INFO metrics.MetricsSaver: MetricsSaver FsShell root:hdfs:///mnt/var/lib/hadoop/metrics/ period:60 instanceId:i-d0d50baa jobflow:j-25TKD097Y72O6
    12/09/02 20:01:24 INFO metrics.MetricsUtil: supported product mapr-m3
    12/09/02 20:01:24 INFO metrics.MetricsSaver: Disable MetricsSaver due to MapR cluster
    12/09/02 20:01:24 INFO metrics.MetricsSaver: Inside MetricsSaver Shutdown Hook

**Update2**:  I have also tried using a custom `core-config.xml` file in the bootstrapping process.

    <property>
            <name>fs.default.name</name>
            <value>s3n://</value>
    </property>
    
    <property>
            <name>dfs.name.default</name>
            <value>s3n://</value>
    </property>

This time I get another issue:

    Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: /user/hadoop/some_bucket
     at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:225)

 

Outcomes