AnsweredAssumed Answered

bad page to cache page when using hadoop fs -text

Question asked by fmdataservices on Jun 12, 2012
Latest reply on Jun 15, 2012 by Ted Dunning
We are migrating to MapR from Cloudera and getting errors for "bad page to cache page" when using hadoop fs -text.

The process is using Hive to overwrite a directory in MapR FS from a select query. This part seems to work fine. The next step uses hadoop fs -text to change the delimiter, bzip and put the result back into MapR FS as follows:

'hadoop fs -text $temp_dir/* | sed "s/[\cA]/${OUTPUT_DELIM}/g" | bzip2 -c | hadoop fs -put - ${final_file}

--
The output (errors at the end):
--

    Hive history file=/tmp/datasrv/hive_job_log_datasrv_201206121621_1540010305.txt
    OK
    Time taken: 1.719 seconds
    Added /opt/mapr/hive/hive-0.7.1/lib/hive-contrib-0.7.1.jar to class path
    Added resource: /opt/mapr/hive/hive-0.7.1/lib/hive-contrib-0.7.1.jar
    Total MapReduce jobs = 2
    Launching Job 1 out of 2
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201205032010_1165, Tracking URL = http://hadoop1.tor.fmpub.net:50030/jobdetails.jsp?jobid=job_201205032010_1165
    Kill Command = /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/   
    // -kill job_201205032010_1165   
    2012-06-12 16:21:43,373 Stage-1 map = 0%,  reduce = 0%
    2012-06-12 16:21:46,394 Stage-1 map = 100%,  reduce = 100%
    Ended Job = job_201205032010_1165
    Ended Job = 537408545, job is filtered out (removed at runtime).
    Moving data to: maprfs://172.16.103.121:7222/tmp/hive-datasrv/hive_2012-06-12_16-21-41_401_6237668420297388701/-ext-10000
    Moving data to: /user/datasrv/jobruns/tz-hourly/job-tz-hourly_2011083017_2011083017/temp_tenzing-long-urls-url-mappings_2011083017_2011083017
    OK
    Time taken: 5.024 seconds
    12/06/12 16:21:48 ERROR fs.Inode: 2049.20382.241108 /user/datasrv/jobruns/tz-hourly/job-tz-hourly_2011083017_2011083017/temp_tenzing-long-urls-url-mappings_2011083017_2011083017/000000_0 Returning bad page to cache page: (2049.20382.241108 0, id: 1, state Invalid)
    12/06/12 16:21:48 ERROR fs.Inode: 2049.20382.241108 /user/datasrv/jobruns/tz-hourly/job-tz-hourly_2011083017_2011083017/temp_tenzing-long-urls-url-mappings_2011083017_2011083017/000000_0 Returning bad page to cache page: (2049.20382.241108 0, id: 1, state Invalid)
    text: null

Outcomes