AnsweredAssumed Answered

How to set flume to send log file and save at hdfs(M3)?

Question asked by hyejin on Feb 16, 2014
Latest reply on Mar 3, 2014 by Ted Dunning
Hello.
I tried sending log file to maprfs. but I couldn't find that file I sent.
Please give me some tips!

after installing mapr-flume, I modified flume-avro.conf like below.


----------

    agent.sources = reader
    agent.channels = fileChannel
    agent.sinks = avro-forward-sink

    # For each one of the sources, the type is defined
    agent.sources.reader.type = exec
    agent.sources.reader.command = tail -f /opt/mapr/logs/configure.log

    # stderr is simply discarded, unless logStdErr=true

    # If the process exits for any reason, the source also exits and will produce no further data.
    agent.sources.reader.logStdErr = true
    agent.sources.reader.restart = true
      
    # The channel can be defined as follows.
    agent.sources.reader.channels = fileChannel
       
    # Each sink's type must be defined
    agent.sinks.avro-forward-sink.type = avro
    agent.sinks.avro-forward-sink.hostname = localhost
    agent.sinks.avro-forward-sink.port = 41414
        
    #Specify the channel the sink should use
    agent.sinks.avro-forward-sink.channel = fileChannel
         
    # Each channel's type is defined.
    agent.channels.fileChannel.type = FILE
          
    # Other config values specific to each type of channel(sink or source)
    
    # can be defined as well
    agent.channels.fileChannel.type = FILE
    agent.channels.fileChannel.transactionCapacity = 1000000
    agent.channels.fileChannel.checkpointInterval 30000
    agent.channels.fileChannel.maxFileSize = 2146435071
    agent.channels.fileChannel.capacity 10000000
    agent.sources = avro-collection-source
    
    agent.channels = channel1
    agent.sinks = hdfs-sink
     
    # For each one of the sources, the type is defined
    agent.sources.avro-collection-source.type = avro
    agent.sources.avro-collection-source.bind = 0.0.0.0
    agent.sources.avro-collection-source.port = 41414
    
    # The channel can be defined as follows.
    agent.sources.avro-collection-source.channels = channel1
    agent.sinks.hdfs-sink.channel = channel1
    
    #properties of hdfs-cluster 1-sink
    agent.sinks.hdfs-Cluster1-sink.type=hdfs
    agent.sinks.hdfs-Cluster1-sink.hdfs.path=hdfs://user/mapr/flume
    
    # Each sink's type must be defined
    agent.sinks.hdfs-sink.type = hdfs
    
    agent.sinks.hdfs-sink.kerberosPrincipal = flume/qa-node133.qa.lab@QA.LAB
    agent.sinks.hdfs-sink.kerberosKeytab = /opt/mapr/conf/flume.keytab
    
    agent.sinks.hdfs-sink.path = /user/mapr/flume/
    agent.sinks.hdfs-sink.filePrefix = LogCreateTest
    agent.sinks.hdfs-sink.rollInterval = 6
    agent.sinks.hdfs-sink.rollSize = 0
    agent.sinks.hdfs-sink.rollCount = 10000
    agent.sinks.hdfs-sink.batchSize = 10000
    agent.sinks.hdfs-sink.txnEventMax = 40000
    agent.sinks.hdfs-sink.fileType = DataStream
    agent.sinks.hdfs-sink.maxOpenFiles=50
    agent.sinks.hdfs-sink.appendTimeout = 10000
    agent.sinks.hdfs-sink.callTimeout = 10000
    agent.sinks.hdfs-sink.threadsPoolSize=100
    agent.sinks.hdfs-sink.rollTimerPoolSize = 1
    
    #Specify the channel the source and sink should use
    agent.sources.avro-collection-source.channels = channel1
    agent.sinks.hdfs-sink.channel = channel1
    agent.channels.channel1.type = FILE
    agent.channels.channel1.transactionCapacity = 1000000
    agent.channels.channel1.checkpointInterval 30000
    agent.channels.channel1.maxFileSize = 2146435071
    agent.channels.channel1.capacity 10000000
    
    **agent.channels.channel1.dataDirs = hdfs://user/mapr/flume/data
    agent.channels.channel1.checkpointDir = hdfs://user/mapr/flume/checkpoint**


----------


 
All of that is same things which is installed at first time.

I typed last two line. and I created new volume at MapR webapp.

Then

    ../bin/flume-ng agent -c conf -f flume-avro.conf -n agent -Dflume.root.logger=INFO,console

and I opened another session.

    bin/flume-ng avro-client -H localhost -p 41414 -F /home/mapr/1.log

Then It seemed running well. However I couldn't find any change of disk utilization at MapR webapp(GUI).

Do you know what I did wrong? ? ?

Outcomes