AnsweredAssumed Answered

Has anybody tried to work with Divolte-collector on MAPR

Question asked by anthony.kalinde on Mar 1, 2015
Latest reply on Jun 22, 2015 by anthony.kalinde
I am not sure if I can get help for this on here, but I thought it was worth a try.

I have 3 node cluster on AWS, I am running MAPR M3 , I installed Storm, Kafka and Divolte-collector and Cassandra. I would like try some of the clickstream examples and I am running into an issue with the tcp-consumer example. Also being quite new to java and distributed processing I have some clarification questions. Again I am not quite sure where to post this because I feel like this is divolte-collector specific and I also have some gaps in my understanding of the javadoc concept and the building and running of jar files; but I figured someone could point me to some resources or help with some clarifications. I can't get the json string to appear in the console running netcat socket listening for clicks:

Divolte tcp-kafka-consumer example

Everything seems fine until step 7 and my knowledge gap is with step 6.

    Step 1: install and configure Divolte Collector

Install works and hello world click collections is promising :-)

    Step 2: download, unpack and run Kafka
    # In one terminal session
    cd kafka_2.10-
    ./ ../config/

already have mapr zookeeper instance running could this be an issue?
    # Leave Zookeeper running and in another terminal session, do:
    cd kafka_2.10-
    ./ ../config/

No erros plus tested kafka examples so seems to working as well

    Step 3: start Divolte Collector
    Go into the bin directory of your installation and run:
        cd divolte-collector-0.2/bin

Step 3 no hitch, can test default divole-collector test page

    Step 4: host your Javadoc files
    Setup a HTTP server that serves the Javadoc files that you generated or downloaded for the examples. If you have Python installed, you can use this:
        cd <your-javadoc-directory>
        python -m SimpleHTTPServer

Ok so I can reach the javadoc pages


      Step 5: listen on TCP port 1234
    nc -kl 1234
    Note: when using netcat (nc) as TCP server, make sure that you configure the Kafka consumer to use only 1 thread, because nc won't handle multiple incoming connections.

Tested netcat by opening port and sending messages so I figured I don't have any port issues on AWS.

    Step 6: run the example
        cd divolte-examples/tcp-kafka-consumer
        mvn clean package
        java -jar target/tcp-kafka-consumer-*-jar-with-dependencies.jar
    Note: for this to work, you need to have the avro-schema project installed into your local Maven repository.

I installed the avro-schema with mvn clean install in avro project that comes with the examples. as per instructions [here][1]

    Step 7: click around and check that you see events being flushed to the console where you run netcat
    When you click around the Javadoc pages, you console should show events in JSON format

I don't see the clicks in my netcat window :( Investigating the issue I viewed the console and network tabs using chrome developer tools it seems divolte is running, but I am not sure how to dig further. This is the console view. Any ideas or pointers?

Thanks anyways

     Initializing Divolte.
    divolte.js:140 Divolte base URL detected
    divolte.js:280 Divolte party/session/pageview identifiers ["0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh"]
    divolte.js:307 Module initialized. Object {partyId: "0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", sessionId: "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", pageViewId: "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh", isNewPartyId: false, isFirstInSession: false…}
    divolte.js:21 Signalling event: pageView 0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh0
    allclasses-frame.html:9 GET
    overview-summary.html:200 GET http://localhost:8290/divolte.js net::ERR_CONNECTION_REFUSED