AnsweredAssumed Answered

Best strategy for Spark M7 read

Question asked by bhardwaj_rajesh on Jul 24, 2014
Latest reply on Jul 28, 2014 by bhardwaj_rajesh
Hello,
This is how we can ask Spark to read from Hbase table (similar example is given for M7 table)
val sparkConf = new SparkConf().setAppName("HBaseTest")
    val sc = new SparkContext(sparkConf)
    val conf = HBaseConfiguration.create()
    // Other options for configuring scan behavior are available. More information available at
    // http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html
    conf.set(TableInputFormat.INPUT_TABLE, args(0))

Now when you read Apache Phoenix, they talk about doing parallel scan from client side and using INCLUDE_AND_SEEK_NEXT_USING_HINT. Can you please guide, how can I do this with HBase conf.
Say my key is {String - 8 Characters}{UnixEpochTime}
I am interested in  getting 1000 random String with date ( > date1 and <date2)

As I will be interesting in doing aggrgeation, I am ok with running parallel scans (Another question, is as M7  have concept of regions (tablets) , do I have to even worry about parallel scans or its automatically taken care of)

Outcomes