What types of data sources do you plan on using with Spark? csv, parquet, json? Do you plan on taking advantage of Dataframes with these data sources?
We just kicked off a Converge Blog discussion of the week related to this topic: Using Apache Spark DataFrames for Processing of Tabular Data - Let's Discuss
Parquet and CSV but does spark allows ORC?
Hi saurabh agrawal,
Yes, you can use ORC files with Spark. I found a few Stack Overflow threads which may be helpful:
I hope this helps!
Retrieving data ...