AnsweredAssumed Answered

Performance of Data Frames vs Data Sets in Spark

Question asked by john.humphreys on Apr 9, 2018
Latest reply on Apr 10, 2018 by cmcdonald

I'm reading an advanced Spark 2.X book and have also read numerous online sources and forum replies as well as stack-overflow answers.  I'm having trouble getting a straight answer to this though.


Are Datasets faster, slower, or equally performant to Data Frames?


I understand the other benefits (compile type checking/etc), but the textbook seems to imply Datasets can be optimized more than Data Frames.  Online reading implies that they are equally performant.  Many forum replies seem to indicate Data Frames are actually faster than data sets.


I don't know which one to believe .