What do you find useful about the Spark Web UI for monitoring Spark applications?
Hi Suzanne Ferry
I like to understand spark's partition decision when parallel applications running?
Each stage contains multiple tasks and task will run on a partition, so I am assuming number of tasks = number of partitions and each task requires a CPU core of cluster(Please correct me if my understanding is wrong)
So if parallel applications running in my cluster, will spark do partition based on available CPU cores since CPU utilisation will differ based on number of applications running.?
Also when can I use repartition () and coalesce () ? What are the scenarios where spark's default partition are not ideal
Thanks for your inquiry on Spark's partition decisions. This is an interesting question for the Answers community, where you will get much more traction on technical details. Please re-ask it there! Best, Suzanne
Sure. Thank you.
Retrieving data ...