How can you calculate the Number of executor
Consider the following cluster information:
cluster
node=10
each node
core =16 (-1 for OS)
RAM = 61GB (-1 for OS)
Here is the number of core identification:
Number of core is - number of concurrent task an executor can run in parallel.
rule of thumb = 5
To calculate the number of executor identification:
no of executor = no. of core / concurrent task
= 15 / 5
= 3
number of executor = no of node * no of executor on each node
10 * 3 = 30
Out of Memory Error
Spark is memory intensive distributed system. With access t o several GB of memory.
But yet job fails with out of memory issue.
Driver and Executor are the JVM process.
they have fixed memory allocated
when spark job fails, identify what is failing.
Spark job is made of drive and set of executor. They both use large amount of memory
they both are allocated certain finite amount of memory
When they need more then what is allocated we get OOM
Check whether driver failed or executor failed or Task inside executor failed.
Spark job consist of number of task.
Task run inside Executor container
Several spark job can run inside container (executor)
Executor are the containers with CPU and memory allocated.
like this multiple executor (container) can exists for the job
Error like OOM or container killed will be showing up when memory issue happens.
Driver is orchestrating the execution between executors
both memory can be controlled by
spark.executor.memory
spark.driver.memory
If 4 GB is allocated to Executor
300 MB is reserved
60 % of (4GB - 300M ) - will go to Spark = spark memory
rest to user memory
out of 60 % of (4GB - 300M ) (Spark Memory)
some is used for storage(cached obj) and rest execution
spark memory is dynamically managed (available after 1.16 version .)
Initially storage and execution get 50 -50 % of spark memory (60% of 4gb - 300mb)
4gb = 300MB reserved
1.5 user memory (40% of 4gb - 300M)
2.2 spark Memory (60% of 4gb - 300M)
1.1 storage
1.1 Execution
when memory for execution is not enough spark with take memory from storage .
If storage have object, it will move to Disk.
Storage memory inside spark memory - is managed by spark.memory.storageFraction which is 50%
How Many cores are allocated to the executor
number of cores mean number of task we can run in parallel.
If we have 2 CPU core then we can run 2 task in parallel.
# CPU Core = # Task in parallel
rule of thumb = for each CPU 2or 3 task
Number of partition decides the number of Task
so if we have big data set which have 100 partition
then spark will create 100 task.
If each task have to deal with lot of data then potential problem to get OOM
few partition mean - Each task is overload,
- Partitions are how Spark processes a file in parallel.
- Each partition is processed independently of the other partitions—up to a shuffle boundary, which we talk about later.
- Partitions are assigned to Executors for processing. A given Executor in a given (a) cluster and (b) application will always process the same initial partitions (where "initial" means the partitions. of the base DataFrame).
The Executor processes each partition by allocating (or waiting for) an available thread in its pool of threads. Once a thread is available, it is assigned the processing of the partition, which is what we call a task. The thread pro
cesses the partition completely, up until either (a) a shuffle boundary, or (b) the end of the job, whichever comes first. Then, it persists its output (somewhere) and makes itself available to process another waiting partition.
On Community Edition, which (currently) uses a local-mode cluster with 8 threads ("cores"), you'll typically see 8 partitions.
- On a cluster with 8 nodes (8 executors), each with 4 threads ("cores"), for a total of 32 threads ("cores"), you'll typically see 32 partitions.
- https://files.training.databricks.com/courses/ilt/Spark-ILT/Spark-ILT-5.1.1/amazon/instructor-notes/Spark%20Architecture.html#:~:text=Spark%20uses%20the%20term%20to,CPU%20cores%20on%20each%20machine.
No comments:
Post a Comment