Sunday, 11 July 2021

Spark Memory allocation

 How can you calculate the Number of executor 

Consider the following cluster information:

cluster

   node=10

   each node 

     core =16 (-1 for  OS)

     RAM = 61GB (-1 for OS)


Here is the number of core identification:

  Number of core is  -  number of concurrent task an executor can run in parallel.

  rule of thumb = 5 

To calculate the number of executor identification:

no of executor = no. of core / concurrent task

                        = 15 / 5 

                       =  3

number of executor = no of node * no of executor on each node 

10 * 3 = 30


Out of Memory Error


Spark is memory intensive distributed system. With access t o several GB of memory. 

But yet job fails with out of memory issue.

Driver and Executor are the JVM process.

they have fixed memory allocated


when spark job fails, identify what is failing.

Spark job is made of drive and set of executor. They both use large amount of memory

they both are allocated certain finite amount of memory

When they need more then what is allocated we get OOM


Check whether driver failed or executor failed or Task inside executor failed.



Spark job consist of number of task.

Task run inside Executor container

Several spark job can run inside container (executor)


Executor are the containers with CPU and memory allocated.

like this multiple executor (container) can exists for the job

Error like OOM or container killed will be showing up when memory issue happens.


Driver is orchestrating the execution between executors

both memory can be controlled by 

spark.executor.memory

spark.driver.memory


If 4 GB is allocated to Executor

            300 MB is reserved

           60 % of (4GB - 300M ) - will go to Spark = spark memory

          rest to user memory


out of 60 % of (4GB - 300M )  (Spark Memory)

some is used for storage(cached obj) and rest execution


spark memory is dynamically managed (available after 1.16 version .)

Initially storage and execution get 50 -50 % of spark memory (60% of 4gb - 300mb)


4gb = 300MB   reserved

      1.5     user memory (40% of 4gb - 300M)

      2.2     spark Memory (60% of 4gb - 300M)

         1.1  storage

         1.1  Execution


when memory for execution is not enough spark with take memory from storage .

If storage have object, it will move to Disk.

Storage memory inside spark memory - is managed by spark.memory.storageFraction  which is 50%



How Many cores are allocated to the executor


number of cores mean number of task we can run in parallel.

If we have 2 CPU core then we can run 2 task in parallel.

# CPU Core = # Task in parallel


rule of thumb = for each CPU 2or 3 task

Number of partition decides the number of Task



so if we have big data set which have 100 partition 

then spark will create 100 task.

If each task have to deal with lot of data then potential problem to get OOM


few partition mean - Each task is overload,







  • Partitions are how Spark processes a file in parallel.
  • Each partition is processed independently of the other partitions—up to a shuffle boundary, which we talk about later.
  • Partitions are assigned to Executors for processing. A given Executor in a given (a) cluster and (b) application will always process the same initial partitions (where "initial" means the partitions. of the base DataFrame).
  • The Executor processes each partition by allocating (or waiting for) an available thread in its pool of threads. Once a thread is available, it is assigned the processing of the partition, which is what we call a task. The thread pro

  • cesses the partition completely, up until either (a) a shuffle boundary, or (b) the end of the job, whichever comes first. Then, it persists its output (somewhere) and makes itself available to process another waiting partition.

    • On Community Edition, which (currently) uses a local-mode cluster with 8 threads ("cores"), you'll typically see 8 partitions.

    • On a cluster with 8 nodes (8 executors), each with 4 threads ("cores"), for a total of 32 threads ("cores"), you'll typically see 32 partitions.
  • https://files.training.databricks.com/courses/ilt/Spark-ILT/Spark-ILT-5.1.1/amazon/instructor-notes/Spark%20Architecture.html#:~:text=Spark%20uses%20the%20term%20to,CPU%20cores%20on%20each%20machine.

No comments:

Post a Comment