Python Basics: Spark Memory allocation

How can you calculate the Number of executor

Consider the following cluster information:

cluster

node=10

each node

core =16 (-1 for OS)

RAM = 61GB (-1 for OS)

Here is the number of core identification:

Number of core is - number of concurrent task an executor can run in parallel.

rule of thumb = 5

To calculate the number of executor identification:

no of executor = no. of core / concurrent task

= 15 / 5

= 3

number of executor = no of node * no of executor on each node

10 * 3 = 30

Out of Memory Error

Spark is memory intensive distributed system. With access t o several GB of memory.

But yet job fails with out of memory issue.

Driver and Executor are the JVM process.

they have fixed memory allocated

when spark job fails, identify what is failing.

Spark job is made of drive and set of executor. They both use large amount of memory

they both are allocated certain finite amount of memory

When they need more then what is allocated we get OOM

Check whether driver failed or executor failed or Task inside executor failed.

Spark job consist of number of task.

Task run inside Executor container

Several spark job can run inside container (executor)

Executor are the containers with CPU and memory allocated.

like this multiple executor (container) can exists for the job

Error like OOM or container killed will be showing up when memory issue happens.

Driver is orchestrating the execution between executors

both memory can be controlled by

spark.executor.memory

spark.driver.memory

If 4 GB is allocated to Executor

300 MB is reserved

60 % of (4GB - 300M ) - will go to Spark = spark memory

rest to user memory

out of 60 % of (4GB - 300M ) (Spark Memory)

some is used for storage(cached obj) and rest execution

spark memory is dynamically managed (available after 1.16 version .)

Initially storage and execution get 50 -50 % of spark memory (60% of 4gb - 300mb)

4gb = 300MB reserved

1.5 user memory (40% of 4gb - 300M)

2.2 spark Memory (60% of 4gb - 300M)

1.1 storage

1.1 Execution

when memory for execution is not enough spark with take memory from storage .

If storage have object, it will move to Disk.

Storage memory inside spark memory - is managed by spark.memory.storageFraction which is 50%

How Many cores are allocated to the executor

number of cores mean number of task we can run in parallel.

If we have 2 CPU core then we can run 2 task in parallel.

# CPU Core = # Task in parallel

rule of thumb = for each CPU 2or 3 task

Number of partition decides the number of Task

so if we have big data set which have 100 partition

then spark will create 100 task.

If each task have to deal with lot of data then potential problem to get OOM

few partition mean - Each task is overload,

Partitions are how Spark processes a file in parallel.
Each partition is processed independently of the other partitions—up to a shuffle boundary, which we talk about later.
Partitions are assigned to Executors for processing. A given Executor in a given (a) cluster and (b) application will always process the same initial partitions (where "initial" means the partitions. of the base DataFrame).
The Executor processes each partition by allocating (or waiting for) an available thread in its pool of threads. Once a thread is available, it is assigned the processing of the partition, which is what we call a task. The thread pro
cesses the partition completely, up until either (a) a shuffle boundary, or (b) the end of the job, whichever comes first. Then, it persists its output (somewhere) and makes itself available to process another waiting partition.

On Community Edition, which (currently) uses a local-mode cluster with 8 threads ("cores"), you'll typically see 8 partitions.
On a cluster with 8 nodes (8 executors), each with 4 threads ("cores"), for a total of 32 threads ("cores"), you'll typically see 32 partitions.

https://files.training.databricks.com/courses/ilt/Spark-ILT/Spark-ILT-5.1.1/amazon/instructor-notes/Spark%20Architecture.html#:~:text=Spark%20uses%20the%20term%20to,CPU%20cores%20on%20each%20machine.

Python Basics

Sunday, 11 July 2021

Spark Memory allocation

No comments:

Post a Comment