Unified Memory Management ~ River IQ

Unified Memory Management

Shared memory for execution and caching instead of exclusive division of the regions.

Memory Management in Spark: < 1.5

· Two separate memory managers:

o Execution memory: computation of shuffles, joins, sorts, aggregations

o Storage memory: caching and propagating internal data sources across cluster

· Issues with this:

o Manual intervention to avoid unnecessary spilling

o No pre-defined defaults for all workloads

o Need to partition the execution (shuffle) memory and cache memory fractions

· Goal: Unify these two memory regions and borrow from each other

Unified Memory Management in Spark 1.6

· Can cross between execution and storage memory

o When execution memory exceeds its own region, it can borrow as much of the storage space as is free and vice versa

o Borrowed storage memory can be evicted at any time (though not execution memory at this time)

· New configurations to be introduced

· Under memory pressure

o Evict cached data

o Evict storage memory

o Evict execution memory

· Notes:

o Dynamically allocated reserved storage region that execution memory cannot borrow from

o Cached data evicted only if actual storage exceeds the dynamically allocated reserved storage region

· Result:

o Allows for better memory management for multi-tenancy and applications relying heavily on caching

o No cap on storage memory nor on execution memory

o Dynamic allocation of reserved storage memory will not require user configuration

// Execute reduceByKey that will cause spill (DBC cluster with 3 nodes, 90GB RAM)

val result = sc.parallelize(0 until 200000000).map { i => (i / 2, i) }.reduceByKey(math.max).collect()

result: Array[(Int, Int)] = Array((33966328,67932657), (3035008,6070017), (52605688,105211377), (74887208,149774417), (41864592,83729185), (67568488,135136977), (40664112,81328225), (88144280,176288561), (91081800,182163601), (93161072,186322145), (26738200,53476401), (22998384,45996769), (4177592,8355185), (63269776,126539553), (95411600,190823201), (18442528,36885057), (92182440,184364881), (67090736,134181473), (59179944,118359889))

Review Stage Details for reduceByKey job

· Spark 1.6 completes faster than the same size Spark 1.5 cluster

· Note the Spark 1.5 spills to memory and disk

River IQ

Saturday, 7 January 2017

Unified Memory Management

0 comments:

Post a Comment

Labels

About

Site Links

Popular Posts

Join the Team