Unified Memory Management
Shared memory for execution
and caching instead of exclusive division of the regions.
Memory Management in Spark:
< 1.5
·
Two separate memory managers:
o Execution memory:
computation of shuffles, joins, sorts, aggregations
o Storage memory: caching and
propagating internal data sources across cluster
·
Issues with this:
o Manual intervention to
avoid unnecessary spilling
o No pre-defined defaults for
all workloads
o Need to partition the
execution (shuffle) memory and cache memory fractions
·
Goal: Unify these two memory regions and borrow from each other
Unified Memory Management
in Spark 1.6
·
Can cross between execution and storage memory
o When execution memory
exceeds its own region, it can borrow as much of the storage space as is free
and vice versa
o Borrowed storage memory can
be evicted at any time (though not execution memory at this time)
·
New configurations to be introduced
·
Under memory pressure
o Evict cached data
o Evict storage memory
o Evict execution memory
·
Notes:
o Dynamically allocated reserved storage region that execution memory
cannot borrow from
o Cached data evicted only if
actual storage exceeds the dynamically allocated reserved storage region
·
Result:
o Allows for better memory
management for multi-tenancy and applications relying heavily on caching
o No cap on storage memory
nor on execution memory
o Dynamic allocation of
reserved storage memory will not require user configuration
// Execute reduceByKey that will cause spill (DBC cluster with 3 nodes,
90GB RAM)
val result = sc.parallelize(0 until 200000000).map { i => (i / 2, i) }.reduceByKey(math.max).collect()
result:
Array[(Int, Int)] = Array((33966328,67932657), (3035008,6070017),
(52605688,105211377), (74887208,149774417), (41864592,83729185),
(67568488,135136977), (40664112,81328225), (88144280,176288561),
(91081800,182163601), (93161072,186322145), (26738200,53476401), (22998384,45996769),
(4177592,8355185), (63269776,126539553), (95411600,190823201),
(18442528,36885057), (92182440,184364881), (67090736,134181473),
(59179944,118359889))
Review Stage
Details for reduceByKey job
·
Spark 1.6 completes faster
than the same size Spark 1.5 cluster
·
Note the Spark 1.5 spills
to memory and disk
0 comments:
Post a Comment