List of Algorithmsxvii List of Figuresxxi List of Tablesxxv Prefacexxvii Acknowledgementsxxxiii Authorsxxxv 1Introduction1 1.1Explicit deallocation2 1.2Automatic dynamic memory management4 1.3Comparing garbage collection algorithms6 Safety6 Throughput and cycles consumed6 Completeness and promptness7 Pause time and latency7 Space overhead8 Energy use8 Optimisations for specific languages9 Scalability and portability10 1.4A performance disadvantage?10 1.5Experimental methodology11 1.6Terminology and notation12 The heap13 The mutator and the collector13 The mutator roots14 References, fields and addresses14 Liveness, correctness and reachability15 Pseudocode15 The allocator15 Mutator read and write operations16 Atomic operations16 Sets, multisets, sequences and tuples17 2Mark-sweep gar…
List of Algorithmsxvii List of Figuresxxi List of Tablesxxv Prefacexxvii Acknowledgementsxxxiii Authorsxxxv 1Introduction1 1.1Explicit deallocation2 1.2Automatic dynamic memory management4 1.3Comparing garbage collection algorithms6 Safety6 Throughput and cycles consumed6 Completeness and promptness7 Pause time and latency7 Space overhead8 Energy use8 Optimisations for specific languages9 Scalability and portability10 1.4A performance disadvantage?10 1.5Experimental methodology11 1.6Terminology and notation12 The heap13 The mutator and the collector13 The mutator roots14 References, fields and addresses14 Liveness, correctness and reachability15 Pseudocode15 The allocator15 Mutator read and write operations16 Atomic operations16 Sets, multisets, sequences and tuples17 2Mark-sweep garbage collection19 2.1The mark-sweep algorithm20 2.2The tricolour abstraction23 2.3Improving mark-sweep24 2.4Bitmap marking24 2.5Lazy sweeping27 2.6Cache misses in the marking loop29 2.7Issues to consider31 Mutator overhead31 Throughput32 Space usage32 To move or not to move?32 3Mark-compact garbage collection35 3.1Two-finger compaction36 3.2The Lisp 2 algorithm38 3.3Threaded compaction40 3.4One-pass algorithms42 3.5Issues to consider45 Is compaction necessary?45 Throughput costs of compaction45 Long-lived data45 Locality46 Limitations of mark-compact algorithms46 4Copying garbage collection47 4.1Semispace copying collection47 Work-list implementations48 An example50 4.2Traversal order and locality50 4.3Issues to consider57 Allocation57 Space and locality58 Moving objects59 5Reference counting61 5.1Advantages and disadvantages of reference counting62 5.2Improving efficiency64 5.3Deferred reference counting65 5.4Coalesced reference counting68 5.5Cyclic reference counting71 5.6Limited-field reference counting77 5.7Issues to consider78 The environment78 Advanced solutions78 6Comparing garbage collectors81 6.1Throughput81 6.2Pause time and latency82 6.3Space82 6.4Implementation83 6.5Adaptive systems84 6.6A unified theory of garbage collection85 Abstract garbage collection85 Tracing garbage collection85 Reference counting garbage collection87 7Allocation91 7.1Sequential allocation91 7.2Free-list allocation93 First-fit allocation93 Next-fit allocation93 Best-fit allocation94 Speeding free-list allocation95 7.3Fragmentation97 7.4Segregated-fits allocation98 Fragmentation99 Populating size classes100 7.5Combining segregated-fits with first-, best- and next-fit101 7.6Additional considerations101 Alignment101 Size constraints102 Boundary tags102 Heap parsability103 Locality104 Wilderness preservation105 Crossing maps105 7.7Allocation in concurrent systems105 7.8Issues to consider106 8Partitioning the heap109 8.1Terminology109 8.2Why to partition109 Partitioning by mobility110 Partitioning by size110 Partitioning for space110 Partitioning by kind111 Partitioning for yield111 Partitioning for responsiveness112 Partitioning for locality112 Partitioning by thread113 Partitioning by availability113 Partitioning by mutability114 8.3How to partition114 8.4When to partition116 9Generational garbage collection117 9.1Example118 9.2Measuring time119 9.3Generational hypotheses119 9.4Generations and heap layout120 9.5Multiple generations121 9.6Age recording122 En masse promotion122 Aging semispaces124 Survivor spaces and flexibility126 9.7Adapting to program behaviour127 Appel-style garbage collection127 Feedback-controlled promotion129 9.8Inter-generational pointers130 Remembered sets130 Pointer direction131 9.9Space management132 9.10Older-first garbage collection133 9.11Beltway136 9.12Analytic support for generational collection138 9.13Issues to consider139 9.14Abstract generational garbage collection141 10Other partitioned schemes143 10.1Large object spaces143 The Treadmill garbage collector144 Moving objects with operating system support145 Pointer-free objects146 10.2Topological collectors146 Mature Object Space garbage collection146 Connectivity-based garbage collection149 Thread-local garbage collection150 Stack allocation155 Region inferencing156 10.3Hybrid mark-sweep, copying collectors157 10.4Garbage-First: collecting young regions158 10.5Trading space and time161 10.6Mark-region collection: immix162 10.7Copying collection in a constrained memory space164 10.8Bookmarking garbage collection166 10.9Ulterior reference counting167 10.10Issues to consider168 11Run-time interface171 11.1Interface to allocation171 Speeding allocation174 Zeroing175 11.2Finding pointers176 Conservative pointer finding176 Accurate pointer finding using tagged values179 Accurate pointer finding in objects180 Accurate pointer finding in global roots181 Accurate pointer finding in stacks and registers182 Accurate pointer finding in code192 Handling interior pointers193 Handling derived pointers194 11.3Object tables195 11.4References from external code196 11.5Stack barriers197 11.6GC safe-points and mutator suspension198 11.7Garbage collecting code201 11.8Read and write barriers202 Engineering202 Precision of write barriers203 Hash tables205 Sequential store buffers206 Overflow action208 Cards and card tables208 Crossing maps210 Summarising cards212 Hardware and virtual memory techniques213 Write barrier mechanisms: in summary214 Chunked lists214 11.9Managing address space215 11.10Applications of virtual memory page protection217 Double mapping217 Applications of no-access pages218 11.11Choosing heap size220 11.12Issues to consider222 12Language-specific concerns225 12.1Finalisation225 When do finalisers run?226 Which thread runs a finaliser?227 Can finalisers run concurrently with each other?228 Can finalisers access the object that became unreachable?228 When are finalised objects reclaimed?229 What happens if there is an error in a finaliser?229 Is there any guaranteed order to finalisation?229 The finalisation race problem230 Finalisers and locks231 Finalisation and weak references231 Finalisation in particular languages232 For further study233 12.2Weak references233 Additional motivations235 Weak references in Java: the short story235 Weak references in Java: the full story236 Race in weak pointer clearing239 Weak pointers in other languages240 12.3Changing object layout241 12.4Issues to consider243 13Concurrency preliminaries245 13.1Hardware245 Processors and threads245 Interconnect246 Memory247 Caches247 Coherence247 Cache coherence performance example: spin locks248 13.2Hardware memory consistency250 Fences and happens-before251 Consistency models252 13.3Hardware primitives253 Compare-and-swap253 Load-linked/store-conditionally254 Atomic arithmetic primitives255 Test then test-and-set257 More powerful primitives257 Overheads of atomic primitives258 13.4Progress guarantees259 Progress guarantees and concurrent collection260 13.5Notation used for concurrent algorithms261 13.6Mutual exclusion262 13.7Work sharing and termination detection264 Rendezvous barriers269 13.8Concurrent data structures269 Concurrent stacks272 Concurrent queue implemented with singly linked list272 Concurrent queue implemented with array276 Concurrent deque for work stealing281 13.9Transactional memory282 Using transactional memory to help implement collection286 Supporting transactional memory in the presence of garbage collection288 13.10Issues to consider289 14Parallel garbage collection291 14.1Is there sufficient work to parallelise?292 14.2Load balancing293 14.3Synchronisation294 14.4Taxonomy295 14.5Parallel marking295 14.6Parallel copying304 Processor-centric techniques305 Memory-centric techniques310 14.7Parallel sweeping315 14.8Parallel compaction316 14.9Garbage collection on the GPU?319 GPU background319 Heap reference graphs321 Marking on the GPU322 A problem not yet solved324 14.10Issues to consider324 Terminology324 Is parallel collection worthwhile?324 Strategies for balancing loads325 Managing tracing325 Low-level synchronisation327 Sweeping and compaction327 Termination328 15Concurrent garbage collection329 15.1Correctness of concurrent collection331 The tricolour abstraction, revisited331 The lost object problem332 The strong and weak tricolour invariants334 Precision335 Mutator colour335 Allocation colour336 Pointer tagging336 Incremental update solutions336 Snapshot-at-the-beginning solutions337 15.2Barrier techniques for concurrent collection337 Grey mutator techniques337 Black mutator techniques339 Completeness of barrier techniques340 Load barriers341 Concurrent write barrier mechanisms342 One-level card tables343 Two-level card tables343 Reducing work343 Eliminating read barriers344 15.3Ragged phase changes344 15.4Issues to consider346 16Concurrent mark-sweep349 16.1Initialisation349 16.2Termination351 16.3Allocation352 16.4Concurrent marking and sweeping353 16.5Garbage-First: collecting young and old regions354 16.6On-the-fly marking357 Write barriers for on-the-fly collection357 Doligez-Leroy-Gonthier358 Doligez-Leroy-Gonthier for Java359 Sliding views360 16.7Abstract concurrent collection360 The collector wavefront361 Adding origins363 Mutator barriers363 Precision363 Instantiating collectors364 16.8Issues to consider364 17Concurrent copying and compaction367 17.1Mostly-concurrent copying367 Baker’s algorithm368 Mostly-concurrent, mostly-copying collection370 17.2Brooks’s indirection barrier370 17.3Self-erasing read barriers371 17.4Replication copying372 Multi-version copying372 Extensions to avoid copy-on-write374 Sapphire376 Transactional Sapphire376 Platinum383 17.5Concurrent compaction384 Compressor384 Pauseless and C4386 Collie395 ZGC396 Shenandoah401 Reducing stop-the-world time in Shenandoah and ZGC403 17.6Issues to consider404 18Concurrent reference counting405 18.1LXR: Latency-critical ImmiX with Reference counting405 18.2Simple reference counting revisited409 18.3Biased reference counting411 18.4Buffered reference counting412 18.5Concurrent, cyclic reference counting413 18.6Taking a snapshot of the heap414 18.7Sliding views reference counting416 Age-oriented collection416 Sliding views cycle reclamation419 Memory consistency420 18.8Issues to consider420 Safe memory reclamation421 19Real-time garbage collection423 19.1Real-time systems423 19.2Scheduling real-time collection424 19.3Work-based real-time collection425 Parallel, concurrent replication425 Uneven work and its impact on work-based scheduling432 19.4Slack-based real-time collection434 Scheduling the collector work436 Execution overheads438 Programmer input439 19.5Time-based real-time collection: Metronome439 Mutator utilisation439 Supporting predictability441 Analysis443 Robustness447 19.6Combining scheduling approaches: Tax-and-Spend448 Tax-and-Spend scheduling448 Tax-and-Spend prerequisites450 19.7Controlling fragmentation452 Incremental compaction in Metronome452 Incremental replication on uniprocessors453 Stopless: lock-free garbage collection454 Staccato: best-effort compaction with mutator wait-freedom455 Chicken: best-effort compaction with mutator wait-freedom for x86458 Clover: guaranteed compaction with probabilistic mutator lock-freedom458 Stopless versus Chicken versus Clover459 Fragmented allocation461 19.8Issues to consider463 20Energy-aware garbage collection465 20.1Issues to consider469 21Persistence and garbage collection471 21.1Persistence: concepts, issues and implementation471 Providing durability472 Issues raised by persistence473 21.2Impacts of persistence on garbage collection475 21.3Barriers in support of persistence476 Software barriers476 Hardware barriers477 Software versus hardware barriers478 21.4Implemented collectors for persistent heaps478 Persistent reference counting478 Persistent mark-sweep and mark-compact479 Persistent copying480 21.5Issues to consider481 Glossary485 Bibliography503 Index551