JVM G1源码分析——记忆集(RSet)

编程入门 行业动态 更新时间:2024-10-13 14:23:35

JVM G1<a href=https://www.elefans.com/category/jswz/34/1770099.html style=源码分析——记忆集(RSet)"/>

JVM G1源码分析——记忆集(RSet)

试想一下,当在ygc时,我们需要扫描整个新生代。当新生代的对象被老年代引用,则此对象就不能回收。那么怎么判断这点呢,总不能扫描老年代吧。所以这里需要使用空间换时间的数据结构:RSet和卡表。
卡表:一种points-out结构,每个card记载有没有引用别的heap分区,它是全局表,对于一些热点card,会放入Hot card cache,它也是全局表;
RSet:一种points-in结构,在卡表基础上实现,每个Heap一个,用来记录别的Heap指向自己的指针,并标记这些指针在卡表哪些范围内。

记忆集RSet是一种抽象概念,记录对象在不同代际之间的引用关系,目的是为了加速垃圾回收的速度。JVM使用根对象引用的收集算法,即从根集合出发,标记所有存活的对象,然后遍历对象的每一个字段继续标记,直到所有的对象标记完毕。在分代GC中,我们知道新生代和老生代处于不同的收集阶段,如果还是按照这样的标记方法,既不合理也没必要。假设我们只收集新生代,我们把老生代全部标记,但是并没有收集老生代,浪费了时间。同理,在收集老生代时有同样的问题。当且仅当,我们要进行Full GC才需要做全部标记。所以算法设计者做了这样的设计,用一个RSet记录从非收集部分指向收集部分的指针的集合,而这个集合描述对象的引用关系。

记忆集的维度应该是什么?针对新生代和老年代各搞一个?还是针对region,每个region都搞一个?

对于G1来说,它是以region为最小内存管理维度的,它的RSet记忆集的维度是对每一个region,都搞一块儿内存,存储region里面所有的对象被引用的引用关系。

针对region这个维度,是因为,每次回收之后,老年代,新生代,大对象区域的region可能都会变化,所以,如果说,对每个分代都搞一份儿的话,不太合理,因为region不断的在变化,同时也会有并发问题,效率问题。

同时,除了新生代的回收是需要选择所有新生代的region,老年代的回收,是需要找性价比高的region来回收的,也就是选择一部分去回收,那么选择一部分回收的时候,还要去整个分代对应的这么一大块儿引用关系数据,去做遍历,筛选,才能拿到需要的数据。

通常有两种方法记录引用关系,第一成为Point Out,第二是Point in。如ObjA.Field=ObjB,对于Point out来说在ObjA的RSet中记录ObjB的地址,对于Point in来说在ObjB的RSet中记录ObjA的地方,这相当于一种反向引用。这两者的区别在于处理有所不同,Point Out记录操作简单,但是需要对RSet做全部扫描;Point In记录操作复杂,但是在标记扫描时直接可以找到有用和无用的对象,不需要额外的扫描,因为RSet里面的对象可以认为就是根对象。

在G1中提供了3种收集算法。新生代回收、混合回收和Full GC。新生代回收总是收集所有新生代分区,混合回收会收集所有的新生代分区以及部分老生代分区,而Full GC则是对所有的分区处理。根据这个思路,我们首先分析一些不同分区之间的RSet应该如何设计。分区之间的引用关系可以归纳为:

  • 分区内部有引用关系。
  • 新生代分区到新生代分区之间有引用关系。
  • 新生代分区到老生代分区之间有引用关系。
  • 老生代分区到新生代分区之间有引用关系。
  • 老生代分区到老生代分区之间有引用关系。

这里的引用关系指的是分区里面有一个对象存在一个指针指向另一个分区的对象。针对这5种情况,最简单的方式就是在RSet中记录所有的引用关系,但这并不是最优的设计方案。因为使用RSet进行回收实际上有两个重大的缺点:

  • 需要额外内存空间;这一部分通常是JVM最大的额外开销,一般在1%~20%之间。
  • 可能导致浮动垃圾;由于根据RSet回收,而RSet里面的对象可能已经死亡,这个时候被引用对象会被认为是活跃对象,实质上它是浮动垃圾。

所以有必要对RSet进行优化,根据垃圾回收的原理,我们来逐一分析哪些引用关系是需要记录在RSet中:

  • 分区内部有引用关系,无论是新生代分区还是老生代分区内部的引用,都无需记录引用关系,因为回收的时候是针对一个分区而言,即这个分区要么被回收要么不回收,回收的时候会遍历整个分区,所以无需记录这种额外的引用关系。
  • 新生代分区到新生代分区之间有引用关系,这个无需记录,原因在于G1的这3中回收算法都会全量处理新生代分区,所以它们都会被遍历,所以无需记录新生代到新生代之间的引用。
  • 新生代分区到老生代分区之间有引用关系,这个无需记录,对于G1中YGC针对的新生代分区,无需这个引用关系,混合GC发生的时候,G1会使用新生代分区作为根,那么遍历新生代分区的时候自然能找到老生代分区,所以也无需这个引用,对于FGC来说更无需这个引用关系,所有的分区都会被处理。
  • 老生代分区到新生代分区之间有引用关系,这个需要记录,在YGC的时候有两种根,一个就是栈空间/全局空间变量的引用,另外一个就是老生代分区到新生代分区的引用。
  • 老生代分区到老生代分区之间有引用关系,这个需要记录,在混合GC的时候可能只有部分分区被回收,所以必须记录引用关系,快速找到哪些对象是活跃的。

前面已经介绍过卡表和位图,那么RSet与卡表、位图又是什么关系?我们已经知道RSet记录引用者的地址。我们可以使用RSet直接记录对象的地址,带来的问题就是RSet会急剧膨胀,一个位可以表示512个字节区域到被引用区的关系。RSet用分区的起始地址和位图表示一个分区所有的引用信息。这里结合RSet从一个具体的例子来看看他们是如何工作。假定有两个新生代分区YHR,两个老生代分区OHR。为了方便,定义:obj1_YHR1.Field1=Obj2_HYR1,表示对象obj1在新生代分区YHR1,它有一个字段Field1指向对象obj2在新生代分区YHR2。

下图是G1中关于RSet和卡表的整体概述:

在这里需要注意一点。卡表是一个全局表,这个卡表的作用并不是记录引用关系,而是记录该区域中对象垃圾回收过程中的状态信息,且能描述对象所处的内存区域块,它能快速描述内存的使用情况,卡表在后文中还会有涉及。RSet里面有足够的信息定位到引用对象所在分区的块中,下面将详细介绍RSet。在G1回收器里面,使用了Point In的方法。算法可以简化为找到需要收集的分区HeapRegion集合,所以YGC扫描Root Set和RSet就可以了。

在G1回收器里面,使用了Point In的方法。算法可以简化为找到需要收集的分区HeapRegion集合,所以YGC扫描Root Set和RSet就可以了。在线程运行过程中,如果对象的引用发生了变化(通常就是赋值操作),就必须要通知RSet,更改其中的记录,但对于一个分区来说,里面的对象有可能被很多分区所引用,这就要求这个分区记录所有引用者的信息。为此G1回收器使用了一种新的数据结构PRT(Per region Table)来记录这种变化。每个HeapRegion都包含了一个PRT,它是通过HeapRegion里面的一个结构HeapRegionRemSet获得,而HeapRegionRemSet包含了一个OtherRegionsTable,也就是我们所说的PRT,代码如下所示:

src\share\vm\gc_implementation\g1\heapRegion.hppclass HeapRegion: public G1OffsetTableContigSpace {friend class VMStructs;private:// The remembered set for this region.// (Might want to make this "inline" later, to avoid some alloc failure// issues.)HeapRegionRemSet* _rem_set;…………………………………………
}
src\share\vm\gc_implementation\g1\heapRegionRemSet.hppclass HeapRegionRemSet : public CHeapObj<mtGC> {friend class VMStructs;friend class HeapRegionRemSetIterator;public:enum Event {Event_EvacStart, Event_EvacEnd, Event_RSUpdateEnd, Event_illegal};private:G1BlockOffsetSharedArray* _bosa;G1BlockOffsetSharedArray* bosa() const { return _bosa; }// A set of code blobs (nmethods) whose code contains pointers into// the region that owns this RSet.G1CodeRootSet _code_roots;Mutex _m;OtherRegionsTable _other_regions;enum ParIterState { Unclaimed, Claimed, Complete };volatile ParIterState _iter_state;volatile jlong _iter_claimed;// Unused unless G1RecordHRRSOops is true.static const int MaxRecorded = 1000000;static OopOrNarrowOopStar* _recorded_oops;static HeapWord**          _recorded_cards;static HeapRegion**        _recorded_regions;static int                 _n_recorded;static const int MaxRecordedEvents = 1000;static Event*       _recorded_events;static int*         _recorded_event_index;static int          _n_recorded_events;…………………………………………………………
}
src\share\vm\gc_implementation\g1\heapRegionRemSet.hpp// The "_coarse_map" is a bitmap with one bit for each region, where set
// bits indicate that the corresponding region may contain some pointer
// into the owning region.// The "_fine_grain_entries" array is an open hash table of PerRegionTables
// (PRTs), indicating regions for which we're keeping the RS as a set of
// cards.  The strategy is to cap the size of the fine-grain table,
// deleting an entry and setting the corresponding coarse-grained bit when
// we would overflow this cap.// We use a mixture of locking and lock-free techniques here.  We allow
// threads to locate PRTs without locking, but threads attempting to alter
// a bucket list obtain a lock.  This means that any failing attempt to
// find a PRT must be retried with the lock.  It might seem dangerous that
// a read can find a PRT that is concurrently deleted.  This is all right,
// because:
//
//   1) We only actually free PRT's at safe points (though we reuse them at
//      other times).
//   2) We find PRT's in an attempt to add entries.  If a PRT is deleted,
//      it's _coarse_map bit is set, so the that we were attempting to add
//      is represented.  If a deleted PRT is re-used, a thread adding a bit,
//      thinking the PRT is for a different region, does no harm.class OtherRegionsTable VALUE_OBJ_CLASS_SPEC {friend class HeapRegionRemSetIterator;G1CollectedHeap* _g1h;Mutex*           _m;HeapRegion*      _hr;// These are protected by "_m".BitMap      _coarse_map;size_t      _n_coarse_entries;static jint _n_coarsenings;PerRegionTable** _fine_grain_regions;size_t           _n_fine_entries;// The fine grain remembered sets are doubly linked together using// their 'next' and 'prev' fields.// This allows fast bulk freeing of all the fine grain remembered// set entries, and fast finding of all of them without iterating// over the _fine_grain_regions table.PerRegionTable * _first_all_fine_prts;PerRegionTable * _last_all_fine_prts;// Used to sample a subset of the fine grain PRTs to determine which// PRT to evict and coarsen.size_t        _fine_eviction_start;static size_t _fine_eviction_stride;static size_t _fine_eviction_sample_size;SparsePRT   _sparse_table;// These are static after init.static size_t _max_fine_entries;static size_t _mod_max_fine_entries_mask;// Requires "prt" to be the first element of the bucket list appropriate// for "hr".  If this list contains an entry for "hr", return it,// otherwise return "NULL".PerRegionTable* find_region_table(size_t ind, HeapRegion* hr) const;// Find, delete, and return a candidate PerRegionTable, if any exists,// adding the deleted region to the coarse bitmap.  Requires the caller// to hold _m, and the fine-grain table to be full.PerRegionTable* delete_region_table();// If a PRT for "hr" is in the bucket list indicated by "ind" (which must// be the correct index for "hr"), delete it and return true; else return// false.bool del_single_region_table(size_t ind, HeapRegion* hr);// link/add the given fine grain remembered set into the "all" listvoid link_to_all(PerRegionTable * prt);// unlink/remove the given fine grain remembered set into the "all" listvoid unlink_from_all(PerRegionTable * prt);public:OtherRegionsTable(HeapRegion* hr, Mutex* m);HeapRegion* hr() const { return _hr; }// For now.  Could "expand" some tables in the future, so that this made// sense.void add_reference(OopOrNarrowOopStar from, int tid);// Returns whether this remembered set (and all sub-sets) have an occupancy// that is less or equal than the given occupancy.bool occupancy_less_or_equal_than(size_t limit) const;// Removes any entries shown by the given bitmaps to contain only dead// objects.void scrub(CardTableModRefBS* ctbs, BitMap* region_bm, BitMap* card_bm);// Returns whether this remembered set (and all sub-sets) contain no entries.bool is_empty() const;size_t occupied() const;size_t occ_fine() const;size_t occ_coarse() const;size_t occ_sparse() const;static jint n_coarsenings() { return _n_coarsenings; }// Returns size in bytes.// Not const because it takes a lock.size_t mem_size() const;static size_t static_mem_size();static size_t fl_mem_size();bool contains_reference(OopOrNarrowOopStar from) const;bool contains_reference_locked(OopOrNarrowOopStar from) const;void clear();// Specifically clear the from_card_cache.void clear_fcc();void do_cleanup_work(HRRSCleanupTask* hrrs_cleanup_task);// Declare the heap size (in # of regions) to the OtherRegionsTable.// (Uses it to initialize from_card_cache).static void initialize(uint max_regions);// Declares that regions between start_idx <= i < start_idx + num_regions are// not in use. Make sure that any entries for these regions are invalid.static void invalidate(uint start_idx, size_t num_regions);static void print_from_card_cache();
};

OtherRegionsTable使用了三种不同的粒度来描述引用,如下图所示。

原因是前面提到的Point In的缺点,一个对象可能被引用的次数不固定。引用的次数可能很多也可能很少,为了提高效率,才用了动态化的数据结构存储。主要有以下三种粒度:

  • 稀疏PRT:通过哈希表方式来存储。默认长度为4。
  • 细粒度PRT:通过PRT指针的指针,所以可以简单地理解为PRT指针的数组。其数组长度可以指定也可以自动计算得到。
  • 粗粒度:通过位图来指示,每一位表示对应的分区有引用到该分区数据结构。

我们来看一下RSet是如何管理引用的,主要代码在add_reference中,即把引用者对象对应的卡表地址存放在RSet里面。在RSet里面记录一个区域到这个对象所在分区的引用。代码如下所示:

src\share\vm\gc_implementation\g1\heapRegionRemSet.cpp// 添加引用者对象对应的卡表地址到RSet中
void OtherRegionsTable::add_reference(OopOrNarrowOopStar from, int tid) {uint cur_hrm_ind = hr()->hrm_index();if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("ORT::add_reference_work(" PTR_FORMAT "->" PTR_FORMAT ").",from,UseCompressedOops? (void *)oopDesc::load_decode_heap_oop((narrowOop*)from): (void *)oopDesc::load_decode_heap_oop((oop*)from));}int from_card = (int)(uintptr_t(from) >> CardTableModRefBS::card_shift);if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("Table for [" PTR_FORMAT "...): card %d (cache = "INT32_FORMAT")",hr()->bottom(), from_card,FromCardCache::at((uint)tid, cur_hrm_ind));}// 提高处理效率,使用卡表缓存,缓存中有表示已处理if (FromCardCache::contains_or_replace((uint)tid, cur_hrm_ind, from_card)) {if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("  from-card cache hit.");}assert(contains_reference(from), "We just added it!");return;}// Note that this may be a continued H region.HeapRegion* from_hr = _g1h->heap_region_containing_raw(from);RegionIdx_t from_hrm_ind = (RegionIdx_t) from_hr->hrm_index();// If the region is already coarsened, return.//如果Rset中已经是粗粒度关系,也就是说RSet中记录是对应的分区,直接返回if (_coarse_map.at(from_hrm_ind)) {if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("  coarse map hit.");}assert(contains_reference(from), "We just added it!");return;}// Otherwise find a per-region table to add it to.// 添加RPT引用关系到RSet中size_t ind = from_hrm_ind & _mod_max_fine_entries_mask;PerRegionTable* prt = find_region_table(ind, from_hr);if (prt == NULL) {//加锁,多线程访问MutexLockerEx x(_m, Mutex::_no_safepoint_check_flag);// Confirm that it's really not there...prt = find_region_table(ind, from_hr);if (prt == NULL) {//使用稀疏矩阵存储uintptr_t from_hr_bot_card_index =uintptr_t(from_hr->bottom())>> CardTableModRefBS::card_shift;CardIdx_t card_index = from_card - from_hr_bot_card_index;assert(0 <= card_index && (size_t)card_index < HeapRegion::CardsPerRegion,"Must be in range.");if (G1HRRSUseSparseTable &&_sparse_table.add_card(from_hrm_ind, card_index)) {if (G1RecordHRRSOops) {HeapRegionRemSet::record(hr(), from);if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print("   Added card " PTR_FORMAT " to region ""[" PTR_FORMAT "...) for ref " PTR_FORMAT ".\n",align_size_down(uintptr_t(from),CardTableModRefBS::card_size),hr()->bottom(), from);}}if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("   added card to sparse table.");}assert(contains_reference_locked(from), "We just added it!");return;} else {if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print_cr("   [tid %d] sparse table entry ""overflow(f: %d, t: %u)",tid, from_hrm_ind, cur_hrm_ind);}}// 细粒度卡表满了,删除PRT并放入粗粒度卡表if (_n_fine_entries == _max_fine_entries) {prt = delete_region_table();// There is no need to clear the links to the 'all' list here:// prt will be reused immediately, i.e. remain in the 'all' list.prt->init(from_hr, false /* clear_links_to_all_list */);} else {//稀疏矩阵满了,需要再分配一个细粒度prt = PerRegionTable::alloc(from_hr);link_to_all(prt);}//把稀疏矩阵的数据迁移到细粒度卡表中,成功后删除稀疏矩阵PerRegionTable* first_prt = _fine_grain_regions[ind];prt->set_collision_list_next(first_prt);_fine_grain_regions[ind] = prt;_n_fine_entries++;if (G1HRRSUseSparseTable) {// Transfer from sparse to fine-grain.SparsePRTEntry *sprt_entry = _sparse_table.get_entry(from_hrm_ind);assert(sprt_entry != NULL, "There should have been an entry");for (int i = 0; i < SparsePRTEntry::cards_num(); i++) {CardIdx_t c = sprt_entry->card(i);if (c != SparsePRTEntry::NullEntry) {prt->add_card(c);}}// Now we can delete the sparse entry.bool res = _sparse_table.delete_entry(from_hrm_ind);assert(res, "It should have been there.");}}assert(prt != NULL && prt->hr() == from_hr, "consequence");}//添加引用prt->add_reference(from);if (G1RecordHRRSOops) {HeapRegionRemSet::record(hr(), from);if (G1TraceHeapRegionRememberedSet) {gclog_or_tty->print("Added card " PTR_FORMAT " to region ""[" PTR_FORMAT "...) for ref " PTR_FORMAT ".\n",align_size_down(uintptr_t(from),CardTableModRefBS::card_size),hr()->bottom(), from);}}assert(contains_reference(from), "We just added it!");
}

前面我们提到RSet记录的是引用者的地址,这只是一个概述,实际上在细粒度表中,每一项都是PRT,这个PRT使用的是HeapRegion的起始地址加上一个位图,这个位图描述这一个分区的引用情况,所以它的大小为HeapRegionSize%512。不直接记录地址而是通过一个起始地址和位图,这样可以使用更少的内存存储更多的引用关系,记录引用的代码如下:

hotspot/src/share/vm/gc_implementation/g1/heapRegionRemSet.cppvoid PerRegionTable::add_reference_work(OopOrNarrowOopStar from, bool par) {HeapRegion* loc_hr = hr();if (loc_hr->is_in_reserved_raw(from)) {size_t hw_offset = pointer_delta((HeapWord*)from, loc_hr->bottom());CardIdx_t from_card = (CardIdx_t)  hw_offset >> (CardTableModRefBS::card_shift - LogHeapWordSize);add_card_work(from_card, par);}
}void PerRegionTable::add_card_work(CardIdx_t from_card, bool par) {if (!_bm.at(from_card)) {if (par) {if (_bm.par_at_put(from_card, 1)) {Atomic::inc(&_occupied);}} else {_bm.at_put(from_card, 1);_occupied++;}}
}

FromCardCache源码如下:

// The FromCardCache remembers the most recently processed card on the heap on
// a per-region and per-thread basis.
class FromCardCache : public AllStatic {private:// Array of card indices. Indexed by thread X and heap region to minimize// thread contention.static int** _cache;static uint _max_regions;static size_t _static_mem_size;public:enum {InvalidCard = -1 // Card value of an invalid card, i.e. a card index not otherwise used.};static void clear(uint region_idx);// Returns true if the given card is in the cache at the given location, or// replaces the card at that location and returns false.static bool contains_or_replace(uint worker_id, uint region_idx, int card) {int card_in_cache = at(worker_id, region_idx);if (card_in_cache == card) {return true;} else {set(worker_id, region_idx, card);return false;}}static int at(uint worker_id, uint region_idx) {return _cache[worker_id][region_idx];}static void set(uint worker_id, uint region_idx, int val) {_cache[worker_id][region_idx] = val;}static void initialize(uint n_par_rs, uint max_num_regions);static void invalidate(uint start_idx, size_t num_regions);static void print(outputStream* out = gclog_or_tty) PRODUCT_RETURN;static size_t static_mem_size() {return _static_mem_size;}
};int**  FromCardCache::_cache = NULL;
uint   FromCardCache::_max_regions = 0;
size_t FromCardCache::_static_mem_size = 0;void FromCardCache::initialize(uint n_par_rs, uint max_num_regions) {guarantee(_cache == NULL, "Should not call this multiple times");_max_regions = max_num_regions;_cache = Padded2DArray<int, mtGC>::create_unfreeable(n_par_rs,_max_regions,&_static_mem_size);invalidate(0, _max_regions);
}void FromCardCache::invalidate(uint start_idx, size_t new_num_regions) {guarantee((size_t)start_idx + new_num_regions <= max_uintx,err_msg("Trying to invalidate beyond maximum region, from %u size " SIZE_FORMAT,start_idx, new_num_regions));for (uint i = 0; i < HeapRegionRemSet::num_par_rem_sets(); i++) {uint end_idx = (start_idx + (uint)new_num_regions);assert(end_idx <= _max_regions, "Must be within max.");for (uint j = start_idx; j < end_idx; j++) {set(i, j, InvalidCard);}}
}void FromCardCache::clear(uint region_idx) {uint num_par_remsets = HeapRegionRemSet::num_par_rem_sets();for (uint i = 0; i < num_par_remsets; i++) {set(i, region_idx, InvalidCard);}
}

更多推荐

JVM G1源码分析——记忆集(RSet)

本文发布于:2024-03-06 02:32:59,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1714124.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:源码   记忆   JVM   RSet

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!