本文主要是介绍Sniper中cache实现,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
Sniper中cache相关的文件
- config文件夹
- gainestown.cfg包含L3 cache的配置情况。嵌套包含了nehalem.cfg文件
- nehalem.cfg包含L2 cache和L1 cache的配置情况。
- sniper默认参数为gainestown.cfg文件
############ nehalem.cfg[perf_model/l1_icache]perfect = falsecache_size = 32associativity = 4address_hash = maskreplacement_policy = lrudata_access_time = 4tags_access_time = 1perf_model_type = parallelwritethrough = 0shared_cores = 1[perf_model/l1_dcache]perfect = falsecache_size = 32associativity = 8address_hash = maskreplacement_policy = lrudata_access_time = 4tags_access_time = 1perf_model_type = parallelwritethrough = 0shared_cores = 1[perf_model/l2_cache]perfect = falsecache_size = 256associativity = 8address_hash = maskreplacement_policy = lrudata_access_time = 8 # 8.something according to membench, -1 cycle L1 tag access timetags_access_time = 3# Total neighbor L1/L2 access time is around 40/70 cycles (60-70 when it's coming out of L1)writeback_time = 50 # L3 hit time will be addedperf_model_type = parallelwritethrough = 0shared_cores = 1[perf_model/l2_cache/srrip]bits[] = 3,3,3,3,3 #srrip算法需要额外增加这些信息,指明每个cacheline的额外需要的位数############ gainestown.cfg[perf_model/l3_cache]perfect = falsecache_block_size = 64cache_size = 8192associativity = 16address_hash = maskreplacement_policy =lru data_access_time = 30 # 35 cycles total according to membench, +L1+L2 tag timestags_access_time = 10perf_model_type = parallelwritethrough = 0shared_cores = 4
- sniper\common\misc\fixed_types.h
- 包含sniper整个系统中的类型定义,例如UInt64,IntPtr地址类型
- 如果需要在sniper中增加全局变量可以选择在这个文件中进行声明,在使用的.cc文件中定义即可。所有文件都会包含这个文件。
typedef int64_t SInt64;typedef int32_t SInt32;typedef int16_t SInt16;typedef int8_t SInt8;typedef UInt8 Byte;typedef UInt8 Boolean;typedef uintptr_t IntPtr;extern UInt64 PC;
- \sniper\common\core\memory_subsystem 包含了sniper中存储系统的定义和具体实现
(1) .\parametric_dram_directory_msi\cache_cntlr.cc
判断当前的访存操作cache是否miss或者命中,如果是命中访问cache(包括写回cache 和读取cache),如果cache 发生miss,则进行cache的插入操作。
HitWhere::where_t
CacheCntlr::processMemOpFromCore(Core::lock_signal_t lock_signal,
Core::mem_op_t mem_op_type,IntPtr ca_address, UInt32 offset,Byte* data_buf, UInt32 data_length,bool modeled,bool count);
/* 接受访存或者是写存的请求,判断当前cache访问是否命中或者是缺失,然后调用不同的处理函数*/
SharedCacheBlockInfo* CacheCntlr::insertCacheBlock(IntPtr address,
CacheState::cstate_t cstate, Byte* data_buf,
core_id_t requester, ShmemPerfModel::Thread_t thread_num);
/*当cache发生miss时会被上一个方法调用。主要功能是寻找cache block进行替换*/
void CacheCntlr::accessCache(
Core::mem_op_t mem_op_type, IntPtr ca_address, UInt32 offset,Byte* data_buf, UInt32 data_length, bool update_replacement);
/*当cache未发生命中时的操作,也是被processMemOpFromCore方法调用。主要包括两个功能:read cache/write cache*/
(2) .\cache\cache.cc和cache.h
每个实际的cache都会被Cache类定义一个对象,例如L1-icache。Cache 中主要包含 cache 的一些基本信息,包括大小,类型,相连度等具体的信息和获取信息的一些操作。同时在cache类中包括了对cache的访问和插入的两个方法:accessSingleLine,insertSingleLine。这两个方法都会在CacheCntlr中被调用
/* cache类中的部分属性 */// Cache countersUInt64 m_num_accesses;UInt64 m_num_hits;// Generic Cache Infocache_t m_cache_type;CacheSet** m_sets;CacheSetInfo* m_set_info;
/* cache类构造函数 */
Cache(String name,String cfgname,core_id_t core_id,UInt32 num_sets,UInt32 associativity, UInt32 cache_block_size,String replacement_policy, cache_t cache_type,hash_t hash = CacheBase::HASH_MASK,FaultInjector *fault_injector = NULL,AddressHomeLookup *ahl = NULL);
/* accessSingleLine */
//cache hit时,Cache controller会调用accessCache方法,在该方法中又会进一步调用cache类中的这个方法。该方法包括了对cache的读取和写入
CacheBlockInfo* accessSingleLine(IntPtr addr,access_t access_type, Byte* buff, UInt32 bytes, SubsecondTime now, bool update_replacement);
/* insertSingleLine */
//cache miss时,Cache controller会调用insertCacheBlock方法,在该方法中又会进一步调用cache类中的这个方法。
void insertSingleLine(IntPtr addr, Byte* fill_buff,bool* eviction, IntPtr* evict_addr,CacheBlockInfo* evict_block_info, Byte* evict_buff, SubsecondTime now, CacheCntlr *cntlr = NULL);
(3) .\cache\cache_base.h
CacheBase类中包括了一些cache的基本信息,例如相连度,cache大小。同时也包括了一些类型定义,例如replacement policy等,如果增加替换算法,需要更改这个地方。
enum ReplacementPolicy{ROUND_ROBIN = 0,LRU,LRU_QBS,NRU,MRU,NMRU,PLRU,SRRIP,SHCT_SRRIP,SRRIP_QBS,RANDOM,NUM_REPLACEMENT_POLICIES,SHCT_LRU};//替换类型的枚举类型
(4) .\cache\cache_set.cc和cache_set.h
cache的替换算法是以组为单位,组内为若干cache line的集合。cache line的个数即为相连度。替换算法是在组中选择一个合适的cacheline进行替换。每一个组都会被CacheSet类定义一个对象。CacheSet类中主要包括了对cache的更基本的操作。accessSingleLine方法会调用read_line和write_line方法,insertCacheBlock会调用insert方法
/* cache hit时,用于访问cache,读取数据 */
void read_line(UInt32 line_index, UInt32 offset, Byte *out_buff, UInt32 bytes, bool update_replacement);
/* cache hit时,用于写回cache */
void write_line(UInt32 line_index, UInt32 offset, Byte *in_buff, UInt32 bytes, bool update_replacement);
/* cache miss时,用于将数据插入到cache,替换算法的作用在这个地方体现 */
void insert(CacheBlockInfo* cache_block_info, Byte* fill_buff, bool* eviction, CacheBlockInfo* evict_block_info, Byte* evict_buff, CacheCntlr *cntlr = NULL);
cacheset除了包含cache的访问方法之外,如果需要自己增加替换算法还需要更改以下两个方法:
/* 根据替换算法的不同,创建对应的cache_set对象 */
CacheSet* CacheSet::createCacheSet(String cfgname, core_id_t core_id,String replacement_policy,CacheBase::cache_t cache_type,UInt32 associativity, UInt32 blocksize, CacheSetInfo* set_info);
/* 根据替换算法的不同,创建对应的cachesetinfo对象 */
CacheSetInfo* CacheSet::createCacheSetInfo(String name, String cfgname, core_id_t core_id,String replacement_policy, UInt32 associativity);
/* 根据输入的替换算法的字符串,判断替换算法的类型 */
CacheBase::ReplacementPolicyCacheSet::parsePolicyType(String policy);
(5) .\cache\cache_block_info.cc和cache_block_info.h
每一个cacheline都会被cacheBlockInfo的类创建一个对象,用于保存cache line的额外信息,例如tag位,used位等。如果增加替换算法,需要增加额外的信息,可以考虑在这个地方或者是上一层cacheset中增加。
IntPtr m_tag;CacheState::cstate_t m_cstate;UInt64 m_owner;BitsUsedType m_used;UInt8 m_options; // large enough to hold a bitfield for all available option_t's
(6) .\cache\cache_set_lru.cc和cache_set_lru.h
sniper中自带的lru算法,基类都是cacheset类,会实现基类中的getReplacementIndex方法和updateReplacementIndex方法。前者用于在寻找替换的cacheline时,根据确定替换算法,选择合适的cacheline被替换。后者则是用于当某一个cacheline被访问(读取,写回,插入)时,替换算法需要执行的更新操作(更新自身的额外信息,例如LRU的访问记录)。
/* cache set中的虚函数 */virtual UInt32 getReplacementIndex(CacheCntlr *cntlr) = 0;virtual void updateReplacementIndex(UInt32) = 0;
/* lru 替换算法中的具体实现 */
UInt32 CacheSetLRU::getReplacementIndex(CacheCntlr *cntlr)
{// First try to find an invalid block//首先寻找没有被用到的cacheline被替换for (UInt32 i = 0; i < m_associativity; i++){if (!m_cache_block_info_array[i]->isValid()){// Mark our newly-inserted line as most-recently usedmoveToMRU(i);return i;}}// Make m_num_attemps attempts at evicting the block at LRU positionfor(UInt8 attempt = 0; attempt < m_num_attempts; ++attempt){UInt32 index = 0;UInt8 max_bits = 0;//寻找最近都未被访问的cacheline进行替换for (UInt32 i = 0; i < m_associativity; i++){if (m_lru_bits[i] > max_bits && isValidReplacement(i)){index = i;max_bits = m_lru_bits[i];}}LOG_ASSERT_ERROR(index < m_associativity, "Error Finding LRU bits");bool qbs_reject = false;if (attempt < m_num_attempts - 1)//尝试的次数是给定的,参数传入{LOG_ASSERT_ERROR(cntlr != NULL, "CacheCntlr == NULL, QBS can only be used when cntlr is passed in");qbs_reject = cntlr->isInLowerLevelCache(m_cache_block_info_array[index]);}if (qbs_reject)//如果当前的数据在下一层cache中存在,则换一个cacheline{// Block is contained in lower-level cache, and we have more tries remaining.// Move this block to MRU and try againmoveToMRU(index);cntlr->incrementQBSLookupCost();continue;}else{// Mark our newly-inserted line as most-recently usedmoveToMRU(index);//更新操作m_set_info->incrementAttempt(attempt);return index;}}LOG_PRINT_ERROR("Should not reach here");
}//更新操作
void CacheSetLRU::updateReplacementIndex(UInt32 accessed_index)
{m_set_info->increment(m_lru_bits[accessed_index]);moveToMRU(accessed_index);
}
void CacheSetLRU::moveToMRU(UInt32 accessed_index)
{//m_lru_bits数字越大,说明当前的cacheline上次的访问时间越远for (UInt32 i = 0; i < m_associativity; i++){if (m_lru_bits[i] < m_lru_bits[accessed_index])m_lru_bits[i] ++;}//当前访问的cacheline对应m_lru_bits设置为0,表示刚刚被访问m_lru_bits[accessed_index] = 0;
}
(7) cache和cacheset之间关系的举例
示例:32KB, 8-way set-associative, 64B blocks
1个Cache对象包含(32KB/(8*64B))=64个CacheSet对象,每个CacheSet对象包含8个cacheline,每个cacheline都有一个CacheBlockInfo类保存额外信息
(8) 图解
这篇关于Sniper中cache实现的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!