《Windows NT File System Internals》学习笔记之物理内存管理简介

本文主要是介绍《Windows NT File System Internals》学习笔记之物理内存管理简介,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

页帧和页帧数据库

NT VMM必须管理系统内可用的物理内存。VMM使用的方法和现代商用操作系统中使用的方法类似。

 

NT VMM将可用的RAM划分为固定尺寸的页帧。页帧的大小可以从4K64K。在Intel X86结构上,页尺寸为4K。每一个页桢都在页帧数据库(PFN Database)中有相应的入口项。页帧数据库在非分页内存池中,它是一个页帧入口数组。PFN数据库为每一个页帧维护以下信息:

 

l         页帧的物理地址,该字段为20比特,再加上12比特的偏移,那么可寻址4GB的物理地址空间

l         页帧的属性集

-一个修改比特,该比特标记页内容是否被修改

-一个状态标记,指明对该页的操作是读还是写

-该页关联的page color

-标记该页是共享页还是进程私有页的信息

l         一个指向PTE(Prototype Page Table Entry PPTE)的指针。PTEPPTE指针指向该页。PPTE指针和PTE指针用来从虚拟地址反向追踪到物理地址

l         该页的引用计数(Reference Count)VMM通过这个值判断页帧数据库中是否有PTE参考该页

l         一个事件指针。当一个Paging I/O在进行时或者数据从磁盘上读取到内存中时,该指针指向一个事件。

 

 

引用计数(Reference Count)不为0的页帧为有效的页帧。当某个页帧没有PTE指向时,引用计数被减1。当引用技术为0时,表明该页帧没有被使用。根据页桢的不同状态,每一个没有被使用的页帧在下面的5个不同的链表上:

l         坏页链表(bad page list):存在校验错误的页帧

l         空闲链表(free page list):这些页可以立即重新使用,但是没有被初始化为0

l         Zeroed list:这些页可以立即重新使用

l         Modified list:这些页帧没有引用了,但是在把它们的内容写到磁盘之前不能把这些页回收

Modified Page Writer/Mapped Page Writer通常执行异步操作将Modified pages 写到磁盘上。

l         Standby list:包含那些已经从进程工作集中移除的页帧。

NT VMM会基于不同进程的访问内存特点尽量减少分配给某一个进程的页帧数目,某个瞬间分配给进程的页总数称为该进程的工作集。NT VMM通过尽量缩减进程的工作集来提高物理内存的使用率。如果进程的某个页帧由于这个原因被VMM移除的话,VMM并不立即回收该页帧,而是把页帧放到standby链表中。这样该进程就有机会重新使用该页帧。当某个页帧被放到Standby链表时,它被标记为transitional 状态,因为它并没有被释放,而且不属于任何一个进程。

 

NT VMM会给处于freestandby状态的页帧设定一个最大值和最小值。当某个页帧被放到free或者standby链表中,并且总数在最大值和最小值中间时,一个VMM全局事件被置为信号态。VMM使用这些事件判断系统中是否有足够的可用物理页。

 

VMM会经常调用一个内部函数检查是否有足够的内存可以满足需要使用。例如你的驱动调用MMAllocateNonCachedMemory()函数,这个函数需要一定数量的Free页帧。该函数调用MiEnsureAvailablePageOrWait()检查在FreeStandBy链表中是否有足够的可用页,如果没有的话,该函数将在两个事件上阻塞,以等待足够的可用页。如果在一定时间内两个事件都没有变成信号态,将会导致KeBugCheck()

 

VMM使用全局自旋锁同步页帧数据库,当PFN数据库被访问时,需要在合适的IRQL获取该自旋锁(<=DISPATCH_LEVEL)

 

 

Win2K代码中的MMAllocateNonCachedMemory()函数代码如下

PVOID MmAllocateNonCachedMemory (

    IN SIZE_T NumberOfBytes

    )

/*++

Routine Description:

    This function allocates a range of noncached memory in

    the non-paged portion of the system address space.

    This routine is designed to be used by a driver's initialization

    routine to allocate a noncached block of virtual memory for

    various device specific buffers.

Arguments:

    NumberOfBytes - Supplies the number of bytes to allocate.

Return Value:

    NON-NULL - Returns a pointer (virtual address in the nonpaged portion

               of the system) to the allocated physically contiguous

               memory.

    NULL - The specified request could not be satisfied.

Environment:

    Kernel mode, IRQL of APC_LEVEL or below.

--*/

{

    PMMPTE PointerPte;

    MMPTE TempPte;

    PFN_NUMBER NumberOfPages;

    PFN_NUMBER PageFrameIndex;

    PVOID BaseAddress;

    KIRQL OldIrql;

    ASSERT (NumberOfBytes != 0);

    NumberOfPages = BYTES_TO_PAGES(NumberOfBytes);

    //

    // Obtain enough virtual space to map the pages.

    //

    PointerPte = MiReserveSystemPtes ((ULONG)NumberOfPages,

                                      SystemPteSpace,

                                      0,

                                      0,

                                      FALSE);

    if (PointerPte == NULL) {

        return NULL;

    }

    //

    // Obtain backing commitment for the pages.

    //

    if (MiChargeCommitmentCantExpand (NumberOfPages, FALSE) == FALSE) {

        MiReleaseSystemPtes (PointerPte, (ULONG)NumberOfPages, SystemPteSpace);

        return NULL;

    }

    MM_TRACK_COMMIT (MM_DBG_COMMIT_NONCACHED_PAGES, NumberOfPages);

    MmLockPagableSectionByHandle (ExPageLockHandle);

    //

    // Acquire the PFN mutex to synchronize access to the PFN database.

    //

    LOCK_PFN (OldIrql);

    //

    // Obtain enough pages to contain the allocation.

    // Check to make sure the physical pages are available.

    //

    if ((SPFN_NUMBER)NumberOfPages > MI_NONPAGABLE_MEMORY_AVAILABLE()) {

        UNLOCK_PFN (OldIrql);

        MmUnlockPagableImageSection (ExPageLockHandle);

        MiReleaseSystemPtes (PointerPte, (ULONG)NumberOfPages, SystemPteSpace);

        MiReturnCommitment (NumberOfPages);

        return NULL;

    }

#if defined(_IA64_)

    KeFlushEntireTb(FALSE, TRUE);

#endif

    MmResidentAvailablePages -= NumberOfPages;

    MM_BUMP_COUNTER(4, NumberOfPages);

    BaseAddress = (PVOID)MiGetVirtualAddressMappedByPte (PointerPte);

    do {

        ASSERT (PointerPte->u.Hard.Valid == 0);

        MiEnsureAvailablePageOrWait (NULL, NULL);

        PageFrameIndex = MiRemoveAnyPage (MI_GET_PAGE_COLOR_FROM_PTE (PointerPte));

        MI_MAKE_VALID_PTE (TempPte,

                           PageFrameIndex,

                           MM_READWRITE,

                           PointerPte);

        MI_SET_PTE_DIRTY (TempPte);

        MI_DISABLE_CACHING (TempPte);

        MI_WRITE_VALID_PTE (PointerPte, TempPte);

        MiInitializePfn (PageFrameIndex, PointerPte, 1);

        PointerPte += 1;

        NumberOfPages -= 1;

    } while (NumberOfPages != 0);

    //

    // Flush any data for this page out of the dcaches.

    //

#if !defined(_IA64_)

    //

    // Flush any data for this page out of the dcaches.

    //

    KeSweepDcache (TRUE);

#else

    MiSweepCacheMachineDependent(BaseAddress, NumberOfBytes, MmNonCached);

#endif

    UNLOCK_PFN (OldIrql);

    MmUnlockPagableImageSection (ExPageLockHandle);

    return BaseAddress;

}

Win2K代码中的MiEnsureAvailablePageOrWait函数如下:

 

ULONG FASTCALL

MiEnsureAvailablePageOrWait (

    IN PEPROCESS Process,

    IN PVOID VirtualAddress

    )

/*++

Routine Description:

    This procedure ensures that a physical page is available on

    the zeroed, free or standby list such that the next call the remove a

    page absolutely will not block.  This is necessary as blocking would

    require a wait which could cause a deadlock condition.

    If a page is available the function returns immediately with a value

    of FALSE indicating no wait operation was performed.  If no physical

    page is available, the thread enters a wait state and the function

    returns the value TRUE when the wait operation completes.

Arguments:

    Process - Supplies a pointer to the current process if, and only if,

              the working set mutex is held currently held and should

              be released if a wait operation is issued.  Supplies

              the value NULL otherwise.

    VirtualAddress - Supplies the virtual address for the faulting page.

                     If the value is NULL, the page is treated as a

                     user mode address.

Return Value:

    FALSE - if a page was immediately available.

    TRUE - if a wait operation occurred before a page became available.

Environment:

    Must be holding the PFN database mutex with APCs disabled.

--*/

{

    PVOID Event;

    NTSTATUS Status;

    KIRQL OldIrql;

    KIRQL Ignore;

    ULONG Limit;

    ULONG Relock;

    PFN_NUMBER StrandedPages;

    LOGICAL WsHeldSafe;

    PMMPFN Pfn1;

    PMMPFN EndPfn;

    LARGE_INTEGER WaitBegin;

    LARGE_INTEGER WaitEnd;

    MM_PFN_LOCK_ASSERT();

    if (MmAvailablePages >= MM_HIGH_LIMIT) {

        //

        // Pages are available.

        //

        return FALSE;

    }

    //

    // If this fault is for paged pool (or pagable kernel space,

    // including page table pages), let it use the last page.

    //

#if defined(_IA64_)

    if (MI_IS_SYSTEM_ADDRESS(VirtualAddress) ||

        (MI_IS_HYPER_SPACE_ADDRESS(VirtualAddress))) {

#else

    if (((PMMPTE)VirtualAddress > MiGetPteAddress(HYPER_SPACE)) ||

        ((VirtualAddress > MM_HIGHEST_USER_ADDRESS) &&

         (VirtualAddress < (PVOID)PTE_BASE))) {

#endif

        //

        // This fault is in the system, use 1 page as the limit.

        //

        if (MmAvailablePages >= MM_LOW_LIMIT) {

            //

            // Pages are available.

            //

            return FALSE;

        }

        Limit = MM_LOW_LIMIT;

        Event = (PVOID)&MmAvailablePagesEvent;

    } else {

        Limit = MM_HIGH_LIMIT;

        Event = (PVOID)&MmAvailablePagesEventHigh;

    }

    while (MmAvailablePages < Limit) {

        KeClearEvent ((PKEVENT)Event);

        UNLOCK_PFN (APC_LEVEL);

        if (Process == HYDRA_PROCESS) {

            UNLOCK_SESSION_SPACE_WS (APC_LEVEL);

        }

        else if (Process != NULL) {

            //

            // The working set lock may have been acquired safely or unsafely

            // by our caller.  Handle both cases here and below.

            //

            UNLOCK_WS_REGARDLESS (Process, WsHeldSafe);

        }

        else {

            Relock = FALSE;

            if (MmSystemLockOwner == PsGetCurrentThread()) {

                UNLOCK_SYSTEM_WS (APC_LEVEL);

                Relock = TRUE;

            }

        }

        KiQueryInterruptTime(&WaitBegin);

        //

        // Wait 7 minutes for pages to become available.

        //

        Status = KeWaitForSingleObject(Event,

                                       WrFreePage,

                                       KernelMode,

                                       FALSE,

                                       (PLARGE_INTEGER)&MmSevenMinutes);

        if (Status == STATUS_TIMEOUT) {

            KiQueryInterruptTime(&WaitEnd);

            //

            // See how many transition pages have nonzero reference counts as

            // these indicate drivers that aren't unlocking the pages in their

            // MDLs.

            //

            Limit = 0;

            StrandedPages = 0;

            do {

       

                Pfn1 = MI_PFN_ELEMENT (MmPhysicalMemoryBlock->Run[Limit].BasePage);

                EndPfn = Pfn1 + MmPhysicalMemoryBlock->Run[Limit].PageCount;

                while (Pfn1 < EndPfn) {

                    if ((Pfn1->u3.e1.PageLocation == TransitionPage) &&

                        (Pfn1->u3.e2.ReferenceCount != 0)) {

                            StrandedPages += 1;

                    }

                    Pfn1 += 1;

                }

                Limit += 1;

            } while (Limit != MmPhysicalMemoryBlock->NumberOfRuns);

            //

            // This bugcheck can occur for the following reasons:

            //

            // A driver has blocked, deadlocking the modified or mapped

            // page writers.  Examples of this include mutex deadlocks or

            // accesses to paged out memory in filesystem drivers, filter

            // drivers, etc.  This indicates a driver bug.

            //

            // The storage driver(s) are not processing requests.  Examples

            // of this are stranded queues, non-responding drives, etc.  This

            // indicates a driver bug.

            //

            // Not enough pool is available for the storage stack to write out

            // modified pages.  This indicates a driver bug.

            //

            // A high priority realtime thread has starved the balance set

            // manager from trimming pages and/or starved the modified writer

            // from writing them out.  This indicates a bug in the component

            // that created this thread.

            //

            // All the processes have been trimmed to their minimums and all

            // modified pages written, but still no memory is available.  The

            // freed memory must be stuck in transition pages with non-zero

            // reference counts - thus they cannot be put on the freelist.

            // A driver is neglecting to unlock the pages preventing the

            // reference counts from going to zero which would free the pages.

            // This may be due to transfers that never finish and the driver

            // never aborts or other driver bugs.

            //

            KeBugCheckEx (NO_PAGES_AVAILABLE,

                          MmModifiedPageListHead.Total,

                          MmTotalPagesForPagingFile,

                          (MmMaximumNonPagedPoolInBytes >> PAGE_SHIFT) - MmAllocatedNonPagedPool,

                          StrandedPages);

            if (!KdDebuggerNotPresent) {

                DbgPrint ("MmEnsureAvailablePageOrWait: 7 min timeout %x %x %x %x/n", WaitEnd.HighPart, WaitEnd.LowPart, WaitBegin.HighPart, WaitBegin.LowPart);

                DbgBreakPoint ();

            }

        }

        if (Process == HYDRA_PROCESS) {

            LOCK_SESSION_SPACE_WS (Ignore);

        }

        else if (Process != NULL) {

            //

            // The working set lock may have been acquired safely or unsafely

            // by our caller.  Reacquire it in the same manner our caller did.

            //

            LOCK_WS_REGARDLESS (Process, WsHeldSafe);

        }

        else {

            if (Relock) {

                LOCK_SYSTEM_WS (Ignore);

            }

        }

        LOCK_PFN (OldIrql);

    }

    return TRUE;

}

 
 

这篇关于《Windows NT File System Internals》学习笔记之物理内存管理简介的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/686455

相关文章

HarmonyOS学习(七)——UI(五)常用布局总结

自适应布局 1.1、线性布局(LinearLayout) 通过线性容器Row和Column实现线性布局。Column容器内的子组件按照垂直方向排列,Row组件中的子组件按照水平方向排列。 属性说明space通过space参数设置主轴上子组件的间距,达到各子组件在排列上的等间距效果alignItems设置子组件在交叉轴上的对齐方式,且在各类尺寸屏幕上表现一致,其中交叉轴为垂直时,取值为Vert

Ilya-AI分享的他在OpenAI学习到的15个提示工程技巧

Ilya(不是本人,claude AI)在社交媒体上分享了他在OpenAI学习到的15个Prompt撰写技巧。 以下是详细的内容: 提示精确化:在编写提示时,力求表达清晰准确。清楚地阐述任务需求和概念定义至关重要。例:不用"分析文本",而用"判断这段话的情感倾向:积极、消极还是中性"。 快速迭代:善于快速连续调整提示。熟练的提示工程师能够灵活地进行多轮优化。例:从"总结文章"到"用

NameNode内存生产配置

Hadoop2.x 系列,配置 NameNode 内存 NameNode 内存默认 2000m ,如果服务器内存 4G , NameNode 内存可以配置 3g 。在 hadoop-env.sh 文件中配置如下。 HADOOP_NAMENODE_OPTS=-Xmx3072m Hadoop3.x 系列,配置 Nam

【前端学习】AntV G6-08 深入图形与图形分组、自定义节点、节点动画(下)

【课程链接】 AntV G6:深入图形与图形分组、自定义节点、节点动画(下)_哔哩哔哩_bilibili 本章十吾老师讲解了一个复杂的自定义节点中,应该怎样去计算和绘制图形,如何给一个图形制作不间断的动画,以及在鼠标事件之后产生动画。(有点难,需要好好理解) <!DOCTYPE html><html><head><meta charset="UTF-8"><title>06

学习hash总结

2014/1/29/   最近刚开始学hash,名字很陌生,但是hash的思想却很熟悉,以前早就做过此类的题,但是不知道这就是hash思想而已,说白了hash就是一个映射,往往灵活利用数组的下标来实现算法,hash的作用:1、判重;2、统计次数;

综合安防管理平台LntonAIServer视频监控汇聚抖动检测算法优势

LntonAIServer视频质量诊断功能中的抖动检测是一个专门针对视频稳定性进行分析的功能。抖动通常是指视频帧之间的不必要运动,这种运动可能是由于摄像机的移动、传输中的错误或编解码问题导致的。抖动检测对于确保视频内容的平滑性和观看体验至关重要。 优势 1. 提高图像质量 - 清晰度提升:减少抖动,提高图像的清晰度和细节表现力,使得监控画面更加真实可信。 - 细节增强:在低光条件下,抖

零基础学习Redis(10) -- zset类型命令使用

zset是有序集合,内部除了存储元素外,还会存储一个score,存储在zset中的元素会按照score的大小升序排列,不同元素的score可以重复,score相同的元素会按照元素的字典序排列。 1. zset常用命令 1.1 zadd  zadd key [NX | XX] [GT | LT]   [CH] [INCR] score member [score member ...]

【机器学习】高斯过程的基本概念和应用领域以及在python中的实例

引言 高斯过程(Gaussian Process,简称GP)是一种概率模型,用于描述一组随机变量的联合概率分布,其中任何一个有限维度的子集都具有高斯分布 文章目录 引言一、高斯过程1.1 基本定义1.1.1 随机过程1.1.2 高斯分布 1.2 高斯过程的特性1.2.1 联合高斯性1.2.2 均值函数1.2.3 协方差函数(或核函数) 1.3 核函数1.4 高斯过程回归(Gauss

软考系统规划与管理师考试证书含金量高吗?

2024年软考系统规划与管理师考试报名时间节点: 报名时间:2024年上半年软考将于3月中旬陆续开始报名 考试时间:上半年5月25日到28日,下半年11月9日到12日 分数线:所有科目成绩均须达到45分以上(包括45分)方可通过考试 成绩查询:可在“中国计算机技术职业资格网”上查询软考成绩 出成绩时间:预计在11月左右 证书领取时间:一般在考试成绩公布后3~4个月,各地领取时间有所不同

ASIO网络调试助手之一:简介

多年前,写过几篇《Boost.Asio C++网络编程》的学习文章,一直没机会实践。最近项目中用到了Asio,于是抽空写了个网络调试助手。 开发环境: Win10 Qt5.12.6 + Asio(standalone) + spdlog 支持协议: UDP + TCP Client + TCP Server 独立的Asio(http://www.think-async.com)只包含了头文件,不依