AMBA问题汇总

2024-04-10 17:58
文章标签 问题 汇总 amba

本文主要是介绍AMBA问题汇总,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

再看AMBA协议时有些问题,模糊的地方,记在这里反复看看,时刻提醒。好多问题和解答都是转自amba社区!

1问:

Regarding AMBA AXI4 Write Strobe port

Hi All,

       I am completely new to AMBA AXI4 protocol. I am unclear about exact functionality of Write Strobe (WSTRB) port in write data channel. According to Specification,
we are calculating Lower_Byte_Lane and Upper_Byte_Lane (on page A3-47) to identify valid bytes present on write data bus (WDATA).

       For Example, if we have aligned address 0x00, transfer size = 32 bits and data bus of 64-bits, we have valid data present on WDATA[31:0] and WDATA[63:32] will
be ignored during first transfer in a burst. So in this case, Master should drive 0x0F on WSTRB port as we are expecting data on lower 32-bits according to Lower_Byte_Lane
and Upper_Byte_Lane calculation.

       My question is that we are determining valid byte lanes based on Lower_Byte_Lane and Upper_Byte_Lane calculation. So what is the significance of WSTRB port?

       When we have to consider WSTRB port value in AXI4 slave? And at the same time, do we need to ignore Lower_Byte_Lane and Upper_Byte_Lane values?

       One more query: During read data phase, do we need to send read data in particular byte lanes on RDATA bus depedning upon Lower_Byte_Lane and
Upper_Byte_Lane values? Please confirm.


Thanks & Regards,

Tejas

1答:

Hi All,

       I am completely new to AMBA AXI4 protocol. I am unclear about  exact functionality of Write Strobe (WSTRB) port in write data channel.  According to Specification,
we are calculating Lower_Byte_Lane and Upper_Byte_Lane (on page A3-47)  to identify valid bytes present on write data bus (WDATA).

       For Example, if we have aligned address 0x00, transfer size = 32  bits and data bus of 64-bits, we have valid data present on WDATA[31:0]  and WDATA[63:32] will
be ignored during first transfer in a burst. So in this case, Master  should drive 0x0F on WSTRB port as we are expecting data on lower  32-bits according to Lower_Byte_Lane
and Upper_Byte_Lane calculation.

       My question is that we are determining valid byte lanes based on  Lower_Byte_Lane and Upper_Byte_Lane calculation. So what is the  significance of WSTRB port?

Page A3-49 of the AXI4 protocol describes "There is one write  strobe for each eight bits of the write data bus, therefore WSTRB[n]  corresponds to WDATA[(8n)+7:(8n)].", i.e., WSTRB[7] validates  WDATA[63:56], and is the most significant bit; WSTRB[0] validates  WDATA[7:0], and is the least significant bit.

Moreover, although the master could signal a 32 bit transfer on AWSIZE,  the WSTRB info then tells you if all 4 bytes of the 32-bit transfer will  contain valid data. For example AWSIZE=3'b010 would signal a 32 bit  transfer on this 64-bit data bus, and WSTRB could signal 8'b00001001,  indicating only WDATA[31:24] and WDATA[7:0] would contain valid bytes to  be transferred. 

       When we have to consider WSTRB port value in AXI4 slave? And at  the same time, do we need to ignore Lower_Byte_Lane and Upper_Byte_Lane  values?

    The slave needs to know both the AWSIZE and WSTRB information. WSTRB  tells the slave the byte lanes containing active data (something it  couldn't work out solely based on the AWADDR/AWSIZE/AWBURST  combination), but AWSIZE (and AWADDR/AWBURST) is still needed to tell  the slave how much the address should increment by for subsequent  transfers in a burst. 

         One more query: During read data  phase, do we need to send read data in particular byte lanes on RDATA  bus depedning upon Lower_Byte_Lane and
Upper_Byte_Lane values? Please confirm.

Yes correct. Lower_Byte_Lane or Upper_Byte_Lane depend on the  first transfer in a burst, i.e. the read address is 32-bit aligned or  64-bit aligned to your 64-bit data bus.  

Thanks & Regards,

Tejas


2问:

AXI4 Burst Length

In AXI4 the burst length is increased from 16 to 256.

If burst type is WRAP, do this mean we can do wrap burst of 2,4,8,16,32,64,128 & 256 ?

Also if the burst type is FIXED can the burst length be more than 16 ?

2答
AXI4 only extends the burst length for INCR type transactions to 256, the maximum burst length of FIXED and WRAP transactions is still 16. (page 44很明确说明)

3问

AXI - USER and REGION Signal Width

Could you plz share some info on the USER & REGION sideband signals, viz. their usage and signal widths ?

Also can write data width and read width be different or is it always the same ?

3答

The USER signals are optional 'sideband' signals that can accompany each AXI channel. The width of them is completely user-defined, as is their purpose. Since their functionality is not defined by the protocol, the use of User signals is generally not recommended as this can lead to interoperability issues if two components use the same User signals in an incompatible manner.

 

The AWREGION/ARREGION signals are both 4 bits wide and can be used to indicate which 'region' of a slave a transaction is targeted at.This high-level explanation is probably better clarified with a simple example of how they could be used; if a slave has (for example) an area of control registers, and a separate area of data registers, a master could indicate that a transaction is for the data area using the appropriate AxREGION signal, and the slave will not have as much decoding to do as a result.

 

Your last question is an interesting one because (as far as I'm aware) nothing in the protocol specification says that read and write data widths have to be the same. It would however need to be a pretty unusual design to actually benefit from doing this, and it would likely increase the complexity of working with existing components/interconnects as this isn't something that I've heard of as being supported.


4问

Difference between AXI transfer and transaction

I ve been studying the AXI protocol and I ve come up with a small roadblock. My questions are:

1.What is a AXI transfer?
2.What is a AXI transaction?
3.Difference between transfer and transaction.
4.Is a transfer made of multiple transactions or is it the other way around??
5.Is 1 beat ,1 transfer or  1 transaction?
6.Can a single transaction(transfer) occupy more than one burst(because sometimes data can be greater than 256 transfers,with each transfer having 64 bits)?
7.Finally (phew!) can a burst have more than one transaction(because sometimes data to be moved will be of a few bytes, so more than one transaction can be accomodated in one burst)

4答
Hi Metalhead0202,

1. An AXI transfer is defined in some of the ARM documentation as

  "A single exchange of information. That is, with one xVALID/xREADY handshake"

2. And an AXI transaction is then described as

  "An entire burst of transfers, comprising an address, one or more data transfers and a response transfer (writes only)."

3. So from these you can see that a "transaction" is made up of lots of "transfers", with a write "transaction" including an AW "transfer", one or more W "transfers" and finally a B "transfer". A read "transaction" starts with an AR "transfer" and is followed by one or more "R" transfers".

4. It is "the other way round"  :)

5.A beat would be 1 data channel transfer.

6. No. A "transaction" contains a "burst" of data "transfers", so a "burst" is limited to the maximum number of "transfers" allowed for that transaction type (16 for FIXED and WRAP, 256 for INCR in AXI4 or 16 for AXI3).

7. There is a "Glossary" of the terms used at the end of the AXI spec, and it describes a "Burst" as

  "In an AXI transaction, the payload data is transferred in a single burst, that can comprise multiple beats, or individual data transfers."

Maybe section A1.4 in the version of the AXI spec you are looking at will help clarify all of your questions. It describes the terminology used in the spec and answers most of the questions above.

5问

AXI write strobes

the AXI spec says:

10.1 About unaligned transfers
[...]
For any burst that is made up of data transfers wider than one byte, it is possible that the first bytes that have to be accessed do not align with the natural data width boundary. For example, a 32-bit (four-byte) data packet that starts at a byte address of 0x1002 is not aligned to a 32-bit boundary.


and then shows some examples of bursts with unaligned first bytes.

i also see references to disabling all strobes on any beat of a burst write.

but, what about unaligned ending bytes?  for example, a burst of 1kB starting at address 0x1 would have both an unaligned starting and ending byte.  is this allowed?

do the bytes of a burst have to be contiguous?  could the writes strobes have holes in them, for example, 0x5, 0xa, 0x9, etc.?

also, i was wondering what AXI masters ARM has that makes use of this feature?  do ARM processors ever generate unaligned bursts for instruction or data accesses, or is it only the DMA controller that issues unaligned bursts?  and in what scenario would a master disable all the strobes after starting a burst write (something like interrupting a dirty line castout?)?
5答
Hi James,

>but, what about unaligned ending bytes?  

Wouldn't be allowed. It is only the first transfer in a burst that is unaligned, all the remaining transfers are aligned.

However for a write transaction you could use the WSTRB bits to signal which of the final bytes is valid, that way having the same effect as an unaligned final transfer in the burst. But you cannot do this for reads.

> for example, a burst of 1kB starting at address 0x1 would 
> have both an unaligned starting and ending byte.  is this allowed?

No. The final transfer would be aligned.

> do the bytes of a burst have to be contiguous?  could the writes
>  strobes have holes in them, for example, 0x5, 0xa, 0x9, etc.?

The write strobes can change for each transfer of a burst, so you could see the above sequence.

> also, i was wondering what AXI masters ARM has that makes
> use of this feature?

I am not aware of any current ARM masters that use the WSTRBs to indicate sparse transfers, but maybe someone else will know more about specific ARM master designs.

==
Look at section 9.3 where it states "In a fixed burst, the address remains constant, and the byte lanes that CAN be used also remain constant".

The important word here is CAN. The AWADDR and AWSIZE signals tell you the range of byte lanes that CAN be used, but the WSTRB bits would say which specific possible byte lanes ARE being used in each beat of the FIXED burst.

So WSTRB could be used to make the FIXED burst appear to be unaligned (even though the AWADDR value actually IS aligned.
==
A start address of 0x1 means that you can use any of the byte lanes WSTRB[7:1] on the 64 bit data bus (assuming AWSIZE indicated a 64 bit transfer), but not WSTRB[0].

Obviously if AWSIZE indicated a 32 bit transfer, even with a bus width of 64 bits we could only use WSTRB[3:1] for this transfer to AWADDR=0x1.

6问

Why The address is same for each transfer  of transaction  in fixed burst of AXI 3 or 4?

In AXI 3/4, let us assume a transaction has 4 transfers.

 

for example 1st transfer of write transaction address location is 2000 with data of 20.

                   2nd transfer of write transaction address location is also 2000(as per Spec) with data of 40.

 

Here My Question is that 1st transfer data is overwrite by 2nd transfer of write transaction i,e in 2000 address location data 40 only available. Is it correct?

 

After sending two transfers of write transaction to the same address location, Can i read data of 1st transfer of write transaction by using same address location?

 

In Fixed Burst we need to do write transfer first and read transfer immediately then only operation will be fine. Is it correct?

 

Whether we can do continuous write transfers first and then continuous read transfers in fixed burst?

 

In the above case(continuous write & continuous reads) i will have only last write transfer of transaction in address location that can be readed multiple times. Is it correct?

6答

You're correct about what the FIXED burst does (multiple reads or writes to the same address).  The question is why it would be a useful thing to do.  The typical example is the on I mentioned in the other thread - accessing a FIFO.

 

Imagine you had a piece of hardware that acted as a message queue.  Every time you write to it you push another message onto the queue.  Every time you read from it you pop the next message off the queue.  You could use a FIXED write to push multiple message onto the queue, or use a FIXED read to pop multiple messages off the queue.


7问
In case of wrapped bursts, we need to calculate first the Aligned_Address, using:

Suppose start address is 55, assuming 32 - bit bus, burst length of 4

Aligned_Address = (INT(Start_Address / Number_Bytes) ) * Number_Bytes;

The value is ::  52 or 56  i.e.  do we have round to lower or upper value.

Then we calculate the wrap boundary, using
Wrap_Boundary = ((INT(Start_Address / (Number_Bytes * Burst_Length))) * (Number_Bytes * Burst_Length);

What does this wrap boundary actually indicate, 
1. The address from where wrapping will take place.
2. The address value after wrap.


Also, if anyone can let me know in a step-wise manner how the address are calculated using the same scenario above, would be great.

Hope to see the replies soon.
7答:
A wrap start address **must** be beat-aligned. It is illegal for a master to make an unaligned wrap request - it wouldn't be AXI conformant.

There are two kinds of alignment in a wrap:

Start_Address needs to be aligned on the beat-size, so for a 32-bit beat it needs 4-byte aligned, 16-bit beat needs to be 2-byte aligned, etc. In this case your start address (55) is not a valid Start_Address for a 32-bit beat  - ..., 52, 56, 60, 64, ... would be OK for example.

> Aligned_Address = (INT(Start_Address / Number_Bytes) ) x Number_Bytes.

Aligned_Address is the lowest address accessed by the wrap - and it is aligned on a boundary which matches the total number of bytes in the burst. So beat-size * number-of-beats. For a burst with 8 x 32-bit beats the Aligned_Address would be 256-bit aligned.

Using my example earlier. If you had a 32-bit wrap of 4 beats starting at address 0x4 then you would get the following access pattern:
0x4, 0x8, 0xc, 0x0

Start Address = 0x4 
- this is allowed as it is aligned on a beat boundary.

Aligned_Address = (INT(0x4 / 0x10)) * 0x10 = INT(0.25) *0x10 = 0 x 0x10 = 0 
- this is allowed as it is aligned on a total_bytes boundary.

Note that INT() is a C style cast - it always truncates decimal parts (rounds down).
==
As I mentioned before, your example is not a valid AXI transfer - it can never happen.

The formula in the spec if for calculating the Aligned address in the wrap, not for aligning the start address - that must be aligned by the master making the access.

For a 4-byte beat your start address must be 4-byte aligned. 55 is not. To repeat what I said before:

[snip]
Start_Address needs to be aligned on the beat-size, so for a 32-bit beat it needs 4-byte aligned, 16-bit beat needs to be 2-byte aligned, etc. In this case your start address (55) is not a valid Start_Address for a 32-bit beat - ..., 52, 56, 60, 64, ... would be OK for example.
[/snip]

However, assuming a start address of 56 the following calculation would apply:

Burst Len = 32-bit beat * 4 beats = 16 bytes

Aligned address: 
= INT(56 / 16) * 16 
= INT(3.5) * 16 
= 3 * 16
= 48

Wrap boundary:
= highest byte accessed in burst
= 48 + 16 -1
= 63

Access Pattern:
= 56, 60, 48, 52

8问

一个AXI的burst问题!

AXI总线,burst操作,不能跨4K边界问题! 
在Master_A设计中,假如Master_A只操作一块64M SDRAM(此Master_A不操作任何其他Slave),读写的数据量远远大于4K。因此其中某个Burst的操作可能 会出现在4K边界上。 
请问: 在这样的情况下,Master_A设计的Burst操作是否需要遵守4k边界的约定。

8答
协议中之所以规定一个burst不能跨越4K边界是为了避免一笔burst交易访问两个slave(每个slave的地址空间是4K/1K对齐的)。假如一个burst交易访问了两个slave A 和B(A在前B在后),那么只有A收到了地址和控制信息,而B不会收到地址和控制信息,因此只有A响应B并无响应,这就会导致此笔burst交易无法完成(B无法返回最后一笔transfer)。因此如果你必须确定系统中所有slave中地址空间最小的那一个,假设是1M,这样你的burst边界可以大于4K至1M。但还是不建议这么做,毕竟一个burst跨越4K边界的概率还是较低的,如果真的跨越了就拆分。

4K对齐最大原因是系统中定义一个page大小是4K。所以,为了更好的设定每个slave的访问attribue,就给一个slave划分4K空间。
4K对齐,以32位地址为例,[31:12]相等的地址都是同一个page,没有跨4K边界。 即[11:0] 可以为0~0xFFF.  例如0x1000和0x2000就是在不同的page,跨了4K边界。0x1000和0x1FFF则是在同一个page,没有跨4K边界。同理,0x1FFF和0x2000则跨了4K边界,虽然他们是相邻的byte。
再说到一次burst没有4K大小,但是如果起始地址是0x1FFC, INCR模式,会跨边界吧?

1K对齐,就是说 [31:10]相等的地址都是在一个1K对齐的空间内。

至于真的跨越了,就拆分。例如处理器load多个数据,就是跨页访问,到了接口控制模块,也会把这一个访问拆分成两个访问。在interfere上的都是符合协议的transaction。例如,core要访问0x1FF0-0x200C共32byte的数据(每次beat 4B)。系统会自动给拆分成0x1FF0-0x1FFC和0x2000-0x200C两个transaction。

9

如果初始地址不对齐(即最低位不和size对齐),发第一次的data时:用到上一个size对齐的所有strobe位,还是只是第一次与意义的位数?


9

应该是后者,第一次只是有意义的位,看下图。第二次之后又开始对齐,没什么好说的。




size = 2, 末位是a,落在89ab上,但只用了ab两位,故strobe为0c

这篇关于AMBA问题汇总的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/891749

相关文章

关于@MapperScan和@ComponentScan的使用问题

《关于@MapperScan和@ComponentScan的使用问题》文章介绍了在使用`@MapperScan`和`@ComponentScan`时可能会遇到的包扫描冲突问题,并提供了解决方法,同时,... 目录@MapperScan和@ComponentScan的使用问题报错如下原因解决办法课外拓展总结@

MybatisGenerator文件生成不出对应文件的问题

《MybatisGenerator文件生成不出对应文件的问题》本文介绍了使用MybatisGenerator生成文件时遇到的问题及解决方法,主要步骤包括检查目标表是否存在、是否能连接到数据库、配置生成... 目录MyBATisGenerator 文件生成不出对应文件先在项目结构里引入“targetProje

C#使用HttpClient进行Post请求出现超时问题的解决及优化

《C#使用HttpClient进行Post请求出现超时问题的解决及优化》最近我的控制台程序发现有时候总是出现请求超时等问题,通常好几分钟最多只有3-4个请求,在使用apipost发现并发10个5分钟也... 目录优化结论单例HttpClient连接池耗尽和并发并发异步最终优化后优化结论我直接上优化结论吧,

Java内存泄漏问题的排查、优化与最佳实践

《Java内存泄漏问题的排查、优化与最佳实践》在Java开发中,内存泄漏是一个常见且令人头疼的问题,内存泄漏指的是程序在运行过程中,已经不再使用的对象没有被及时释放,从而导致内存占用不断增加,最终... 目录引言1. 什么是内存泄漏?常见的内存泄漏情况2. 如何排查 Java 中的内存泄漏?2.1 使用 J

numpy求解线性代数相关问题

《numpy求解线性代数相关问题》本文主要介绍了numpy求解线性代数相关问题,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧... 在numpy中有numpy.array类型和numpy.mat类型,前者是数组类型,后者是矩阵类型。数组

解决systemctl reload nginx重启Nginx服务报错:Job for nginx.service invalid问题

《解决systemctlreloadnginx重启Nginx服务报错:Jobfornginx.serviceinvalid问题》文章描述了通过`systemctlstatusnginx.se... 目录systemctl reload nginx重启Nginx服务报错:Job for nginx.javas

Oracle数据库使用 listagg去重删除重复数据的方法汇总

《Oracle数据库使用listagg去重删除重复数据的方法汇总》文章介绍了在Oracle数据库中使用LISTAGG和XMLAGG函数进行字符串聚合并去重的方法,包括去重聚合、使用XML解析和CLO... 目录案例表第一种:使用wm_concat() + distinct去重聚合第二种:使用listagg,

Redis缓存问题与缓存更新机制详解

《Redis缓存问题与缓存更新机制详解》本文主要介绍了缓存问题及其解决方案,包括缓存穿透、缓存击穿、缓存雪崩等问题的成因以及相应的预防和解决方法,同时,还详细探讨了缓存更新机制,包括不同情况下的缓存更... 目录一、缓存问题1.1 缓存穿透1.1.1 问题来源1.1.2 解决方案1.2 缓存击穿1.2.1

vue解决子组件样式覆盖问题scoped deep

《vue解决子组件样式覆盖问题scopeddeep》文章主要介绍了在Vue项目中处理全局样式和局部样式的方法,包括使用scoped属性和深度选择器(/deep/)来覆盖子组件的样式,作者建议所有组件... 目录前言scoped分析deep分析使用总结所有组件必须加scoped父组件覆盖子组件使用deep前言

解决Cron定时任务中Pytest脚本无法发送邮件的问题

《解决Cron定时任务中Pytest脚本无法发送邮件的问题》文章探讨解决在Cron定时任务中运行Pytest脚本时邮件发送失败的问题,先优化环境变量,再检查Pytest邮件配置,接着配置文件确保SMT... 目录引言1. 环境变量优化:确保Cron任务可以正确执行解决方案:1.1. 创建一个脚本1.2. 修