Question about armv7_icache_sync_range function

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about armv7_icache_sync_range function

Jeunder Yu
Hello,

I have a question for armv7_icache_sync_range function in
sys/arch/arm/arm/cpufunc_asm_armv7.S

ENTRY(armv7_icache_sync_range)
        ldr     ip, .Larmv7_line_size
        cmp     r1, #0x8000
        movcs   r1, #0x8000

The register r1 is the size of range to sync.
My question is, why it is limited to 0x8000 (32K bytes) ?

Similar limitation also existed in following functions:
armv7_dcache_wb_range
armv7_idcache_wbinv_range
armv7_dcache_wbinv_range
armv7_dcache_inv_range

It's so strange!

The function armv5_ec_icache_sync_range in
sys/arch/arm/arm/cpufunc_asm_armv5_ec.S

ENTRY_NP(armv5_ec_icache_sync_range)
        ldr     ip, .Larmv5_ec_line_size
        cmp     r1, #0x4000
        bcs     .Larmv5_ec_icache_sync_all

It will use icache_sync_all function if size greater than 0x4000,
is this logic suitable for armv7_icache_sync_range function?
Is there any performance consideration?

Any comment?

BRs,
Joey

Reply | Threaded
Open this post in threaded view
|

Re: Question about armv7_icache_sync_range function

Patrick Wildt-2
Hi,

Some might call it unimplemented feature, other might call it a bug.

In essence this code is wrong.  The value has to match the cache line
size used in the specific CPU that we’re running on.

If you look at NetBSD, you’ll see that they always read out the size
prior to doing that sync.  FreeBSD reads it out once on bootup, stores
it in a variable and uses that.  OpenBSD hardcodes this value because
no one yet came to do it right.

Patrick

> Am 20.11.2015 um 07:21 schrieb 游俊德 <[hidden email]>:
>
> Hello,
>
> I have a question for armv7_icache_sync_range function in
> sys/arch/arm/arm/cpufunc_asm_armv7.S
>
> ENTRY(armv7_icache_sync_range)
>        ldr     ip, .Larmv7_line_size
>        cmp     r1, #0x8000
>        movcs   r1, #0x8000
>
> The register r1 is the size of range to sync.
> My question is, why it is limited to 0x8000 (32K bytes) ?
>
> Similar limitation also existed in following functions:
> armv7_dcache_wb_range
> armv7_idcache_wbinv_range
> armv7_dcache_wbinv_range
> armv7_dcache_inv_range
>
> It's so strange!
>
> The function armv5_ec_icache_sync_range in
> sys/arch/arm/arm/cpufunc_asm_armv5_ec.S
>
> ENTRY_NP(armv5_ec_icache_sync_range)
>        ldr     ip, .Larmv5_ec_line_size
>        cmp     r1, #0x4000
>        bcs     .Larmv5_ec_icache_sync_all
>
> It will use icache_sync_all function if size greater than 0x4000,
> is this logic suitable for armv7_icache_sync_range function?
> Is there any performance consideration?
>
> Any comment?
>
> BRs,
> Joey
>

Reply | Threaded
Open this post in threaded view
|

Re: Question about armv7_icache_sync_range function

Jeunder Yu
In
2015年11月20日 下午7:07於 "Patrick Wildt" <[hidden email]>寫道:

>
> Hi,
>
> Some might call it unimplemented feature, other might call it a bug.
>
> In essence this code is wrong.  The value has to match the cache line
> size used in the specific CPU that we’re running on.
>
> If you look at NetBSD, you’ll see that they always read out the size
> prior to doing that sync.  FreeBSD reads it out once on bootup, stores
> it in a variable and uses that.  OpenBSD hardcodes this value because
> no one yet came to do it right.
>
> Patrick
>

The cache linesize was not hardcoded, it is read from global variable, and
the global variable was initialized by reading from cache type register. I
think the limitation is not necessary.

I found the size of range (r1) subtract from one, this also unnecessary.

It seems coherent walk on translation table not supported, page table is
non-cacheable (why not write-back and flush?)

May I give a patch to correct size of range and support coherent walk on
translation table?

> > Am 20.11.2015 um 07:21 schrieb 游俊德 <[hidden email]>:
> >
> > Hello,
> >
> > I have a question for armv7_icache_sync_range function in
> > sys/arch/arm/arm/cpufunc_asm_armv7.S
> >
> > ENTRY(armv7_icache_sync_range)
> >        ldr     ip, .Larmv7_line_size
> >        cmp     r1, #0x8000
> >        movcs   r1, #0x8000
> >
> > The register r1 is the size of range to sync.
> > My question is, why it is limited to 0x8000 (32K bytes) ?
> >
> > Similar limitation also existed in following functions:
> > armv7_dcache_wb_range
> > armv7_idcache_wbinv_range
> > armv7_dcache_wbinv_range
> > armv7_dcache_inv_range
> >
> > It's so strange!
> >
> > The function armv5_ec_icache_sync_range in
> > sys/arch/arm/arm/cpufunc_asm_armv5_ec.S
> >
> > ENTRY_NP(armv5_ec_icache_sync_range)
> >        ldr     ip, .Larmv5_ec_line_size
> >        cmp     r1, #0x4000
> >        bcs     .Larmv5_ec_icache_sync_all
> >
> > It will use icache_sync_all function if size greater than 0x4000,
> > is this logic suitable for armv7_icache_sync_range function?
> > Is there any performance consideration?
> >
> > Any comment?
> >
> > BRs,
> > Joey
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Question about armv7_icache_sync_range function

Patrick Wildt-2
Sorry for replying late. Yes, you’re right. I’m not sure why I
read it like that.

I think the PTEs are write-through by default.  If there’s no write-
through we wanted to use write-back, but unfortunately that triggered
bugs... So we made those PTEs then uncached.

Having them write-back with coherent walk (or flush) does not sound
too bad. Feel free to send a diff to tech@ if you're still interested.

There had been some external effort to rewrite the PMAP based on
the PowerPC one. It actually runs on a platform, but it's not stable
enough. I wish there was more time to work on that.

> Am 10.12.2015 um 13:16 schrieb Jeunder Yu <[hidden email]>:
>
> In
> 2015年11月20日 下午7:07於 "Patrick Wildt" <[hidden email]>寫道:
>>
>> Hi,
>>
>> Some might call it unimplemented feature, other might call it a bug.
>>
>> In essence this code is wrong.  The value has to match the cache line
>> size used in the specific CPU that we’re running on.
>>
>> If you look at NetBSD, you’ll see that they always read out the size
>> prior to doing that sync.  FreeBSD reads it out once on bootup, stores
>> it in a variable and uses that.  OpenBSD hardcodes this value because
>> no one yet came to do it right.
>>
>> Patrick
>>
>
> The cache linesize was not hardcoded, it is read from global variable, and
> the global variable was initialized by reading from cache type register. I
> think the limitation is not necessary.
>
> I found the size of range (r1) subtract from one, this also unnecessary.
>
> It seems coherent walk on translation table not supported, page table is
> non-cacheable (why not write-back and flush?)
>
> May I give a patch to correct size of range and support coherent walk on
> translation table?
>
>>> Am 20.11.2015 um 07:21 schrieb 游俊德 <[hidden email]>:
>>>
>>> Hello,
>>>
>>> I have a question for armv7_icache_sync_range function in
>>> sys/arch/arm/arm/cpufunc_asm_armv7.S
>>>
>>> ENTRY(armv7_icache_sync_range)
>>>       ldr     ip, .Larmv7_line_size
>>>       cmp     r1, #0x8000
>>>       movcs   r1, #0x8000
>>>
>>> The register r1 is the size of range to sync.
>>> My question is, why it is limited to 0x8000 (32K bytes) ?
>>>
>>> Similar limitation also existed in following functions:
>>> armv7_dcache_wb_range
>>> armv7_idcache_wbinv_range
>>> armv7_dcache_wbinv_range
>>> armv7_dcache_inv_range
>>>
>>> It's so strange!
>>>
>>> The function armv5_ec_icache_sync_range in
>>> sys/arch/arm/arm/cpufunc_asm_armv5_ec.S
>>>
>>> ENTRY_NP(armv5_ec_icache_sync_range)
>>>       ldr     ip, .Larmv5_ec_line_size
>>>       cmp     r1, #0x4000
>>>       bcs     .Larmv5_ec_icache_sync_all
>>>
>>> It will use icache_sync_all function if size greater than 0x4000,
>>> is this logic suitable for armv7_icache_sync_range function?
>>> Is there any performance consideration?
>>>
>>> Any comment?
>>>
>>> BRs,
>>> Joey

Reply | Threaded
Open this post in threaded view
|

Re: Question about armv7_icache_sync_range function

Daniel Bolgheroni-3
On Fri, Dec 18, 2015 at 01:27:59PM +0100, Patrick Wildt wrote:
> There had been some external effort to rewrite the PMAP based on
> the PowerPC one. It actually runs on a platform, but it's not stable
> enough. I wish there was more time to work on that.

What are the advantages of the PowerPC-based pmap implementation and
what are the critical problems with the current one?

Thank you.

--
db