How to shrink the SIZE memory?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How to shrink the SIZE memory?

Peter J. Philipp-3
Hi misc@,

I have a program, a DNS server.  It has a database to hold internal data.
Right now it's very inneficient in the way it uses memory.  Let me explain.

If you know what an RRSET is it's all the RR records under one name.  Like in
the OpenBSD.ORG name there is a SOA, NS, A RR's and so on.  That's an RRSET.

Internally in my program I reserve space for an RRSET to all the maximum
records (around maximum 15 A RR's or so) and there is 15 or so types of
RR's in support with my program.   As you can guess the RRSET becomes huge.  

When designing the database for this I thought "hey what if I mmap this space
and only those parts that get written to it will actually be used".  And that's
what I did.  Only I run into a wall.  With close to 30000 RRSET's I get this
in top:

Memory: Real: 1233M/3468M act/tot Free: 28G Cache: 1197M Swap: 0K/32G

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
74465 _ddd       2    0   13G  163M sleep/1   select    0:01  0.00% delphinusdn
31148 _ddd       2    0   13G 1116K sleep/1   select    0:01  0.00% delphinusdn
...

So you see the SIZE is 13 GB, but the actual resident memory is 163M.  So now
I come to my final question:

        Is there a way to claim back part of that 13G?

If not is there a way to overcommit?  I googled on this and many people think
it's bad and I don't want to start a religious war right now.

In the worst case scenario I'll have to rewrite my underlying database, and I'm
a little lazy, if I do have to do that it won't be for another year+.

I looked at madvise(2) and I'm a little confused as to what I can use for this.

Best regards,
-peter

Reply | Threaded
Open this post in threaded view
|

Re: How to shrink the SIZE memory?

Otto Moerbeek
On Sat, Feb 09, 2019 at 11:34:41AM +0100, Peter J. Philipp wrote:

> Hi misc@,
>
> I have a program, a DNS server.  It has a database to hold internal data.
> Right now it's very inneficient in the way it uses memory.  Let me explain.
>
> If you know what an RRSET is it's all the RR records under one name.  Like in
> the OpenBSD.ORG name there is a SOA, NS, A RR's and so on.  That's an RRSET.
>
> Internally in my program I reserve space for an RRSET to all the maximum
> records (around maximum 15 A RR's or so) and there is 15 or so types of
> RR's in support with my program.   As you can guess the RRSET becomes huge.  
>
> When designing the database for this I thought "hey what if I mmap this space
> and only those parts that get written to it will actually be used".  And that's
> what I did.  Only I run into a wall.  With close to 30000 RRSET's I get this
> in top:
>
> Memory: Real: 1233M/3468M act/tot Free: 28G Cache: 1197M Swap: 0K/32G
>
>   PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
> 74465 _ddd       2    0   13G  163M sleep/1   select    0:01  0.00% delphinusdn
> 31148 _ddd       2    0   13G 1116K sleep/1   select    0:01  0.00% delphinusdn
> ..
>
> So you see the SIZE is 13 GB, but the actual resident memory is 163M.  So now
> I come to my final question:
>
> Is there a way to claim back part of that 13G?
>
> If not is there a way to overcommit?  I googled on this and many people think
> it's bad and I don't want to start a religious war right now.
>
> In the worst case scenario I'll have to rewrite my underlying database, and I'm
> a little lazy, if I do have to do that it won't be for another year+.
>
> I looked at madvise(2) and I'm a little confused as to what I can use for this.
>
> Best regards,
> -peter
>

Why is this a wall? Do your mmaps start failing? With what error code?

You should be able (given ulimits) to mmap up to MAXDSIZ (32G on
amd64) per process.

If you want to reduce SIZE, call munmap(). That's the only way. You
cannot have virtual memory pages that are not accounted for in SIZE.

        -Otto

Reply | Threaded
Open this post in threaded view
|

Re: How to shrink the SIZE memory?

Peter J. Philipp-3
On Sat, Feb 09, 2019 at 12:01:39PM +0100, Otto Moerbeek wrote:
> Why is this a wall? Do your mmaps start failing? With what error code?

Well 13G isn't the wall, but I had tried the entire /usr/share/dict/words as
A records which would have given more than 200K RRSET's which would have
blown up this SIZE considerably since the 30K RRSET's were 13G.  

The mmaps failed but due to a stupidity in my own code I could not glance the
real errno value (I'll show you why):

----->
              /* does not exist, create it */

                map = (char *)mmap(NULL, SIZENODE, PROT_READ|PROT_WRITE, MAP_PRI
VATE | MAP_ANON, -1, 0);
                if (map == MAP_FAILED) {
                        errno = EINVAL;
                        return -1;
                }

<-----

I should have logged it before reusing errno, I'll do some tests and get back
to you.  I am currently not at my workstation as I went and visited my parents
since writing the original mail, and I won't be back until monday or so, so any
changing of this code  will have to wait until then.

> You should be able (given ulimits) to mmap up to MAXDSIZ (32G on
> amd64) per process.

I noticed that RLIMIT_DATA limits had given me problems too.  Sizing them to
16G had allowed me to work with the 30K RRSETs.

> If you want to reduce SIZE, call munmap(). That's the only way. You
> cannot have virtual memory pages that are not accounted for in SIZE.
>
> -Otto

Ahh ok thanks for clearing that up.  It looks like I'll have to rewrite the
way I store internal data then, if I want to use a LOT of RRSETs in the future.
May be better for me too.

Thanks Otto!

Best Regards,
-peter

Reply | Threaded
Open this post in threaded view
|

Re: How to shrink the SIZE memory?

Otto Moerbeek
On Sat, Feb 09, 2019 at 12:39:37PM +0100, Peter J. Philipp wrote:

> On Sat, Feb 09, 2019 at 12:01:39PM +0100, Otto Moerbeek wrote:
> > Why is this a wall? Do your mmaps start failing? With what error code?
>
> Well 13G isn't the wall, but I had tried the entire /usr/share/dict/words as
> A records which would have given more than 200K RRSET's which would have
> blown up this SIZE considerably since the 30K RRSET's were 13G.  

So you're using around 433k per RRSET. That's a lot given that a
typical RRSET in the wild often is smaller than 100 bytes (no k
there). I understand to advantages of fixed size data structures, but
in this case it's not the right way.

        -Otto

>
> The mmaps failed but due to a stupidity in my own code I could not glance the
> real errno value (I'll show you why):
>
> ----->
>               /* does not exist, create it */
>
>                 map = (char *)mmap(NULL, SIZENODE, PROT_READ|PROT_WRITE, MAP_PRI
> VATE | MAP_ANON, -1, 0);
>                 if (map == MAP_FAILED) {
>                         errno = EINVAL;
>                         return -1;
>                 }
>
> <-----
>
> I should have logged it before reusing errno, I'll do some tests and get back
> to you.  I am currently not at my workstation as I went and visited my parents
> since writing the original mail, and I won't be back until monday or so, so any
> changing of this code  will have to wait until then.
>
> > You should be able (given ulimits) to mmap up to MAXDSIZ (32G on
> > amd64) per process.
>
> I noticed that RLIMIT_DATA limits had given me problems too.  Sizing them to
> 16G had allowed me to work with the 30K RRSETs.
>
> > If you want to reduce SIZE, call munmap(). That's the only way. You
> > cannot have virtual memory pages that are not accounted for in SIZE.
> >
> > -Otto
>
> Ahh ok thanks for clearing that up.  It looks like I'll have to rewrite the
> way I store internal data then, if I want to use a LOT of RRSETs in the future.
> May be better for me too.
>
> Thanks Otto!
>
> Best Regards,
> -peter
>

Reply | Threaded
Open this post in threaded view
|

Re: How to shrink the SIZE memory?

Peter J. Philipp-3
On Sat, Feb 09, 2019 at 03:15:30PM +0100, Otto Moerbeek wrote:

> On Sat, Feb 09, 2019 at 12:39:37PM +0100, Peter J. Philipp wrote:
>
> > On Sat, Feb 09, 2019 at 12:01:39PM +0100, Otto Moerbeek wrote:
> > > Why is this a wall? Do your mmaps start failing? With what error code?
> >
> > Well 13G isn't the wall, but I had tried the entire /usr/share/dict/words as
> > A records which would have given more than 200K RRSET's which would have
> > blown up this SIZE considerably since the 30K RRSET's were 13G.  
>
> So you're using around 433k per RRSET. That's a lot given that a
> typical RRSET in the wild often is smaller than 100 bytes (no k
> there). I understand to advantages of fixed size data structures, but
> in this case it's not the right way.
>
> -Otto

Thanks Otto.  I doodled up a plan in the last few hours, here is the result:
       
        https://centroid.eu/blog/c?article=1549722619

I'm not going to be rash about this, but I may start next week with this this
new concept.   Do you think an RRSET like this would be space saving and still
be fast enough?  I know I'm losing a bit of speed with all the TAILQ's but I
think I'll work this out with the tree(3) and queue(3) macros.  

It's been a long way on getting the database right over the last 15 years on
this daemon.  I started with BerkeleyDB backend, and replaced the BDB about
2 years ago with the tree(3).  Now hopefully I'll get it right finally.  This
year is already gonna rock in terms of milestones, hehe.

Best Regards,
-peter

> >
> > The mmaps failed but due to a stupidity in my own code I could not glance the
> > real errno value (I'll show you why):
> >
> > ----->
> >               /* does not exist, create it */
> >
> >                 map = (char *)mmap(NULL, SIZENODE, PROT_READ|PROT_WRITE, MAP_PRI
> > VATE | MAP_ANON, -1, 0);
> >                 if (map == MAP_FAILED) {
> >                         errno = EINVAL;
> >                         return -1;
> >                 }
> >
> > <-----
> >
> > I should have logged it before reusing errno, I'll do some tests and get back
> > to you.  I am currently not at my workstation as I went and visited my parents
> > since writing the original mail, and I won't be back until monday or so, so any
> > changing of this code  will have to wait until then.
> >
> > > You should be able (given ulimits) to mmap up to MAXDSIZ (32G on
> > > amd64) per process.
> >
> > I noticed that RLIMIT_DATA limits had given me problems too.  Sizing them to
> > 16G had allowed me to work with the 30K RRSETs.
> >
> > > If you want to reduce SIZE, call munmap(). That's the only way. You
> > > cannot have virtual memory pages that are not accounted for in SIZE.
> > >
> > > -Otto
> >
> > Ahh ok thanks for clearing that up.  It looks like I'll have to rewrite the
> > way I store internal data then, if I want to use a LOT of RRSETs in the future.
> > May be better for me too.
> >
> > Thanks Otto!
> >
> > Best Regards,
> > -peter
> >

Reply | Threaded
Open this post in threaded view
|

Re: How to shrink the SIZE memory?

Peter J. Philipp-3
In reply to this post by Otto Moerbeek
On Sat, Feb 09, 2019 at 03:15:30PM +0100, Otto Moerbeek wrote:

> On Sat, Feb 09, 2019 at 12:39:37PM +0100, Peter J. Philipp wrote:
>
> > On Sat, Feb 09, 2019 at 12:01:39PM +0100, Otto Moerbeek wrote:
> > > Why is this a wall? Do your mmaps start failing? With what error code?
> >
> > Well 13G isn't the wall, but I had tried the entire /usr/share/dict/words as
> > A records which would have given more than 200K RRSET's which would have
> > blown up this SIZE considerably since the 30K RRSET's were 13G.  
>
> So you're using around 433k per RRSET. That's a lot given that a
> typical RRSET in the wild often is smaller than 100 bytes (no k
> there). I understand to advantages of fixed size data structures, but
> in this case it's not the right way.
>
> -Otto

Hi Otto & Misc,

Good news I was programming all week long and managed to compact the database.
I still have to test the works of it as I'm sure there is a few bugs lurking,
but at first glance it looks like I was able to shrink the size of an RRSET
down to ~1800 bytes, and it's pretty fast too.  I added all of
/usr/share/dict/words as A RR's and thus as their own RRSET to the existing
30K I had and the memory footprint was around 456M virtual size.  I'm very
happy I did this, and not the problematic query I had in the beginning of
this thread.

Thanks Otto!

Regards,
-peter