Teach du(1) the -m flag, disk usage in megabytes

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Teach du(1) the -m flag, disk usage in megabytes

David Goerger
This diff teaches du(1) the -m flag, report disk usage in megabytes.
This brings us in line with implementations in the other BSDs, Linux,
and Illumos.

Other base utilities where this flag might be useful include df(1)
and quot(8), although it doesn't appear to be universally adopted
among the other implementations. That said I can definitely cook
up a diff if others would find the flag useful elsewhere. In
particular I think the flag would be useful in quot(8), but that
tool has an unfortunate legacy, discouraged option "-h" which
conflicts with -k/-m/-h semantics elsewhere in base, such that
adding "-m" to quot(8) might only invite confusion.

Many thanks to florian@ for a first-pass review on bsd.network, and
for encouraging me to check out what the other BSDs and utilities in
base do, so that we maintain consistency across the ecosystem.

This is my first patch submission---any pointers for improvement would
be greatly appreciated! Thanks!

(diff below and also attached as "du-megabytes.diff" in case my mail
client mangles formatting; hopefully I got this right!)

---

Index: du.1
===================================================================
RCS file: /cvs/src/usr.bin/du/du.1,v
retrieving revision 1.35
diff -u -p -r1.35 du.1
--- du.1 2 Sep 2019 21:18:41 -0000 1.35
+++ du.1 25 Jan 2020 20:52:11 -0000
@@ -38,7 +38,7 @@
 .Nd display disk usage statistics
 .Sh SYNOPSIS
 .Nm du
-.Op Fl achkrsx
+.Op Fl achkmrsx
 .Op Fl H | L | P
 .Op Fl d Ar depth
 .Op Ar
@@ -86,6 +86,10 @@ By default, all sizes are reported in 51
 The
 .Fl k
 option causes the numbers to be reported in kilobyte counts.
+.It Fl m
+Similar to
+.Fl k ,
+but report disk usage in megabytes.
 .It Fl L
 All symbolic links are followed.
 .It Fl P
Index: du.c
===================================================================
RCS file: /cvs/src/usr.bin/du/du.c,v
retrieving revision 1.32
diff -u -p -r1.32 du.c
--- du.c 24 Aug 2016 03:13:45 -0000 1.32
+++ du.c 25 Jan 2020 20:52:31 -0000
@@ -61,7 +61,7 @@ main(int argc, char *argv[])
        long blocksize;
        int64_t totalblocks;
        int ftsoptions, listfiles, maxdepth;
- int Hflag, Lflag, cflag, hflag, kflag;
+ int Hflag, Lflag, cflag, hflag, kflag, mflag;
        int ch, notused, rval;
        char **save;
        const char *errstr;
@@ -70,11 +70,11 @@ main(int argc, char *argv[])
                err(1, "pledge");

        save = argv;
- Hflag = Lflag = cflag = hflag = kflag = listfiles = 0;
+ Hflag = Lflag = cflag = hflag = kflag = listfiles = mflag = 0;
        totalblocks = 0;
        ftsoptions = FTS_PHYSICAL;
        maxdepth = -1;
- while ((ch = getopt(argc, argv, "HLPacd:hkrsx")) != -1)
+ while ((ch = getopt(argc, argv, "HLPacd:hkmrsx")) != -1)
                switch (ch) {
                case 'H':
                        Hflag = 1;
@@ -103,10 +103,17 @@ main(int argc, char *argv[])
                case 'h':
                        hflag = 1;
                        kflag = 0;
+ mflag = 0;
                        break;
                case 'k':
                        kflag = 1;
                        hflag = 0;
+ mflag = 0;
+ break;
+ case 'm':
+ kflag = 0;
+ hflag = 0;
+ mflag = 1;
                        break;
                case 's':
                        maxdepth = 0;
@@ -155,6 +162,8 @@ main(int argc, char *argv[])
                blocksize = 512;
        else if (kflag)
                blocksize = 1024;
+ else if (mflag)
+ blocksize = 1048576;
        else
                (void)getbsize(&notused, &blocksize);
        blocksize /= 512;
@@ -320,6 +329,6 @@ usage(void)
 {

        (void)fprintf(stderr,
-    "usage: du [-achkrsx] [-H | -L | -P] [-d depth] [file ...]\n");
+    "usage: du [-achkmrsx] [-H | -L | -P] [-d depth] [file ...]\n");
        exit(1);
 }

du-megabytes.diff (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Sebastian Benoit-3
Maybe the manpage text could be better, but i'll leave that to jmc@

ok benno@

David Goerger([hidden email]) on 2020.01.26 11:59:33 -0500:

> This diff teaches du(1) the -m flag, report disk usage in megabytes.
> This brings us in line with implementations in the other BSDs, Linux,
> and Illumos.
>
> Other base utilities where this flag might be useful include df(1)
> and quot(8), although it doesn't appear to be universally adopted
> among the other implementations. That said I can definitely cook
> up a diff if others would find the flag useful elsewhere. In
> particular I think the flag would be useful in quot(8), but that
> tool has an unfortunate legacy, discouraged option "-h" which
> conflicts with -k/-m/-h semantics elsewhere in base, such that
> adding "-m" to quot(8) might only invite confusion.
>
> Many thanks to florian@ for a first-pass review on bsd.network, and
> for encouraging me to check out what the other BSDs and utilities in
> base do, so that we maintain consistency across the ecosystem.
>
> This is my first patch submission---any pointers for improvement would
> be greatly appreciated! Thanks!
>
> (diff below and also attached as "du-megabytes.diff" in case my mail
> client mangles formatting; hopefully I got this right!)
>
> ---
>
> Index: du.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 du.1
> --- du.1 2 Sep 2019 21:18:41 -0000 1.35
> +++ du.1 25 Jan 2020 20:52:11 -0000
> @@ -38,7 +38,7 @@
>  .Nd display disk usage statistics
>  .Sh SYNOPSIS
>  .Nm du
> -.Op Fl achkrsx
> +.Op Fl achkmrsx
>  .Op Fl H | L | P
>  .Op Fl d Ar depth
>  .Op Ar
> @@ -86,6 +86,10 @@ By default, all sizes are reported in 51
>  The
>  .Fl k
>  option causes the numbers to be reported in kilobyte counts.
> +.It Fl m
> +Similar to
> +.Fl k ,
> +but report disk usage in megabytes.
>  .It Fl L
>  All symbolic links are followed.
>  .It Fl P
> Index: du.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 du.c
> --- du.c 24 Aug 2016 03:13:45 -0000 1.32
> +++ du.c 25 Jan 2020 20:52:31 -0000
> @@ -61,7 +61,7 @@ main(int argc, char *argv[])
> long blocksize;
> int64_t totalblocks;
> int ftsoptions, listfiles, maxdepth;
> - int Hflag, Lflag, cflag, hflag, kflag;
> + int Hflag, Lflag, cflag, hflag, kflag, mflag;
> int ch, notused, rval;
> char **save;
> const char *errstr;
> @@ -70,11 +70,11 @@ main(int argc, char *argv[])
> err(1, "pledge");
>
> save = argv;
> - Hflag = Lflag = cflag = hflag = kflag = listfiles = 0;
> + Hflag = Lflag = cflag = hflag = kflag = listfiles = mflag = 0;
> totalblocks = 0;
> ftsoptions = FTS_PHYSICAL;
> maxdepth = -1;
> - while ((ch = getopt(argc, argv, "HLPacd:hkrsx")) != -1)
> + while ((ch = getopt(argc, argv, "HLPacd:hkmrsx")) != -1)
> switch (ch) {
> case 'H':
> Hflag = 1;
> @@ -103,10 +103,17 @@ main(int argc, char *argv[])
> case 'h':
> hflag = 1;
> kflag = 0;
> + mflag = 0;
> break;
> case 'k':
> kflag = 1;
> hflag = 0;
> + mflag = 0;
> + break;
> + case 'm':
> + kflag = 0;
> + hflag = 0;
> + mflag = 1;
> break;
> case 's':
> maxdepth = 0;
> @@ -155,6 +162,8 @@ main(int argc, char *argv[])
> blocksize = 512;
> else if (kflag)
> blocksize = 1024;
> + else if (mflag)
> + blocksize = 1048576;
> else
> (void)getbsize(&notused, &blocksize);
> blocksize /= 512;
> @@ -320,6 +329,6 @@ usage(void)
>  {
>
> (void)fprintf(stderr,
> -    "usage: du [-achkrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> +    "usage: du [-achkmrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> exit(1);
>  }

> Index: du.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 du.1
> --- du.1 2 Sep 2019 21:18:41 -0000 1.35
> +++ du.1 25 Jan 2020 20:52:11 -0000
> @@ -38,7 +38,7 @@
>  .Nd display disk usage statistics
>  .Sh SYNOPSIS
>  .Nm du
> -.Op Fl achkrsx
> +.Op Fl achkmrsx
>  .Op Fl H | L | P
>  .Op Fl d Ar depth
>  .Op Ar
> @@ -86,6 +86,10 @@ By default, all sizes are reported in 51
>  The
>  .Fl k
>  option causes the numbers to be reported in kilobyte counts.
> +.It Fl m
> +Similar to
> +.Fl k ,
> +but report disk usage in megabytes.
>  .It Fl L
>  All symbolic links are followed.
>  .It Fl P
> Index: du.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 du.c
> --- du.c 24 Aug 2016 03:13:45 -0000 1.32
> +++ du.c 25 Jan 2020 20:52:31 -0000
> @@ -61,7 +61,7 @@ main(int argc, char *argv[])
>   long blocksize;
>   int64_t totalblocks;
>   int ftsoptions, listfiles, maxdepth;
> - int Hflag, Lflag, cflag, hflag, kflag;
> + int Hflag, Lflag, cflag, hflag, kflag, mflag;
>   int ch, notused, rval;
>   char **save;
>   const char *errstr;
> @@ -70,11 +70,11 @@ main(int argc, char *argv[])
>   err(1, "pledge");
>  
>   save = argv;
> - Hflag = Lflag = cflag = hflag = kflag = listfiles = 0;
> + Hflag = Lflag = cflag = hflag = kflag = listfiles = mflag = 0;
>   totalblocks = 0;
>   ftsoptions = FTS_PHYSICAL;
>   maxdepth = -1;
> - while ((ch = getopt(argc, argv, "HLPacd:hkrsx")) != -1)
> + while ((ch = getopt(argc, argv, "HLPacd:hkmrsx")) != -1)
>   switch (ch) {
>   case 'H':
>   Hflag = 1;
> @@ -103,10 +103,17 @@ main(int argc, char *argv[])
>   case 'h':
>   hflag = 1;
>   kflag = 0;
> + mflag = 0;
>   break;
>   case 'k':
>   kflag = 1;
>   hflag = 0;
> + mflag = 0;
> + break;
> + case 'm':
> + kflag = 0;
> + hflag = 0;
> + mflag = 1;
>   break;
>   case 's':
>   maxdepth = 0;
> @@ -155,6 +162,8 @@ main(int argc, char *argv[])
>   blocksize = 512;
>   else if (kflag)
>   blocksize = 1024;
> + else if (mflag)
> + blocksize = 1048576;
>   else
>   (void)getbsize(&notused, &blocksize);
>   blocksize /= 512;
> @@ -320,6 +329,6 @@ usage(void)
>  {
>  
>   (void)fprintf(stderr,
> -    "usage: du [-achkrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> +    "usage: du [-achkmrsx] [-H | -L | -P] [-d depth] [file ...]\n");
>   exit(1);
>  }

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Jonathan Gray-11
In reply to this post by David Goerger
On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> This diff teaches du(1) the -m flag, report disk usage in megabytes.
> This brings us in line with implementations in the other BSDs, Linux,
> and Illumos.

Why is it needed?  -k is required by POSIX, adding arguments for
megabytes, gigabytes, terabytes, petabytes etc seems silly when
there is already 512 byte blocks, kilobytes and -h output.

>
> Other base utilities where this flag might be useful include df(1)
> and quot(8), although it doesn't appear to be universally adopted
> among the other implementations. That said I can definitely cook
> up a diff if others would find the flag useful elsewhere. In
> particular I think the flag would be useful in quot(8), but that
> tool has an unfortunate legacy, discouraged option "-h" which
> conflicts with -k/-m/-h semantics elsewhere in base, such that
> adding "-m" to quot(8) might only invite confusion.
>
> Many thanks to florian@ for a first-pass review on bsd.network, and
> for encouraging me to check out what the other BSDs and utilities in
> base do, so that we maintain consistency across the ecosystem.
>
> This is my first patch submission---any pointers for improvement would
> be greatly appreciated! Thanks!
>
> (diff below and also attached as "du-megabytes.diff" in case my mail
> client mangles formatting; hopefully I got this right!)
>
> ---
>
> Index: du.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 du.1
> --- du.1 2 Sep 2019 21:18:41 -0000 1.35
> +++ du.1 25 Jan 2020 20:52:11 -0000
> @@ -38,7 +38,7 @@
>  .Nd display disk usage statistics
>  .Sh SYNOPSIS
>  .Nm du
> -.Op Fl achkrsx
> +.Op Fl achkmrsx
>  .Op Fl H | L | P
>  .Op Fl d Ar depth
>  .Op Ar
> @@ -86,6 +86,10 @@ By default, all sizes are reported in 51
>  The
>  .Fl k
>  option causes the numbers to be reported in kilobyte counts.
> +.It Fl m
> +Similar to
> +.Fl k ,
> +but report disk usage in megabytes.
>  .It Fl L
>  All symbolic links are followed.
>  .It Fl P
> Index: du.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 du.c
> --- du.c 24 Aug 2016 03:13:45 -0000 1.32
> +++ du.c 25 Jan 2020 20:52:31 -0000
> @@ -61,7 +61,7 @@ main(int argc, char *argv[])
> long blocksize;
> int64_t totalblocks;
> int ftsoptions, listfiles, maxdepth;
> - int Hflag, Lflag, cflag, hflag, kflag;
> + int Hflag, Lflag, cflag, hflag, kflag, mflag;
> int ch, notused, rval;
> char **save;
> const char *errstr;
> @@ -70,11 +70,11 @@ main(int argc, char *argv[])
> err(1, "pledge");
>
> save = argv;
> - Hflag = Lflag = cflag = hflag = kflag = listfiles = 0;
> + Hflag = Lflag = cflag = hflag = kflag = listfiles = mflag = 0;
> totalblocks = 0;
> ftsoptions = FTS_PHYSICAL;
> maxdepth = -1;
> - while ((ch = getopt(argc, argv, "HLPacd:hkrsx")) != -1)
> + while ((ch = getopt(argc, argv, "HLPacd:hkmrsx")) != -1)
> switch (ch) {
> case 'H':
> Hflag = 1;
> @@ -103,10 +103,17 @@ main(int argc, char *argv[])
> case 'h':
> hflag = 1;
> kflag = 0;
> + mflag = 0;
> break;
> case 'k':
> kflag = 1;
> hflag = 0;
> + mflag = 0;
> + break;
> + case 'm':
> + kflag = 0;
> + hflag = 0;
> + mflag = 1;
> break;
> case 's':
> maxdepth = 0;
> @@ -155,6 +162,8 @@ main(int argc, char *argv[])
> blocksize = 512;
> else if (kflag)
> blocksize = 1024;
> + else if (mflag)
> + blocksize = 1048576;
> else
> (void)getbsize(&notused, &blocksize);
> blocksize /= 512;
> @@ -320,6 +329,6 @@ usage(void)
>  {
>
> (void)fprintf(stderr,
> -    "usage: du [-achkrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> +    "usage: du [-achkmrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> exit(1);
>  }

> Index: du.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 du.1
> --- du.1 2 Sep 2019 21:18:41 -0000 1.35
> +++ du.1 25 Jan 2020 20:52:11 -0000
> @@ -38,7 +38,7 @@
>  .Nd display disk usage statistics
>  .Sh SYNOPSIS
>  .Nm du
> -.Op Fl achkrsx
> +.Op Fl achkmrsx
>  .Op Fl H | L | P
>  .Op Fl d Ar depth
>  .Op Ar
> @@ -86,6 +86,10 @@ By default, all sizes are reported in 51
>  The
>  .Fl k
>  option causes the numbers to be reported in kilobyte counts.
> +.It Fl m
> +Similar to
> +.Fl k ,
> +but report disk usage in megabytes.
>  .It Fl L
>  All symbolic links are followed.
>  .It Fl P
> Index: du.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 du.c
> --- du.c 24 Aug 2016 03:13:45 -0000 1.32
> +++ du.c 25 Jan 2020 20:52:31 -0000
> @@ -61,7 +61,7 @@ main(int argc, char *argv[])
>   long blocksize;
>   int64_t totalblocks;
>   int ftsoptions, listfiles, maxdepth;
> - int Hflag, Lflag, cflag, hflag, kflag;
> + int Hflag, Lflag, cflag, hflag, kflag, mflag;
>   int ch, notused, rval;
>   char **save;
>   const char *errstr;
> @@ -70,11 +70,11 @@ main(int argc, char *argv[])
>   err(1, "pledge");
>  
>   save = argv;
> - Hflag = Lflag = cflag = hflag = kflag = listfiles = 0;
> + Hflag = Lflag = cflag = hflag = kflag = listfiles = mflag = 0;
>   totalblocks = 0;
>   ftsoptions = FTS_PHYSICAL;
>   maxdepth = -1;
> - while ((ch = getopt(argc, argv, "HLPacd:hkrsx")) != -1)
> + while ((ch = getopt(argc, argv, "HLPacd:hkmrsx")) != -1)
>   switch (ch) {
>   case 'H':
>   Hflag = 1;
> @@ -103,10 +103,17 @@ main(int argc, char *argv[])
>   case 'h':
>   hflag = 1;
>   kflag = 0;
> + mflag = 0;
>   break;
>   case 'k':
>   kflag = 1;
>   hflag = 0;
> + mflag = 0;
> + break;
> + case 'm':
> + kflag = 0;
> + hflag = 0;
> + mflag = 1;
>   break;
>   case 's':
>   maxdepth = 0;
> @@ -155,6 +162,8 @@ main(int argc, char *argv[])
>   blocksize = 512;
>   else if (kflag)
>   blocksize = 1024;
> + else if (mflag)
> + blocksize = 1048576;
>   else
>   (void)getbsize(&notused, &blocksize);
>   blocksize /= 512;
> @@ -320,6 +329,6 @@ usage(void)
>  {
>  
>   (void)fprintf(stderr,
> -    "usage: du [-achkrsx] [-H | -L | -P] [-d depth] [file ...]\n");
> +    "usage: du [-achkmrsx] [-H | -L | -P] [-d depth] [file ...]\n");
>   exit(1);
>  }

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Ted Unangst-6
Jonathan Gray wrote:
> On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > This brings us in line with implementations in the other BSDs, Linux,
> > and Illumos.
>
> Why is it needed?  -k is required by POSIX, adding arguments for
> megabytes, gigabytes, terabytes, petabytes etc seems silly when
> there is already 512 byte blocks, kilobytes and -h output.

there is also env BLOCKSIZE=1m although it's unwieldy to use environment.

freebsd and linux both have -B blocksize which would be more flexible.

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

David Goerger
In reply to this post by Jonathan Gray-11
Monday, 20200127 10:05+1100, Jonathan Gray wrote:
> On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > This brings us in line with implementations in the other BSDs, Linux,
> > and Illumos.
>
> Why is it needed?  -k is required by POSIX, adding arguments for
> megabytes, gigabytes, terabytes, petabytes etc seems silly when
> there is already 512 byte blocks, kilobytes and -h output.

It's a fair question. My reasoning is two-fold:

(1) FreeBSD, NetBSD, Linux, and Illumos all support the "-m" flag, and
it's helpful when one can use the same flags/scripts across different
systems without surprises.

While both FreeBSD and NetBSD also implement "-g" (gigabytes), other
systems don't, and it's not an itch I have to scratch. However it's
easy to add if we decide we want it.

Presently only Linux easily supports blocksizes larger than a
gigabyte, via e.g. "-B 1T" as noted by tedu@. FreeBSD also supports
"-B", but the argument must be an integer number of bytes[1], e.g. "-B
1099511627776" (1024^4) for 1 terabyte, which is a bit different and
feels less natural for everyday use. My feeling is that simple,
common, and hard-to-misuse flags like "-m" and "-g" are more useful
than an interface which requires most users first open a calculator.

(2) We currently support 512-byte (default), kilobyte (-k), arbitrary
BLOCKSIZE up to 1g per environ(7), and human-readable (-h), but only
the first three can be piped to sort(1) e.g. when investigating filled
disk scenarios when one wants to determine the largest
files/directories on the disk (human-readable doesn't use the same
scale for all outputs, so it can't be sorted as easily). In this case
I simply find it easier to conceptualize thousands of megabytes than I
do millions of kilobytes. For example:

$ BLOCKSIZE=1k du -d0 * | sort -nr | head -n 5
17541678        Audiobooks
9513850 Music
1991678 Documents
1638976 Books
223872  mbox
$ BLOCKSIZE=1m du -d0 * | sort -nr | head -n 5
17131   Audiobooks
9291    Music
1945    Documents
1601    Books
219     mbox

The environment variable is a bit clunky to pass around and I think a
flag like "-m" would represent a nice usability improvement.

Thanks!
David

---

references

[1]
https://svnweb.freebsd.org/base/head/usr.bin/du/du.c?revision=326025&view=markup#l130

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Todd C. Miller-3
In reply to this post by Jonathan Gray-11
On Mon, 27 Jan 2020 10:05:41 +1100, Jonathan Gray wrote:

> On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > This brings us in line with implementations in the other BSDs, Linux,
> > and Illumos.
>
> Why is it needed?  -k is required by POSIX, adding arguments for
> megabytes, gigabytes, terabytes, petabytes etc seems silly when
> there is already 512 byte blocks, kilobytes and -h output.

It is useful in conjunction with sort.  For example, I often do:

    du -sk * | sort -rn | head

to see the largest disk users.

However, output in kilobytes is less useful than it used to be due
to larger files now being common.  Using the BLOCKSIZE env var is
more flexible but is cumbersome to use and not portable.

 - todd

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Florian Obser-2
On Mon, Jan 27, 2020 at 10:33:49AM -0700, Todd C. Miller wrote:

> On Mon, 27 Jan 2020 10:05:41 +1100, Jonathan Gray wrote:
>
> > On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > > This brings us in line with implementations in the other BSDs, Linux,
> > > and Illumos.
> >
> > Why is it needed?  -k is required by POSIX, adding arguments for
> > megabytes, gigabytes, terabytes, petabytes etc seems silly when
> > there is already 512 byte blocks, kilobytes and -h output.
>
> It is useful in conjunction with sort.  For example, I often do:
>
>     du -sk * | sort -rn | head
>
> to see the largest disk users.

Me, too. Given its wide spread adoption I think we should implement it.
OK florian@

>
> However, output in kilobytes is less useful than it used to be due
> to larger files now being common.  Using the BLOCKSIZE env var is
> more flexible but is cumbersome to use and not portable.
>
>  - todd
>

--
I'm not entirely sure you are real.

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Sebastian Benoit-3
Florian Obser([hidden email]) on 2020.01.27 19:57:41 +0100:

> On Mon, Jan 27, 2020 at 10:33:49AM -0700, Todd C. Miller wrote:
> > On Mon, 27 Jan 2020 10:05:41 +1100, Jonathan Gray wrote:
> >
> > > On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > > > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > > > This brings us in line with implementations in the other BSDs, Linux,
> > > > and Illumos.
> > >
> > > Why is it needed?  -k is required by POSIX, adding arguments for
> > > megabytes, gigabytes, terabytes, petabytes etc seems silly when
> > > there is already 512 byte blocks, kilobytes and -h output.
> >
> > It is useful in conjunction with sort.  For example, I often do:
> >
> >     du -sk * | sort -rn | head
> >
> > to see the largest disk users.
>
> Me, too. Given its wide spread adoption I think we should implement it.
> OK florian@
>
> >
> > However, output in kilobytes is less useful than it used to be due
> > to larger files now being common.  Using the BLOCKSIZE env var is
> > more flexible but is cumbersome to use and not portable.
> >
> >  - todd

Every time my laptop disk gets full i use a horrible awk construct
to make things readable. I would certainly like to see this.

Do we need -B for that? I dont think so...

I'm happy to commit this if nobody complains ;)

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Daniel Jakots-6
In reply to this post by Todd C. Miller-3
On Mon, 27 Jan 2020 10:33:49 -0700, Todd C. Miller
<[hidden email]> wrote:

> For example, I often do:
>
>     du -sk * | sort -rn | head
>
> to see the largest disk users.
>
> However, output in kilobytes is less useful than it used to be due
> to larger files now being common.

Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
options will automatically select the best suffix and sort(1)'s -h
will sort first using the suffix then the numerical value.

Also if you don't sort -r, you don't need to `| head`.

Cheers,
Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Jonathan Gray-11
In reply to this post by Sebastian Benoit-3
On Mon, Jan 27, 2020 at 10:34:56PM +0100, Sebastian Benoit wrote:

> Florian Obser([hidden email]) on 2020.01.27 19:57:41 +0100:
> > On Mon, Jan 27, 2020 at 10:33:49AM -0700, Todd C. Miller wrote:
> > > On Mon, 27 Jan 2020 10:05:41 +1100, Jonathan Gray wrote:
> > >
> > > > On Sun, Jan 26, 2020 at 11:59:33AM -0500, David Goerger wrote:
> > > > > This diff teaches du(1) the -m flag, report disk usage in megabytes.
> > > > > This brings us in line with implementations in the other BSDs, Linux,
> > > > > and Illumos.
> > > >
> > > > Why is it needed?  -k is required by POSIX, adding arguments for
> > > > megabytes, gigabytes, terabytes, petabytes etc seems silly when
> > > > there is already 512 byte blocks, kilobytes and -h output.
> > >
> > > It is useful in conjunction with sort.  For example, I often do:
> > >
> > >     du -sk * | sort -rn | head
> > >
> > > to see the largest disk users.
> >
> > Me, too. Given its wide spread adoption I think we should implement it.
> > OK florian@
> >
> > >
> > > However, output in kilobytes is less useful than it used to be due
> > > to larger files now being common.  Using the BLOCKSIZE env var is
> > > more flexible but is cumbersome to use and not portable.
> > >
> > >  - todd
>
> Every time my laptop disk gets full i use a horrible awk construct
> to make things readable. I would certainly like to see this.
>
> Do we need -B for that? I dont think so...
>
> I'm happy to commit this if nobody complains ;)

There are several commands which have a -k flag for scaling to
kilobytes.

For example:
df
du
ls
pstat
quot
swapctl

Going down the list of unit names kmgtpezy some of these flags are
already used.  So it would be hard to consistently use -m where -k is
used.  Adding a flag to take a scalling factor, even if just using an
additional char like 'm' to index into a table of scaling factors would
use less flags.

Though not all of these combinations and uses make sense, and commands
could be left inconsistent and not use additional scaling factors other
commands use.

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Todd C. Miller-3
On Tue, 28 Jan 2020 15:00:39 +1100, Jonathan Gray wrote:

> There are several commands which have a -k flag for scaling to
> kilobytes.
>
> For example:
> df
> du
> ls
> pstat
> quot
> swapctl
>
> Going down the list of unit names kmgtpezy some of these flags are
> already used.  So it would be hard to consistently use -m where -k is
> used.  Adding a flag to take a scalling factor, even if just using an
> additional char like 'm' to index into a table of scaling factors would
> use less flags.

That's a fair point.  It's not really reasonable to use a separate
option letter for every possible scaling factor.  The Linux du -B
option seems close to what we want, though it also appends a unit
for things like "du -Bm" which breaks sorting (though du -B1m works).

 - todd

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Todd C. Miller-3
In reply to this post by Daniel Jakots-6
On Mon, 27 Jan 2020 18:29:39 -0500, Daniel Jakots wrote:

> Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
> options will automatically select the best suffix and sort(1)'s -h
> will sort first using the suffix then the numerical value.

Yes, I forgot about "sort -h".  Old habits die hard :-)

 - todd

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Florian Obser-2
On Tue, Jan 28, 2020 at 09:58:40AM -0700, Todd C. Miller wrote:
> On Mon, 27 Jan 2020 18:29:39 -0500, Daniel Jakots wrote:
>
> > Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
> > options will automatically select the best suffix and sort(1)'s -h
> > will sort first using the suffix then the numerical value.
>
> Yes, I forgot about "sort -h".  Old habits die hard :-)

... which is not in posix, netbsd nor illumos.


>
>  - todd
>

--
I'm not entirely sure you are real.

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Lauri Tirkkonen-3
On Tue, Jan 28 2020 18:03:19 +0100, Florian Obser wrote:

> On Tue, Jan 28, 2020 at 09:58:40AM -0700, Todd C. Miller wrote:
> > On Mon, 27 Jan 2020 18:29:39 -0500, Daniel Jakots wrote:
> >
> > > Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
> > > options will automatically select the best suffix and sort(1)'s -h
> > > will sort first using the suffix then the numerical value.
> >
> > Yes, I forgot about "sort -h".  Old habits die hard :-)
>
> ... which is not in posix, netbsd nor illumos.

So, do you think that 'du -m' will be in all those then? POSIX doesn't
have it [0].

The way I see it, the entire conversation in this thread is about doing
things that might be useful to people. IMO, arguing about where
extensions are or aren't implemented isn't productive.

[0]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/du.html

--
Lauri Tirkkonen | lotheac @ IRCnet

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Sebastian Benoit-3
Lauri Tirkkonen([hidden email]) on 2020.01.29 01:31:56 +0200:

> On Tue, Jan 28 2020 18:03:19 +0100, Florian Obser wrote:
> > On Tue, Jan 28, 2020 at 09:58:40AM -0700, Todd C. Miller wrote:
> > > On Mon, 27 Jan 2020 18:29:39 -0500, Daniel Jakots wrote:
> > >
> > > > Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
> > > > options will automatically select the best suffix and sort(1)'s -h
> > > > will sort first using the suffix then the numerical value.
> > >
> > > Yes, I forgot about "sort -h".  Old habits die hard :-)
> >
> > ... which is not in posix, netbsd nor illumos.
>
> So, do you think that 'du -m' will be in all those then? POSIX doesn't
> have it [0].
>
> The way I see it, the entire conversation in this thread is about doing
> things that might be useful to people. IMO, arguing about where
> extensions are or aren't implemented isn't productive.

Yes it is. We try to avoid adding options whereever we can, because

* every option letter we grab thats not used for the same purpose elsewhere
  can crate problems down the road
* more options means more bugs

B

>
> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/du.html
>
> --
> Lauri Tirkkonen | lotheac @ IRCnet
>

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Lauri Tirkkonen-3
On Wed, Jan 29 2020 12:25:34 +0100, Sebastian Benoit wrote:

> Lauri Tirkkonen([hidden email]) on 2020.01.29 01:31:56 +0200:
> > On Tue, Jan 28 2020 18:03:19 +0100, Florian Obser wrote:
> > > On Tue, Jan 28, 2020 at 09:58:40AM -0700, Todd C. Miller wrote:
> > > > On Mon, 27 Jan 2020 18:29:39 -0500, Daniel Jakots wrote:
> > > >
> > > > > Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s -h
> > > > > options will automatically select the best suffix and sort(1)'s -h
> > > > > will sort first using the suffix then the numerical value.
> > > >
> > > > Yes, I forgot about "sort -h".  Old habits die hard :-)
> > >
> > > ... which is not in posix, netbsd nor illumos.
> >
> > So, do you think that 'du -m' will be in all those then? POSIX doesn't
> > have it [0].
> >
> > The way I see it, the entire conversation in this thread is about doing
> > things that might be useful to people. IMO, arguing about where
> > extensions are or aren't implemented isn't productive.
>
> Yes it is. We try to avoid adding options whereever we can, because
>
> * every option letter we grab thats not used for the same purpose elsewhere
>   can crate problems down the road
> * more options means more bugs

Yeah, I agree with this completely -- it was in fact kind of my point:
why add -m to du, if the usecase it helps with is already handled by
sort -h? I think it's a bit of a moot point whether sort -h is supported
elsewhere or not.

I concede that I was not speaking clearly enough; there might have been
some beer that was doing the talking. Apologies for that and for
shouting from the gallery :)

--
Lauri Tirkkonen | lotheac @ IRCnet

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

David Goerger
In reply to this post by Daniel Jakots-6
Monday, 20200127 18:29-0500, Daniel Jakots wrote:
> Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s
> -h options will automatically select the best suffix and sort(1)'s
> -h will sort first using the suffix then the numerical value.

Thanks! I didn't know about "sort -h". That indeed does what I want,
and is a bit more readable (e.g. 8G instead of the quick mental math
in evaluating 8192M). Like Todd said, old habits die hard. And at
least in my case, I'm pleasantly surprised any time a tool features
smart extensions and I don't have to manipulate arrays of raw
integers. :)

Actually, I think you've convinced me that using "sort -h" is better.
In particular, I like that it future-proofs us up to and including
yottabytes. What about something like this, to highlight this common
use case?

---
Index: du.1
===================================================================
RCS file: /cvs/src/usr.bin/du/du.1,v
retrieving revision 1.35
diff -u -p -r1.35 du.1
--- du.1        2 Sep 2019 21:18:41 -0000       1.35
+++ du.1        29 Jan 2020 16:02:45 -0000
@@ -147,6 +147,16 @@ option is specified.
 .El
 .Sh EXIT STATUS
 .Ex -std du
+.Sh EXAMPLES
+To sort human-readable output by size, one might use the human-readable
+extension to
+.Xr sort 1 ,
+for example:
+.Pp
+.Dl du -sh * | sort -h
+.Pp
+This is useful to quickly identify large files and folders consuming
+disk space.
 .Sh SEE ALSO
 .Xr df 1 ,
 .Xr fts_open 3 ,

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Jason McIntyre-2
On Wed, Jan 29, 2020 at 11:12:56AM -0500, David Goerger wrote:

> Monday, 20200127 18:29-0500, Daniel Jakots wrote:
> > Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s
> > -h options will automatically select the best suffix and sort(1)'s
> > -h will sort first using the suffix then the numerical value.
>
> Thanks! I didn't know about "sort -h". That indeed does what I want,
> and is a bit more readable (e.g. 8G instead of the quick mental math
> in evaluating 8192M). Like Todd said, old habits die hard. And at
> least in my case, I'm pleasantly surprised any time a tool features
> smart extensions and I don't have to manipulate arrays of raw
> integers. :)
>
> Actually, I think you've convinced me that using "sort -h" is better.
> In particular, I like that it future-proofs us up to and including
> yottabytes. What about something like this, to highlight this common
> use case?
>
> ---
> Index: du.1
> ===================================================================
> RCS file: /cvs/src/usr.bin/du/du.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 du.1
> --- du.1        2 Sep 2019 21:18:41 -0000       1.35
> +++ du.1        29 Jan 2020 16:02:45 -0000
> @@ -147,6 +147,16 @@ option is specified.
>  .El
>  .Sh EXIT STATUS
>  .Ex -std du
> +.Sh EXAMPLES
> +To sort human-readable output by size, one might use the human-readable
> +extension to
> +.Xr sort 1 ,
> +for example:
> +.Pp
> +.Dl du -sh * | sort -h
> +.Pp
> +This is useful to quickly identify large files and folders consuming
> +disk space.
>  .Sh SEE ALSO
>  .Xr df 1 ,
>  .Xr fts_open 3 ,
>

hi.

i don;t think it would be such a bad thing for du to have an example or
two, so i'm ok with this.

couple of points:

- the description is way too wordy. it is definitely better to try and
  avoid this structure for simple cases:

  1st part of description:

                $ blah
       
        2nd part of description.

  so sth like:

  Display a summary of files and folders in the current directory,
        sorted by size:

                $ blah

- i guess if you have a lot of stuff it makes sense to have the biggest
  files displayed at the end of the list. but generally wouldn;t you
  want your biggest files listed first? we could add -r to sort.

jmc

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Scott Cheloha
> On Jan 29, 2020, at 10:26 AM, Jason McIntyre <[hidden email]> wrote:
>
> On Wed, Jan 29, 2020 at 11:12:56AM -0500, David Goerger wrote:
>> Monday, 20200127 18:29-0500, Daniel Jakots wrote:
>>> Can't you achieve what you want with `du -sh * | sort -h`? du(1)'s
>>> -h options will automatically select the best suffix and sort(1)'s
>>> -h will sort first using the suffix then the numerical value.
>>
>> Thanks! I didn't know about "sort -h". That indeed does what I want,
>> and is a bit more readable (e.g. 8G instead of the quick mental math
>> in evaluating 8192M). Like Todd said, old habits die hard. And at
>> least in my case, I'm pleasantly surprised any time a tool features
>> smart extensions and I don't have to manipulate arrays of raw
>> integers. :)
>>
>> Actually, I think you've convinced me that using "sort -h" is better.
>> In particular, I like that it future-proofs us up to and including
>> yottabytes. What about something like this, to highlight this common
>> use case?
>>
>> ---
>> Index: du.1
>> ===================================================================
>> RCS file: /cvs/src/usr.bin/du/du.1,v
>> retrieving revision 1.35
>> diff -u -p -r1.35 du.1
>> --- du.1        2 Sep 2019 21:18:41 -0000       1.35
>> +++ du.1        29 Jan 2020 16:02:45 -0000
>> @@ -147,6 +147,16 @@ option is specified.
>> .El
>> .Sh EXIT STATUS
>> .Ex -std du
>> +.Sh EXAMPLES
>> +To sort human-readable output by size, one might use the human-readable
>> +extension to
>> +.Xr sort 1 ,
>> +for example:
>> +.Pp
>> +.Dl du -sh * | sort -h
>> +.Pp
>> +This is useful to quickly identify large files and folders consuming
>> +disk space.
>> .Sh SEE ALSO
>> .Xr df 1 ,
>> .Xr fts_open 3 ,
>>
>
> [...]
>
> - i guess if you have a lot of stuff it makes sense to have the biggest
>  files displayed at the end of the list. but generally wouldn;t you
>  want your biggest files listed first? we could add -r to sort.

Our sort(1) has an '-r' flag for "reverse".

Reply | Threaded
Open this post in threaded view
|

Re: Teach du(1) the -m flag, disk usage in megabytes

Ingo Schwarze
In reply to this post by Jason McIntyre-2
Hi Jason,

Jason McIntyre wrote on Wed, Jan 29, 2020 at 04:26:42PM +0000:

> i don;t think it would be such a bad thing for du to have an example or
> two, so i'm ok with this.
>
>   so sth like:
>
>   Display a summary of files and folders in the current directory,
> sorted by size:
>
> $ blah

That wording seems very nice to me, see below for a patch.

I also included .??* because having large hidden directories around
and desperately searching for the waste of disk space elsewhere
seems like a common trap to me.

> - i guess if you have a lot of stuff it makes sense to have the biggest
>   files displayed at the end of the list. but generally wouldn;t you
>   want your biggest files listed first? we could add -r to sort.

Actually, i consider it more versatile without the -r.
You are right, it really matters for directories with lots of entries.
But even with few entries, what's wrong with having the most relevant
entries closest to the subsequent shell prompt?
Everything is visible on the screen in that case either way.

Feel free to commit something like this (with or without the .??*,
with ot without the -r) based on OK schwarze@, or send an OK to me.

Yours,
  Ingo


Index: du.1
===================================================================
RCS file: /cvs/src/usr.bin/du/du.1,v
retrieving revision 1.35
diff -u -r1.35 du.1
--- du.1 2 Sep 2019 21:18:41 -0000 1.35
+++ du.1 29 Jan 2020 16:40:13 -0000
@@ -147,6 +147,11 @@
 .El
 .Sh EXIT STATUS
 .Ex -std du
+.Sh EXAMPLES
+Display a summary of files and folders in the current directory,
+sorted by size:
+.Pp
+.Dl du -sh * .??* | sort -h
 .Sh SEE ALSO
 .Xr df 1 ,
 .Xr fts_open 3 ,

12