List Files by Size – Unix Command line Philosophy

There are a few tips in here that explain how to do this, but for my money, awk is overkill for this and ls is slightly inadequate in and of itself. Of course, Linux/Unix/OSX comes with a rich tool kit for doing lots of things pretty quickly and easily like listing files by size and reformatting the output to be whatever you like. Here are two other ways that make use of the Unix philosophy of just using simple commands to do complex things, or in this case, a relatively trivial thing.

To list files by size, you can just use ls -l. (See man ls for more information.)

Of course, that gives you a lot more information than just size. ls can also sort by a variety of criteria, it can print list information in a lot of different formats, it can list the current directory or it can list recursively.

For now, let’s start with ls. It lists files and with the -l argument, it makes a long listing with lots of information, like this:

-rwx—— 1 john john 10113 2006-11-05 17:43 SigGraphMeetingNotes.txt

Now, since our stated objective is to list just sorted file names and sizes maybe there is some way we can just exclude everything up to the size part of the listing for sorting purposes. It happens there is and that is using the cut command. (See man cut for more information.)

So, on my machine, and your mileage may vary, I can just pipe the ls -l command to the cut command with the proper argument and get a listing formatted like this:

10113 2006-11-05 17:43 SigGraphMeetingNotes.txt

The command to do that is:

ls -l | cut -c 24-

Now at this point, the keen observer may note that there are directories and symbolic links included in this listing so how do we get rid of those because we certainly don’t want to count them as ordinary files. Well first of all, we can dereference the symbolic links themselves using the L option of ls like this:

ls -lL | cut -c 24-

There, that works nicely. Now let’s remove directories from the listing. To do that, I am going to use grep and a regular expression, along with another pipe. (Again, see man grep for more information.)

ls -lL | grep -v ^d | cut -c 24-

All that is left is to sort and maybe do some more column trimming. To sort, I am going to just pipe my current command to the sort command and tell it to do a numerical sort like this:

ls -lL | grep -v ^d | cut -c 24- | sort -nr

Then, I am going to remove the date and time stamps using the colrm command and another pipe. That will look like this:

ls -lL | grep -v ^d | cut -c 24- | sort -nr | colrm 10 25

And its output will look like this:

101143 SigGraphMeetingNotes.txt

(As usual, look at man sort and man colrm to get more information about these little utilities.)

Of course, in the interest of brevity, I haven’t included the whole listing for the directory in question, but only a single file. You will doubtless have more than one file in your directory and so you will be filling up your screen with files and sizes sorted in descending order using the command line above.

So, is this way I describe in this tip any better than the awk method or one of the other ls ways described in other tips? Nope, it is just different and depending on what you want to do and how much time and effort you might want to expend, you may prefer one approach over another. The important thing is learning how to do this stuff.

In fact, there are other ways I could use the same Unix utilities to do this. For instance, instead of the first cut, I could have just sorted based on a defined key, in this case it would have been field 5. To see how that works, just try this at the shell prompt:

ls -lL | sort -k 5 -nr | grep -v ^d | cut -c 24- | colrm 10 25

Here is yet another way to list the contents of your directory sorted by size and without using either awk or ls:

du -Sba –max-depth=1 –exclude=’./.*’ | sort -nr

(Yep, take a look at man du for more information.)

I have no doubt I could come up with even more ways to do the same thing, but the point is that even if you are an OSX user and absolutely adicted to GUI’s, not only does it behoove you to learn more about the Unix command line, it will pay dividends in ways I cannot even begin to describe when you find that you want to do something and there doesn’t seem to be a simple tool to do it or an elaborate Acqua application for that matter.

Unix only looks difficult. Actually, it is simplicity itself, but it does require some familiarity to get the most out of it.

NOTE: The cut parameter and the colrm parameters may have to be adjusted based on how ls -l formats its output on your machine. Like I said, your mileage may vary.

 

Submitter: John Carey