News

Welcome to End Point’s blog

Ongoing observations by End Point people

dstat: better system resource monitoring

I recently came across a useful tool I hadn't heard of before: dstat, by Dag Wieers (of DAG RPM-building fame). He describes it as "a versatile replacement for vmstat, iostat, netstat, nfsstat and ifstat."

The most immediate benefit I found is the collation of system resource monitoring output at each point in time, removing the need to look at output from multiple monitors. The coloring helps readability too:

% dstat                                                                         
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--         
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw          
  4   1  92   3   0   0|  56  84k|   0     0 |  94 188B|1264  1369          
  3   7  43  44   1   1| 368  11M| 151 222B|   0   260k|1453  1565          
  3   2  46  48   1   0| 4325784k|   0     0 |   0     0 |1421  1584          
  2   2  47  49   0   0| 592k    0 |   0     0 |   0     0 |1513  1763          
  6   2  44  49   1   0| 448 248k|   0     0 |   0     0 |1398  1640          
  8   4  41  45   3   0| 456k    0 | 135 222B|   0     0 |1530  2102          
 18   4  38  41   0   0| 408 128k|   0    47B|   0     0 |1261  1977          
 10   4  44  43   0   0| 728 208k|   0     0 |   0     0 |1445  2203          
  6   3  39  51   0   0| 648 256k|36074124B|   0     0 |1496  2180          
  7   7  34  53   0   0|1088k    0 |1234 582B|   0     0 |1465  2057          
 14   8  28  49   0   0|2856 104k|   0     0 |   0    52k|1610  2995          
  6   6  43  45   0   0|1992k    0 |59644836B|   0     0 |1493  2391          
  9  14  34  44   0   0|2432 112k|7854 726B|   0     0 |1527  2190          
  9  11  40  41   1   0|2680k    0 |1382 972B|   0     0 |1550  2298          
  5   4  68  22   0   0| 5761096k|  124628B|   0     0 |1522  1731 ^C       

(Textual screenshot by script of util-linux and Perl module HTML::FromANSI.)

Its default one-line-per-timeslice output makes it good for collecting data samples over time, as opposed to full-screen top-like utilities such as atop, which give much more detailed information at each snapshot, but don't show history.

Since dstat is a standard package available in RHEL/CentOS and Debian/Ubuntu, it is a reasonably easy add-on to get on various systems.

dstat also allows plugins, and just in the most recent release last month were added new plugins "for showing NTP time, power usage, fan speed, remaining battery time, memcache hits and misses, process count, top process total and average latency, top process total and average CPU timeslice, and per disk utilization rates."

It sounds like it'll grow even more useful over time and is worth keeping an eye on.

2 comments:

Greg Smith said...

The best thing about dstat is that you can configure it to include a timestamp in the output, so that if you capture data from it you can later line it up against other system events. It's great for questions like "was this I/O spike during *some event*?" I'll use this:

dstat -tcdm

And then save the output into a CSV file for that sort of work.

Being able to break out individual drives is great for database work too, I'll usually monitor the system drive, the database drive, and the WAL drive separately on larger PostgreSQL systems where those are split.

Jon Jensen said...

Thanks for pointing that out, Greg. I hadn't noticed it before, but as you say, the -t option is great for logging data over time.