Why does apt-get NOT use 100% (cpu OR disk OR net)?

Question

Why does apt-get not use 100% of either cpu, disk, or network -- or even close to it? Even on a slow system (Raspberry Pi 2+) I'm getting at most 30% CPU load. I'm just thinking that either it's being artificially throttled, or it should max out something while it's working ... or it should be able to do its thing faster than it does.

Edit: I'm just measuring roughly via cpu/disk/net monitors in my panel, and the System Monitor app of Ubuntu MATE.

Please explain why I'm wrong. :-)

Update: I understand that apt-get needs to fetch its updates (and may be limited by upstream/provider bandwidth). But once it's "unpacking" and so on, the CPU usage should at least go up (if not max out). On my fairly decent home workstation, which uses an SSD for its main drive, and a ramdisk for /tmp, this is not the case.

Or maybe I need to take a closer look.

Disk IO is just like network IO, though. It will still block the app, preventing it from using the CPU. Alas, apt-get isn't particularly good at optimizing this. I imagine it could install as it downloads so that by the time your download is finished most of your payload could already be installed, but, unfortunately, it doesn't. In any case, standalone installs mostly just extract data to disk. Those operations are inherently IO bound, and there's simply not much else to do but wait on the disk drive to finish reading or writing. — PSkocik, 11 hours ago
@PSkocik "I imagine it could install as it downloads" apt-get just downloads, dpkg installs. And dpkg is smarter than apt-get in the order that a bunch of packages should be installed, which may not be the same that apt-get downloads them. — Braiam, 5 hours ago

PSkocik · Accepted Answer · 2016-05-25 11:29:57Z

up vote 14 down vote accepted

Apps will only max out the CPU if the app is CPU-bound. An app is CPU-bound if it can quickly get all of its data and what it waits on is the processor to process the data.

apt-get, on the other hand, is IO-bound. That means it can process its data rather quickly, but loading the data (from disk or from the network) takes time, during which the processor can do either other stuff or sit idle if no other processes need it.

Typically, all IO requests (disk, network) are slow, and whenever an application thread makes one, the kernel will remove it from the processor until the data gets loaded into the kernel (=these IO requests are called blocking requests).

edited 11 hours ago

answered 12 hours ago

PSkocik

10k12045

4

With apt commands, it's aggravated by the fact that many files are open in sync mode, or with frequent explicit flushes to disk being requested to guarantee data on disk stays in a consistent state as a system crash could have serious consequences otherwise. Running apt commands with eatmydata can often dramatically improve performance at the expense of reduced reliability (not to mention that services started as part of package installations will inherit the eatmydata settings) – Stéphane Chazelas 9 hours ago

Lol at that last point :). Does anyone have numbers for eatmydata since the 2010 commit in bugs.debian.org/cgi-bin/bugreport.cgi?bug=578635 ? I don't know if "dramatically" is the right word still. – sourcejedi 8 hours ago

Ah, maybe it is (at least on some cloud providers) bugs.launchpad.net/cloud-init/+bug/1236531/comments/6 – sourcejedi 8 hours ago

@sourcejedi On a Raspberry Pi2 with a relatively high-end SD card (but still an SD card, not a high-end SSD), I consider “dramatically” to be a bit of an understatement. The performance of dpkg on flash media really sucks. – Gilles 1 hour ago

add a comment |

A.L · Answer 2 · 2016-05-25 14:21:41Z

Even on a slow system (Raspberry Pi 2+) I'm getting at most 30% CPU load.

The Raspberry Pi 2+ has 4 cores. For some monitoring tools, a 100% usage correspond to all the cores been used at 100%. A 30% CPU load is roughly one core used at (100% = 100 / 4 = 25%) + some background processes.

Here is an example on my 8 cores machine running Ubuntu, I launched one thread with the cat /dev/urandom > /dev/null command in order to create an infinite process that utilize one core entirely.

Now if we take a look at the graph from htop, we can see that the average load is 15.6%, which correspond to one core used at (100% = 100 / 8 = 12.5%) + some background processes ≃ 15.6%.

+1, a usage % close to a multiple of (100 / nCores) should always trigger further scrutiny. This can be checked - and indeed is precluded - by using a monitor able to show usage per core, where 0 <= the% <= 100 * nCores — underscore_d, 1 hour ago

sourcejedi · Answer 3 · 2016-05-25 14:59:26Z

up vote 1 down vote

I think you're actually not measuring IO %. I haven't seen a Linux IO% widget. (I'm very envious of the Windows 10 task manager :). Check using the iotop command and you will see 100% IO.

top should show 100% across user+system+iowait, for values of 100% divided by your core count as described by A.L. I'm not saying top is 100% helpful, but it can be a really useful all-around tool to learn.

Throughput will be lower than maximum, because you're unpacking lots of small files, aka "random IO". There's also some disk sync / cache flushes, although since 2010 on Linux there's only a few of them for each package installed. (Used to be one per file).

edited 8 hours ago

answered 11 hours ago

sourcejedi

4,226720

Use iotop --only, the --only option only show processes or threads actually doing I/O. – A.L 9 hours ago

2

iostat, dstat, atop... will show per disk disk utilisation without needing privileges. It's for the per-task utilisation that you need privileges – Stéphane Chazelas 9 hours ago

@StéphaneChazelas absolutely correct. The point I was trying to make (ninja edit) is that the OP mentions a couple of GUI tools. And the particular GUI tools I've seen, like Gnome System Monitor, show throughput but no IO%. – sourcejedi 8 hours ago

add a comment |

Aymeric R. · Answer 4 · 2016-05-25 22:51:45Z

Actually, IO/Network requests are really slow compared to CPU ops. This means that while your network card is fetching data, or your disk is writing this data, your CPU does absolutly nothing (for this process anyway).

If your hard drive is speeder than your network connection (which is probably true), it won't write more than it has received.

Finally, the network percentage corresponds to the max possible network card usage, not connection. So you may have a 1Gb/s network adapter, you're really unlikely to have an internet connection that reaches this bandwidth.

asked	today
viewed	1434 times
active	today

current community

your communities

more stack exchange communities

Why does apt-get NOT use 100% (cpu OR disk OR net)?

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged apt cpu-usage or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Why does apt-get NOT use 100% (cpu OR disk OR net)?

4 Answers 4

Did you find this question interesting? Try our newsletter

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged apt cpu-usage or ask your own question.

Related

Hot Network Questions