I posted about ZRAM before, but because of my totally unscientific experiment, personal experience and the common question, which Linux to run on potatoes…
First, I tweaked ZRAM for my use-case(s) on my hardware, this settings might not be right for your use-cases or your hardware!
My hardware is a netbook with an Intel Celeron N4120 and 4G RAM (3.64G usable).
When I recently played around with ZRAM settings, it felt like the zstd algorithm made my netbook noticeable more sluggish. It never felt sluggish with lzo-rle or lz4.
In a totally unscientific way, I rebooted the computer several times (after a complete update of everything), executed my backup script several times, and measured the last 3 executions. (Didn’t touch the netbook during the runs.) The bottleneck of the backup script should not be ZRAM, but it is some reproducible workload that I could execute and measure.
To my surprise, I could measure a performance difference for my backup scripts, lz4 was consistent fastest in real and sys time w/o tweaks to vm.page-cluster!
Changing the vm.page-cluster to 0 further enhanced the speed for lz4, but with this one toggle, all of a sudden zstd is as fast as lz4 in my benchmark and runs with a more consistent runtime.
Changing the vm.swapiness to 180 decreased the speed for lz4, to my surprise.
Obviously the benchmarks are not 100% clean, although the trend for my workload was clearly in favor of lz4/zstd.
To the best of my knowledge, I ended up with nearly the same tweaks that Google makes for ChromeOS:
-
zstd as algorithm (I think ChromeOS uses lzo-rle)
-
2*ram as ram-size
-
vm.page-cluster = 0
-
Install/enable systemd-oomd
vm.page-cluster = 0 seems like a no-brainer when using ZRAM, on my netbook it is literally the switch for ‘fast’ mode.
In summary: ZRAM makes my netbook totally usable for everyday tasks, and with tweaking the above settings I run Gnome 3, VS Code and Firefox/Evolution w/o trouble. (Of course, Xfce4 on the same machine is still noticeable more performant.)
I wonder if we should recommend to people asking for a lightweight distribution for potatoes to check/tweak their ZRAM settings by default.
Anyway, I would be interested in experiences from other people:
- Any other tweaks on my ZRAM or sysctl for potatoes which made a measurable difference for you?
- Any other tips to improve quality of life on potatoe machines? (Besides switching to KDE, LXDE, Xfce, etc. ;-))
- Any idea why vm.swapiness didn’t improve my measurements? To my understanding it should basically have cached more of my files in ZRAM, making the backup run faster. It even slowed the backup down, which I don’t understand.
Edit:
- zstd beats lz4 on my machine for my benchmark when vm.page-cluster=0!
I wrote this years ago when I was doing a bunch of work with low ram (1gb) potato SBCs and I use it everywhere, including my 32/64gb SFF proxmox nodes: https://github.com/foundObjects/zram-swap
You might find the comments re: swap sizing and compression ratios handy, I’ve found that lz4 approximates to a 2.5:1 compression ratio during most workloads. On your 4gb potato I’d run something like ~2GB lz4 zram, which would work out to a ~5GB zram device. I never bothered with sysctl tuning, you generally don’t need to.
Edit: just about every Chromebook under the sun, and like 90%+ of all Android devices, runs lzo/lzo-rle zram swap at ~(½ramsize*3). Change that to *2.5 for lz4 and you’re set.
Thank you for your answer and your insights.
In my unscientific tests, sysctl/vm.page-cluster made a measurable difference (15% faster when setting it to 0), and it seems everyone else (PopOS, ChromeOS) tweaks at least this setting with ZRAM. I would assume the engineers at PopOS/ChromeOS also did some benchmarks before using this settings.
Now I really would be interested, if you would measure a difference on your 1gb potato SBCs, because IMHO it should even have a bigger impact for them. (Of course, your workload/use cases might make any difference irrelevant, and of course potato SBCs have other bottlenecks like WiFi/IO, which might make this totally irrelevant.
I don’t have my potato lab up and running at the moment but my android devices and sff hypervisors are all using page-cluster=0. That’s the default setting on android and ChromeOS I think, I probably tuned it on the proxmox machines years ago and forgot about it.
Edit: that’s basically swap read ahead right? Ie: number of pages to read from swap at a time.
To my understand it is swap read-ahead, and the number is a power for the base 2. This means the default reads 2^3 = 8 pages ahead. According to what I read, the default of 3 was set in the age of rotating discs and never adapted for RAM swap devices.
Yeah, that’s my understanding of that sysctl too. If IOPS are cheap (and they are when dealing with ram or high IOPS NVMe) there’s no real point in performing extra read ahead.
You likely saw this already, but if you haven’t: https://www.reddit.com/r/Fedora/comments/mzun99/new_zram_tuning_benchmarks/
Thanks a lot! You are right, I saw this already.
I can confirm the findings with my benchmarks: zstd has the best compression, lz4 is the fastest.
Here is what I ended up using for my sysctl conf, iirc I got some of these from popos default config:
vm.swappiness = 180 vm.page-cluster = 0 vm.watermark_boost_factor = 0 vm.watermark_scale_factor = 125 vm.dirty_bytes = 268435456 vm.dirty_background_bytes = 134217728 vm.max_map_count = 2147483642 vm.dirtytime_expire_seconds = 1800 vm.transparent_hugepages = madvise
Could you ELI5 the last five settings? I saw that Chrome OS sets vm.overcommit_memory = 1, it seems to make sense but is missing here.
I really don’t know lol
Increasing the max_map_count is needed for some Steam games, iirc Arch is now dong this by default.
iirc the dirty_bytes settings prevent the system from hanging if there is too much disk IO
And setting transparent_hugepages to madvise was something I did when archlinux had this bug in the kernel: https://old.reddit.com/r/archlinux/comments/1atueo0/higher_ram_usage_since_kernel_67_and_the_solution/
It was eventually fixed but I later ran into the issue again and I decided to keep it on madvise.
Nice, thanks a lot, especially the dirty_bytes settings are interesting to me, because I experience hangs with too much disk IO :-P.
Cheers!
I’m definitely not a “potato expert”, but what I use (on my orange pi zero 3 w/ 1 GiB of ram, at least) is simply:
zram size= 100% of available ram, zstd, priority set at 100%. Because apparently if theres more zram swap than available ram, it’ll lead into memory leaks and/or slowdowns.