I believe this is a slightly controversial topic, at least from what I have gathered so far. Some say its best to leave the server on to spare the life time of the spinning rust. Other seem to prefer to save power and boot the server off each night. So wanted to chip in and hear what folks here do and why do what you do.
Bonus question; Do you guys have a UPS? Is it a must have for a homelab, or does it just depend on the usecase?
On/off:
I have 5 main chassis excluding desktops. Prod cluster is all flash, standalone host has one flash array, one spinning rust array, NAS is all spinning rust. I have a big enough server disk array that spinning it up is actually a power sink and the Dell firmware takes a looong time to get all the drives up on reboot.
TLDR: Not off as a matter of day/night, off as a matter of summer/winter for heat.
Winter: all on
Summer:
VMs are a different story. Normally I just turn them on and off as needed regardless of season, though I will typically turn off more of my “optional” VMs to reduce summer workload in addition to powering off the one server. Rough goal is to reduce thermal load as to not kill my AC as quickly which is probably running above its duty cycle to keep up. Physical wise, these servers are virtualized so this on/off load doesn’t cycle the array.
Because all four of my main servers are the same hypervisor (for now, VMware ESXi), VMs can move among the prod cluster to balance load autonomously, and I can move VMs on or off the standalone host by drag-and-drop. When the standalone host is off, I usually move turn it’s VMs off and move them onto the prod cluster so I don’t get daily “backup failure” emails from the NAS.
UPS: Power in my area is pretty stable, but has a few phase hiccups in the summer. (I know it’s a phase hiccup because I mapped out which wall plus are on which phase, confirmed with a multimeter than I’m on two legs of a 3-phase grid hand-off, and watched which devices blip off during an event) For something like a light that will just flicker or a laptop/phone charger that has a high capacitance, such blips are a non issue. Smaller ones can even be eaten by the massive power supplies my Dell servers have. But, my Cisco switches are a bit sensitive to it and tend to sing me the song of their people when the power flickers - aka fan speed 100% boot up whining. Larger blips will also boop the Dell servers, but I don’t usually see breaks more than 3-5m.
Current UPS setup is:
With all that setup, the gauges in the front of the 3 UPSes all show roughly 15-20m run time in summer, and 20-25m in winter. I know one may be lower than displayed because it’s battery is older, but even if it fails and dumps it’s redundant load onto the main newer UPS I’ll still have 7-10m of battery at worst case and that’s all I really need to weather most power related issues at my location.