Software RAID check - slow system issues
This is an issue that we, as an IT services company, run into quite frequently. Many servers running in small businesses do not have a hardware raid installed, so we are stuck dealing with speed issues like that. If you are unlucky enough to have a software raid running on your Linux server, you've probably noticed the occasional slowing down of the system. This is because of periodic redundancy checks of MD devices.
It is a cronjob that runs weekly and scans the MD devices for consistency. This can have a serious effect on the system, especially if you are running IO intensive applications, such as virtual machines. (Microsoft Exchange or SQL, for example).
One way to solve this problem is to lower the dev.raid.speed_limit_max. By default, it is set to this:
dev.raid.speed_limit_min = 1000
dev.raid.speed_limit_max = 200000
(Speed is no less than 1000K/sec and no more than 200000K/sec)
The dev.raid.speed_limit_max is what you are interested in. This is the value that should be adjusted on a per-system basis. If you set it too low, your system will constantly be checking the raid (2052 minutes until done, given that this runs once per week, 2 of those days it will be checking the raid):
cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda2[0] sdb2[1]
1945121728 blocks [2/2] [UU]
[======>..............] check = 30.1% (586144768/1945121728) finish=2052.8min speed=11032K/sec
bitmap: 4/15 pages [16KB], 65536KB chunk
unused devices: <none>
If you set it too high, it will finish faster, but will cause some of your processes to hang or crash.
To make the changes permanently, set them up in /etc/sysctl.conf (add these variables). We've set ours to max 10000. Yes, it will take 2 days to run, but at least the perfomance impact is not significant.
dev.raid.speed_limit_min = 1000
dev.raid.speed_limit_max = 10000
To make the changes without rebooting:
sysctl -w dev.raid.speed_limit_max=10000
Test them out and set it to whatever works for you. You can also change these during runtime, so if it the server load is not high and you want the check to finish sooner, you can set it to a higher number
ALT Consulting is an IT Consulting and Support company, with offices in the US and Canada. ALT specializes in providing support for Linux, Apple/Mac and Windows systems for businesses and organizations. Contact us, if you are looking for reliable and experienced system support specialists.
Original article has been published on ALT Support Miami.