CentOS 7 grub virtio error migrating to KVM

When migrating CentOS7 from physical servers or vmware / hyperv, it typically does not have virtio drivers built into it. This will often give grub or initramfs errors.

Once the image has been migrated, you have three choices:

  1. Boot using Rescue mode from a latest CentOS or RockyOS (v8+), and choose to mount the existing machine (option 1). Version 8+ has the virtio drivers built in, so will see the drive no problem.
  2. Boot using recovery mode. This is usually a grub menu option. We find this typically has virtio built in
  3. Change emulation for the drive to IDE and boot as normal

Once booted into the os (or mounted via rescue cd and chroot /mnt/sysimage), change to root user if you aren’t already.

Run the following command to rebuild the initramfs:

mkinitrd -f --allow-missing --with=virtio_blk --preload=virtio_blk --with=virtio_net --preload=virtio_net --with=virtio_console --preload=virtio_console /boot/initramfs-$(uname -r).img $(uname -r)

Then just shut down the VM and boot it back up. new initramfs will now have the virtio drivers and be able to see the disk.

[del.icio.us] [Digg] [StumbleUpon] [Technorati] [Windows Live]

chronyd NTP server for local network

Configuration on Redhat / CentOS / Rocky Linux / Almalinux

yum install chrony

These are the important bits in your /etc/chrony.conf file:

local stratum 10
manual
allow 192.168.0.0/16
allow 10.10.0.0/16
ratelimit interval 3 burst 16

local stratum is a bit like a trust score, lower is more trusted. 10 is high enough that you wont affect much if your particular server goes horribly wrong.

manual keyword specifies that you’re able to use chronyc on the command line to manually set the time. I always leave this enabled but you can choose to not include this if you prefer.

allow directive specifies the networks that should be allowed. specify multiple times to allow multiple networks. Alternatively you can just say allow any, but please do read about dns reflection ddos attacks first.

ratelimit allows rate limiting replies on a per-ip address basis. I always specify this just in case some client software goes haywire. interval is not in seconds, but actually 2 to the power of X seconds. so interval of 3 actually means 8 seconds. burst is how many responses are allowed above the threshold before enforcing this interval.

Dont forget to restart chronyd:

systemctl restart chronyd

Example chrony.conf configuration file

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (https://www.pool.ntp.org/join.html).
pool 2.rocky.pool.ntp.org iburst

# Use NTP servers from DHCP.
sourcedir /run/chrony-dhcp

# Record the rate at which the system clock gains/losses time.
driftfile /var/lib/chrony/drift

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Enable kernel synchronization of the real-time clock (RTC).
rtcsync

# Enable hardware timestamping on all interfaces that support it.
#hwtimestamp *

# Increase the minimum number of selectable sources required to adjust
# the system clock.
#minsources 2

# Rate limit responses
ratelimit interval 3 burst 6

# Allow NTP client access from local network.
allow 10.0.0.0/8

# Serve time even if not synchronized to a time source.
local stratum 10
manual
# Require authentication (nts or key option) for all NTP sources.
#authselectmode require

# Specify file containing keys for NTP authentication.
keyfile /etc/chrony.keys

# Save NTS keys and cookies.
ntsdumpdir /var/lib/chrony

# Insert/delete leap seconds by slewing instead of stepping.
#leapsecmode slew

# Get TAI-UTC offset and leap seconds from the system tz database.
leapsectz right/UTC

# Specify directory for log files.
logdir /var/log/chrony

# Select which information is logged.
#log measurements statistics tracking

nftables installation

You can choose your own firewall policy implementation, but we use nftables:

yum install nftables

I usually edit this file:

[root@XXXXXXXXXXXXXXXXX admin]# cat /etc/sysconfig/nftables.conf
# Uncomment the include statement here to load the default config sample
# in /etc/nftables for nftables service.

#include “/etc/nftables/main.nft”

I swap out the commented out include line for the following:

include "/etc/nftables/nftables.nft"

And then inside that config file I put all my rules:

[root@XXXXXXXXXXXXXXXXX admin]# cat /etc/nftables/nftables.nft
table inet filter {
chain INPUT {
type filter hook input priority 0; policy accept;
iif "lo" accept
ct state established,related accept
ip protocol icmp icmp type echo-request accept
ip6 nexthdr ipv6-icmp icmpv6 type 1 counter accept comment "accept ICMPv6 dest unreachable"
ip6 nexthdr ipv6-icmp icmpv6 type 2 counter accept comment "accept ICMPv6 packet too big"
ip6 nexthdr ipv6-icmp icmpv6 type 3 counter accept comment "accept ICMPv6 time exceeded"
ip6 nexthdr ipv6-icmp icmpv6 type 4 counter accept comment "accept ICMPv6 parameter problem"
ip6 nexthdr ipv6-icmp icmpv6 type 128 icmpv6 code 0 counter accept comment "accept ICMPv6 echo request"
ip6 nexthdr ipv6-icmp icmpv6 type 129 icmpv6 code 0 counter accept comment "accept ICMPv6 echo reply"
ip6 nexthdr ipv6-icmp icmpv6 type 133 icmpv6 code 0 counter accept comment "accept ICMPv6 router solicitation"
ip6 nexthdr ipv6-icmp icmpv6 type 134 icmpv6 code 0 counter accept comment "accept ICMPv6 router advertisement"
ip6 nexthdr ipv6-icmp icmpv6 type 135 icmpv6 code 0 counter accept comment "accept ICMPv6 neighbor solicitation"
ip6 nexthdr ipv6-icmp icmpv6 type 136 icmpv6 code 0 counter accept comment "accept ICMPv6 neighbor advertisement"

tcp dport 22 ip saddr X.X.X.X accept
udp dport 123 accept
drop
}

chain OUTPUT {
type filter hook output priority 0; policy accept;
}
}

This will allow SSH port 22 access to your system from a predefined X.X.X.X IP, and open access to NTP. This could be dangerous if you’re putting this on a public network, so either restrict this to your local IPs by adding a ip saddr X.X.X.X/X accept on the end, or just know what you’re opening yourself up for by reading up on NTP software compromises and NTP reflection ddos attacks.

Testing using ntpdate

And ofcourse we need to do some testing….

Testing NTP server using ntpdate:

ntpdate -q 103.43.119.204
server 103.43.119.204, stratum 3, offset 0.000072, delay 0.02623
29 Oct 08:03:48 ntpdate[14770]: adjust time server 103.43.119.204 offset 0.000072 sec

As long as the offset is tiny it should be good to go.

[del.icio.us] [Digg] [StumbleUpon] [Technorati] [Windows Live]

Multiple A DNS records and browser retries failover / redundancy

I wanted to know how firefox / chrome etc behave when there are two A DNS records for a website, and one of those IPs does not respond.  The use-case for this is to have the same website hosted in two locations, and when one server/network becomes offline that the other one can take over.

I couldn’t find any up to date info about this so I thought I’d just do some tests and compile my own info. Test case used only two A records for the same hostname.

 

Browser Version Failover OK Failover Seconds
Chrome 75 Yes 30
Internet Explorer 11 Yes 20
Firefox 68 Yes 20
MS Edge 17 Yes 8

 

NOTE: If the connection returns an error sooner, presumably these browsers will failover faster. The test here was a complete drop of traffic with no TCP reset/icmp error coming back from the first attempted IP.

Using multiple DNS A records is a plausible failover mechanism in modern browsers!

PROBLEM: Browsers seem to switch between IPs after a period of inactivity (even just a few minutes).  If your site requires state, you will need to make state be stored inside the browser (preferably) or some kind of shared state system between servers.

Pros:
Multi-server failover is relatively easy, and fast in modern browsers
Traffic is balanced between both locations

Cons:
If one A record server stops responding, the user may experience a delay when loading the site
If one A record server stops responding while the user was already on the site, they may see an error until they reload the page.
If state required, need to store in browser or shared state system

[del.icio.us] [Digg] [StumbleUpon] [Technorati] [Windows Live]

End of Life – PHP version 4 webhosting

PHP version 4 was officially retired 8 years ago and no longer receives any kind of official patches or support. To preserve functionality for our resellers who have customer services requiring PHP v4 we have maintained our own patch subset for PHP v4 since 2008.

Our volume of PHPv4 websites has now reached a low enough level that it no longer makes business sense to maintain our PHPv4 Webhosting services. Effective 1 August 2017, we will no longer offer / support PHPv4 websites.

We recommend to all customers to either upgrade their websites to PHPv5, PHPv6, or PHPv7. You can select the different version from our webhosting panel to temporarily change your php version to see how it may affect your website. We would recommend trying PHP 5.2.

If you need help with migration we recommend placing a RackCorp support ticket. Any sites still operating on PHPv4 will be automatically upgraded to PHPv5.2 on August 1st. Please note there is a VERY HIGH chance that this will break sites that were built for PHPv4.

[del.icio.us] [Digg] [StumbleUpon] [Technorati] [Windows Live]

Virtual Server Image Backups – Performance Improvement

Tonight we’ve deployed the latest version of our Virtual Server backup manager. This software is responsible for:
1) Creating LVM snapshots of Virtual Servers
2) Reading entire snapshots, 1MB at a time and generating data hashes
3) Comparing those data hashes to our backup repositories (usually in off-site datacenters)
4) Copying any changed blocks to our backup repositories

Traditionally, most of our customers run on small VMs (20GB-30GB), so the above process has worked quite well albeit not optimal. This year we’ve seen a significant uptake of large 200GB+ VM’s which has made us re-evaluate our backup manager. Tonight’s update is a significant one in terms of performance:

  • Change 20150516-001: Backup Repository no longer re-calculates hashes during a backup. They’re calculated once when the backup is first written, used for comparison purposes, and updated as new blocks come in.
  • Change 20150601-001: snapshot blkio limiting. Previously, backup processes were allowed to utilise full sequential read speeds due to oversight in the blkio settings. This has now been remedied, with sequential reads now limited to 60% of the virtual server’s allocated device speed.
  • Change 20150601-002: posix_fadvise(POSIX_FADV_SEQUENTIAL) utilisation. Added libc binding to make a call to posix_fadvise to set the POSIX_FADV_SEQUENTIAL flag. This should mean the Linux Disk Cache doesn’t get blown away during backups.

Coming up in the future will be the much requested feature addition to allow file-level mounts of backups!

[del.icio.us] [Digg] [StumbleUpon] [Technorati] [Windows Live]