backup2l troubleshooting – skipcond not ignoring paths

Something to watch out for…

backup2l, the popular Unix backup software, has a setting in /etc/backup2l.conf called skipcond that lets you ignore files/paths.  It uses find syntax.

You might have written something like this if you don’t want to both backing up old logs, say:

SKIPCOND=(-path /var/www/mysite/system/cms/logs/log*" -o -name "*.o")

But when you run backup2l -e to simulate the backup, it says:

856 / 43210 file(s), 13 / 5003 dir(s), 2.2GB / 4.3GB (uncompressed)
skipping: 0 file(s), 0 dir(s), 0 B (uncompressed)

You know this is wrong for two reasons, the overall size is too big and it’s not skipping anything.

This could be because of symlinks.  To debug, run a real backup (with the -b switch), and use backup2l -l [pattern] (where pattern is some file(s) you know are in the directory you want to exclude) to see what’s there.

You may find it’s actually backing up /usr/share/nginx/mysite, because /var/www is symlinked to it:

lrwxrwxrwx  1 root root    17 Aug  2  2015 www -> /usr/share/nginx/

Remember backup2l has a purge (-p) option that lets you remove individual differential or incremental backups by specifying the number. So if you’ve just run all.1375, say, you can delete that with -p 1375 and when you run it again it’ll reuse the number.

 

Why developers should turn off smart quotes

Ocassionally I write commands in my notes app before copying and pasting them into iTerm.  It’s very easy to type “regular quotes” and not notice macOS has converted them into “smart quotes”.

Then your command doesn’t run, and you get an error message that doesn’t make any sense because the arguments aren’t being parsed correctly, but you don’t notice at first it’s the quotes that are wrong because the font is too small and your mind is fixated on looking for spelling mistakes and syntax problems…

To turn them off, on a Mac go to:

System Preferences > Keyboard > Text > uncheck “Use smart quotes and dashes”

Let’s Encrypt news – client name change and auto renewals

Updated Wed 14 Sep 2016 (new installation guide URL, clarify name change.)

Let’s Encrypt have updated their getting started page, but the following may help anyone trying to understand the latest changes.

If you’re installing from scratch, use Certbot (see below) and start here.  You’ll get custom instructions for your operating system and web server – the client can now be installed via a package on newer systems.

The name

The client (now at version 0.6.0) from letsencrypt-auto to Certbot – to be precise, the project/ecosystem is still called Let’s Encrypt, while Certbot is the EFF’s certificate deployment client.

You’ll find that certbot-auto (a shell script) is an exact copy of the letsencrypt-auto, so all previous commands will still work.

The git repository has also been renamed – the old one is redirected.

Old: https://github.com/letsencrypt/letsencrypt
New: https://github.com/certbot/certbot

You can update the location of your ‘origin’ remote in .git/config

Renewals

You can now renew all your certificates at once.

# to test
~/letsencrypt/certbot-auto renew --dry-run

# to actually do it
~/letsencrypt/certbot-auto renew

There’s some clever stuff going on here:

  • It uses all your previous settings.
  • It renews any certificates that will expire within 30 days.
  • Afterwards you get a list of which were renewed and which were skipped (“not due for renewal yet”)
  • --dry-run  use staging server, so doesn’t count towards API limits.
  • “renew” is designed for unattended use.
  • Remember you still need to reload apache/nginx afterwards.

Example of a cron.weekly script you could use:

#!/bin/bash
/path/to/letsencrypt/certbot-auto renew
service nginx reload

How to turn off Fail2Ban email notifications

Updated 23 Mar 2016 with corrections.
(These instructions based on a CentOS machine I’m responsible for.)

You may find yourself getting multiple emails per day from a server running Fail2Ban, each and every time it blocks an IP address after several failed SSH logins, e.g.

Subject: [Fail2Ban] SSH: banned 123.123.123.123 from myserver

It’s not terribly obvious how to disable these – you’ll find plenty of threads from people asking how to turn Fail2Ban notifications on, not so many asking how to turn them off, also the concepts and syntax takes a bit of getting used to…

In /etc/fail2ban/jail.conf` there’s a section that describes various actions – look for action_, action_mw and action_mwl.  You’ll see they vary in scope, from just writing to the logfile to emailing the sysadmin (or even administrators identified in whois lookups) or automatically banning IPs from 3rd-party services like CloudFlare.

Further down is this:

# Choose default action.  To change, just override value of 'action' with the
# interpolation to the chosen action shortcut (e.g.  action_mw, action_mwl, etc) in jail.local
# globally (section [DEFAULT]) or per specific section
action = %(action_)s

In other words, you can have a single definition in /etc/fail2ban/jail.conf and reuse it in jail.local without writing it out again in full.  It will need to go in the correct [section] (or “jail”) or under [DEFAULT].

I’d recommend changing one thing at a time – many of the checks (FTP etc.) will be disabled by default anyway.

Note: your jail.local file may have the actions written out in full as well (mine did) in which case you can just manually remove the sendmail line.  Adding a duplicate action won’t produce a warning anywhere, fail2ban will just use the last one one.

But there’s no [ssh] section? Which of these “jails” do I use?

[ssh-iptables]
[ssh-tcpwrapper]
[ssh-route]
[ssh-iptables-ipset4]
[ssh-iptables-ipset6]
[ssh-iptables-ipset6]

Check fail2ban’s status to get a list of which jails it’s using, e.g.

sudo service fail2ban status
fail2ban-server (pid  9427) is running...
Status
|- Number of jail:    1
`- Jail list:    ssh-iptables

Your default jail.local will likely already have enabled=true or false lines for each jail too.

Remember to restart the service.

sudo service fail2ban restart

Checking what Fail2Ban doing now you no longer have email alerts

See the entries in /var/log/messages, such as:

Mar 21 13:41:54 myserver fail2ban.filter[3306]: INFO [ssh-iptables] Found 123.123.123.123
Mar 21 13:41:55 myserver fail2ban.filter[3306]: INFO [ssh-iptables] Found 
123.123.123.123
Mar 21 13:41:56 myserver fail2ban.filter[3306]: INFO [ssh-iptables] Found 123.123.123.123
Mar 21 13:41:57 myserver fail2ban.filter[3306]: INFO [ssh-iptables] Found 123.123.123.123
Mar 21 13:41:58 myserver fail2ban.filter[3306]: INFO [ssh-iptables] Found 123.123.123.123
Mar 21 13:41:59 myserver fail2ban.actions[3306]: NOTICE [ssh-iptables] Ban 123.123.123.123

Can’t get Exim4 to DKIM sign outgoing mail?

DKIM isn’t too hard to setup, but there’s a crucial typo in several tutorials –  including this otherwise excellent one for Debian – which may leave you scratching your head to as why the header with the signature is missing from  your outgoing emails (and with no error messages in Exim’s log.)

Wrong:

DKIM_FILE = /etc/exim4/dkim/example.com-private.pem

Right:

DKIM_PRIVATE_KEY = /etc/exim4/dkim/example.com-private.pem

If you look closely in the remote_smtp config, you’ll see which constants it reads in (dkim_private_key = DKIM_PRIVATE_KEY) – but it’s easy to miss.  Or to put it another way, the names of the constants used don’t matter, provided code elsewhere in the configuration files is looking for the matching definitions.

Other tips:

On Debian, when you run sudo update-exim4.conf, the output is written to /var/lib/exim4/config.autogenerated

If something’s not working, check your changes have been copied there.

You can have a situation where all the split config files (the directories under /etc/exim4/conf.d/) exist, but Exim is running in unsplit mode, so only /etc/exim4/exim4.conf.template will actually be read.  Run sudo dpkg-reconfigure exim4-config to fix this (or check the db_use_split_config line in /etc/exim4/update-exim4.conf.conf)

Introduction to Let’s Encrypt (and a few tips)

Note: this isn’t a How To or Getting Started guide – be sure to use the official documentation.  Updated Sun 25 Mar 2018.

13 May 2016 – see my summary of the latest changes.

Simpler installation process

Let’s Encrypt has been packaged on newer Linux systems – you no longer need to do manual installation. E.g, on Ubuntu 16.04 (Xenial) you can now simply do:

sudo apt-get install letsencrypt

The Certbot installation page prompts you to choose your web server and operating system/version and provides tailored instructions.

What is Let’s Encrypt?

Let’s Encrypt is a free and automated Certificate Authority.  Providing you run your own virtual or dedicated Linux server, you can quickly create and renew certificates for any domains (or subdomains) hosted on it.

Why use it (and why use https)?

Let’s Encrypt is completely free, whereas the cheapest certificates I’m aware of are currently £30+VAT per year for a single (sub)domain certificate or £80+VAT/year for a wildcard (*.example.com)

Until now, that’s made it common to avoid using https:// unless absolutely necessary – e.g. you have a customer database.  As the old process, described below, often involves clients, it can add the burden of justifying to them why spending the extra money is worthwhile.

Now, you can use TLS (https://) everywhere as standard and take advantage of the features of SPDY and HTTP/2, including faster page loads on mobile and a slight boost in your search ranking.  A secure connection also prevents mobile networks or other proxy servers from altering (or breaking) your HTML, changing your cache headers or compressing images.

Getting into the habit of building sites for https:// from the very beginning makes sense as you don’t have to check all your existing pages for insecure content as you do when migrating at a later date – I’d recommend always using a self-signed certificate on your development server.

Once you’re setup and organised – that is, you’ve installed Let’s Encrypt, written and tested a standard set of Apache or Nginx settings and been through the whole process for one domain – adding certificates for subsequent sites is going to take you little more than a minute or so.  Renewing them (note Let’s Encrypt certificates only last 90 days) takes just a few seconds.

By contrast, buying certificates the conventional way is tedious and error prone. You must:

  • pay (which also generates invoices you have to enter as part of your book-keeping)
  • generate and upload a CSR
  • (at a minimum) verify an email address on the domain name (which often involves other people and testing mailboxes in advance)
  • wait for approval and certificate generation (delays of an hour or two are common whenever I do it)
  • manually copy and paste your certificate
  • identify the correct intermediate certificate and add it to your bundle
  • install everything on the sever, check the configuration, reload and hope for the best

Browser support

At the time of writing, Let’s Encrypt is still technically in beta.

Until now, certificates haven’t worked in Windows XP, but this should be resolved by late March 2016. Because of that there are some sites I’ve not migrated from paid certificates yet, and some where I’ve only forced https:// for the admin pages.  For what it’s worth, my approach has been to check analytics data for existing sites but generally just go ahead with new ones – fortunately this won’t be a problem for much longer.

Tips and Troubleshooting

Minimising downtime, keep your existing server running

Originally, it was necessary to use the Let’s Encrypt standalone webserver to generate or renew certificates, which meant turning your existing server off temporarily to free up the port.

This is no longer the the case.  The webroot plugin creates a hidden .well_known directory in your docroot and places static files there as part of an ACME challenge to confirm ownership of the domain – therefore it’s compatible with Apache, Nginx or any other web server.

Directories beginning with a period (.) are usually protected (to stop people browsing .git or .htaccess files) so a brief bit of config is necessary. Nginx example:

location ~ /.well-known {
  allow all;
}

Recommendation: create this directory manually, stick a file in it and verify you can view it in the browser. If you get 403 Forbidden:

  • make sure the directory and files have the correct owner/group and permissions (i.e. can your nginx/www-data etc. user access them?)
  • remember Nginx’s rules about location matching. Are there location blocks earlier in the config that will take precedence over this one?
  • Remember to disable password protection, e.g. with auth_basic off;
  • If your directory is mapped to an alias, rather than a relative location under docroot, try beginning the block with location ^~ instead.

Even if you’re using the standalone server, use the command line switches to save yourself time and manually respond to the prompts.

Why can’t you run the Let’s Encrypt standalone server on some other port so you can leave your web server running? Security.  Ports 80 and 443 are privileged and it would be dangerous for just any user of the server to be able to generate certificates.

Moving servers – remember to turn Certbot off on original machine

If you move a domain from one server to another, you’ll probably remember to install/configure certbot on the new machine, but then may forget to turn it off on the old one.  This doesn’t do any harm – but it does create confusion, because at some point you’ll get an error message like this:

The client lacks sufficient authorization – 404

You may then, as I have, spend a while debugging the old server, before you realise the domain is no longer even mapped to it.

Testing certificate security and compatibility

The Qualys online test is by far the most widely used at the moment.

Opinion: given the variation in the care with which people configure their security certificates, perhaps it’s time the padlock icon changed or degraded in some way to indicate a server where encryption is valid, but vulnerable (RC4 support, for example).  We have taught non-technical users that if the padlock is there, everything is fine, which is often not the case.

Choosing the right encryption protocols

It’s very hard for most people to keep up with the frequent OpenSSL security alerts (and new developments in crytography such as elliptic curve). Worse, there’s a lot of partial advice in circulation – some of it dangerously out of date – about which settings to use.

Remy van Elst has written an excellent excellent guide to maximising your SSL score. <- This is probably the most useful link in the entire post.  Bookmark this site.   He also explains many advanced topics like OCSP Stapling.

Troubleshooting OCSP Stapling

Remember, OCSP won’t work from the very first request after restarting the server, you must allow Nginx a chance to asynchronously query the upstream certificates and cache that data.  So if you’re testing with the openssl command (see below) and it fails straight away, give it a few seconds and repeat.

Similarly, if you are only protecting, say the admin pages of the site, the Qualys SSL Labs test on the https:// version of the homepage can show “OCSP Stapling: No” when you run it, if no-one has visited an https:// protected page recently.

Note the OCSP cache is per Nginx worker, with no sharing between processes.

openssl and SNI

Brief reminder: Server Name Indication allows one server to handle lots of certificates for different domains without needing a separate IP address for each. But it doesn’t work in Internet Explorer up to and including Windows XP (fine in other Windows XP for browsers other than IE, and IE itself supports it as of  Windows 7.)

SNI is pretty standard now, of course.   If you’re running openssl from the Linux CLI, e.g. to test the server’s OCSP response…

openssl s_client -connect example.com:443 -servername example.com -tls1 -tlsextdebug -status | more

…be aware the -servername switch is essential as it indicates which of the available certificates you want, if you don’t specify it Nginx will send back the first one it finds (typically whichever domain is at the beginning of the alphabet, it seems to use the first Listen 443; directive it finds in /etc/nginx/sites-enabled)

Monitoring certificate expiry

Let’s Encrypt will email you shortly before a certificate expires, but if you’re a Nagios user you can test validity yourself using the check_ssl_cert plugin.  Example config:

define service {
    service_description     Certificate - https://wturrell.co.uk
    host_name               turtle2
    check_command           check_ssl_cert!-H turtle2.wturrell.co.uk -n wturrell.co.uk -A -c 7
    use                     daily
}

where -c is the number of days before expiry a critical alert is issued, and daily is a service with a normal_check_interval of 1440 (1440 mins = 24 hrs)

Test output:

SSL_CERT OK - X.509 certificate for 'wturrell.co.uk' from 'AlphaSSL CA - SHA256 - G2' valid until Jul 16 07:25:54 2019 GMT (expires in 1231 days)

(the example I’ve given is a commercially bought certificate, hence the long duration.)

As per the documentation, Let’s Encrypt say they’ll be streamlining the renewal process, and provide an example of a script you could use in the meantime.  I’d still advocate taking responsibility for your own monitoring if you can.

Delete unwanted/old certificates

e.g. if you no longer use a domain, say it was a test domain before the site went live, there’s no longer any need to delete files manually from the various /etc/letsencrypt directories.  First:

cerbot-auto delete

…then choose the certificate you want to remove from the list.

Troubleshooting Let’s Encrypt error messages

The client sent an unacceptable anti-replay nonce :: JWS has invalid anti-replay nonce

Run it again until it works (this was a bug earlier on due to a problem with Let’s Encrypt’s CDN – they’ve fixed that and I haven’t seen it since.)

could not find cert file

Check /var/log/letsencrypt/letsencrypt.log for the precise problem – for example, I’d manually (but only partially) deleted an unwanted certificate and it was still looking for the remainder of that, regardless of which domain I was attempting to renew.

Patching servers for glibc (CVE-2015-7547) – remember to reboot!

This is a significant (though we’re not sure yet realistically how dangerous) buffer overflow bug in the Linux C library, glibc. The library handles system calls, for example, reading/writing to files.

The bug is with gethostbyname (used to find the DNS entry for a domain name).  Full explanation (Google)

Security updates have been available since 16 Feb 2016.

It’s easy to update but… remember to reboot afterwards! That’s the main reason I wrote this up – there’s a note to this effect in the security advisories, but utilities like aptitude won’t remind you. All sorts of services could be using glibc and remain vulnerable until they’ve restarted. Dan Kaminsky, at length, on how bad it could be

Much as it’s nice to accumulate a large uptime value, a forced reboot is also a useful opportunity to check the relevant services on your machine will come up after a power outage and, for example, your firewall settings are being loaded correctly.

Recommendation: check your existing version before you update, so you can verify it’s changed afterwards:

sudo aptitude show libc6 (Debian)
or execute the library itself, e.g.
/lib64/libc.so.6
(look for the line: Compiled on a Linux 2.6.32 system on [date])

How To:

Use your usual update procedure (e.g. apt-get update/upgrade on Debian/Ubuntu, yum update on CentOS/RedHat) then reboot the server.

Remember to do your VMs (e.g. Vagrant boxes) too.

Note packages are backported, so for Debian you pay attention to the uxx after the version number, not the version itself.

Does it affect Mac OS X?

No. (OS X doesn’t have glibc.)