Sysadmin's Basic Guide to SSL Certificates and Authorities
Note: this is a post from 2006, recently reposted with fixed markup; I'd like to think I've improved as a writer in the past six years, but I'm only going to make very light edits, nothing substantive. No changes for improved understanding or development of the software (eg, OpenSSH now has its own kind of PKI). Some changes for clarity, and [ed: additions] if I really think something needed to be added. Some of my opinions have changed since I wrote this. Occasional corrections of typos were made to the original post, but for sanity's sake I'm starting here from my pre-markup document which lacks those updates. Do not be surprised (only dismayed) to see an updated version as a new post sometime.
Intended audience: system administrators who know roughly what SSL/TLS is and can use SSH and OpenPGP products (such as GnuPG) who now want to know more and perhaps issue local certificates. You should know what public-key cryptography is, but are not expected to be able to follow any math (no equations herein) — this is about using the math's results, not understanding the math itself. You understand that ‘encrypt’ is scrambling and ‘decrypt’ is descrambling.
The explanations are by comparison amongst existing protocols, pointing out similarities and differences and the reasons for the differences, so along the way you'll read refreshers of material not directly related to SSL or CAs.
This document will not be sufficient reading for running a public Certificate Authority (CA), especially not for remuneration, but may be adequate for internal use.
You'll need access to an install of the OpenSSL software; access to OpenSSH and GnuPG software may be beneficial for poking at or understanding explanations of background material which you're not completely familiar with. Some path-names will be used, with values which match common defaults. It's assumed that anyone ready for this document is capable of figuring out the correct values without prompting.
Finally, be aware that the author of this document is not an expert on these technologies, merely a practising sysadmin. There may be grievous mistakes scattered amongst the opinionated rants. No warranties, either express or implied, are associated with this text.
Terminology
CA | Certificate Authority, someone who attests to identity bindings |
PKI | Public-Key Infrastructure, trying to make this stuff scale |
PGP | Pretty Good Privacy, protocol OpenPGP |
SSH | Secure SHell, for remote access to systems |
SSL | Secure Sockets Layer |
S/MIME | Secure MIME; email crypto like PGP but using PKI |
TLS | Transport Layer Security; SSL got renamed |
WoT | Web of Trust (see refresher) |
Key Refresher and Identity Primer
So you have a private key and a public key, and the public key can be recovered from the private key. The private key can be used to do some math operation on something like a checksum such that anyone with the public key can verify that the result could only come from something with access to the private key. Anyone with the public key can transform data so that only something with the private key can recover it.
Hey presto, that's the basis of OpenPGP and of S/MIME and of SSH's Public-key ("pubkey") authentication. If the preceding paragraph left you bewildered, then you'll need to read some public key crypto introductory materials before you're ready to come back to this document or running a CA.
Shared secret (symmetric) crypto is the stuff where the same key is used at either end. In fact, pubkey crypto is so slow that really it's just used as a key distribution system for shared secret crypto. When you encrypt with OpenPGP, the software generates a shared secret, encrypts the text with that, then encrypts the much smaller shared secret using the public key crypto and sends it along. Shared secret crypto is also used for handling passphrases to protect the private keys on local storage, so that someone who gets access to the disk still needs to know your passphrase to be able to get to the key. The other bit of glue is the cryptographic digest; amongst other things, digests are used for making fingerprints of data, so that any small change in the data severely changes the fingerprint. A signature is the fingerprint with verification data made by the private key. A key fingerprint is the fingerprint of the public key itself (and in current OpenPGP, the keyid is the end bit of the fingerprint).
But, here the public key is just the numbers used in the math. Nothing binds this to an identity. The possibilities for CAs and Webs-of-Trust (WoTs) arise from how to associate this public key with an identity with some degree of proof.
SSH
SSH is a protocol geared towards providing remote login across IP networks.
Some details included here are specific to OpenSSH, which is perhaps the predominant implementation. The concepts are the same for all implementations, but by including the OpenSSH specifics, you can go take a look and see. Various extensions might add features dismissed below as not being present, such as PKI support.
The public key is put into a file listing ‘authorized_keys’ for access to a system account. Each files can contain multiple keys, one per line. The file might be in ~/.ssh/authorized_keys (common), or in a directory with a file named for the account, where the user can't modify the file themselves. Authorisation rules can be associated with the key by including them on the same line, at the start, indicating that access is only authorised to particular commands, if coming from particular remote hosts.
There is no intrinsic bond between a key and an identity. The public key grants access to an account by virtue of being in the right file at the right time. At most, there may be an unverified comment attached to the key at the end of the line it's on, which might be used to describe the account@source.host which the key came from.
A remote host has its own private/public key, used for verifying that the correct host is being talked to, without being subject to a Man-in-the-Middle attack. Without per-site extra work, this does not prove a host identity; instead, the client caches the public key in a "known_hosts" file; this is trusted to be safe from tampering. If the host public key changes, then an alert is happening. [ed: term is "trust on first use"]
This works, in the hands of competent or semi-competent people. It provides what people most need with a minimum of fuss and overhead. Pubkey authentication removes dictionary attacks against online servers. Confidentiality negotiated for the network stream (TCP connection) provides for protection against eves-droppers [ed: and integrity protection is supplied too, to protect against modification; that's not inherently part of confidentiality and encryption]. People need some care the first time that they connect, but thereafter the fact that it's the same host is sufficient for all identity concerns. Host identity fingerprints can be published for verification against what's seen when you first connect; extract them with something like:
$ ssh-keygen -f /etc/ssh/ssh_host_rsa_key.pub -l $ ssh-keygen -f /etc/ssh/ssh_host_dsa_key.pub -l
and publish, perhaps in a PGP-signed text document to immediately get a WoT protection. The handling of keys is simple enough that distribution mechanisms can be scripted as needed.
In part this works because remote shell access implies a great deal of trust, so access grants are comparatively limited, generally not for the open public. An administrator doesn't care whether or not everyone can verify their host's identity, only that their user-base can. An administrator needs only care about the identities of a small set of users (small being less than a few thousand) and can set their own policies about expiration, verification, etc. Issues of scaling trust were side-stepped by realising that for both host and user verification for trusted and/or critical maintenance access, trust need not, indeed should not, scale out of the administrative control of the organisation using it.
For scaling up within an organisation, various hacks can be used such as storing keys in LDAP; using alternatives such as the shared-secret system ‘Kerberos’ (which is also the authentication infrastructure in Active Directory) can provide for smoother scaling with better centralised control against issues such as empty passphrases/passwords, enforced server verification (so that users don't blindly accept changed host identities) and improved audit trails for shared accounts, together with protection against many more protocols than just shell access: IMAP, SMTP, POP3, NNTP, HTTP — all these (and more) can use Kerberos authentication. Oh, and SSH too.
But Kerberos is a bitch the moment you need to step outside the boundaries of one organisation, so SSH comes back.
This author's current preferred shell access system is SSH using GSSAPI (Kerberos, effectively) authentication within an organisation but SSH Pubkey authentication elsewhere.
OpenPGP
The simplest way to associate an identity with a public key is for someone to use their private key to sign a binding of the public key to a representation of the identity. That's a PGP key signature. To confirm that the identity at least has a hope of being valid, all such identities should include a signature from the private key corresponding to the public key. This is a self-signature. The identity should be a name, an email address and perhaps a comment.
If you bundle the cryptographic public key up with the bindings and call this the public key, as a unit of distribution, the public key can collect extra bindings (key signatures) from other private keys. If you can verify any of the signatures and if you trust the entity which made the signature, then you've just verified the identity attached to the key. These public keys are commonly uploaded to public keyservers from which they can be retrieved by anyone.
If you then allow for transitive trust, then you can start weaving a web of trust. Although you might want to look at the pictures of webs resulting from scientific experiments examining the results of treating spiders to various hallucinogenic compounds to get a better idea of the weaving involved.
A signature on a key is a public attestation as to the identify concerned. Think of a legal affidavit stating that the key is associated with someone or something with that identity. That's what you're effectively doing, when you sign someone else's key and publishing the results. This is why you should use local (non-exportable) signatures until you understand what you're doing, so that you can only fool yourself. Various guides talk about techniques for verifying identity when signing keys; try to understand the issues before making publicly distributable signatures affirming human identities which others might rely upon. This branches off into interesting and important issues about the nature of identity, use of pseudonyms, etc.
In PGP, there are now multiple trust models. The old, classic, model has trust levels associated with identities and thresholds for minimum number of signatures by fully-trusted introducers (intermediaries) and semi-trusted introducers. There are the keys you've signed; you can associate a trust level for how confident you are in the identity checking performed by the owner of that key. For keys signed by those keys, if sufficiently signed by some combination of fully trusted and semi-trusted introducers, the key becomes considered verified; you can then associate trust levels with those. Trust path verification also has a maximum depth.
There is no central authority, but there is no certainty of being able to verify the identity of a PGP key. You can use local signatures to confirm that you're seeing the same key each time, but that's about it. Within an organisation you can have some competent security-aware people who check signatures, etc, and act as trusted introducers; the introducers sign the keys of other people (after appropriate verification) and get those people to, at least, locally-sign as trusted the introducer's key. A concerted effort can ensure that there's a strong trust path within the organisation. On the open Internet, the nearest similar approach is the concept of the "strong set", the largest group of keys which can "reach" other keys within the group. In practice, this is centered around the creator of PGP, Phil Zimmerman. For people in the strong set, PGP works well for identity verification of other people in the strong set whom they haven't met. If you're not in the strong set, the lack of trust on so many keys leads to an overload of spurious warnings and unfortunately people quickly learn to not pay any attention to the trust level associated with a key.
It doesn't take many signatures to get into the strong set and there are many benefits to doing so; being able to verify that software really does come from an organisation is good. Those organisations who use software signing keys for which there are no signatures on the keys in the public keyservers are basically getting "checksum plus verifying that the next release comes from the same place". Those clueful organisations, such as ISC (maintainers of INN and Bind), who ensure that their signing key has signatures from several people in the strong set are able to release software and have the sysadmin really verify that the software did come from them, to the extent that they trust intermediate introducers. Next time there's a security update in Bind, it's comforting at least to be able to verify that the new release really did come from the people it should be coming from and that the push to update isn't an elaborate social engineering attack to get people to install a compromised fake new release of the software. Warm fuzzies in the midst of frustration are remarkably soothing.
SSL Certificates
In the SSL world politics has worked to ensure that a small number of organisations can make a lot of money and there are some rather artificial restrictions.
You make a private key; the public key isn't normally put on disk. You can then make a Certificate Signing Request, or CSR; this is like the public key with self-signature. No client software will directly use this, though. Instead, you send the CSR to a Certificate Authority. The CA's job is to verify that you are who you claim to be, in the real world, and after that to process the CSR, signing the data with their CA key, to produce a public certificate. They can then get this back to you in any number of ways — confidentiality is a nicety, but if it truly matters then the whole public key crypto is so severely broken already that you're in trouble. The public key is provided by a server to every client that tries to start an SSL session with it, so is freely given out anyway.
Here, the whole job of the CA is to verify the identity of the requester. The CA serves no other purpose than to attest to identity, both at the time of signing and maintaining lists of expired/compromised keys. For this, they get to charge money each year for once-off full verification and double-checking at the time of renewal. Any CA which tries to remove humans from this loop is asking for trouble.
Identity here consists of a name which is addressable by the relevant computer systems (email address or hostname) and descriptions about the person involved, in a structured format.
Unfortunately, because the identity is an intrinsic part of the certificate, rather than one of a set of optional bolt-ons, a certificate can only have one signed binding. Thus someone buying a certificate needs to choose one CA which is most likely to be supported by all their clients. This goes directly against free market principles and ultimately feedback-loops into a very few CAs having most of the market: http://www.securityspace.com/s_survey/data/man.200609/casurvey.html
In theory, a server can use multiple certificates and just keep retrying as each one fails; this is slow, scaling poorly, and good luck finding servers and clients which support it reliably.[ed: theory that's very stupid] A TLS client can suggest which CAs it handles, thus letting every visited SSL server know every CA supported, including local private CAs [ed: for clarity, this is a defined extension], instead of just having the server offer the fingerprints of the CAs from which it actively offers certificates. That would be too easy and scalable and respect privacy too much.
Also, any degree of real extensibility requires TLS; one neat feature which makes TLS desirable is the server_name extension of RFC 3546 §3.1 [ed: obsoleted by RFC 4366]; that one allows you to use virtual-hosting of SSL sites on one IP address with a different certificate per site. Unfortunately, trying to negotiate TLS breaks some old sites which are still SSLv2-only. What's truly pathetic is that the sites still using old cryptographically unsound SSLv2 are uniformly banking websites.
[ed: SSLv2 is almost certainly dead, server_name has caught on, the ServerNameIndication extension is now commonly TLS SNI, Microsoft led the way in introducing browser support for it. OpenSSL and GnuTLS now support TLSv1.1 and TLSv1.2 and the TLS world is actually getting better!]
SSL Certificates, Theory
Structured data has been around for quite some time. There's one standard approach which was designed to create a flexible framework for describing data and mapping it into the a wire presentation, with the presentation mapping being distinct from the data structure definitions. That's ASN.1, Abstract Syntax Notation 1, ITU-T specification X.680, for describing the data. Then X.690 defines the BER, CER and DER encodings. There's also now XER, for an XML encoding. ASN.1 with one of the X.690 encodings is found in Kerberos v5, in LDAP, in SNMP, H.323 and also in cryptographic certificates. Think of them as nested Type-Length-Value representations, handling arbitrarily large numbers and structured groups of types.
Data types are agreed interpretations of a given type label. A type label is an OID, Object IDentifier. OIDs are sequences of numbers, where at any point authority can be delegated to someone else, much as in DNS domains. But instead of federation names, which people attach value to (whether sentimental or business, such as trademarks), it's ‘just’ numbers. The top level is managed by the ITU and various arcs lead to various other organisations; some organisations charge money for an arc, others offer them for free, so there's NEVER a reason to hijack a number. No arc has more importance than another. IANA Enterprise numbers are under the "1.3.6.1.4.1." arc, "iso.org.dod.internet.private.enterprises." and these are free; in Britain, any registered company can take their Companies House registration number and use it without further formality (or money) as the base of an OID arc; iso(1).member-body(2).uk(826).0.1.<regnum> for companies registered in England and Wales, 1.2.826.0.2.<regnum> for companies registered in Scotland (after stripping off the SC prefix).
CSRs are in the PKCS#10 format, specification available at time of writing from: http://www.rsasecurity.com/rsalabs/node.asp?id=2124, which offers all of the PKCS series specifications.
The certificate definition is called X.509 (another ITU-T specification); the version which finally became widespread is X.509v3 which is extensible. In binary form, this often has the filename extension ".crt" and MIME type "application/x-x509-ca-cert". The other format seen is PEM (for Privacy Enhanced Mail), which is basically just storing the exact same data, with the exact same binary representation, but encoded into base64 with BEGIN and END tag lines.
Various standard extensions are commonly expected on certificates you'll encounter. They refine how the certificate is used without changing that it is a certificate — something binding the public key to the identity, under a signature. These include, amongst others:
- basicConstraints, which can set CA:TRUE or CA:FALSE for whether the certificate can itself be used as a CA; it must be present in a CA cert.
- keyUsage, which has values relating to infrastructure and email usage
- extendedKeyUsage, which has values relating to use with HTTPS and code-signing
- subjectAltName, see below.
- crlDistributionPoints, for where to look to verify that a certificate is still valid
When talking to an application server (IMAP, HTTP, whatever), the hostname supplied by the user (not refined by any kind of insecure DNS resolution) is compared to the Common Name, CN, field of the certificate. That allows for a simple match. It is also matched against the subjectAltName extension, trying to find a match. No real clients today will ignore subjectAltName. subjectAltName is a comma-separated list of names, tagged with their type ("type:name"). Tags include, amongst others, "email", "DNS" and "IP". First rule is to always include the CN field as a DNS subjectAltName, because otherwise you'll be discovering the corner cases in how implementations behave when a value is only present in one or the other place. So for "www.example.org", you'd have:
field "CN", value "www.example.org" field "subjectAltName", value "DNS:www.example.org".
The subjectAltName can list several names and can include wildcards. So you might have:
field "CN", value "www.example.net" field "subjectAltName", value: "DNS:www.example.net, DNS:docs.example.net, DNS:webmail.example.net" field "CN", value "www.example.org" field "subjectAltName", value: "DNS:www.example.org, DNS:example.org, DNS:*.example.org"
It may be possible to also use a wildcard in the CN field, although I'm not sure how widespread this is.
Also, a CA certificate can include nameConstraints, which restricts what names the CA claims authority over. Most CAs claim global authority and most clients don't appear to offer a way to impose name constraints by local policy instead of based on the certificate. As a result, a CA in Brazil can issue an SSL cert for www.barcleys-bank.co.uk and it's up to you to spot the mis-spelling. Equally, a British CA can issue certificates for sites under .br. This is insane.
If creating your own CA, give serious consideration to creating the CA certificate with a nameConstraints field:
nameConstraints=permitted;DNS:*.uk
Given any X.509 certificate, you can take a look at the content to get used to this with "openssl x509". Eg:
$ openssl x509 -in cert.pem -noout -text | less $ openssl x509 -inform der -in cert.crt -noout -text | less
SSL Certicate Generation
The openssl(1) command is a top-level meta-command, providing various sub-commands which do the actual work. It ships with manual-pages, both for itself and for the individual sub-commands. All sub-commands take the "-help" option [ed: or don't take it, but give out usage information in response to the error anyway].
Most commands will let you use “-in <file> -noout -text” to decode the file stored.
You first need to create a key. This key can be stored protected by a passphrase (via symmetric cryptography); ALWAYS use a key for a CA; decide for yourself whether you need a server to start unattended or whether the server should require a passphrase to start. Applications such as Apache will prompt for a passphrase at start-up if one is needed. Local laws and regulations, or business partner requirements, may mandate the use of a passphrase in certain circumstances, such as accepting credit-card numbers.
# KEY GENERATION # No passphrase: openssl genrsa -out svcname.key.pem 1024 # Passphrase: openssl genrsa -out svcname.key.pem -des3 1024 # Examining uses a different sub-command: openssl rsa -in svcname.key.pem -noout -text | less
Then you need to make the CSR; this is the first point when identifying information is attached, and will use “/etc/ssl/openssl.cnf” for templating the answer.
# MAKING THE CSR (CERTIFICATE SIGNING REQUEST) openssl req -new -key svcname.key.pem -out svcname.csr
You can then make a self-signed temporary certificate, if you need to wait on a slow CA, again using the req(1) sub-command:
# TEMPORARY CERTIFICATE openssl req -x509 -key svcname.key.pem -in svcname.csr -out cert.pem
After receiving the certificate back from the CA, just put it in place and use it, no extra commands should be needed. If you got the certificate back in DER form, transform it to PEM with:
# DER to PEM: openssl x509 -inform der -outform pem -in cert.crt -out cert.pem
SSL Certicate Authority
Around about this point, consider paying for software or using a free public CA such as CAcert.org; one certificate to install, no messing with openssl.cnf. Taking a look at http://www.cacert.org/ is strongly recommended. Running your own CA can scale further if you provide a lot of services and can help ameliorate some concerns of the more security conscious.
The main problem is that managing the v3 extensions requires rewriting parts of the config file or writing temporary config files which pull in others. Whilst theoretically you can pull in environment variables in the config files, this has proven fragile in my experience. I got it to work on the CA machine but where, in personal use, the CA machine was also the server machine (very naughty, but it's a personal box and I don't issue certs for anyone else) it interfered with use of the req command. Feedback from those with good solutions for this is appreciated.
[ed: define SUBJECTALTNAME = email:copy in the defaults at the top, then you can safely use ${ENV::SUBJECTALTNAME} below, even when ‘SUBJECTALTNAME’ does not exist in the environment. ]
If running your own private CA, consider using nameConstraints as described above. I didn't know about it when setting up the two I've run so haven't yet tried it myself. [ed: also, I have a tendency to start using new domains for various purposes every now and then; this feature makes more sense inside a corporation, and not much then, since an acquisition would need a new CA cert issued]
Also consider paying an established CA to sign your CA cert, instead of having it be self-signed. If the established CA's certificate is already installed in all the client software which you use then you've just boot-strapped your CA into existence. Because of the money involved, it's probably worth first running small trials with a self-signed cert as described below, so that you can gain experience and be less likely to make expensive mistakes.
Your /etc/ssl/openssl.cnf will have a key ‘default_ca’ in the section ‘ca’, which defines another section which has attributes of the CA. One of those will be the directory where the CA lives.
[ ca ] default_ca = CA_default # The default ca section [ CA_default ] dir = ./demoCA # Where everything is kept certs = $dir/certs # Where the issued certs are kept crl_dir = $dir/crl # Where the issued crl are kept #.... x509_extensions = usr_cert #....
Change the ‘dir’ to an appropriate name (‘orgCA’ or whatever); note the other values, as they all modify every filename used below; I've stuck to the defaults, other than ‘demoCA’. 1100 days gives you three years and a few slack days. Also, look in the “[ v3_ca ]” section and consider setting nsCertType; this is (probably) where nameConstraints goes too.
# MAKE DIRECTORY LAYOUT mkdir /etc/ssl/orgCA cd /etc/ssl/orgCA mkdir certs crl newcerts private openssl rand -hex 10 > serial touch index.txt chmod -R go-rwx . # MAKE CA KEY: openssl genrsa -des3 -out private/cakey.pem 1024 # MAKE CA CERTIFICATE (self-signed): openssl req -new -x509 -days 1100 -extensions v3_ca -key private/cakey.pem -out cacert.pem
[ed: the above used to use old '01' advice for initializing the serial number; don't do that, serial prediction attacks have moved on]
You'll want a non-PEM copy of the cert, because when in DER form and served with MIME type ‘application/x-x509-ca-cert’ applications such as web-browsers will prompt the user to install the certificate when a link to it is clicked on.
# CONVERT TO DER: openssl x509 -inform pem -outform der -in cacert.pem -out orgCA.crt
Get that certificate installed into the browsers you use, into the mail-clients, etc. This is the point where using a custom CA can bite. If you don't have direct administrative control over the software and settings of your client hosts, this is where getting your CA certificate signed as a CA by an established CA can help.
Many tools which are linked against OpenSSL will pull in the system-wide certs; this is a directory, typically /etc/ssl/certs/, where all the .pem certs live. To quickly locate the correct certificate, the client library hashes the CA cert identity supplied by the server and looks for a filename abcdef01.0 in the directory. These are best maintained as symlinks, using the tool c_rehash. Unfortunately, some OSes don't bother installing that tool. Install it. Use it. Find the options in your software for setting the dir or using the default system dir and enable it.
[ed: beware that GnuTLS does not use this directory scheme, and OpenSSL 1.0.0 changed the hash algorithm used for these hashes; I modified my c_rehash to generate two links, one with -subject_hash_old in the openssl x509 invocation, so that programs using the old library could still find the certificates even after I had started using OpenSSL 1+. ]
Then, you come to actually using the CA certificate to sign the server CSRs.
The CA section (CA_default) defines in the x509_extensions attribute the name of the section with the attributes; by default, ‘usr_cert’. So, in the usr_cert section:
basicConstraints = CA:FALSE nsComment = "MyOrganisation SSL Certificate" subjectKeyIdentifier = hash authorityKeyIdentifier = keyid,issuer:always crlDistributionPoints = URI:https://www.security.example.org/CA/orgCA.crl nsCertType = server subjectAltName=DNS:foo, DNS:bar
That subjectAltName needs changing for each certificate used. You can try: subjectAltName = ${ENV::SUBJECTNAME} but beware of it breaking elsewhere [ed: see above ed-note for workaround]. Also change nsCertType if needed; comma-separated list, other values ‘client’ and ‘email’.
# CERT SIGNING: # modify subjectAltName and then: openssl ca -policy policy_anything -days 370 -in svcname.csr -out svcname.crt.pem
Meta
That's it for now.
With thanks to Roger Burton West, for his guide which helped me get started and from which, even now, I cut&paste the command-names when I'm tired.
Copyright © 2006 Phil Pennock, All Rights Reserved.