The Heartbleed bug spurred server administrators worldwide to work closely with Certification Authorities (CAs) in rekeying and reissuing potentially vulnerable SSL certificates. Part of this effort included revoking existing certificates used on vulnerable servers to ensure obtained private keys are not later used in a man-in-the-middle attack against the website.
Unfortunately, in recent days, certain news reports and blogs addressing certificate revocation and checking for revoked certificates online have failed to discuss the benefits of revocation, instead focusing on the minority of circumstances where widely deployed revocation is not perfect. In the interest of providing balanced information to the public, we, as members of the CA community and as individuals generally interested in a high level of Internet security, would like to help clarify some of the issues confused by these reports and blogs.
What is certificate revocation checking and how does it work?
Certificate revocation is the process CAs use to indicate a certificate should no longer be considered trustworthy. Revocation checking happens in various ways, depending on the software relying on the certificate. Some software, like Internet Explorer, use an online certificate status protocol (OCSP) to verify a certificate’s validity. Other software can check certificate revocation lists (CRLs) provided by the CA or, in the case of Google’s Chrome, check a compilation of CRLs using an out-of-band process. The newest technology, OCSP stapling, actually provides the revocation information from the website directly to the browser when creating the secure connection. Despite these many different avenues, revocation checking essentially all boils down to getting data that tells the browser whether the certificate still works.
As leading CAs, we feel strongly that revocation checking serves a useful and beneficial purpose in online trust and security. These benefits have become more apparent recently in response to the Heartbleed bug. While, as pointed out by Larry Seltzer in his article at http://www.zdnet.com/internet-slowed-by-heartbleed-identity-crisis-7000028506/), traditional OCSP responses and CRLs cannot prevent a sophisticated and direct attack against one user, even traditional revocation information can minimize the potential devastation caused by compromised certificates. Revocation checking takes these certificates out of circulation to protect and preserve online trust.
Although Heartbleed was big enough to warrant a large-scale revocation of potentially compromised certificates, certificate revocation usually occurs on a smaller scale for a variety of reasons, including key compromise, change of control over a domain name, or loss and wrongdoing by the certificate holder. Revocation in Heartbleed has prevented hundreds of thousands of certificates from remaining valid—eliminating them from continued use. Regardless of the reason, without revocation checking, users would no longer be adequately warned about malicious sites soliciting credit card information or be able to realize they may not be transacting with the appropriate party. In all cases, those browsers that properly check revocation information provide users a valuable service by preventing unwitting access to sites deemed as compromised or fraudulent.
“Soft fail” vs. “Hard fail”
Certificate-using software typically receives one of three certificate status responses in connection with revocation information. A “good” response is clear and indicates that the certificate is valid and unrevoked. A “revoked” response is also clear and indicates that there is something untrustworthy about the certificate. A revoked response should always cause software to reject the certificate. For SSL/TLS, this should cause the browser to refuse to create a secure connection. The third response is where a response is never received from the applicable status server. A non-response may be the result of typical Internet disruptions or an indication that the software operator is under a direct attack. Although browsers could overcome normal Internet interruptions by re-attempting to retrieve status information, most browsers will only attempt to access revocation information once during the handshake, making it difficult to determine why the information was unavailable (whether because of an attack or just normal internet disruptions). A simple retry for revocation information could significantly help separate real attacks from Internet errors.
Although revocation servers from major CAs are generally globally distributed through systems that achieve a near-100-percent uptime, standard Internet connectivity cannot achieve this level of performance. Because of these disruptions, software takes two diverse approaches to the non-response. Software using “hard fail” rejects the certificate, assuming the disruption is enough of a risk that the certificate cannot be trusted. On the other hand, software using “soft-fail” will accept the certificate, often without any warning that the certificate status information was not properly received. This lack of clear information about the certificate status is unfortunate since, according to recent studies, users actually find browser messages beneficial (See http://www.cs.berkeley.edu/~devdatta/papers/alice-in-warningland.pdf).
To accommodate the normal flux in internet availability, most browsers cache certificate status information for a set period of time, updating this information only when the information is considered stale. Concerns about the possibility of “denial of service” attacks due to large CRL files or DDOS attacks on OCSP responders are legitimate, but they have been minimized by CAs taking considerable measures to manage these risks by using distributed response servers. Browser caching and geographically distributed status information reduce the number of users benefitting from a soft-fail approach.
Google takes an innovative approach designed to overcome the possibility of a directed attack. By compiling CRLs into a single CRLSet, Google can have each browser download a list of revoked certificates separate from the secure connection. Unfortunately, to keep the size small, Google chose to limit the number of CRLs included in each CRLSet. As a result, the status check fails to detect most revoked certificates, leaving many users unaware of the actual status of a relied-on certificate. We are eager to see whether Google can improve CRL Sets in a way that protects all of the web, rather than a limited subset. So far, CRLSets appear unable to perform sufficiently fast without sacrificing security.
Revocation checking is one of many tools to protect users
Revocation checking is just one arrow in the quiver CAs have to handle all sorts of bad actors. When combined with other advancements like caching, stapling, and pinning, certificates are one of the strongest segments of the security infrastructure. This layered approach to security and improved infrastructure is why CASC has promoted the use of OCSP with stapling as a default option in both browser and server software for more than a year. CASC members believe OCSP stapling enhanced with “MUST STAPLE”, which directs supporting software to hard-fail if a stapled response is not provided with the certificate, is the standards-based solution that the entire ecosystem should push to adopt. With OCSP Must Staple, software can hard-fail without concern about unintentionally failing to obtain revocation information. More information about stapling and the CASC’s efforts to assist deployment is available here https://casecurity.org/2013/02/14/certificate-revocation-and-ocsp-stapling/.
As any security professional knows, a system can never be 100-percent secure, especially any system connected to the Internet. However, any system that is protected in layers and working for a majority of cases is definitely an improvement over no security. With revocation working in nearly every case, we should not let the perfect be the enemy of the good. In a world where nano-seconds matter in reference to performance, many browsers are now focused on streamlining the user experience, sometimes sacrificing security. Because of this, a performance decision not to check revocation may leave internet users vulnerable to events like Heartbleed well past the date when other software is “fixed” and secure.