Transport Layer Security (TLS) encryption may be commonplace in 2020, but this wasn’t always the case. Back in 2014, our business owner storefront traffic wasn’t encrypted. We manually provisioned the few TLS certificates that were in production. In this post, we’ll cover Shopify’s journey from manually provisioning TLS certificates to the fully automated system that supports over 1M business owners today.
In the Beginning
Up to 2014, only business owner shop administration and checkout traffic were encrypted. All checkouts were on the checkout.shopify.com domain. Secured shop administration functions used the *.myshopify.com certificate and a single-domain certificate for checkout.shopify.com. Our Operations team renewed the certificates manually as needed. During this time, teams began research on what it would take for us to offer TLS encryption for all business owners in an automated fashion.
We launched Shopify Plus in early 2014. One of Plus’s earliest features was TLS encrypted storefronts. We manually provisioned certificates, adding new domains to the Subject Alternative Name (SAN) list as required. As our certificate authority placed a limit on the number of domains per certificate, certificates were added to support the new domains being onboarded. At the time, Internet Explorer on Windows XP was still used by a significant number of users, which prevented our use of the Server Name Indication (SNI) extension.
While this addressed our immediate needs, there were several drawbacks:
- Manual certificate updates and provisioning were labor-intensive and needed to be handled with care.
- Additional IP addresses were needed to support new certificates.
- Having domains for non-related shops in a single certificate wasn’t ideal.
The pace of onboarding was manageable at first. As we onboarded more merchants, it was apparent that this process wasn’t sustainable. At this point, there were dozens of certificates that all had to be manually provisioned and renewed. For each Plus account onboarded, the new domains had to be manually added. This was labor-intensive and error-prone. We worked on a fully automated system during Shopify’s Hack Days, and it became a fully staffed project in May 2015.
Shopify’s Notary System
Automating TLS certificates had to address multiple facets of the process including
- How are the certificates provisioned from the certificate authority?
- How to serve the certificates at scale?
- What other considerations are there for offering encrypted storefronts?
Shopify's Notary System
Our Notary system provisions certificates. When a business owner adds a domain to their shop, the system receives a request for a certificate to be provisioned. The certificate provisioning is fully automated via Application Programming Interface (API) calls to the certificate authority. This includes the order request, domain ownership verification, and certificate/private key pair delivery. Certificate renewals are performed automatically in the same fashion.
While it makes sense that we group domains from a shop to one certificate, the system handles all domains separately for simplicity. Each certificate has one domain with a unique private key. The certificate and private key are stored in a relational database. This relational database is accessible by the load balancers for terminating TLS connections.
Scaling Up Certificate Provisioning
At the time, we hosted our nginx load balancers at our datacenters. Storing the TLS certificates on disk and reloading nginx when certificates changed wasn’t feasible. In a past article, we talked about our use of nginx and OpenResty Lua modules. Using OpenResty allowed us to script nginx to serve dynamic content outside of the nginx configuration. In addition, browser support for the TLS SNI extension was almost universal. By leveraging the TLS SNI extension, we dynamically load TLS certificates from our database in a Lua middleware via the
ssl_certificate_by_lua module. Certificates and private keys are directly accessible from the relational database via a single SQL query. An in-memory Least Recently Used (LRU) cache reduced the latency of TLS handshakes for frequently accessed domains.
Solving Mixed Content Warnings
With TLS certificates in place for business owner shop domains, we could offer encrypted storefronts for all shops. However, there was still a significant hurdle to overcome. Each shop’s theme could have images or assets referencing non-encrypted Uniform Resource Locators (URLs). Mixing of encrypted and unencrypted content would cause the browser to display a Mixed Content warning, denoting that some resources on the page are not encrypted. To resolve this problem, we had to process all the shop themes to replace references to HTTP with HTTPS.
With all the infrastructure in place, we realized the goal of supporting encrypted storefronts for all merchants in February 2016. The same system is still in place and has scaled to provide TLS certificates for all of our 1M+ merchants.
Let’s Encrypt is a non-profit certificate authority that provides TLS certificates at no charge. Shopify has been and is currently a sponsor. The service launched in April 2016, shortly after our Notary went into production. With the exception of Extended Verification (EV) certificates and other special cases, we’ve migrated away from our paid certificate authority in favor of Let’s Encrypt.
Move to the Cloud
In June 2019, our network edge moved from our datacenter to a cloud provider. The number of TLS certificates in our requirements needing support drastically reduced the viable vendor list. Once the cloud provider was selected, our TLS provisioning system had to be adapted to work with their system. There were two paths forward, using the cloud provider’s managed certificates or continuing to provision Let’s Encrypt certificates and upload them. The initial migration leveraged the provider’s certificate provisioning.
Using managed certificates from the cloud provider has the advantage of being maintenance-free after they’ve been provisioned. There are no storage concerns for certificates and private keys. In addition, certificates are automatically renewed by the vendor. Administrative work was required during the migration to guide merchants to modify their domain’s Certification Authority Authorization (CAA) Domain Name System (DNS) records as needed. Backfilling the certificates for our 1M+ merchants took several weeks to complete.
After the initial successful migration to our cloud provider, we revisited the certificate provisioning strategy. As we maintain an alternate edge network for contingency, the Notary infrastructure is still in place to provide certificates for that infrastructure. The intent of using provider managed certificates is for it to be a stepping stone for deprecating Notary in the future. While the cloud provider-provisioned certificates worked well for us, there are now two sets of certificates to keep synchronized. To simplify certificate state and operation load, we now use the Notary provisioned certificates for both edge networks. Instead of provisioning certificates on our cloud provider, certificates from Notary are uploaded as new ones are required.
Outside of our business owner shop storefronts, we rely on nginx for other services that are part of our cloud infrastructure. Some of our Lua middleware, including the dynamic TLS certificate loading code, was contributed to the ingress-nginx Kubernetes project.
Our TLS certificate journey took us from a handful of manually provisioned certificates to a fully automated system that can scale up to support over 1M merchants. If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Visit our Engineering career page to find out about our open positions. Learn about the actions we’re taking as we continue to hire during COVID‑19