Building Shopify’s Application Security Program
Share
Shopify builds products for an industry based on trust. From product discovery to purchase, we act as a broker of trust between the 800,000+ merchants who run their business on our platform and their customers, who come from anywhere in the world. That’s why it’s critical that everyone at Shopify understands the importance of trust in everything we build.
Security is a non-negotiable priority, and we’ve purposefully built a security mindset into our culture. It gives our security team a huge advantage because we start with engaged, talented, and security-minded members across our product teams. But, we also know how important it is that every business on our platform has access to the latest and most innovative features to help them be successful. So, the question is: How do we build an application security program that encourages safety at high speed, removes complexities, and fosters an environment for creative problem solving so that everyone can focus on delivering amazing products to our merchants?
There are three parts to our program that I will outline in this post: scaling secure applications, scaling security teams, and scaling security interactions. When I started at Shopify 7 years ago, I was the lone employee focused on security. Since then, we have grown to a team of dozens of security engineers, covering the breadth of Shopify’s applications, infrastructure, integrations, and hardware platform.
Scaling Secure Applications
As your company grows, the number of different applications and services that will be deployed will inevitably increase. For a small team, it can be daunting to think about providing security for many more services than there are team members, but there are ways to wrangle this sprawl and set your company up with trust at scale.
The first recommendation is to work across R&D disciplines (engineering, data, and UX) and decide on a homogeneous technical baseline you’ll use for your services. There are a lot of non-security advantages to doing this, so the appetite for standardization should be present already. For Shopify, deciding that we would default to all of our products being built in Ruby on Rails meant that our security tooling could go deep on the security concerns for Rails, without thinking about any other web application frameworks. We made similar technical choices up and down the stack (databases, routing, caching, and configuration management) which simplifies the developer experience but also allowed us to ignore security concerns anywhere other than in the things we knew we ran.
Knowing what you are running is a lot harder than it sounds, but it is key to achieving security success at high speed. The way this is done will look different in every organization, but the objective will be the same: visibility. When a new vulnerability is announced, you need visibility into what needs to be patched and the ability to notify the responsible team or automatically kickstart the patching process for every affected service. At Shopify, our security team joined our Production Engineering team’s service tracking project and got a massive head start into having observability of the services, dependencies, and code of everything running in our environment, including the ability to automatically update application dependencies.
Additionally, every new application gets to start with the best defaults we have come up with to this point because we have collectively started hundreds of new projects with the same framework, in the same environment, and using the same technology.
Scaling Security Teams
In a start-up, product direction must be fluid and adapt quickly based on the discovery of new information to keep the company growing. Unless security features are differentiating your product from competitors, investing in a security team isn’t usually the top growth priority. For me, it took over a year before we hired our second security team member. This meant I wore a lot of hats and used some of the tactics described above to ensure a security foundation was included in all new product development.
Growing our security team meant carving off specializations to the first few people we hired. Fraud, application security, infrastructure security, networking, and anti-abuse all started as one-person teams going deep into a particular aspect of the overall security program and feeding their lessons back into the teams across the company.
You also need to understand your options for targeted activities and where third-party services can be used to advance your security agenda. Things like penetration testing, bug bounty programs, and auditing can be used as external validation on a time- and budget-limited basis.
No matter the size of the security team, any security incident is everyone’s responsibility to respond to. Having relationships with teams across the organization will help get the right people quickly moving when you’re faced with an urgent situation or a high severity risk to mitigate. It should never happen that the security team is left with only their team members to fix high priority issues. But there are always ways that security priorities can be embedded within other projects being worked on. Maintaining a list of long-term security enhancements that are ready to be worked on is an invaluable way to make things better without the overhead of staffing an entire team.
Scaling Security interactions
Security teams are renowned for being slow, inconsistent, and risk-averse. In trying to defeat each of those stereotypes, the path to success is to be fast, automated, and risk-aware. The way your security team interacts with the rest of the company is the most important part of consistently building secure products for the long-term.
Deploying security tripwires at the testing and code repository levels allows your team to define dangerous methods and detect unwanted patterns as they are committed. The time when a developer is writing code is the best time to course-correct towards a more secure implementation. To make this effective, flagging a security risk should be designed to be like any good production alert: timely, high-fidelity, actionable, and bring a low false positive rate.
Helped by the success of all the approaches discussed so far, we can build these tactics once and deploy across all of our codebases. With these tactics in place, you gain confidence that even when an application is totally off your radar, you know that it’s being built in line with your security standards. An example of this approach at Shopify is how we handle html_safe. In Rails, html_safe is a confusing function that renders a given string as unescaped HTML, which can be quite unsafe and lead to cross-site scripting vulnerabilities. Our approach to solving this problem consists of renaming this method to dangerously_output_as_html so it’s clear what it does, adding a comment to any pull requests using this method that links to our training materials on mitigating XSS, and triggering an alert to our Application Security (appsec) team so they can review the proposed code change and suggest an alternative and safer approach. This allows our application security team to focus on the exponential benefit of automation rather than the linear benefit of human reviews.
Finally, our best security interactions are the ones we don’t need to have. For example, by making risk decisions at the infrastructure level, we can provide a trustworthy security baseline with our built-in safeguards and tripwires to the teams deploying applications running in that infrastructure without them even knowing those protections are there.
These are just a few of the ways we are tackling the problem of security at scale. Our team is always on the lookout for new ideas and people to join our team to help protect the hundreds of thousands of businesses running on our platform. If these sound like the kinds of problems you want to solve, check out these available positions: Director of Security Engineering, Security Engineering Manager, and Lead Software Engineer - Mobile Security.