Security

Translation missing: en.categories.security.content_html
Hubble: Our Tool for Encapsulating and Extending Security Tools

Hubble: Our Tool for Encapsulating and Extending Security Tools

Fundamentally, Shopify is a company that thrives by building simplicity. We take hard, risky, and complex things and make them easy, safe, and simple.

Trust is Shopify’s team responsible for making commerce secure for everyone. First and foremost, that means securing our internal systems and IT resources, and maintaining a strong cybersecurity posture. If you’ve worked in these spaces before, you know that it takes a laundry list of tools to effectively manage and secure a large fleet of computers. Not only does it take tons of tools, but also takes training, access provisioning and deprovisioning, and constant patching. In any large or growing company, these problems compound and can become exponential costs if they aren’t controlled and solved for.

You either pay that cost by spending countless human hours on menial processes and task switching, or you accept the risk of shadow IT—employees developing their own processes and workarounds rather than following best practices. You either get choked by bureaucracy, or you create such a low trust environment that people don’t feel their company is interested in solving their problems.

Shopify is a global company that, in 2020, embraced being Digital by Design—in essence, the firm belief that our people have the greatest impact when we support them to work whenever and wherever they like. As you can imagine, this only magnified the problems described above. With the end of office centricity, suddenly the work of securing our devices got a lot more important, and a lot more difficult. Network environments got more varied, the possibility of in-person patching or remediation went out the window—the list goes on. Faced with these challenges, we searched for off-the-shelf solutions, but couldn’t find anything that fully fit our needs.

Hubble logo which features an image of a telescope followed by the word Hubble
Hubble Logo

So, We Built Hubble.

An evolution of previous internal solutions, Hubble is a tool that encapsulates and extends many of the common tools used in security. Mobile device management services and more are all fully integrated into Hubble. For IT staff, Hubble is a one stop shop for inventory, device management, and security. Rather than granting hundreds of employees access to multiple admin panels, they access Hubble—which ingests and standardizes data from other systems, and then sends commands back to those systems. We also specify levels of granularity in access (a specialist might have more access than an entry level worker, for instance). On the back end, we also track and audit access in one central location with a consistent set of fields—making incident response and investigation less of a rabbit hole.

A screenshot of the Hubble screen on a Macbook Pro. It shows a profile picture and several lines of status updates about the machine
Hubble’s status screen on a user’s machine

For everyone else at Shopify, Hubble is a tool to manage and view the devices that belong to them. At a glance, they can review the health of their device and its compliance, and not just an arbitrary set of metrics, but something that we define and find valuable - things like OS/Patch Compliance, VPN usage, and more. Folks don’t need to ask IT or just wonder if their device is secure. Hubble informs them, either via the website or device notification pings. And if their device isn’t secure, Hubble provides them with actionable information on how to fix it. Users can also specify test devices, or opt in to betas that we run. This enables us to easily build beta cohorts for any testing we might be running. When you give people the tools to be proactive about their security, and show that you support that proactivity, you help build a culture of ownership.

And, perhaps most importantly, Hubble is a single source of truth for all the data it consumes. This makes it easier for other teams to develop automations and security processes. They don’t have to worry about standardizing data, or making calls to 100 different services. They can access Hubble, and trust that the data is reliable and standardized.

Now, why should you care about this? Hubble is an internal tool for Shopify, and unfortunately it isn’t open source at this time. But these two lessons we learned building and realizing Hubble are valuable and applicable anywhere.

1. When the conversation is centered on encapsulation, the result is a partnership in creating a thoughtful and comprehensive solution.

Building and maintaining Hubble requires a lot of teams talking to each other. Developers talk to support staff, security engineers, and compliance managers. While these folks often work near each other, they rarely work directly together. This kind of collaboration is super valuable and can help you identify a lot of opportunities for automation and development. Plus, it presents the opportunity for team members to expand their skills, and maybe have an idea of what their next role could be. Even if you don’t plan to build a tool like this, consider involving frontline staff with the design and engineering processes in your organization. They bring valuable context to the table, and can help surface the real problems that your organization faces.

2. It’s worth fighting for investment.

IT and Cybersecurity are often reactive and ad-hoc driven teams. In the worst cases, this field lends itself to unhealthy cultures and an erratic work life balance. Incident response teams and frontline support staff often have unmanageable workloads and expectations, in large part due to outdated tooling and processes. We strive to make sure it isn’t like that at Shopify, and it doesn’t have to be that way where you work. We’ve been able to use Hubble as a platform for identifying automation opportunities. By having engineering teams connected to support staff via Hubble, we encourage a culture of proactivity. Teams don’t just accept processes as broken and outdated—they know that there’s talent and resources available for solving problems and making things better. Beyond culture and work life balance, consider the financial benefits and risk-minimization that this strategy realizes.

For each new employee onboarded to your IT or Cybersecurity teams, you spend weeks if not months helping them ramp up and safely access systems. This can incur certification and training costs (which can easily run in the thousands of dollars per employee if you pay for their certifications), and a more difficult job search to find the right candidate. Then you take on the risk of all these people having direct access to sensitive systems. And finally, you take on the audit and tracking burden of all of this.

With each tool you add to your environment, you increase complexity exponentially. But there’s a reason those tools exist, and complexity on its own isn’t a good enough reason to reject a tool. This is a field where costs want to grow exponentially. It seems like the default is to either accept that cost and the administrative overhead it brings, or ignore the cost and just eat the risk. It doesn’t have to be that way.

We chose to invest and to build Hubble to solve these problems at Shopify. Encapsulation can keep you secure while keeping everyone sane at the same time.

Tony is a Senior Engineering Program Manager and leads a team focussed on automation and internal support technologies. He’s journaled daily for more than 9 years, and uses it as a fun corpus for natural language analysis. He likes finding old bread recipes and seeing how baking has evolved over time!


Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Engineering career page to find out about our open positions and learn about Digital by Design.

Continue reading

Making Open Source Safer for Everyone with Shopify’s Bug Bounty Program

Making Open Source Safer for Everyone with Shopify’s Bug Bounty Program

Zack Deveau, Senior Application Security Engineer at Shopify, shares the details behind a recent contribution to the Rails library, inspired by a bug bounty report we received. He'll go over the report and its root cause, how we fixed it in our system, and how we took it a step further to make Rails more secure by updating the default serializer for a few classes to use safe defaults.

Continue reading

Let’s Encrypt x Shopify: Securing the Web 4.5 Million Domains at a Time

Let’s Encrypt x Shopify: Securing the Web 4.5 Million Domains at a Time

On June 30, 2021 Shipit!, our monthly event series, presented Let’s Encrypt and Shopify: Securing Shopify’s 4.5 Million Domains. Learn about how we secure over 4.5M Shopify domains and team up to foster a safer Internet for everyone. The video is now available.

It’s already been six years since Shopify became a sponsor of Let’s Encrypt.

In 2016, the SSL team started transitioning all of our merchants' stores to HTTPS. When we started exploring the concept a few years earlier, it was a daunting task. There were few providers that could let us integrate a certificate authority programmatically. The few that did had names like “Reseller API.” The idea that you would give away certificates for free and no human would be involved was completely alien in this market. Everything was designed with the idea that a user would be purchasing the certificate, downloading it, and installing it somehow. It’s a lot more problematic than you might think. For example, a lot of those API return human readable error messages instead of having a defined error code. Normally, they would expect the implentor to send back the message to the user trying to purchase a certificate, but in a fully automated system there is no user to read anything. For Shopify, all 650,000 domains would get a certificate, and they would be provisioned and renewed without any interactions from our merchants.

I first heard about Let’s Encrypt in 2014. A lot of the chatter online was around the fact that they would become a certificate authority providing free certificates (they were pretty expensive until now), but a bit less about the other part of the project, the ACME protocol. The idea was to fully automate the certificate authorities using standardized APIs.

In the summer of 2015 they still hadn’t launched, but I started to write a Ruby implementation of the ACME client protocol on the weekend to get a feel for it. I’d already been through this exercise a few times with other providers. Working from a specification was pretty refreshing. They’re boring documents, but when trying to automate hundreds of thousands of domains that you don’t really control, you want to know that you have all your exceptions accounted for. That’s when we reached out to them to figure out how Shopify could help and agreed on a sponsorship. We didn’t intend to make use of their service, at least not in the immediate future, but we share value around the open web and the importance of removing barriers of entry using technology.

Interacting with a small organization that does their work fully open was also quite refreshing. My experience dealing with certificate authority would be to work with an account manager who forwards my question to a technical team. The software they run is usually not implemented by them, so there is a limit to how much they can answer questions. Let’s Encrypt being fully open changes the dynamic. I asked questions on IRC and they answered me with github links that point at the actual implementation. I reported bugs or inconsistencies in the specification, and they tagged me in the pull request that fixed it.

In late November, we started rolling out our shiny new automated provisioning system. We immediately ran into some scalability issues with our initial providers. We did some napkin math with the throttling they were imposing on us, we would need about 100 days to provision every domain. We let it run over the holidays and launched in February 2016.

The team was already engaged in its next mission but in the back of our mind we knew we needed to revisit this. Now that the bulk of the domains were done, new domains would come at a slower pace and eventually renewal, but that would be good for a while at our current growth projection. Our main concern was emergency rotation. If for some reason we had to rotate our private keys or the certificate chain was compromised somehow, we’d be in trouble. A 100 days is too slow to react to an incident.

We needed to be more responsive for our merchants, and that’s why we decided to add Let’s Encrypt as a backup option. We were able to roll Let’s Encrypt out in a few hours compared to months with our original providers. The errors we ran into were predictable because of their specification and server implementation being open source, so we could refer directly to it to debug unexpected behaviour. It was so reliable that we decided to make them our main certificate authority.

Let's Encrypt is a game changer for the industry. For a big software-as-a-service company like Shopify, it saves time because their implementation is built around an open specification. You can even change or add a new certificate authority that supports the ACME protocol without redesigning or having to change your entire infrastructure if you wanted to. It's more reliable than the API from the past because it's designed to be fully automated from the beginning.

Shipit! Presents Let’s Encrypt and Shopify: Securing Shopify’s 4.5 Million Domains

Shipit! welcomes Josh Aas, co-founder and Executive Director of Let’s Encrypt and Shopify’s Charles Barbier, Application Security Development Manager, to talk about securing over 4.5 million Shopify domains and teaming up to foster a safer Internet for everyone.

Additional Information

Charles Barbier is a Developer Lead for the Application security team. You can connect with him on Twitter.


We're planning to DOUBLE our engineering team in 2021 by hiring 2,021 new technical roles (see what we did there?). Our platform handled record-breaking sales over BFCM and commerce isn't slowing down. Help us scale & make commerce better for everyone.

Continue reading

Schematizing Deletion at Scale

Schematizing Deletion at Scale

At Shopify, we analyze a variety of events from our buyers and merchants to improve their experience and the platform, and empower their decision making. These events are collected via our streaming platform (Kafka) and stored in our data warehouse that houses event data at a rate of tens of billions of events per day. The image below depicts how these events have historically been collected, stored in our data warehouse, and used in other online dashboards.

An animated gif showing how events were collected, stored in the data warehouse, and used in dashboards. The image represents Analytical Events as three yellow envelopes that are on the left hand side of the image. The Kafka pipeline in the centre of the image is represented by a blue cylindrical shape. The data warehouse, on the right hand side of the image, is a represented by a set of six grey circles stacked on top of each other. Below the data warehouse is a computer screen which represents the dashboards.  The animation shows yellow envelopes passing through the Kafka pipeline and continues to the data warehouse for sign up events or the dashboard for POS transactions
How events were collected, stored in our data warehouse, and used in other online dashboards in the old system.

We set out to enhance the technical organization of our systems to enhance the reliability, performance, and efficiency of data processing for purposes of effecting deletion. The Privacy team and the Data Science & Engineering teams collaborated and addressed those challenges together, achieving long-term benefits. The rest of this blog post focuses on our collaboration efforts and the technical challenges we faced when addressing these issues in a large organization such as Shopify.

Context Collection

Lack of guaranteed schemas for events was the root cause of a lot of our challenges. To address this, we designed a schematization system that specified the structure of each event including types of each field, evolution (versions) context, ownership, as well as privacy context. The privacy context specifically includes marking sensitive data, identifying data subjects, and handling PII ( that is, what to do with PII).

Schemas are designed by data scientists or developers interested in capturing a new kind of event (or changing an existing one). They’re proposed in a human readable JSON format and then reviewed by team members for accuracy and privacy reasons. As of today, we have more than 4500 active schemas. This schema information is then used to enforce and guarantee the structure of every single event going through our pipeline at generation time.

Above shows a trimmed signup event schema. Let’s read through this schema and see what we learn from it:

The privacy_setting section specifies whose PII this event includes by defining a data controller and data subject. Data controller indicates the entity that decides why and how personal data is processed (Shopify in this example). Data subject designates whose data is being processed that’s tracked via email (of the person in question) in this schema. 

Every field in a schema has a data-type and doc field, and a privacy block indicating if it contains sensitive data. The privacy block indicates what kind of PII is being collected under this field and how to handle that PII.

Our new schematization platform was successful in capturing the aforementioned context and it significantly increased privacy education and awareness among our data scientists and developers because of discussions on schema proposals about identifying personal data fields. This platform also helped with reusability, observability, and streamlining common tasks for the data scientists too. Our schematization platform signified the importance of capitalizing on shared goals across different teams in a large organization.

Personal Data Handling

At this point, we have schemas that gather all the context we need regarding structure, ownership, and privacy for our analytical events. The next question is how to handle and track personal information accurately in our data warehouse.

We perform two types of transformation on personal data before entering our data warehouse. These transformations convert personal (identifying) data to non-personal (non-identifying) data. In particular, we employ two types of pseudonymisation techniques: Obfuscation and Tokenization.

Obfuscation and Enrichment

When we obfuscate an IP address, we mask half of the bytes but include geolocation data at city and country level. In most cases, this is how the raw IP address was intended to be used for in the first place. This had a big impact on adoption of our new platform, and in some cases offered added value too.

In obfuscation, identifying parts of data are either masked or removed so the people whom the data describe remain anonymous. This often removes the need for storing personal data at all. However, a crucial point is to preserve the analytical value of these records in order for them to stay useful.

Looking at different types of PII and how they’re used, we quickly observed patterns. For instance, the main use case of a full user agent string is to determine operating system, device type, and major version that are shared among many users. But a user agent can contain very detailed identifying information including screen resolution, fonts installed, clock skew, and other bits that can identify a data subject. So, during obfuscation, all identifying bits are removed and replaced with generalized aggregate level data that data analysts seek. The table below shows some examples of different PII types and how they’re obfuscated.

PII Type

Raw Form

Obfuscated

IP Address

207.164.33.12

{

"masked": "207.164.0.0", "geo_country": "Canada"

}

User agent

CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69 Instagram 8.4.0 (iPhone7,2; iPhone OS 9_3_2; nb_NO; nb-NO; scale=2.00; 750x1334

{

"Family": "Instagram", "Major": "8",
"Os.Family": "iOS",
"Os.Major": "9",
"Device.Brand": "Apple",
"Device.Model": "iPhone7"

}

Latitude/Longitude

45.4215° N, 75.6972° W

45.4° N, 75.6° W

Email

 john@gmail.com

behrooz@example.com

REDACTED@gmail.com

 REDACTED@REDACTED.com


Tokenization

Obfuscation is irreversible (the original PII is gone forever) and doesn’t suit every use case. There are times when data scientists require access to the actual raw data. To address these needs, we built a tokenization engine that exchanges PII with a consistent random token. We then store tokens in the data warehouse. A separate secured vault service is in charge of storing the token to PII mapping. This way, to effect deletion only the mapping in the vault service needs removing and all the copies of that corresponding token across the data warehouse become effectively non-detokenizable (in other words, just a random string).

To understand the tokenization process better let’s go through an example. Let’s say Hooman is a big fan of AllBirds and GymShark products, and he purchases two pairs of shoes from AllBirds and a pair of shorts from GymShark to hit the trails! His purchase data might look like the table below before tokenization:

Email
Shop
Product
...
hooman@gmail.com
allbirds
Sneaker
hooman@gmail.com
Gymshark
Shorts
hooman@gmail.com
allbirds
Running Shoes
 
After tokenization is applied the before table will look like the table below:

Email

Shop

Product

...

Token123

allbirds

Sneaker

Token456

Gymshark

Shorts

Token123

allbirds

Running Shoes

There are two important observations in the after tokenization table:

  1. The same PII (hooman@gmail.com) was replaced by the same token(Token123) under the same data controller (allbirds shop) and data subject (Hooman). This is the consistency property of tokens.
  2. On the other hand, the same PII (hooman@gmail.com) got a different token (Token456) under a different data controller (merchant shop) even though the actual PII remained the same. This is the multi-controller property of tokens and allows data subjects to exercise their rights independently among different data controllers (merchant shops). For instance, if Hooman wants to be forgotten or deleted from allbirds, that shouldn’t affect his history with Gymshark.

Now let’s take a look inside how all of this information is stored within our tokenization vault service shown in table below.

Data Subject
Controller
Token
PII
hooman@gmail.com
allbirds
Token123
hooman@gmail.com
hooman@gmail.com
Gymshark
Token456
hooman@gmail.com
...
...
...
... 

The vault service holds token to PII mapping. It uses this context to decide whether to generate a new token for the given PII or reuse the existing one. The consistency property of tokens allows data scientists to perform analysis without requiring access to the raw data. For example, all orders of Hooman from GymShark could be tracked only by looking for Token456 across the orders tokenized dataset.

Now back to our original goal, let’s review how all of this helps with deletion of PII in our data warehouse. If the data in our warehouse is obfuscated and tokenized, essentially there will be nothing left in the data warehouse to delete after removing the mapping from the tokenization vault. To understand this let’s go through some examples of deletion requests and how it will affect our datasets as well as tokenization vault.

Data Subject Controller Token PII
hooman@gmail.com allbirds Token123 hooman@gmail.com
hooman@gmail.com Gymshark Token456 hooman@gmail.com
hooman@gmail.com Gymshark Token789
222-333-4444
eva@hotmail.com
Gymshark
Token011
IP 76.44.55.33
Assume the table above shows the current content of our tokenization vault, and these tokens are stored across our data warehouse in multiple datasets. Now Hooman sends a deletion request to Gymshark (controller) and subsequently Shopify (data processor) receives it. At this point, all that’s required to delete Hoomans PII under GymShark is to just locate rows with the following condition and delete the mapping of token to PII:

DataSubject == ‘hooman@gmail.com’ AND Controller == Gymshark

Which results in deletion of the rows identified with a star (*) in the table below:

Data Subject Controller Token PII
hooman@gmail.com allbirds Token123 hooman@gmail.com
* hooman@gmail.com Gymshark Token456 hooman@gmail.com
* hooman@gmail.com Gymshark Token789 222-333-4444
eva@hotmail.com Gymshark Token011 IP 76.44.55.33
Similarly, if Shopify needed to delete all Hooman’s PII across all shops, it would need to only look for rows that have Hooman as the data subject, highlighted below:
Data Subject Controller Token PII
* hooman@gmail.com allbirds Token123 hooman@gmail.com
* hooman@gmail.com Gymshark Token456 hooman@gmail.com
* hooman@gmail.com Gymshark Token789 222-333-4444
eva@hotmail.com Gymshark Token011 IP 76.44.55.33

Notice in all of these examples, there was nothing to do in the actual data warehouse since once the mapping of token ↔ PII is deleted, tokens effectively become consistent random strings. In addition, all of these operations can be done in fractions of a second whereas doing any task in a petabyte scale data warehouse can become very challenging, and time and resource consuming.

Schematization Platform Overview

So far we’ve learned about details of schematization, obfuscation, and tokenization. Now it’s time to put all of these pieces together in our analytical platform. The image below shows an overview of the journey of an event from when it’s fired until it’s stored in the data warehouse:

An animated gif overview of the journey of an event from when it’s fired until it’s stored in the data warehouse. On the left hand side is Analytical Events represented by three yellow envelopes. In the center of the image is a cylindrical object that represents the Scheme Repository. An arrow from the Scheme Repository points downward to the Kafka pipeline which is represented by a blue cylindrical object. On the right hand side of the image is the Tokenization Vault that is represented by a blue square with a vault lock. Underneath the vault is the data warehouse represented by six grey circles stacked on top of each other.

In this example:

  1. A SignUp event is triggered into the messaging pipeline (Kafka)
  2. A tool, Scrubber, intercepts the message in the pipeline and applies pseudonymisation on the content using the predefined schema fetched from the Schema Repository for that message
  3. The Scrubber identifies that the SignUp event contains tokenization operations too. It then sends the raw PII and Privacy Context to the Tokenization Vault.
  4. Tokenization Vault exchanges PII and Privacy Context for a Token and sends it back to the Scrubber
  5. Scrubber replaces PII in the content of the SignUp event with the Token
  6. The new anonymized and tokenized SignUp event is put back onto the message pipeline.
  7. The new anonymized and tokenized SignUp event is stored in the Data warehouse.

Lessons from Managing PII at Shopify Scale

Despite having a technical solution for classifying and handling PII in our data warehouse, Shopify scale made adoption and reprocessing of our historic data a difficult task. Here are some lessons that helped us in this journey.

Adoption

Having a solution versus adopting it are two different problems. Given the scale of Shopify, collaborating with all stakeholders to implement this new tooling required intentional engagement and productive communication, particularly in light of the significant changes proposed. Let’s review a few factors that significantly helped us.

Make the Wrong Thing the Hard Thing

Make the right thing the default option. A big factor in the success and adoption of our tooling was to make our tooling the default and easy option. Nowadays, creating and collecting unstructured analytical events at Shopify is difficult and goes through a tedious process with several layers of approval. Whereas creating structured privacy-aware events is a quick, well documented, and automated task.

“Trust Me, It Will Work” Isn’t Enough!

Proving scalability and accuracy of the proposed tooling was critical to building trust in our approach. We used the same tooling and mechanisms that the Data Science & Engineering team uses to prove correctness, reconciliation. We showed the scalability of our tooling by testing it on real datasets and stress testing under order of magnitudes higher load.

Make Sure the Tooling Brings Added Value

Our new tooling is not only the default and easy way to collect events, but also offers added value and benefits such as:

  • Shared privacy education: Our new schematization platform encourages asking about and discussing privacy concerns. They range from what’s PII to other topics like what can or can’t be done with PII. It brings clarity and education that wasn’t easily available before.
  • Increased dataset discoverability: Schemas for events allow us to automatically integrate with query engines and existing tooling, making datasets quick to be used and explored.

These examples are a big driver and encouragement in adoption of our new toolings.

Capitalizing on Shared Goals

Schematization isn’t only useful for privacy reasons, it helps with reusability and observability, reduces storage cost, and streamlines common tasks for the data scientists too. Both privacy and data teams are important stakeholders in this project and it made collaboration and adoption a lot easier because we capitalized on shared goals across different teams in a large organization.

Historic Datasets are several petabytes of historic events collected in our data warehouse prior to the schematization platform. 

There is intricate interdependency between some of the analytical jobs depending on these datasets. Similar to adoption challenges, there’s no easy solution for this problem, but here are some practices that helped us in mitigating this challenge.

Organizational Alignment

Any task of this scale goes beyond the affected individuals, projects, or even teams. Hence an organizational commitment and alignment is required to get it done. People, teams, priorities, and projects might change, but if there’s organizational support and commitment for addressing privacy issues, the task can survive. Organizational alignment helped us to put out consistent messaging to various team leads that meant everyone understood the importance of the work. With this alignment in place, it was usually just a matter of working with leads to find the right balance of completing their contributions in a timely fashion without completely disrupting their roadmap.

Dedicated Task Force

These kinds of projects are slow and time consuming. We understood the importance of having a team and project, so we didn’t depend on individuals. People come and go, but the project must carry on.

Tooling, Documentation, and Support

One of our goals was to minimize the amount of effort individual dataset owners and users needed to migrate their datasets to the new platform. We documented the required steps, built automation for tedious tasks, and created integrations with tooling that data scientists and librarians were already familiar with. In addition, having Engineering support with hurdles was important. For instance, on many occasions when performance or other technical issues came up, Engineering support was available to solve the problem. Time spent on building the tooling, documentation, and support procedures easily paid off in the long time run.

Regular Progress Monitoring

Questioning dependencies, priorities, and blockers regularly paid off because we found better ways. For instance, in a situation where x is considered a blocker for y maybe:

  • we can ask the team working on x to reprioritize and unblock y earlier.
  • both x and y can happen at the same time if the teams owning them align on some shared design choices.
  • there's a way to reframe x or y or both so that the dependency disappears.

We were able to do this kind of reevaluation because we had regular and constant progress monitoring to identify blockers.

New Platform Operational Statistics

Our new platform has been in production use for over two years. Nowadays, we have over 4500 distinct analytical schemas for events, each designed to capture certain metrics or analytics, and with their own unique privacy context. On average, these schemas generate roughly 20 billions events per day or approximately 230K events per second with peaks of over 1 million events per second during busy times. Every single one of these events is processed by our obfuscation and tokenization tools in accordance to its privacy context before being accessible in the data warehouse or other places.

Our tokenization vault holds more than 500 billions distinct PII to token mappings (approximately 200 TeraBytes) from which tens to hundreds of millions are deleted daily in response to deletion. The magical part of this platform is that deletion happens instantaneously in the tokenization vault without requiring any operation in the data warehouse. This is the super power that enables us to delete data that used to be very difficult to identify. These metrics proved the efficiency and scalability of our approach and new tooling.

As part of onboarding our historic datasets into our new platform, we rebuilt roughly 100 distinct datasets (approximately tens of petabytes of data in total) feeding hundreds of jobs in our analytical platform. Development, rollout, and reprocessing of our historical data altogether took about three years with help from 94 different individuals signifying the scale of effort and commitment that we put into this project.

We believe sharing the story of a metamorphosis in our data analytics platform is valuable because when we looked for industry examples, there were very few available. In our experience, schematization and a platform to capture the context including privacy and evolution is beneficial in analytical event collection systems. They enable a variety of opportunities in treating sensitive information and educating developers and data scientists on data privacy. In fact, our adoption story showed that people are highly motivated to respect privacy when they have the right tooling at their disposal.

Tokenization and obfuscation proved to be effective tools in helping with handling, tracking and deletion of personal information. They enabled us to efficiently delete data at a very large scale.

Finally, we learned that solving technical challenges isn’t the entire problem. It remains a tough problem to address organizational challenges such as adoption and dealing with historic datasets. We learned that bringing new value, capitalizing on shared goals, streamlining and automating processes, and having a dedicated task force to champion these kinds of big cross team initiatives are effective and helpful techniques.

Additional Information

Behrooz is a staff privacy engineer at Shopify where he works on building scalable privacy tooling and helps teams to respect privacy. He received his MSc in Computer Science at University of Waterloo in 2015.  Outside of the binary world, he enjoys being upside down (gymnastics) 🤸🏻, on a bike  🚵🏻‍♂️ , on skis ⛷, or in the woods. Twitter: @behroozshafiee

Shipit! Presents: Deleting the Undeletable

On September 29, 2021, Shipit!, our monthly event series, presented Deleting the Undeletable. Watch Behrooz Shafiee and Jason White as they discuss the new schematization platform and answer your questions.


Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Engineering career page to find out about our open positions and learn about Digital by Default.

Continue reading

Updates on Shopify’s Bug Bounty Program

Updates on Shopify’s Bug Bounty Program

For three years we, Shopify’s Application Security team, have set aside time to reflect on our bug bounty program and share recent insights. This past year has been quite a ride, as our program has been busier than ever! We’re excited to share what we have learned, and share some of the great things we have planned!

Continue reading

Vouching for Docker Images

Vouching for Docker Images

If you were using computers in the ‘90s and the early 2000s, you probably had the experience of installing a piece of software you downloaded from the internet, only to discover that someone put some nasty into it, and now you’re dragging your computer to IT to beg them to save your data. To remedy this, software developers started “signing” their software in a way that proved both who they were and that nobody tampered with the software after they released it. Every major operating system now supports code or application signature verification, and it’s a backbone of every app store.But what about Kubernetes? How do we know that our Docker images aren’t secret bitcoin miners, stealing traffic away from customers to make somebody rich? That’s where Binary Authorization comes in. It’s a way to apply the code signing and verification that modern systems now rely on to the cloud. Coupled with Voucher, an open source project started by my team at Shopify, we’ve created a way to prevent malicious software from being installed without making developers miserable or forcing them to learn cryptography.

Why Block Untrusted Applications?

Your personal or work computer getting compromised is a huge deal! Your personal data being stolen or your computer becoming unusable due to popup ads or background processes doing tasks you don’t know about is incredibly upsetting.

But imagine if you used a compromised service. Imagine if your email host ran in Docker containers in a cluster with a malicious service that wanted to access contents of the email databases? This isn’t just your data, but the data of everyone around you.

This is something we care about deeply at Shopify, since trust is a core component of our relationship with our merchants and our merchants’ relationships with their customers. This is why Binary Authorization has been a priority for Shopify since our move to Kubernetes.

What is Code Signing?

Code signing starts by taking a hash of your application. Hashes are made with hashing algorithms that take the contents of something (such as the binary code that makes up an application) and make a short, reproducible value that represents that version. A part of the appeal of hashing algorithms is that it takes an almost insurmountable amount of work (provided you’re using newer algorithms) to find two pieces of data that produce the same hash value.

For example, if you have a file that has the text:

Hello World

The hash representation of that (using the “sha256” hashing algorithm) is:

d2a84f4b8b650937ec8f73cd8be2c74add5a911ba64df27458ed8229da804a26

Adding an exclamation mark to the end of our file:

Hello World!

Results in a completely different hash:

03ba204e50d126e4674c005e04d82e84c21366780af1f43bd54a37816b6ab340

Once you have a hash of an application, you can run the same hashing algorithm on it to ensure that it hasn’t changed. While this is an important part of the code signing, most signing applications will automatically create a hash of what you are signing, rather than requiring you to hash and then sign the hash separately. It makes the hash creation and verification transparent to the developers and their users.

Once the initial release is ready, the developer that’s signing the application creates a public and private key for signing it, and shares the public key with their future users. The developer then uses the private part of their signing key and the hash of the application to create a value that can be verified with the public part of the key.

For example, with Minisign, a tool for creating signatures quickly, first we create our signing key:

The public half of the key is now:

RWSs3jHbeTsmYhWlyqpDEufCe5QSGHsb1fFnglZItPwDfJ3wEZzSGyBJ

And the private half remains private, living in /Users/caj/.minisign/minisign.key.

Now, if our application was named “hello” we can create a signature with that private key:

And then your users could verify that “hello” hasn’t been tampered with by running:

Unless you’re a software developer or power user, you likely have never consciously verified a signature, but that’s what’s happening behind the scenes if you’re using a modern operating system. 

Where is Code Signing Used?

Two of the biggest users of code signing are Apple and Google. They use code signing to ensure that you don’t accidentally install software updates that have been tampered with or malicious apps from the internet. Signatures are usually verified in the background, and you only get notified if something is wrong. In Android, you can turn this off by allowing unknown apps in the phone's settings, whereas iOS requires the device be jailbroken to allow unsigned applications to be installed.

A macOS dialog window showing that Firefox is damaged and can't be opened. It gives users the option of moving it to the Trash.


A macOS dialog window showing that Firefox is damaged and can't be opened. It gives users the option of moving it to the Trash.

In macOS, applications that are missing their developer signatures or don’t have valid signatures are blocked by the operating system and advise users to move them to the Trash.

Most Linux package managers, (such as Apt/DPKG in Debian and Ubuntu, Pacman in ArchLinux) use code signing to ensure that you’re installing packages from the distribution maintainer, and verify those packages at install time.

Docker Hub showing a docker image created by the author.
Docker Hub showing a docker image created by the author.

Unfortunately, Kubernetes doesn’t have this by default. There are features that allow you to leverage code signing, but chances are you haven’t used them.

And at the end of the day, do you really trust some rando on the internet to actually give you a container that does what it says it does? Do you want to trust that for your organization? Your customers?

What is Binary Authorization?

Binary Authorization is a series of components that work together: 

  • A metadata service: a service that stores signatures and other image metadata
  • A Binary Authorization Enforcer: a service that blocks images that it can’t find valid signatures for
  • A signing service: a system that signs new images and stores those signatures in the metadata service.

Google provides the first two services for their Kubernetes servers, which Shopify uses, based on two open source projects:

  • Grafeas, a metadata service
  • Kritis, a Binary Authorization Enforcer

When using Kritis and Grafeas or the Binary Authorization feature in Google Kubernetes Engine (GKE), infrastructure developers will configure policies for their clusters, listing the keys (also referred to as attestors) that must have signed the container images before they can run.

When new resources are started in a Kubernetes cluster, the images they reference are sent to the Binary Authorization Enforcer. The Enforcer connects to the metadata service to verify the existence of valid signatures for the image in question and then compares those signatures to the policy for the cluster it runs in. If the image doesn’t have the required signatures, it’s blocked, and any containers that would use it won’t start.You can see how these two systems work together to provide the same security that one gets in one’s operating system! However, there’s one piece that wasn’t provided by Google until recently: the signing service.

Voucher: The Missing Piece

Voucher serves as the last piece for Binary Authorization, the signing service. Voucher allows Shopify to run security checks against our Docker images and sign them depending on how secure they are, without requiring that non-security teams manage their signing keys.

Using Voucher's client software to check an image with the 'is_shopify' check, which verifies if the image was from a Shopify owned repository.
Using Voucher's client software to check an image with the 'is_shopify' check, which verifies if the image was from a Shopify owned repository.

The way it works is simple:

  1. Voucher runs in Google Cloud Run or Kubernetes and is accessible as a REST endpoint
  2. Every build pipeline automatically calls to Voucher with the path to the image it built
  3. Voucher reviews the image, signs it, and pushes that signature to the metadata service

On top of the basic code signing workflow discussed previously, Voucher also supports validating more complicated requirements, using separate security checks and associated signing keys to mix and match required signatures on a per cluster basis to create distinct policies based on a cluster’s requirement.

For example, do you want to block images that weren’t built internally? Voucher has a distinct check for verifying that an image is associated with a Git commit in a Github repo you own, and signing those images with a separate key.

Alternatively, do you need to be able to prove that every change was approved by multiple people? Voucher can support that, creating signatures based on the existence of approvals in Github (with support for other code hosting services coming soon). This would allow you to use Binary Authorization to block images that would violate that requirement.

Voucher also has support for verifying the identity of the container builder, blocking images with a high number of vulnerabilities, and so on. And Voucher was designed to be extensible, allowing for the creation of new checks as need be.By combining Voucher’s checks and Binary Authorization policies, infrastructure teams can create a layered approach to securing their organization’s Kubernetes clusters. Compliance clusters can be configured to require approvals and block images with vulnerabilities, while clusters for experiments and staging can use less strict policies to allow developers to move faster, all with minimum work from non-security focused developers.

Voucher Joins Grafeas

As mentioned earlier, Voucher serves a need that hasn’t been provided by Google until recently. This is because Voucher has moved into the Grafeas organization and now is a service provided by Google to Google Kubernetes Engine users going forwards. 

Since our move to Kubernetes, Shopify’s security team has been working with Google’s Binary Authorization team to plan out how we’ll roll out Binary Authorization and design Voucher. We also released Voucher as an open source project in December 2018. This move to the Grafeas project simplifies things, putting it in the same place as the other open source Binary Authorization components.

Improving the security of the infrastructure we build makes everyone safer. And making Voucher a community project will put it in front of more teams which will be able to leverage it to further secure their Kubernetes clusters, and if we’re lucky, will result in a better, more powerful Voucher! Of course, Shopify’s Software Supply Chain Security team will continue our work on Voucher, and we want you to join us!

Please help us!

If you’re a developer or writer who has time and interest in helping out, please take a look at the open issues or fork the project and open a PR! We can always use more documentation changes, tutorials, and third party integrations!

And if you’re using Voucher, let us know! We’d love to hear how it’s going and how we can do a better job of making Kubernetes more secure for everyone!


Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you? Visit our Engineering career page to find out about our open positions and learn about Digital by Default.

Continue reading

Bug Bounty Year in Review 2019

Bug Bounty Year in Review 2019

For the third year in a row, we’ve taken time to reflect on our Bug Bounty program. This past year was an exciting one for us because we ran multiple experiments and made a number of process improvements to increase our program speed. 

2020 Program Improvements

Building on our program’s continued success in 2019, we’re excited to announce more improvements. 

Bounties Paid in Full Within 7 Days

As of today, we pay bounties in full within 7 days of a report being triaged. Paying our program minimum on triage has been a resounding success for us and our hackers. After having experimented with paying full bounties on triage in Shopify-Experiments (described below), we’ve decided to make the same change to our public program.

Maximum Bounty is Now $50,000

We are increasing our maximum bounty amount to $50,000. Beginning today, we are doubling the bounty amounts for valid reports of Arbitrary Code Execution, now $20K–$50K, SQL Injection, now $20K$40K, and Privilege Escalation to Shop Owner, now $10K$30K. Trust and security is our number one priority at Shopify and these new amounts demonstrate our commitment to both.

Surfacing More Information About Duplicate Reports

Finally, we know how important it is for hackers to trust the programs they choose to work with. We value that trust. So, beginning today, anyone who files a duplicate report to our program will be added to the original report, when it exists within HackerOne. We're continuing to explore ways to share information about internally known issues with hackers and hope to have a similar announcement later this year.

Learning from Bug Bounty Peers

Towards the end of 2018, we reached out to other bug bounty programs to share experiences and lessons learned. This was amazing. We learned so much chatting with our peers and those conversations gave us better insight into improving our data analytics and experimenting with a private program.

Improving Our Analytics

At Shopify, we make data-informed decisions and our bug bounty program is no exception. However, HackerOne platform data only gives us insight into what hackers are reporting and when; it doesn’t tell us who is testing what and how often. Discussing this problem with other programs revealed how some had already tackled this obstacle; they were leveraging provisioned accounts to understand their program funnel, from invitation, to registration, to account creation, and finally testing. Hearing this, we realized we could do the same.

To participate in our bug bounty program, we have always required hackers to register for an account with a specific identifier (currently a @wearehackerone.com email address). Historically, we used that registration requirement for investigating reports of suspicious platform activity. However, we realized that the same data could tell us how often people are testing our applications. Furthermore, with improvements to the HackerOne API and the ability to export all of our report data regularly, we have all the data necessary to create exciting activity reports and program trends. It’s also given us more knowledge to share in our monthly program recap tweets.

Shopify-Experiments, A Private Bug Bounty Program

Chatting with other programs, we also shared ideas about what is and isn’t working. We heard about some having success running additional private programs. Naturally, we launched a private bug bounty program to test the return on investment. We started Shopify-Experiments in mid-2019 and invited high signal, high impact hackers who have reported to our program previously or who have a proven track record on the HackerOne platform. The program allowed us to run controlled experiments aimed at improving our public program. For example, in 2019, we experimented with:

  • expanding the scope to help us better understand the workload implications
  • paying bounties in full after validating and triaging a report
  • making report disclosure mandatory and adding hackers to duplicate reports
  • allowing for self-closing reports that were submitted in good faith, but were false positives
  • increasing opportunities to collaborate with Shopify third party developers to test their apps.

These experiments had immediate benefits for our Application Security Team and the Shopify public program. For instance, after running a controlled experiment with an expanded scope, we understood the workload it would entail in our public program. So, on September 11, 2019, we added all but a few Shopify-developed apps into the scope of our public program. Since then, we’ve received great reports about these new assets, such as Report 740989 from Vulnh0lic, which identified a misconfiguration in our OAuth implementation for the Shopify Stocky app. If you’re interested in being added to the program, all it takes is 3 resolved Shopify reports with an overall signal of 3.0 or more in our program.

Improving Response Times with Automation

In 2018, our average initial response time was 17 hours. In 2019, we wanted to do better. Since we use a dedicated Slack channel to manage incoming reports, it made sense to develop a chatbot and use the HackerOne API. In January last year, we implemented HackerOne API calls to change report states, assign reports, post public and private comments as well as suggest bounty amounts.

Immediately this gave us better access to responding to reports on mobile devices. However, our chosen syntax was difficult to remember. For example, changing a report state was done via the command hackerone change_state <report_id> <state>. Responding with an auto response was hackerone auto_respond <report_id> <state> <response_id>. To make things easier, we introduced shorthands and emoji responses. Now, instead of typing hackerone change_state 123456 not-applicable, we can use h1 change_state 123456 na. For common invalid reports, we react with emojis which post the appropriate common response and close the report as not applicable.

2019 Bug Bounty Statistics

Knowing how important communication is to our hackers, we continue to pride ourselves on all of our response metrics being among the best on HackerOne. For another straight year, we reduced our communication times. Including weekends, our average time to first response was 16 hours compared to 1 day and 9 hours in 2018. This was largely a result of being able to quickly close invalid reports on weekends with Slack. We reduced our average time to triage from 3 days and 6 hours in 2018 to 2 days and 13 hours in 2019.

We were quicker to pay bounties and resolve bugs; our average time to bounty from submission was 7 days and 1 hour in 2019 versus 14 days in 2018. Our average resolution time from time of triage was down to 20 days and 3 hours from 48 days and 15 hours in 2018. Lastly, we thanked 88 hackers in 2019, compared to 86 in 2018.

Average Shopify Response Times - Hours vs. YearsAverage Shopify Response Times - Hours vs. Years


We continued to request disclosure on our resolved bugs. In 2019, we disclosed 74 bugs, up from 37 in 2018. We continue to believe it’s extremely important that we build a resource library to enable ethical hackers to grow in our program. We strongly encourage other companies to do the same.

Reports Disclosed - Number vs. YearReports Disclosed - Number of Reports vs. Year

In contrast to our speed improvements and disclosures, our bounty related statistics were down from 2018, largely a result of having hosted H1-514 in October 2018, which paid out over $130,000 to hackers. Our total amount paid to hackers was down to $126,100 versus $296,400 in 2018, despite having received approximately the same number of reports; 1,379 in 2019 compared to 1,306 in 2018.

Bounties Paid - Bounties Awarded vs. YearBounties Paid - Bounties Awarded vs. Years

Number of Reports by Year - Number of Reports vs. YearNumber of Reports by Year - Number of Reports vs. Year

Report States by Year - Number of Reports vs. YearReport States by Year - Number of Reports vs. Year

Similarly, our average bounty awarded was also down in 2019, $1,139 compared to $2,052 in 2018. This is partly attributed to the amazing bugs found at H1-514 in October 2018 and our decision to merge the Shopify Scripts bounty program, which had a minimum bounty of $100, to our core bounty program in 2019. We rewarded bounties to fewer reports; 107 in 2019 versus 182 in 2018.

After another successful year in 2019, we’re excited to work with more hackers in 2020. If you’re interested in helping to make commerce more secure, visit hackerone.com/shopify to start hacking or our careers page to check out our open Trust and Security positions.

Happy Hacking.
- Shopify Trust and Security

Continue reading

Building Shopify’s Application Security Program

Building Shopify’s Application Security Program

Shopify builds products for an industry based on trust. From product discovery to purchase, we act as a broker of trust between the 800,000+ merchants who run their business on our platform and their customers, who come from anywhere in the world. That’s why it’s critical that everyone at Shopify understands the importance of trust in everything we build.

Security is a non-negotiable priority, and we’ve purposefully built a security mindset into our culture. It gives our security team a huge advantage because we start with engaged, talented, and security-minded members across our product teams. But, we also know how important it is that every business on our platform has access to the latest and most innovative features to help them be successful. So, the question is: How do we build an application security program that encourages safety at high speed, removes complexities, and fosters an environment for creative problem solving so that everyone can focus on delivering amazing products to our merchants?

There are three parts to our program that I will outline in this post: scaling secure applications, scaling security teams, and scaling security interactions. When I started at Shopify 7 years ago, I was the lone employee focused on security. Since then, we have grown to a team of dozens of security engineers, covering the breadth of Shopify’s applications, infrastructure, integrations, and hardware platform.

Scaling Secure Applications

As your company grows, the number of different applications and services that will be deployed will inevitably increase. For a small team, it can be daunting to think about providing security for many more services than there are team members, but there are ways to wrangle this sprawl and set your company up with trust at scale.

The first recommendation is to work across R&D disciplines (engineering, data, and UX) and decide on a homogeneous technical baseline you’ll use for your services. There are a lot of non-security advantages to doing this, so the appetite for standardization should be present already. For Shopify, deciding that we would default to all of our products being built in Ruby on Rails meant that our security tooling could go deep on the security concerns for Rails, without thinking about any other web application frameworks. We made similar technical choices up and down the stack (databases, routing, caching, and configuration management) which simplifies the developer experience but also allowed us to ignore security concerns anywhere other than in the things we knew we ran.

Knowing what you are running is a lot harder than it sounds, but it is key to achieving security success at high speed. The way this is done will look different in every organization, but the objective will be the same: visibility. When a new vulnerability is announced, you need visibility into what needs to be patched and the ability to notify the responsible team or automatically kickstart the patching process for every affected service. At Shopify, our security team joined our Production Engineering team’s service tracking project and got a massive head start into having observability of the services, dependencies, and code of everything running in our environment, including the ability to automatically update application dependencies.

Additionally, every new application gets to start with the best defaults we have come up with to this point because we have collectively started hundreds of new projects with the same framework, in the same environment, and using the same technology.

Scaling Security Teams

In a start-up, product direction must be fluid and adapt quickly based on the discovery of new information to keep the company growing. Unless security features are differentiating your product from competitors, investing in a security team isn’t usually the top growth priority. For me, it took over a year before we hired our second security team member. This meant I wore a lot of hats and used some of the tactics described above to ensure a security foundation was included in all new product development.

Growing our security team meant carving off specializations to the first few people we hired. Fraud, application security, infrastructure security, networking, and anti-abuse all started as one-person teams going deep into a particular aspect of the overall security program and feeding their lessons back into the teams across the company.

You also need to understand your options for targeted activities and where third-party services can be used to advance your security agenda. Things like penetration testing, bug bounty programs, and auditing can be used as external validation on a time- and budget-limited basis.

No matter the size of the security team, any security incident is everyone’s responsibility to respond to. Having relationships with teams across the organization will help get the right people quickly moving when you’re faced with an urgent situation or a high severity risk to mitigate. It should never happen that the security team is left with only their team members to fix high priority issues. But there are always ways that security priorities can be embedded within other projects being worked on. Maintaining a list of long-term security enhancements that are ready to be worked on is an invaluable way to make things better without the overhead of staffing an entire team.

Scaling Security interactions

Security teams are renowned for being slow, inconsistent, and risk-averse. In trying to defeat each of those stereotypes, the path to success is to be fast, automated, and risk-aware. The way your security team interacts with the rest of the company is the most important part of consistently building secure products for the long-term.

Deploying security tripwires at the testing and code repository levels allows your team to define dangerous methods and detect unwanted patterns as they are committed. The time when a developer is writing code is the best time to course-correct towards a more secure implementation. To make this effective, flagging a security risk should be designed to be like any good production alert: timely, high-fidelity, actionable, and bring a low false positive rate.

Helped by the success of all the approaches discussed so far, we can build these tactics once and deploy across all of our codebases. With these tactics in place, you gain confidence that even when an application is totally off your radar, you know that it’s being built in line with your security standards. An example of this approach at Shopify is how we handle html_safe. In Rails, html_safe is a confusing function that renders a given string as unescaped HTML, which can be quite unsafe and lead to cross-site scripting vulnerabilities. Our approach to solving this problem consists of renaming this method to dangerously_output_as_html so it’s clear what it does, adding a comment to any pull requests using this method that links to our training materials on mitigating XSS, and triggering an alert to our Application Security (appsec) team so they can review the proposed code change and suggest an alternative and safer approach. This allows our application security team to focus on the exponential benefit of automation rather than the linear benefit of human reviews.

Finally, our best security interactions are the ones we don’t need to have. For example, by making risk decisions at the infrastructure level, we can provide a trustworthy security baseline with our built-in safeguards and tripwires to the teams deploying applications running in that infrastructure without them even knowing those protections are there.

These are just a few of the ways we are tackling the problem of security at scale. Our team is always on the lookout for new ideas and people to join our team to help protect the hundreds of thousands of businesses running on our platform. If these sound like the kinds of problems you want to solve, check out these available positions: Director of Security EngineeringSecurity Engineering Manager, and Lead Software Engineer - Mobile Security.

Continue reading

One Million Dollars in Bug Bounties

One Million Dollars in Bug Bounties

Today, we’re excited to announce that we’ve awarded over $1M USD in bounties through our bounty programs. At Shopify, bounty programs complement our security strategy and allow us to leverage a community of researchers who help secure our platform. They each bring their perspective and specialties and are can evaluate our platform from thousands of different viewpoints to create a better Shopify product and a better user experience for the 800,000+ businesses we safeguard. Our ongoing investment is a clear indication that we are committed to security and making sure commerce is secure for everyone.

Some Bug Bounty Stats

Shopify is the fifth public program, out of 176, to reach the $1M USD milestone on HackerOne, our bug bounty platform. We’ve had some amazing reports and worked with awesome hackers over the last four years, here are some stats to put it into perspective:

Shopify's Bug Bounty Program Stats: Highest Bounty Award $25K. Over 400+ Hackers Thanked. Over 950+ Bugs Resolved. 750+ Bounties Awarded. 375+ Public Disclosures. Held 2 Live Events
Statistics about Shopify's Bug Bounty Programs Since Inception

Top Three Interesting Bugs

Shopify is dedicated to publicly disclosing all vulnerability reports discovered through our program to propel industry education and we strongly encourage other companies to do the same. Three of our most interesting resolved bugs over the years are:

1. SSRF in Exchange leads to ROOT access in all instances - Bounty: $25,000 

Shopify infrastructure is isolated into subsets of infrastructure. @0xacb reported it was possible to gain root access to any container in one particular subset by exploiting a server-side request forgery bug in the screenshotting functionality of Shopify Exchange. Within an hour of receiving the report, we disabled the vulnerable service, began auditing applications in all subsets and remediating across all our infrastructure. The vulnerable subset did not include Shopify core. After auditing all services, we fixed the bug by deploying a metadata concealment proxy to disable access to metadata information. We also disabled access to internal IPs on all infrastructure subsets.

2. Shopify admin authentication bypass using partners.shopify.com - Bounty: $20,000

@uzsunny reported that by creating two partner accounts sharing the same business email, it was possible to be granted “collaborator” access to a store. We tracked down the bug to incorrect logic in a piece of code that was meant to automatically convert an existing normal user account into a collaborator account. The intention was that, when a partner had a valid user account on the store, their collaborator account request could be accepted automatically, with the user account converted into a collaborator account. We fixed this issue by properly verifying that the existing account is in fact a user account.

3. Stored cross site scripting in Shopify admin and partner pages - Bounty $5,000

@bored-engineer found we were incorrectly sanitizing sales channel icon SVG files uploaded by Partner accounts. During our remediation, we noted the XSS would execute in partners.shopify.com and the Shopify admin panel, which increased the impact of this bug. The admin functionality was not required, so it was removed. Additionally, we verified that the bug had not been exploited by any other users.

Shopify x HackerOne H1-514
Shopify x HackerOne H1-514

Having reached the $1M in awarded bounties, we’re still looking for ways to ensure our program remains competitive and attractive to hackers. This year we’ll be experimenting with new ways to drive hacker engagement and make Shopify’s bug bounty program more lucrative and attractive to hack on.

Happy Hacking!


If you’re interested in helping to make commerce more secure, visit Shopify on HackerOne to start hacking or our career page to check out our open Trust and Security positions.

Continue reading

Bug Bounty Year in Review 2018

Bug Bounty Year in Review 2018

With 2018 coming to a close, we thought it a good opportunity to once again reflect on our Bug Bounty program. At Shopify, our bounty program complements our security strategy and allows us to leverage a community of thousands of researchers who help secure our platform and create a better Shopify user experience. This was the fifth year we operated a bug bounty program, the third on HackerOne and our most successful to date (you can read about last year’s results here). We reduced our time to triage by days, got hackers paid quicker, worked with HackerOne to host the most innovative live hacking event to date and continued contributing disclosed reports for the bug bounty community to learn from.

Our Triage Process

In 2017, our average time to triage was four days. In 2018, we shaved that down to 10 hours, despite largely receiving the same volume of reports. This reduction was driven by our core program commitment to speed. With 14 members on the Application Security team, we're able to dedicate one team member a week to HackerOne triage.

When someone is the dedicated “triager” for the week at Shopify, that becomes their primary responsibility with other projects becoming secondary. Their job is to ensure we quickly review and respond to reports during regular business hours. However, having a dedicated triager doesn't preclude others from watching the queue and picking up a report.

When we receive reports that aren't N/A or Spam, we validate before triaging and open an issue internally since we pay $500 when reports are triaged on HackerOne. We self-assign reports on the HackerOne platform so other team members know the report is being worked on. The actual validation process we use depends on the severity of the issue:

  • Critical: We replicate the behavior and confirm the vulnerability, page the on-call team responsible and triage the report on HackerOne. This means the on-call team will be notified immediately of the bug and Shopify works to address it as soon as possible.
  • High: We replicate the behavior and ping the development team responsible. This is less intrusive than paging but still a priority. Collaboratively, we review the code for the issue to confirm it's new and triage the report on HackerOne.
  • Medium and Low: We’ll either replicate the behavior and review the code, or just review the code, to confirm the issue. Next, we review open issues and pull requests to ensure the bug isn't a known issue. If there are clear security implications, we'll open an issue internally and triage the report on HackerOne. If the security implications aren't clear, we'll err on the side of caution and discuss with the responsible team to get their input about whether we should triage the report on HackerOne.

This approach allows us to quickly act on reports and mitigate critical and high impact reports within hours. Medium and Low reports can take a little longer, especially where the security implications aren't clear. Development teams are responsible for prioritizing fixes for Medium and Low reports within their existing workloads, though we occasionally check in and help out.

H1-514

Shopify x HackerOne H1-514
H1-514 in Montreal

In October, we hosted our second live hacking event and it was the first hacking event in our office in Montreal, Quebec, H1-514. We welcomed over 40 hackers to our office to test our systems. To build on our program's core principles of responsiveness, transparency and timely payouts, we wanted to do things differently than other HackerOne live hacking events. As such, we worked with HackerOne to do a few firsts for live hacking events:

  • While other events opened submissions the morning of the event, we opened submissions when the target was announced to be able to pay hackers as soon as the event started and avoid a flood of reports
  • We disclosed resolved reports to participants during the event to spark creativity instead of leaving this to the end of the event when hacking was finished
  • We used innovative bonuses to reward creative thinking and hard work from hackers testing systems that are very important to Shopify (e.g. GraphQL, race conditions, oldest bug, regression bonuses, etc.) instead of awarding more money for the number of bugs people found
  • We gave hackers shell access to our infrastructure and asked them to report any bugs they found. While none were reported at the event, the experience and feedback informed a continued Shopify infrastructure bounty program and the Kubernetes product security team's exploration of their own bounty program.

Shopify x HackerOne H1-514
H1-514 in Montreal

When we signed on to host H1-514, we weren't sure what value we'd get in return since we run an open bounty program with competitive bounties. However, the hackers didn't disappoint and we received over 50 valid vulnerability reports, a few of which were critical. Reflecting on this, the success can be attributed to a few factors:

  • We ship code all the time. Our platform is constantly evolving so there's always something new to test; it's just a matter of knowing how to incentivize the effort for hackers (You can check the Product Updates and Shopify News blogs if you want to see our latest updates).
  • There were new public disclosures affecting software we use. For example, Tavis Ormandy's disclosure of Ghostscript remote code execution in Imagemagick, which was used in a report during the event by hacker Frans Rosen.
  • Using bonuses to incentivize hackers to explore the more complex and challenging areas of the bounty program. Bonuses included GraphQL bugs, race conditions and the oldest bug, to name a few.
  • Accepting submissions early allowed us to keep hackers focused on eligible vulnerability types and avoid them spending time on bugs that wouldn't be rewarded. This helped us manage expectations throughout the two weeks, keep hackers engaged and make sure everyone was using their time effectively.
  • We increased our scope. We wanted to see what hackers could do if we added all of our properties into the scope of the bounty program and whether they'd flock to new applications looking for easier-to-find bugs. However, despite the expanded scope, we still received a good number of reports targeting mature applications from our public program.

H1-514 in Montreal. Photo courtesy of HackerOne
H1-514 in Montreal. Photo courtesy of HackerOne

Stats (as of Dec 6, 2018)

2018 was the most successful year to date for our bounty program. Not including the stats from H1-514, we saw our average bounty increase again, this time to $1,790 from $1,100 in 2017. The total amount paid to hackers was also up $90,200 compared to the previous year, to $155,750 with 60% of all resolved reports having received a bounty. We also went from one five-figure bounty awarded in 2017, to five in 2018 marked by the spikes in the following graph.

Bounty Payouts by Date
Bounty Payouts by Date

As mentioned, the team committed to quick communication, recognizing how important it is to our hackers. We pride ourselves on all of our timing metrics being among the best in the category on HackerOne. While our initial response time slipped by 5 hours to 9 hours, our triage time was reduced by over 3 days to 10 hours (it was 4 days in 2017). Both our time to bounty and resolution times also dropped, time to bounty to 30 days and resolution to 19 days, down from about a month.

Response Time by Date
Response Time by Date

Report Submitted by Date
Report Submitted by Date

In 2018 we received 1,010 reports. 58.7% were closed as not applicable compared to 63.1% in 2017. This was accompanied by an almost one percent increase in the number of resolved reports, 11.3%, up from 10.5% in 2017. The drop in not applicable and rise in informatives (reports which contain useful information but don't warrant immediate action) is likely the result of the team's commitment to only close bugs as not applicable when the issue reported is in our tables of known issues and ineligible vulnerabilities types or lacks evidence of a vulnerability.

Types of Bugs Closed
Types of Bugs Closed

We also disclosed 24 bugs on our program, one less than the previous year, but we tried to maintain our commitment to requesting disclosure for every bug resolved in our program. We continue to believe it’s extremely important that we build a resource library to enable ethical hackers to grow in our program. We strongly encourage other companies to do the same.

Despite a very successful 2018, we know there are still areas to improve upon to remain competitive. Our total number of resolved reports was down again, 113 compared to 121 despite having added new properties and functionality to our program. We resolved reports from only 62 hackers compared to 71 in 2017. Lastly, we continue to have some low severity reports remain in a triaged state well beyond our target of 1-month resolution. The implications of this are mitigated for hackers since we changed our policy earlier in the year to pay the first $500 of a bounty immediately. Since low severity reports are unlikely to receive an additional bounty, most low-severity reports are paid entirely up-front. HackerOne also made platform changes to award the hackers their reputation when we triage reports versus when we resolve them, as was previously the case.

We're planning new changes, experiments and adding new properties in 2019 so make sure to watch our program for updates.

Happy hacking!


If you're interested in helping to make commerce more secure, visit Shopify on HackerOne to start hacking or our career page to check out our open Trust and Security positions

Continue reading

2017 Bug Bounty Year in Review

2017 Bug Bounty Year in Review

7 minute read

At Shopify, our bounty program complements our security strategy and allows us to leverage a community of thousands of researchers who help secure our platform and create a better Shopify user experience. We first launched the program in 2013 and moved to the HackerOne platform in 2015 to increase hacker awareness. Since then, we've continued to see increasing value in the reports submitted, and 2017 was no exception.

Continue reading

How Shopify Governs Containers at Scale with Grafeas and Kritis

How Shopify Governs Containers at Scale with Grafeas and Kritis

Today, Google and its contributors launched Grafeas, an open source initiative to define a uniform way for auditing and governing the modern software supply chain. At Shopify, we’re excited to be part of this announcement.

Grafeas, or “scribe” in Greek, enables us to store critical software component metadata during our build and integration pipelines. With over 6,000 container builds per day and 330,000 images in our primary container registry, the security team was eager to implement an appropriate auditing strategy to be able to answer questions such as:

  • Is this container deployed to production?
  • When was the time this container was pulled (downloaded) from our registry?
  • What packages are installed in this container?
  • Does this container contain any security vulnerabilities?
  • Does this container meet our security controls?

Using Grafeas as the central source of truth for container metadata has allowed the security team to answer these questions and flesh out appropriate auditing and lifecycling strategies for the software we deliver to users at Shopify.

Here’s a sample of some of the container introspection we gain from Grafeas. In this example we have details surrounding the origin of this container including its build details, base image and the operations that resulted in the container layers.

Build Details:

Image Basis:

As part of Grafeas, Google also introduced Kritis, or “judge” in Greek, which allows us to use the metadata stored in Grafeas to build and enforce real-time deployment policies with Kubernetes. During CI, a number of audits are performed against the containers and attestations are generated. These attestations make up the policies we can enforce with Kritis on Kubernetes.

At Shopify we use PGP to digitally sign our attestations, ensuring the identity of our builder and other attestation authorities.

Here’s an example of a signed attestation:

The two key concepts of Kritis are attestation authorities and policies. Attestation authorities are described as a named entity which has the capability to create attestations. A policy would then name one or more attestation authorities whose attestations are required to deploy a container to a particular cluster. Here’s an example of what that might look like:

Given the above attestation authorities (built-by-us and tested) we can deploy a policy similar to this example:

This policy would preclude the deployment of any container that does not have signed attestations from both authorities.

Given this model, then we can create a number of attestation authorities which satisfy particular security controls.

Attestation Examples:

  • This container has been built by us
  • This container comes from our (or a trusted) container repository
  • This container does not run as root
  • This container passes CI tests
  • This container does not introduce any new vulnerabilities (scanned)
  • This container is deployed with the appropriate security context

Given the attestation examples above, we can enable Kritis enforcement on our Kubernetes clusters that ensures we only run containers which are free from known vulnerabilities, have passed our CI tests, do not run as root, and have been built by us!

In addition to build time container security controls we can also generate Kritis attestations for the Kubernetes workload manifests with the results of kubeaudit during CI. This means we can ensure there are no regressions in the runtime security controls before the container is even deployed.

Using tools like Grafeas and Kritis has allowed us to inject security controls into the DNA of Shopify’s cloud platform to provide software governance techniques at scale alongside our developers, unlocking the velocity of all the teams.

We’re really excited about these new tools and hope you are too! Here are some of the ways you can learn more about the projects and get involved:

Try Grafeas now and join the GitHub project: https://github.com/Grafeas

Attend Shopify’s talks at Google Cloud Summit in Toronto on 10/17 and KubeCon in December.

See grafeas.io for documentation and examples.

Continue reading

Sharing the Philosophy Behind Shopify's Bug Bounty

Sharing the Philosophy Behind Shopify's Bug Bounty

2 minute read

Bug bounties have become commonplace as companies realize the advantages to distributing the hunt for flaws and vulnerabilities among talented people around the world. We're no different, launching a security response program in 2012 before evolving it into a bug bounty with HackerOne in 2015. Since then, we've seen meaningful results including nearly 400 fixes from 250 researchers, to the tune of bounties totalling over half a million dollars.

Security is vital for us. With the number of shops and volume of info on our platform, it's about maintaining trust with our merchants. Entrepreneurs are running their businesses and they don't want to worry about security, so anything we can do to protect them is how we measure our success. As Tobi recently mentioned on Hacker News, “We host the livelihoods of hundreds of thousands of other businesses. If we are down or compromised all of them can't make money.” So, we have to ensure any issue gets addressed.

Continue reading

Session Hijacking Protection

Session Hijacking Protection

There’s been a lot of talk in the past few weeks about “Firesheep”, a new program that lets users hijack other users’ accounts on many different websites. But there’s no need to worry about your Shopify account — we’ve taken steps to ensure your account can’t be hijacked and your data is safe.

Firesheep is a Firefox plugin (a program that integrates right into the Firefox browser) that makes it easy to perform HTTP session cookie hijacks when using an insecure connection on an untrusted network. This kind of attack is nothing new, but Firesheep makes it dead simple and shows how prevalent it is.

The attack consists of stealing cookie data over an untrusted network and using that data to log in to other people’s user accounts. Many websites that you use daily, including Shopify, are susceptible to this kind of attack.

Naturally we reacted to this by taking measures to ensure that this can’t happen to our users. All of your Shopify admin data is now fully secure, encrypted, and protected from Firesheep attacks.

Technical Details

The only way to ensure that cookie data, or any data sent over HTTP for that matter, is not been spied upon is end-to-end encryption. Currently the solution for this is SSL.

Last week we made the switch to all SSL in the Shopify admin area. This has been applied to all URLs and all subscription plans. This means that any request made to Shopify will be forced to use SSL for secure encryption.

But this is not quite enough to ensure that cookie data is not hijacked. By default HTTP cookies are sent over secured, as well as unsecured, connections. Without taking the extra step to secure the HTTP cookie as well, your session is still vulnerable.

The Problem

In Shopify’s case we weren’t able to use SSL for all traffic on the site. There are two main areas to Shopify, the shop frontend and the shop backend. In the backend is where a shop’s employees manage product data, fulfill orders, etc. In the frontend is where products are viewed, carts are filled, and checkout happens. All traffic in the backend happens under one domain, *.myshopify.com, with individual accounts having unique subdomains. One wildcard SSL cert allows us to protect the entire backend.

We can’t apply the same strategy to the shop frontends because we allow our merchants to use custom domains for their shops. So there are literally thousands of different domain names pointing at the Shopify servers, each of which would require an SSL cert. An unsecure frontend is not too worrisome since there is no sensitive data being passed around, just information about what’s stored in the cart.

However, this meant that we would need two different session cookies, one for use in the backend to be sent on encrypted connections only, and one for use in the frontend to be sent unencrypted.

Using two different session stores based on routes isn’t something that Ruby on Rails supports out of the box. You set one session store for your application that gets inserted into the middleware chain and handles sessions for your application.

The Solution

So we came up with a

MultiSessionStore
that delegates to multiple session stores based on the
PATH_INFO
Shopify still has only one session store handling all of its sessions, but if the request comes in under the
/admin
path we’ll use the secure cookie, and if it comes in under another path we’ll use the unsecured cookie.

Here is our implementation in its entirety: https://gist.github.com/704099

This last step, the secured cookie, ensures that session cookie data is never available for hijacking.

Continue reading

Start your free 14-day trial of Shopify