How Shopify Uses Recommender Systems to Empower Entrepreneurs

How Shopify Uses Recommender Systems to Empower Entrepreneurs

Authors: Dóra Jámbor and Chen Karako 

There is a good chance you have come across a “recommended for you” statement somewhere in our data-driven world. This may be while shopping on Amazon, hunting for new tracks on Spotify, looking to decide what restaurant to go to on Yelp, or browsing through your Facebook feed — ranking and recommender systems are an extremely important feature of our day-to-day interactions.

This is no different at Shopify, a cloud-based, multi-channel commerce platform that powers over 600,000 businesses of all sizes in approximately 175 countries. Our customers are merchants that use our platform to design, set up, and manage their stores across multiple sales channels, including web, mobile, social media, marketplaces, brick-and-mortar locations, and pop-up shops.

Shopify builds many different features in order to empower merchants throughout their entrepreneurial lifecycle. But with the diversity of merchant needs and the variety of features that Shopify provides, it can quickly become difficult for people to filter out what’s relevant to them. We use recommender systems to suggest personalized insights, actions, tools and resources to our merchants that can help their businesses succeed. Every choice a merchant makes has consequences for their business and having the right recommendation at the right time can make a big difference.

In this post, we’ll describe how we design and implement our recommender system platform.

Methodology

Collaborative Filtering (CF) is a common technique to generate user recommendations for a set of items. For Shopify, users are merchants, and items are business insights, apps, themes, blog posts, and other resources and content that merchants can interact with. CF allows us to leverage past user-item interactions to predict the relevance of each item to a given user. This is based on the assumption that users with similar past behavior will show similar preferences for items in the future.

The first step of designing our recommender system is choosing the right representation for user preferences. One way to represent preferences is with user-item interactions, derived from implicit signals like the user’s past purchases, installations, clicks, views, and so on. For example, in the Shopify App Store, we could use 1 to indicate an app installation and 0 to represent an unobserved interaction with the given app.

User-item Interaction
User-item interaction

These user-item interactions can be collected across all items, producing a user preference vector.

User Preference Vector
User preference vector

This user preference vector allows us to see the past behavior of a given user across a set of items. Our goal is now to predict the relevance of items that the user hasn’t yet interacted with, denoted by the red 0s. A simple way of achieving our goal is to treat this as a binary classification problem. That is, based on a user’s past item interactions, we want to estimate the probability that the user will find an item relevant.

User Preference (left) and Predicted Relevance (right)

User Preference (left) and Predicted Relevance (right)

We do this binary classification by learning the relationship between the item itself and all other items. We first create a training matrix of all user-item interactions by stacking users’ preference vectors. Each row in this matrix serves as an individual training example. Our goal is to reconstruct our training matrix in a way that predicts relevance for unobserved interactions.

There are a variety of machine learning methods that can achieve this task including linear models such as Sparse Linear Methods (SLIM), linear method variations (e.g., LRec), autoencoders, and matrix factorization. Despite the differences in how these models recover item relevance, they can all be used to reconstruct the original training matrix.

At Shopify, we often use linear models because of the benefits they offer in real-world applications. For the remainder of this post, we’ll focus on these techniques.

Linear methods like LRec and its variations solve this optimization problem by directly learning an item-item similarity matrix. Each column in this item-item similarity matrix corresponds to an individual item’s model coefficients.

We put these pieces together in the figure below. On the left, we have all user-item interactions, our training matrix. In the middle, we have the learned item-item similarity matrix where each column corresponds to a single item. Finally, on the right, we have the predicted relevance scores. The animation illustrates our earlier discussion of the prediction process.

User-item Interactions (left), Item-item Similarity (middle), and Predicted Relevance (right)
User-item Interactions (left), Item-item Similarity (middle), and Predicted Relevance (right)

To generate the final user recommendations, we take the items that the user has not yet interacted with and sort their predicted scores (in red). The top scored items are then the most relevant items for the user and can be shown as recommendations as seen below.

Personalized App Recommendations on the Shopify App Store
Personalized app recommendations on the Shopify App Store

Linear methods and this simple binary framework are commonly used in industry as they offer a number of desired features to serve personalized content to users. The binary aspect of the input signals and classification allows us to maintain simplicity in scaling a recommender system to new domains, while also offering flexibility with our model choice.

Scalability and Parallelizability

As shown in the figure above, we train one model per item on all user-item interactions. While the training matrix is shared across all models, the models can be trained independently from one another. This allows us to run our model training in a task-parallel manner, while also reducing the time complexity of the training. Additionally, as the number of users and items grows, this parallel treatment favors the scalability of our models.

Interpretability

When building recommender systems, it’s important that we can interpret a model and explain the recommendations. This is useful when developing, evaluating, and iterating on a model, but is also helpful when surfacing recommendations to users.

The item-item similarity matrix produced by the linear recommender provides a handy tool for interpretability. Each entry in this matrix corresponds to a model coefficient that reflects the learned relationship of two items. We can use this item-item similarity to derive which coefficients are responsible for a produced set of user recommendations.

Coefficients are especially helpful for recommenders that include other user features, in addition to the user-item interactions. For example, we can include merchant industry as a user feature in the model. In this case, the coefficient for a given item-user feature allows us to share with the user how their industry shaped the recommendations they see. Showing personalized explanations with recommendations is a great way of establishing trust with users.

For example, merchants’ home feeds, shown below, contain personalized insights along with explanations for why those insights are relevant to them.

Shopify Home Feed: Showing Merchants how Their Business is Doing, Along With Personalized Insights
Shopify Home Feed: Showing Merchants how Their Business is Doing, Along With Personalized Insights

Extensibility

Beyond explanations, user features are also useful for enriching the model with additional user-specific signals such as shop industry, location, product types, target audience and so on. These can also help us tackle cold-start problems for new users or items, where we don’t yet have much item interaction data. For example, using a user feature enriched model, a new merchant who has not yet interacted with any apps could now also benefit from personalized content in the App Store.

Performance

A recommender system must yield high-quality results to be useful. Quality can be defined in various ways depending on the problem at hand. There are several recommender metrics to reflect different notions of quality like precision, diversity, novelty, and serendipity. Precision can be used to measure the relevance of recommended items. However, if we solely optimize for precision, we might appeal to the majority of our users by simply recommending the most popular items to everyone, but would fail to capture subtleties of individual user preferences.

For example, the Shopify Services Marketplace, shown below, allows merchants to hire third-party experts to help with various aspects of their business.

Shopify Services Marketplace, Where Merchants can Hire Third-party Experts
Shopify Services Marketplace, Where Merchants can Hire Third-party Experts

To maximize the chance of fruitful collaboration, we want to match merchants with experts who can help with their unique problems. On the other hand, we also want to ensure that our recommendations are diverse and fair to avoid scenarios in which a handful of experts get an overwhelming amount of merchant requests, preventing other experts from getting exposure. This is one example where precision alone isn’t enough to evaluate the quality of our recommender system. Instead, quality metrics need to be carefully selected in order to reflect the key business metric that we hope to optimize.

While recommendations across various areas of Shopify optimize different quality metrics, they’re ultimately all built with the goal of helping our merchants get the most out of our platform. Therefore, when developing a recommender system, we have to identify the metric, or proxy for that metric that allows us to determine whether the system is aligned with this goal.

Conclusion

Having a simple and flexible base model reduces the effort needed for Shopify Data Science team members to extend into new domains of Shopify. Instead, we can spend more time deepening our understanding of the merchant problems we are solving, refining key model elements, and experimenting with ways to extend the capabilities of the base model.

Moreover, having a framework of binary input signals and classification allows us to easily experiment with different models that enrich our recommendations beyond the capabilities of the linear model we presented above.

We applied this approach to provide recommendations to our merchants in a variety of contexts across Shopify. When we initially launched our recommendations through A/B tests, we observed the following results:

  • Merchants receiving personalized app recommendations on the Shopify App Store had a 50% higher app install rate compared to those who didn’t receive recommendations
  • Merchants with a personalized home feed were up to 12% more likely to report that the content of their feed was useful, compared to those whose feeds were ranked by a non-personalized algorithm.
  • Merchants who received personalized matches with experts in the Expert Marketplace had a higher response rate and had overall increased collaboration between merchants and third-party experts.
  • Merchants who received personalized theme recommendations on the Shopify Theme Store, seen below, were over 10% more likely to launch their online store, compared to those receiving non-personalized or no recommendations.

Shopify Theme Store: Where Merchants can Select Themes for Their Online Store
Shopify Theme Store: Where Merchants can Select Themes for Their Online Store

This post was originally published on Medium.

This post was edited on Feb 6, 2019


We’re always working on challenging new problems on the Shopify Data team. If you’re passionate about leveraging data to help entrepreneurs, check out our open positions in Data Science and Engineering.

Continue reading

iOS Application Testing Strategies at Shopify

iOS Application Testing Strategies at Shopify

At Shopify, we use a monorepo architecture where multiple app projects coexist in one Git repository. With hundreds of commits per week, the fast pace of evolution demands a commitment to testing at all levels of an app in order to quickly identify and fix regression bugs.

This article presents the ways we test the various components of an iOS application: Models, Views, ViewModels, View Controllers, and Flows. For brevity, we ignore the details of the Continuous Integration infrastructure where these tests are run, but you can learn more from the Building a Dynamic Mobile CI System blog post.

Testing Applications, Like Building a Car

Consider the process of building a reliable car, base components like cylinders and pistons are individually tested to comply with design specifications (Model & View tests). Then these parts are assembled into an engine, which is also tested to ensure the components fit and function well together (View Controller tests). Finally, the major subsystems like the engine, transmission, and cooling systems are connected and the entire car is test-driven by a user (Flow tests).

The complexity and slowness of a test increases as we go from unit to manual tests, so it’s important to choose the right type and amount of tests for each component hierarchy. The image below shows the kind of tests we use for each type of app component; it reads bottom-up like a Model is tested with Regular Unit Tests.

Types of Tests Used for App Components
Types of Tests Used for App Components

Testing Models

A Model represents a business entity like a Customer, Order, or Cart. As the foundation of all other application constructs, it’s crucial to test that the properties and methods of a model ensure conformance with their business rules. The example below shows a unit test for the Customer model where we test the rule for a customer with multiple addresses, the billingAddress must be the first default address.

A Word on Extensions

Changing existing APIs in a large codebase is an expensive operation, so we often introduce new functionality as Extensions. For example, the function below enables two String Arrays to be merged without duplicates.

We follow a few conventions. Each test name follows a compact and descriptive format test<Function><Goal>. Test steps are about 15 lines max otherwise the test is broken down into separate cases. Overall, each test is very simple and requires minimal cognitive load to understand what it’s checking.

Testing Views

Developers aim to implement exactly what the designers intend under various circumstances and avoid introducing visual regression bugs. To achieve this, we use Snapshot Testing to record an image of a view, then subsequent tests compare that view with the recorded snapshot and fails if different.

For example, consider a UITableViewCell for Ping Pong players with the user’s name, country, and rank. What happens when the user has a very long name? Does the name wrap to a second line, truncate, or does it push the rank away? We can record our design decisions as snapshot tests so we are confident the view gracefully handles such edge cases.

UITableViewCell Snapshot Test
UITableViewCell Snapshot Test

Testing View Models

A ViewModel represents the state of a View component and decouples business models from Views—it’s the state of the UI. So, they store information like the default value of a slider or segmented control and the validation logic of a Customer creation form. The example below shows the CustomerEntryViewModel being tested to ensure its taxExempt property is false by default, and that its state validation function works correctly given an invalid phone number.

Testing View Controllers

The ViewController is the top hierarchy of component composition. It brings together multiple Views and ViewModels in one cohesive page to accomplish a business use case. So, we check whether the overall view meets the design specification and whether components are disabled or hidden based on Model state. The example below shows a Customer Details ViewController where the Recent orders section is hidden if a customer has no orders or the ‘edit’ button is disabled if the device is offline. To achieve this, we use snapshot tests as follows.

Snapshot Testing the ViewController
Snapshot Testing the ViewController

Testing Workflows

A Workflow uses multiple ViewControllers to achieve a use case. It’s the highest level of functionality from the user’s perspective. Flow tests aim to answer specific user questions like: can I login with valid credentials?, can I reset my password?, and can I checkout items in my cart?

We use UI Automation Tests powered by the XCUITest framework to simulate a user performing actions like entering text and clicking buttons. These tests are used to ensure all user-facing features behave as expected. The process for developing them is as follows.

  1. Identify the core user-facing features of the app—features without which users cannot productively use the app. For example, a user should be able to view their inventory by logging in with valid credentials, and a user should be able to add products to their shopping cart and checkout.
  2. Decompose the feature into steps and note how each step can be automated: button clicks, view controller transitions, error and confirmation alerts. This process helps to identify bottlenecks in the workflow so they can be streamlined.
  3. Write code to automate the steps, then compose these steps to automate the feature test.

The example below shows a UI Test checking that only a user with valid credentials can login to the app. The testLogin() function is the main entry point of the test. It sets up a fresh instance of the app by calling setUpForFreshInstall(), then it calls the login() function which simulates the user actions like entering the email and password then clicking the login button.

Considering Accessibility

One useful side effect of writing UI Automation Tests is that they improve the accessibility of the app, and this is very important for visually impaired users. Unlike Unit Tests, UI Tests don’t assume knowledge of the internal structure of the app, so you select an element to manipulate by specifying its accessibility label or string. These labels are read aloud when users turn on iOS accessibility features on their devices. For more information about the use of accessibility labels in UI Tests, watch this Xcode UI Testing - Live Tutorial Session video.

Manual Testing

Although we aim to automate as much flow tests as possible, the tools available aren’t mature enough to completely exclude manual testing. Issues like animation glitches and rendering bugs are only discovered through manual testing…some would even argue that so long as applications are built for users, manual user testing is indispensable. However, we are becoming increasingly dependant on UI Automation tests to replace Manual tests.

Conclusion

Testing at all levels of the app gives us the confidence to release applications frequently. But each test also adds a maintenance liability. So, testing each part of an app with the right amount and type of test is important. Here are some tips to guide your decision.

  • The speed of executing a test decreases as you go from Unit to Manual tests.
  • The human effort required to execute and maintain a test increases from Unit tests to Manual tests.
  • An app has more subcomponents than major components.
  • Expect to write a lot more Unit tests for subcomponents and fewer, more targeted tests as you move up to UI Automation and Manual tests...a concept known as the Test Pyramid.

Finally, remember that tests are there to ensure your app complies with business requirements, but these requirements will change over time. So, developers must consistently remove tests for features that no longer exist, modify existing tests to comply with new business rules, and add new tests to maintain code coverage.

If you'd like to continue talking about application testing strategies, please find me on Medium at @u.zziah


If are passionate about iOS development and excellent user experience, the Shopify POS team is hiring a Lead iOS Developer! Have a look at the job posting

Continue reading

The Unreasonable Effectiveness of Test Retries: An Android Monorepo Case Study

The Unreasonable Effectiveness of Test Retries: An Android Monorepo Case Study

At Shopify, we don't have a QA team, we have a QA culture which means we rely on automated testing to ensure the quality of our mobile apps. A reliable Continuous Integration (CI) system allows our developers to focus on building a high-quality system, knowing with confidence that if something is wrong, our test suite will catch it. To create this confidence, we have extensive test suites that include integration tests, unit tests, screenshot tests, instrumentation tests, and linting. But every large test suite has an enemy: flakiness.

A flaky test can exhibit both a passing and failing result with the same code and requires a resilient system that can recover from those failures. Tests can fail for different reasons that aren’t related to the test itself: network or infrastructure problems, bugs in the software that runs the tests, or even cosmic rays.

Last year, we moved our Android apps and libraries to a monorepo and increased the size of our Android team. This meant more people working in the same codebase and more tests executed when a commit merged to master (we only run the entire test suite on the master branch. For other branches only the tests related to what have changed are run). It’s only logical that the pass rate of our test suites took a hit.

Let’s assume that every test we execute is independent of each other (events like network flakiness affect all tests, but we’re not taking that into account here) and passes 99.95% of the time. We execute pipelines that each contain 100 tests. Given the probability of a test, we can estimate that the pipeline will pass 0.9995100 = 95% of the time. However, the entire test suite is made up of 20 pipelines with the same pass probability so it will pass 0.9520 = 35% of the time.

This wasn’t good and we had to improve our CI pass rate.

Developers lose trust in the test infrastructure when CI is red most of the time due to test flakiness or infrastructure issues. They’ll start assuming that every failure is a false positive caused by flakiness. Once this happens, we’ve lost the battle and gaining that developer’s trust back is difficult. So, we decided to tackle this problem in the simplest way: retrying failures.

Retries are a simple, yet powerful mechanism to increase the pass rate of our test suite. When executing tests, we believe in a fail-fast system. The earlier we get feedback, the faster we can move and that’s our end goal. Using retries may sound counterintuitive, but almost always, a slightly slower build is preferable over a user having to manually retry a build because of a flaky failure.

When retrying tests once, the chances of failing CI due to a single test would require that test to fail twice. Using the same assumptions as before, the chances of that happening are 0.05% · 0.05% = 0.000025% for each test. That translates to a 99.999975% pass rate for each test. Performing the same calculation as before, for each pipeline we would expect a pass rate of 0.99999975100 = 99.9975%, and for the entire CI suite, 0.99997520 = 99.95%. Simply by retrying failing tests, the theoretical pass rate of our full CI suite increases from 35% to 99.95%.

In each of our builds, many different systems are involved and things can go wrong while setting up the test environment. Docker can fail to load the container, bundler can fail while installing some dependencies, and so can git fetch. All of those failures can be retried. We have identified some of them as retriable failures, which means they can be retried within the same job, so we don’t need to initialize the entire test environment again.

Some other failures aren’t as easy to retry in the same job because of its side effects. Those are known as fatal failures, and we need to reload the test environment altogether. This is slower than a retriable failure, but it’s definitely faster than waiting for the developer to retry the job manually, or spend time trying to figure why a certain task failed to finally realize that the solution was retrying.

Finally, we have test failures. As we have seen, a test can be flaky. They can fail for multiple reasons, and based on our data, screenshot tests are flakier than the rest. If we detect a failure in a test, that single test is retried up to three times.

The Message Displayed When a Test Fails and It’s Retried.
The message displayed when a test fails and it’s retried.

Retries in general and test retries, in particular, aren’t ideal. They work but make CI slower and can hide reliability issues. At the end of the day, we want our developers to have a reliable CI while encouraging them to fix test flakiness if possible. For this reason, we detect all the tests that pass after a retry and notify the developers so the problem doesn’t go unnoticed. We think that a test that passes in a second attempt shouldn’t be treated like a failure, but as a warning that something can be improved. To reduce the flakiness of builds these are the tips we recommend besides retry mechanisms:

  • Don't depend on unreliable components in your builds. Try to identify the unreliable components of your system and don’t depend on them if possible. Unfortunately, most of the time this is not possible and we need those unreliable components.
  • Work on making the component more reliable. Try to understand why the component isn’t reliable enough for your use case. If that component is under your control, make changes to increase reliability.
  • Apply caching to invoke the unreliable component less often. We need to interact with external services for different reasons. A common case is to download dependencies. Instead of downloading them for every build, we can build a cache to reduce our interactions with this external service and therefore gaining in resiliency.
These tips are exactly what we did from an infrastructure point of view. When this project started, the pass rate in our Android app pipeline was 31%. After identifying and applying retry mechanisms to the sources of flakiness and adding some cache to the gradle builds we managed to increase it to almost 90%.

Pass rate plot from March to September
Pass rate plot from March to September

Something similar happened in our iOS repository. After improving our CI infrastructure, adding the previously discussed retry mechanisms and applying the tips to reduce flakiness, the pass rate grew from 67% to 97%.

It may sound counterintuitive, but thanks to retries we can move faster having slower builds.

We love to talk about mobile tooling. Feel free to reach out to us in the comments if you want to know more or share your solution to this problem.


Intrigued? Shopify is hiring and we’d love to hear from you. Please take a look at our open positions on the Engineering career page

Continue reading

Preparing Shopify for Black Friday and Cyber Monday

Preparing Shopify for Black Friday and Cyber Monday

Making commerce better for everyone is a challenge we face on a daily basis. For our Production Engineering team, it means ensuring that our 600,000+ merchants have a reliable and scalable platform to support their business needs. We need to be able to support everything our merchants throw at us—including the influx of holiday traffic during Black Friday and Cyber Monday (BFCM). All of this needs to happen without an interruption in service. We’re proud to say that the effort we put in to deploying, scaling, and launching new projects on a daily basis gives our merchants access to a platform with 99.98% uptime.

Black Friday Cyber Monday 2018 by the numbers
Black Friday Cyber Monday 2018

To put the impact of this into perspective, Black Friday and Cyber Monday is what we refer to as our World Cup. Each year, our merchants push the boundaries of our platform to handle more traffic and more sales. This year alone, merchants sold over $1.5 billion USD in sales throughout the holiday weekend.

What people may not realize is that Shopify is made up of many different internal services and interaction points with third-party providers, like payment gateways and shipping carriers. The performance and reliability of each of this dependencies can potentially affect our merchants and buyers in different ways. That’s why our Production Engineering teams preparations for BFCM run the entire gamut.

To increase the chances of success on BFCM Production Engineering run “game days” on our systems and their dependencies. Game days are a form of fault injection where we test our assumptions about the system by degrading its dependencies under controlled conditions. For example, we’ll introduce artificial latency into the code paths that interact with shipping providers to ensure that the system continues working and doing something reasonable. That could be, for instance, falling back to another third party or hard-coded defaults if a third party dependency were to become slow for any reason, or verifying that a particular service responds as expected to a problem with their main datastore.

Besides fault injection work, Production Engineering also run load testing exercises where volumes similar to what we expect during BFCM are created synthetically and sent to the different applications to ensure that the system and its components behave well under the onslaught of requests they’ll serve on BFCM.

At Shopify, we pride ourselves on continuous and fast deploys to deliver features and fixes as fast as we can; however, the rate of change on a system increases the probability of issues that can affect our users. During the ramp-up period for BFCM, we manage the normal cadence of the company by establishing both a feature freeze and a code freeze. The feature freeze starts several weeks before BFCM and means no meaningful changes to user-facing features are deployed to prevent changes on merchant’s workflows. At that point in the year, changes, even improvements can have an unacceptable learning curve for merchants that are diligently getting ready for the big event.

A few days before BFCM and during the event an actual code freeze is in effect, means that only critical fixes can be deployed and everything else must remain in stasis. The idea is to reduce the possibility of introducing bugs and unexpected system interactions that could cause the service to be compromised during the peak days of the holiday season.

Did all of our preparations work out? With BFCM in the rearview mirror, we can say, yes. This BFCM weekend was a record breaker for Shopify. We saw nearly 11,000 orders created per minute and around 100,000 requests per second being served for extended periods during the weekend. All and all, most system metrics followed a pattern of 1.8 times what they were in 2017.

The somewhat unsurprising conclusion is that running towards the risk by injecting faults, load testing, and role-playing possible disaster scenarios pays off. Also, reliability goes beyond your “own” system most complex platforms these days have to deal with third parties to provide the best service possible. We have learned to trust our partners but also understand that any system can have downtime and in the end, Shopify is responsible to our merchants and buyers.

Continue reading

Bug Bounty Year in Review 2018

Bug Bounty Year in Review 2018

With 2018 coming to a close, we thought it a good opportunity to once again reflect on our Bug Bounty program. At Shopify, our bounty program complements our security strategy and allows us to leverage a community of thousands of researchers who help secure our platform and create a better Shopify user experience. This was the fifth year we operated a bug bounty program, the third on HackerOne and our most successful to date (you can read about last year’s results here). We reduced our time to triage by days, got hackers paid quicker, worked with HackerOne to host the most innovative live hacking event to date and continued contributing disclosed reports for the bug bounty community to learn from.

Our Triage Process

In 2017, our average time to triage was four days. In 2018, we shaved that down to 10 hours, despite largely receiving the same volume of reports. This reduction was driven by our core program commitment to speed. With 14 members on the Application Security team, we're able to dedicate one team member a week to HackerOne triage.

When someone is the dedicated “triager” for the week at Shopify, that becomes their primary responsibility with other projects becoming secondary. Their job is to ensure we quickly review and respond to reports during regular business hours. However, having a dedicated triager doesn't preclude others from watching the queue and picking up a report.

When we receive reports that aren't N/A or Spam, we validate before triaging and open an issue internally since we pay $500 when reports are triaged on HackerOne. We self-assign reports on the HackerOne platform so other team members know the report is being worked on. The actual validation process we use depends on the severity of the issue:

  • Critical: We replicate the behavior and confirm the vulnerability, page the on-call team responsible and triage the report on HackerOne. This means the on-call team will be notified immediately of the bug and Shopify works to address it as soon as possible.
  • High: We replicate the behavior and ping the development team responsible. This is less intrusive than paging but still a priority. Collaboratively, we review the code for the issue to confirm it's new and triage the report on HackerOne.
  • Medium and Low: We’ll either replicate the behavior and review the code, or just review the code, to confirm the issue. Next, we review open issues and pull requests to ensure the bug isn't a known issue. If there are clear security implications, we'll open an issue internally and triage the report on HackerOne. If the security implications aren't clear, we'll err on the side of caution and discuss with the responsible team to get their input about whether we should triage the report on HackerOne.

This approach allows us to quickly act on reports and mitigate critical and high impact reports within hours. Medium and Low reports can take a little longer, especially where the security implications aren't clear. Development teams are responsible for prioritizing fixes for Medium and Low reports within their existing workloads, though we occasionally check in and help out.

H1-514

Shopify x HackerOne H1-514
H1-514 in Montreal

In October, we hosted our second live hacking event and it was the first hacking event in our office in Montreal, Quebec, H1-514. We welcomed over 40 hackers to our office to test our systems. To build on our program's core principles of responsiveness, transparency and timely payouts, we wanted to do things differently than other HackerOne live hacking events. As such, we worked with HackerOne to do a few firsts for live hacking events:

  • While other events opened submissions the morning of the event, we opened submissions when the target was announced to be able to pay hackers as soon as the event started and avoid a flood of reports
  • We disclosed resolved reports to participants during the event to spark creativity instead of leaving this to the end of the event when hacking was finished
  • We used innovative bonuses to reward creative thinking and hard work from hackers testing systems that are very important to Shopify (e.g. GraphQL, race conditions, oldest bug, regression bonuses, etc.) instead of awarding more money for the number of bugs people found
  • We gave hackers shell access to our infrastructure and asked them to report any bugs they found. While none were reported at the event, the experience and feedback informed a continued Shopify infrastructure bounty program and the Kubernetes product security team's exploration of their own bounty program.

Shopify x HackerOne H1-514
H1-514 in Montreal

When we signed on to host H1-514, we weren't sure what value we'd get in return since we run an open bounty program with competitive bounties. However, the hackers didn't disappoint and we received over 50 valid vulnerability reports, a few of which were critical. Reflecting on this, the success can be attributed to a few factors:

  • We ship code all the time. Our platform is constantly evolving so there's always something new to test; it's just a matter of knowing how to incentivize the effort for hackers (You can check the Product Updates and Shopify News blogs if you want to see our latest updates).
  • There were new public disclosures affecting software we use. For example, Tavis Ormandy's disclosure of Ghostscript remote code execution in Imagemagick, which was used in a report during the event by hacker Frans Rosen.
  • Using bonuses to incentivize hackers to explore the more complex and challenging areas of the bounty program. Bonuses included GraphQL bugs, race conditions and the oldest bug, to name a few.
  • Accepting submissions early allowed us to keep hackers focused on eligible vulnerability types and avoid them spending time on bugs that wouldn't be rewarded. This helped us manage expectations throughout the two weeks, keep hackers engaged and make sure everyone was using their time effectively.
  • We increased our scope. We wanted to see what hackers could do if we added all of our properties into the scope of the bounty program and whether they'd flock to new applications looking for easier-to-find bugs. However, despite the expanded scope, we still received a good number of reports targeting mature applications from our public program.

H1-514 in Montreal. Photo courtesy of HackerOne
H1-514 in Montreal. Photo courtesy of HackerOne

Stats (as of Dec 6, 2018)

2018 was the most successful year to date for our bounty program. Not including the stats from H1-514, we saw our average bounty increase again, this time to $1,790 from $1,100 in 2017. The total amount paid to hackers was also up $90,200 compared to the previous year, to $155,750 with 60% of all resolved reports having received a bounty. We also went from one five-figure bounty awarded in 2017, to five in 2018 marked by the spikes in the following graph.

Bounty Payouts by Date
Bounty Payouts by Date

As mentioned, the team committed to quick communication, recognizing how important it is to our hackers. We pride ourselves on all of our timing metrics being among the best in the category on HackerOne. While our initial response time slipped by 5 hours to 9 hours, our triage time was reduced by over 3 days to 10 hours (it was 4 days in 2017). Both our time to bounty and resolution times also dropped, time to bounty to 30 days and resolution to 19 days, down from about a month.

Response Time by Date
Response Time by Date

Report Submitted by Date
Report Submitted by Date

In 2018 we received 1,010 reports. 58.7% were closed as not applicable compared to 63.1% in 2017. This was accompanied by an almost one percent increase in the number of resolved reports, 11.3%, up from 10.5% in 2017. The drop in not applicable and rise in informatives (reports which contain useful information but don't warrant immediate action) is likely the result of the team's commitment to only close bugs as not applicable when the issue reported is in our tables of known issues and ineligible vulnerabilities types or lacks evidence of a vulnerability.

Types of Bugs Closed
Types of Bugs Closed

We also disclosed 24 bugs on our program, one less than the previous year, but we tried to maintain our commitment to requesting disclosure for every bug resolved in our program. We continue to believe it’s extremely important that we build a resource library to enable ethical hackers to grow in our program. We strongly encourage other companies to do the same.

Despite a very successful 2018, we know there are still areas to improve upon to remain competitive. Our total number of resolved reports was down again, 113 compared to 121 despite having added new properties and functionality to our program. We resolved reports from only 62 hackers compared to 71 in 2017. Lastly, we continue to have some low severity reports remain in a triaged state well beyond our target of 1-month resolution. The implications of this are mitigated for hackers since we changed our policy earlier in the year to pay the first $500 of a bounty immediately. Since low severity reports are unlikely to receive an additional bounty, most low-severity reports are paid entirely up-front. HackerOne also made platform changes to award the hackers their reputation when we triage reports versus when we resolve them, as was previously the case.

We're planning new changes, experiments and adding new properties in 2019 so make sure to watch our program for updates.

Happy hacking!


If you're interested in helping to make commerce more secure, visit Shopify on HackerOne to start hacking or our career page to check out our open Trust and Security positions

Continue reading

How an Intern Released 3 Terabytes Worth of Storage Before BFCM

How an Intern Released 3 Terabytes Worth of Storage Before BFCM

Hi there! I’m Gurpreet and currently finishing up my second internship at Shopify. I was part of the Products team during both of my internships. The team is responsible for building and maintaining the products area of Shopify admin. As a developer, every day is another opportunity to learn something new. Although I worked on many tasks during my internship, today I will be talking about one particular problem I solved.

The Problem

As part of the Black Friday Cyber Monday (BFCM) preparations, we wanted to make sure our database was resilient enough to smoothly handle increased traffic during flash sales. After completing an analysis of our top SQL queries, we realized that the database was scanning a large number of fixed-size storage units, called innoDB pages, just to return a single record. We identified the records, historically kept for reporting purposes, that caused this excess scanning. After talking among different teams and making sure that these records were safe to delete, the team decided to write a background job to delete them.

So how did we accomplish this task which could have potentially taken our database down, resulting in downtime for our merchants?

The Background Job

I built the Rails background job using existing libraries that Shopify built to avoid overloading the database while performing different operations including deletion. A naive way to perform deletions is sending either a batch delete query or one delete query per record. It’s not easy to interrupt MySQL operations and doing the naive approach would easily overload the database with thousands of operations. The job-iteration library allows background jobs to run in iterations and it’s one of the Shopify libraries I leveraged to overcome the issue. The job runs in small chunks and can be paused between iterations to let other higher priority jobs run first or to perform certain checks. There are two parts of the job; the enumerator and the iterator. The enumerator fetches records in batches and passes one batch to the iterator at a time. The iterator then fetches the records in the given batch and deletes them. While this made sure that we weren’t deleting a large number of records in a single SQL query, we still needed to make sure we weren’t deleting the batches too fast. Deleting batches too fast results in a high replication lag and can affect the availability of the database. Thankfully, we have an existing internal throttling enumerator which I also leveraged writing the job.

After each iteration, the throttling enumerator checks if we’re starting to overload the database. If so, it automatically pauses the job until the database is back in a healthy state. We ensured our fetch queries used proper indexes and the enumerator used a proper cursor for batches to avoid timeouts. A cursor can be thought of as flagging the last record in the previous batch. This allows fetching records for the next batch by using the flagged record as the pivot. It avoids having to re-fetch previous records and only including the new ones in the current batch.

The Aftermath

We ran the background job approximately two weeks before BFCM. It was a big deal because not only did it free up three terabytes of storage and resulted in large cost savings, it made our database more resilient to flash sales.

For example, after the deletion, as seen in the chart below, our database was scanning around ~3x fewer pages in order to return a single record. Since the database was reading fewer pages to return a single record, it meant that during flash sales, it can serve an increased number of requests without getting overloaded because of unnecessary page scans. This also meant that we were making sure our merchants get the best BFCM experience with minimal technical issues during flash sales.

Database Scanning After Deletion
Database Scanning After Deletion

Truth to be told, I was very nervous watching the background job run because if anything went wrong, that meant downtime for the merchants, which is the last thing we want and man, what a horrible intern experience. At the peak, we were deleting approximately six million records a minute. The Shopify libraries I leveraged helped to make deleting over 🔥5 billion records🔥 look like a piece of cake 🎂.

🔥5 billion records🔥
5 billion Records Deleted

What I Learned

I learned so much from this project. I got vital experience with open source projects when using Shopify’s job-iteration library. I also did independent research to better understand MySQL indexes and how cursors work. For example, I didn’t know about partial indexes and how they worked. MySQL will pick a subset of prefix keys, based on the longest prefix match with predicates in the WHERE clause, to be used by the partial index to evaluate the query. Suppose we have an index on (A,B,C). A query with predicates (A,C) in the WHERE clause will only use the key A from the index, but a query with predicates (A,B) in the WHERE clause will use the keys A and B. I also learned how to use SQL EXPLAIN to analyze SQL queries. It shows exactly which indexes the database considered using, which index it ended up using, how many pages were scanned, and a lot of other useful information. Apart from improving my technical skills, working on this project made me realize the importance of collecting as much context as one can before even attempting to solve the problem. My mentor helped me with cross-team communication. Overall, context gathering allowed me to identify any possible complications ahead of time and make sure the background job ran smoothly.


Can you see yourself as one of our interns? Applications for the Summer 2019 term will be available at shopify.com/careers/interns from January 7, 2019. The deadline for applications is Monday, January 21, 2019, at 9:00 AM EST!

Continue reading

Director of Engineering, Lawrence Mandel Talks Road to Leadership, Growth, and Finding Balance.

Director of Engineering, Lawrence Mandel Talks Road to Leadership, Growth, and Finding Balance.

Lawrence Mandel is a Director of Production Engineering leading Shopify’s Developer Acceleration team and has been at Shopify for over a year. He previously worked at IBM and Mozilla where he started as a software developer before transitioning into leadership roles. Through all his work experience, he’s learned to understand the meaning of time management and to prioritize the most important things in his life, which are his family, health, and work.  

Continue reading

Developer Talks: How the Command Line Can Empower You (Webinar)

Developer Talks: How the Command Line Can Empower You (Webinar)

On Tuesday, 27 November 2018, Eric Fung, Senior Data Scientist presented "How the Command Line Can Empower You." 

You can watch this presentation on Zoom.us and download the speaker notes at Speakerdeck.com

Developer Talks: How the Command Line Can Empower You - November 27, 2018. 1-2pm EST

As a developer, you probably use a modern IDE that lets you write, debug, test, and deploy your code quickly and easily. However, your job often includes activities performed outside your IDE, such as working with APIs, creating screenshots, or massaging data. Eric wants to show you how the command-line can simplify, improve, or even automate some of these tasks.

Eric provides an overview of utilities that gives you more ways to get your work done and get it done faster. Using real-world examples, you'll learn how to type less in the terminal, search your files with ease, manipulate images and JSON files, write code automatically, and more. All without needing to point, click, or swipe!

If you are curious about command-line tools or want to learn more about their impressive capabilities, this talk is for you. This presentation focuses on software available for macOS computers, but Linux and Windows users can benefit, as many of the tools mentioned are cross-platform.

This presentation is a 45-minute talk with 15 minutes dedicated to Q&A.

Couldn't make the presentation? Here as is a link to the view the presentation

About Eric Fung

Since 2010, Eric has worked on many mobile apps and games and spent five years at Shopify as an Android developer before recently transitioning to a data scientist role. At the beginning of his career, he spent a lot of time in Linux and the command-line. Eric caught the public speaking bug a few years ago and is an organizer of GDG Toronto Android. In addition to coding, he enjoys making and eating elaborate desserts.


Continue reading

Handling Addresses from All Around the World

Handling Addresses from All Around the World

Four months ago, I joined the International Growth team at Shopify. The mission of the INTL team (as we call it) is to help Shopify conquer international markets. Our team builds tools, services and enhances Shopify’s platform to make it scale to different markets where we need to tailor the experience locally to a country: add new shipping patterns, new payment paradigms, and be compliant with local laws.

As a senior web developer, the first problem I tackled was to make sure addresses were formatted correctly for everyone, everywhere. Addresses are core parts of our merchant’s business; crucial when delivering products and dealing with customers. At the same time, they are also a crucial part of a customer's journey. Entering an address in a form seems obvious, but there are essential details that you need to get right when going international. Details that might not seem obvious if you haven't thought about it or never lived abroad.

I’m going to take you through some of the problems the team encountered when dealing with addresses and how we solved some of those problems.

The Problem with Addresses

Definition

Let’s start with a simple definition. At Shopify, we describe an address with the following fields:

  • First name
  • Last name
  • Address line 1
  • Address line 2
  • Zone code
  • Postal code
  • City
  • Country code
  • Phone

Zones are administrative divisions by country (see Wikipedia’s article), they are States in the US, provinces in Canada, etc. Some of these fields may be optional or required depending on the country.

Ordering

When looking at the fields listed above, I’m assuming that for some readers, the order of the fields listed make sense. Well, it’s not the case for most people of the world. For example:

  • In Japan, people start their address by entering their postal code. Postal codes are very precise, so with just seven digits, a whole address can be auto-completed. The last name is first, otherwise, it’s considered rude
  • In France, the postal code comes before the city while in Canada it’s the opposite

As you can imagine, the list goes on and on. All of these details can’t be overlooked for a proper localized experience for customers connecting from everywhere in the world. At the same time, creating one version of the form for every country leads to unnecessary code duplication— something to avoid for the code to scale and remain maintainable.

Wording

Let's talk about wording. What is address1? What is zone? Parts of an address aren’t the same around the world, so how to name the labels of forms when building them? The tough part of these differences, from a developer’s perspective, is that we had variations per country, as well as, variations per locale. For example:

  • Zone can refer to "states", "provinces", "regions" or even "departments" in certain countries (such as Peru) 
  • Postal code can be called "ZIP code" or "postcode" or even "postal code"
  • address2 might refer to "apartment number", "unit number" or "suite"
  • In Japan, when displaying an address, the symbol 〒 is prepended to the postal code so, if a user enters 153-0062, it displays as 〒153-0062

Translations

Translation is the most obvious problem, form labels need translation but so do country and zone names. Canada is written the same way in most languages, it’s カナダ in Japanese or كندا in Arabic. Canada is bilingual, so provinces labels are language specific: British Columbia in English becomes Colombie-Britannique in French, etc.

Our Solution (So Far)

We’re at the beginning of our journey to go international. Solutions we come up with are never finished; we iterate and evolve as we learn more. That being said, here’s what we're doing so far.

A Database for Countries

The one thing we needed was a database storing all the data representing every country. Thankfully, we already built it at the beginning of our Internationalization journey (phew!) and had every country represented with YAML files in a GitHub repository. The database stored every country’s basic information such as country code, name, currency code, and a list of zones, where applicable.

Normalization

The same way we have formats to represent dates, we created formats to describe addresses per country. With our database for countries, we can store these formats for every country.

Form Representation

What is the order we want to show input fields when presenting an address form? We came up with the following format to make it easier for reuse:

  • {fieldName}: Name of the field
  • _: line break

Here’s an example with Canada and Japan:

 Japan
{company}_{lastName}{firstName}_{zip}_{country}_{province}{city}_{address1}_{address2}_{phone}


Form Representation Japan

 

Canada
{firstName}{lastName}_{company}_{address1}_{address2}_{city}_{country}{province}{zip}_{phone}

 

Form Representation Canada
Form Representation Canada

Now, with a format for every country, we dynamically reorder the fields of an address form based on the selected country. When the page loads, we already know which country the shop is located and where the user is connecting from, so we can prepopulate the form with the country and show the fields in the right order. If the user changes the country, we also reorder on the fly the form. And since we store the data on provinces, we can also prepopulate the zone dropdown on the fly. 

Display Representation

We’ve used the same representation to show an address as above and the only difference here is that extra characters used to represent an address for different locales are displayed. Here’s another example with Japan and Canada:

Japan
{country}_〒{zip}{province}{city}{address1}{address2}_{company}_{lastName} {firstName}様_{phone}
Canada
{firstName} {lastName}_{company}_{address1} {address2}_{city} {province} {zip}_{country}_{phone}


The thing to note here is that for Japan, we add characters such as 〒 to indicate that what follows is a postal code or we add 様 (“sama”) after the first name which is the formal and respectful form of Miss/Mr/Mrs. And for other countries, we can add commas if necessary and account for spaces.

Labels and Translations

The other problem to resolve was the name of the labels we use to display address data. Remember, the label for postal code can be different in different countries. To solve this, we created a list of keys for certain fields. Our implementation approach is to make changes incrementally instead of taking on the enormous task (it would probably take forever!) of having our address forms work for all countries from the get-go. Based on our most popular countries, we came up with specific label keys that we translate in our front end.

So, as in our previous example, zones are Provinces in Canada and in Japan they’re Prefectures. So in our YAML file for Canada, we’ve added zone_key: province and in Japan’s we’ve added zone_key: prefecture. We translate these keys in our front end. We’ve applied this same logic to other countries and fields when needed. For example, we have zip_key: postcode for certain countries and zip_key: pincode for others. We include default values for all our label keys since we don’t have a value for all countries yet.

Screenshot of the checkout in Japanese and English

 

Translations

As mentioned earlier, country names and province names need translation so we store them per language for most of them. We translate country names in all of our supported locales, but we only translate zones when necessary and based on the usage and the locale. So, for example, Canada has translations for French and English for now. So by default, the provinces will be rendered in English unless your locale is fr. We’ll evolve our translations over time.

API endpoint

Shopify is an ecosystem where many apps live. To ensure our data is up to date everywhere at the same time we created an API endpoint to access it. This way, our iOS, Android and front-end applications will be in sync when introducing new formats for new countries. No need to update the information everywhere since every app will be using the endpoint. The advantage of this approach is in the future we might realize that some formatting isn't only country related but also locale related, e.g. firstName and lastName are reversed when the locale is Japanese no matter if the address in Japan or Canada. Since the endpoint receives the locale for each request, this problem will be transparent from the client.

Creating Abstraction / Libraries

To make the life of developers easier, we’ve created abstraction libraries. Yes, we want to localize our apps, but we also want to keep our developers happy. And asking them to query a graph endpoint and parse the formats we came up with is… maybe a bit much. So we’ve created abstractions to overcome this:

  • Other non-public components built on top of @shopify/address such as an AddressForm and an Address to add another easy abstraction for developers which displays the address form as easily as doing:

The Future

This is the current state of how we’re solving these problems. There are drawbacks that we’re tackling, such as overcoming the fact that we need to fetch information to render an address. Implementing caching solution to prevent from having to do a network call every time we want to render an address or an address form for instance. But this will evolve as we gain more context, knowledge and we grow our tooling around going international.


Intrigued by Internationalization? Shopify is hiring and we’d love to hear from you. Please take a look at our open positions on the Engineering career page.

Continue reading

Running Apache Kafka on Kubernetes at Shopify

Running Apache Kafka on Kubernetes at Shopify

In the Beginning, There Was the Data Center

Apache KafkaShopify is a leading multi-channel commerce platform that powers over 600,000 businesses in approximately 175 countries. We first adopted Apache Kafka as our data bus for reliable messaging in 2014 and mainly used it for collecting events and log aggregation across the systems.

In that first year, our primary focus was building trust in the platform with our data analysts and developers by automating all aspects of cluster management, creating the proper in-house tooling needed for our daily operations, and helping them use it with minimum friction. Initially, our deployment was a single regional Kafka cluster in each of our data centers and one aggregate cluster for our data warehouse. The regional clusters would mirror their data to the aggregate using Kafka’s mirrormaker.


Apache Kafka deployment in the data center

Fast forward to 2016, and we’re managing many multi-tenant clusters in all our regions. These clusters are the backbone of our data superhighway — delivering billions of messages every day to our data warehouse and other application-specific Kafka consumers. Chef provisioned, configured and managed our Kafka infrastructure in the data center. We deploy a configuration change to all clusters at once by updating one or more files in our Chef GitHub repository.

Moving to the Cloud

In 2017, Shopify started moving some services from our data centers to the cloud. We took on the task of migrating our Kafka infrastructure to the cloud with zero downtime. Our target was to achieve reliable cluster deployment with predictable and scalable performance and do all this without sacrificing ease of use and security. Migration was a three-step process:

  1. Deploy one regional Kafka cluster in each cloud region we use, and deploy an aggregate Kafka cluster in one of the regions.
  2. Mirror all regional clusters in the data center and in the cloud to both aggregate clusters in the data center and in the cloud. This guarantees both aggregate clusters will have the same data.
  3. Move Kafka clients (both producers and consumers) from the data center clusters and configure them to point to the cloud clusters.

Apache Kafka deployment during our move to the cloud
Apache Kafka deployment during our move to the cloud

By the time we migrated all clients to the cloud clusters, the regional clusters in the data center had zero incoming traffic and we could safely shut them down. That was followed by a safe shutdown of the aggregate Kafka cluster in the data center as no more clients were reading from it.

Virtual-Machines or Kubernetes?

We compared running Kafka brokers in Google Cloud Platform (GCP) as Virtual Machines (VM) vs. running it in containers managed by Kubernetes and we decided to use Kubernetes for the following reasons. 

The first option using GCP VMs is closer in concept to how we managed physical machines in the data center. There, we have full control of the individual servers, but we also need to write our own tooling to monitor, manage the state of the cluster as a whole, and execute deployments in a way that we do not impact Kafka availability. For example, we can’t perform a configuration change and restart all Kafka brokers at once —this results in a service outage.

Kubernetes, on the other hand, offers abstract constructs to manage a set of containers together as a stateless or stateful cluster. Kubernetes manages a set of Pods. Each Pod is a set of functionally related containers deployed together on a server called a Node. To manage a stateful set of nodes like a Kafka cluster, we used Kubernetes StatefulSets to control deployment and scaling of containers with an ordered and graceful deployment of changes including guarantees to prevent compromising the overall service availability. And to implement our own custom behavior that’s not provided by Kubernetes, we extended it using Custom Resources and Controllers, an extension for Kubernetes API to create user-defined resources and implement actions when these resources are updated.

This is an example of a Kubernetes StatefulSet template used to configure a Kafka cluster of 30 nodes:

Kubernetes StatefulSet template
Kubernetes StatefulSet template

Containerizing Kafka

Running Kafka in a docker container is straightforward, the simplest setup is for the Kafka server configuration to be stored in a Kubernetes ConfigMap and to mount the configuration file in the container by referencing the proper configMap key. But… pulling a third party Kafka image is risky since depending on a Kafka image from an external registry risks application failure if the image is changed or removed! We highly recommend hosting your own container registry and building your own Kafka image. In a critical software environment where you want to minimize sources of failures, it’s more reliable to build the image yourself and host it in your own registry, giving you more control on its content and availability.

Best Practices

Our Kafka Pods contain the Kafka container itself and another resource-monitoring container. Kafka isn’t friendly with frequent server restarts because restarting a Kafka broker or container means terabytes of data shuffling around the cluster. Restarting many brokers at the same time risks having offline-partitions and consequently data-loss. These are some of the best practices we learned and implemented to tune the cluster availability:

  • Node Affinity and Taints: Schedules Kafka containers on nodes with the required specifications. Taints guarantees that other applications can’t use nodes required for Kafka containers. 
  • Inter-pod Affinity and Anti-Affinity prevents the Kubernetes scheduler from scheduling two Kafka containers on the same node.
  • Persistent Volumes is persistent storage for Kafka pods and guarantees that a Pod always mounts the same disk volume when it restarts.
  • Kubernetes Custom Resources extends Kubernetes functionality; we use to automate and manage Kafka Topic provisioning, cluster discovery, and SSL certificate distribution.
  • Kafka broker’s rack-awareness reduces the impact of a single Kubernetes zone failure by mapping Kafka containers to multiple Kubernetes zones
  • Readiness Probe guarantees how fast we roll configuration changes to cluster nodes.

We successfully migrated all our Kafka clusters to the cloud. We run multiple regional Kafka clusters and an aggregate one to mirror all other clusters before feeding its data into our data warehouse. Today, we stream billions of events daily across all clusters — these events are key to our developers, data analysts, and data scientists to build a world-class, data-driven commerce platform.


If you are excited about working on similar systems join our Production-Engineering team at Shopify here: Careers at Shopify 

 

Continue reading

Building Shopify POS for Android Using MVVM

Building Shopify POS for Android Using MVVM

There are many architectures out there to structure your app. The one we use in Shopify’s Point of Sale (POS) for Android app is the Model-View-ViewModel (MVVM) pattern based on Google’s App Architecture Guide which was announced last year at Google I/O 2017.

Shopify’s Point of Sale (POS) for Android app
Shopify POS

History

Our POS app is three and a half years old, and we didn’t build it using MVVM from scratch. Before the move to MVVM, we had two competing architectures in our codebase: Model View Controller (MVC) and Model View Presenter (MVP). Both did the job, but they created inconsistency within the codebase. The developers on the team had difficulty switching between the two options, and we didn’t have good answers for questions about which architecture to use when developing new screens and features. The primary advantages for adopting MVVM are consistent architecture, automatic retention of state across configuration changes, and a clearer separation of concerns that lead to easier testing. MVVM helped new members of the team get up to speed during onboarding as they now can find consistent functional examples throughout the codebase and consult the official Android documentation which the team uses as a blueprint. Google is actively maintaining the Android Architecture Components, so we get peace of mind knowing that we’ll continue to reap the benefits as this library is improved.

With a significant amount of code using legacy MVC and MVP architectures, we knew we couldn’t make the switch all at once. Instead, the team committed to writing all new screens using MVVM and converting older screens when making significant changes. Though we still have a few screens using MVC and MVP, there isn’t confusion anymore because everyone now knows there is one standard and how to incorporate it into our existing and future codebase.

Architecture

I’ll explain the basic idea and flow of this architecture by describing the following components of MVVM.

Flows in a Model-View-ViewModel Architecture
Flows in a Model-View-ViewModel Architecture

View: View provides an interface for the user to interact with the app. In Shopify’s POS app, Fragment holds the View and the View holds different sub-views which handle all the user interface (UI) interactions. Any actions that happen on the UI by the user (for example, a button click or text change), View tells ViewModel about those actions via an interface callback. All of our MVVM setups use interfaces/contracts to interact with one another. We never hold references to the actual instance, for example, View won’t keep a reference to the actual ViewModel object, but instead to an instance of the contract object (I’ll describe it below in the example). Another task for View is to listen to LiveData changes posted by the ViewModel and then update its UI by receiving the new data content from LiveData.

ViewModel: ViewModel is responsible for fetching data and providing the updated data back to the UI. The ViewModel gets notified of UI actions via events generated by View, for example, onButtonPressed(). Based on a particular action, it fetches the data state, mutates it as per the business logic and tells View about the new data changes by posting it to LiveData. The ViewModel instance survives configuration changes, such as screen rotations, so when re-creating the Activity or Fragment instance, they re-connect to the existing ViewModel instance. So, the data that’s held by the ViewModel object remains available to the re-created Activity or Fragment instance. ViewModel dies when the associated Activity dies, or the Fragment is detached.

ViewModelProvider: This is the class responsible for providing ViewModel to the UI component and retaining that ViewModel instance while the scope of the given Activity or Fragment is alive.

Model: The component that represents the data source (e.g., the persistent model, web service, and cache). They’re responsible for handling the data for the app. For example, if our app needs to get a list of users, it would fetch it from a local database, if available. Otherwise, it would fetch the data from the network and save it in the database for later use.

LiveData: LiveData is an observable class that acts as a container for holding data. View subscribes to LiveData objects to get notified of any data updates. LiveData respects the lifecycle states of the app components, and it only passes the updates about data when the Fragment is in the active state, i.e., only the active observers get the updates.

Let me run through a simple example to demonstrate the flow of MVVM architecture:

1. The user interacts with the View by pressing Add Product button.

2. View tells ViewModel that a UI action happened by calling onAddProductPressed() method of ViewModel.


3. ViewModel fetches related data from the DB, mutates it as per the business logic and then posts the new data to LiveData.

4. The View which earlier subscribed to listen for the changes in LiveData now gets the updated data and asks other sub-views to update their UI with the new data.

Benefits of Using MVVM Architecture

Since Shopify moved to MVVM, we’ve taken advantage of the benefits this architecture has to offer. MVVM offers separation of concerns. View is only responsible for UI related logic like displaying UI data and reacting to user actions. ViewModel handles data preparation and mutation tasks. Using contracts between the View and ViewModel provide a strong separation of concerns and well-defined responsibilities. Driving UI from a ViewModel makes our data survive configuration changes, i.e., our data state is retained due to ViewModel caching.

Testing the business logic and UI interactions is efficient and easier with MVVM because of the strong separation of concerns, we can test the business logic and different view states of the app independently. We can perform screenshot testing on the View to check the UI since it has no business logic, and similarly, we can unit test the ViewModel without having to create Fragments and Views. You can read more about it in this article about creating verifiable Android apps on Shopify Mobile’s Medium page.

LiveData takes care of complex Android lifecycle issues that happen when the user navigates through, out of, and back to the application. When updating the UI, LiveData only sends the update when the app is in an active state. When the app is in an inactive state, it doesn’t send any updates, thus saving app from crashes or other lifecycle issues.

Finally, keeping UI code and business logic separate makes the codebase easier to modify and manage for developers as we follow a consistent architecture pattern throughout the app.


Intrigued? Shopify is hiring and we’d love to hear from you. Please take a look at our open positions on the Engineering career page.

Continue reading

Creating Locale-aware Number and Currency Condensing

Creating Locale-aware Number and Currency Condensing

It’s easy to transform a long English number into an abbreviated one. Two thousand turns into 2K, 1,000,000 becomes 1M and 10,000,000,000 is 10B. But when multiple languages are involved, condensing numbers stops being so straightforward.

I discovered that hard truth earlier this year as Shopify went multilingual, allowing our 600,000+ merchants to use Shopify admin in six additional languages (French, German, Japanese, Italian, Brazilian Portuguese, and Spanish). 

My team is responsible for the front-end web development of Shopify Home and Analytics within the admin, which merchants see when they’re logged in. Shopify Home and Analytics are the windows into every merchant's customers and sales. One of the internationalization challenges we faced was condensing numbers worldwide for graphs displaying essential information, including sales, visits and customer data. Without shortening numbers, many merchants would see long numbers taking up too much space on a graph’s axis, throwing off the design of Shopify’s Admin.

Without condense-number
Without condense-number

With condense-number
With condense-number

Team member Andy Mockler and I wrapped up most of the project in June, over Shopify’s quarterly Hack Days, which allows Shopifolk to take a two-day break from regular work to hack uninterrupted on a project of their choice. We realized that Hack Days presented the ideal opportunity to deliver this functionality and make it available to other developers in Shopify working on their internationalization goals.

Initially, we looked around to see if there was an existing JavaScript solution that worked for us. (Spoiler alert: there wasn’t.) There’s a built-in JavaScript Intl API for language-sensitive formatting, but a proposal to add number condensing isn’t implemented. We found a couple existing libraries that do a range of international formatting, but they either did more than we needed or were incompatible with our stack.

Ideally, we wanted to be able to take a number, like 3,000, and display an abbreviated version according to the audience’s locale. While 3,000 becomes 3K in English, it’s 3 mil in Portuguese, for example. Another consideration was different counting systems; India uses lakhs (1,00,000) and crores (1,00,00,000) instead of some Western increments like millions.

Through our research ahead of Hack Days, we stumbled across a treasure trove of international formatting data: the Unicode Common Locale Data Repository (CLDR). Unicode describes CLDR as the “largest and most extensive standard repository of locale data available.” It’s used by companies including Apple, Google, IBM, and Microsoft. It contains information about how to format dates, times, timezones, numbers, currencies, places and time periods. Most importantly, for Andy and I, it contained almost all the information we needed about abbreviating numbers. Once we combined that data with currency information from Intl.js, we were able to write a small set of functions to condense both numbers and currencies, according to locale.

Andy has more experience with open source packages than I do and he quickly realized our code would be useful to other developers. Since our solution could help across Shopify and beyond, we decided to open it up for others to use. In July 2018, we released our package, condense-number, on npm. If you have any international number formatting needs, we’d love for you to give it a try. If we’re missing a language or feature you’d like us to support, file an issue in the condense-number repository.
Intrigued? Shopify is hiring and we’d love to hear from you. Please take a look at our open positions on the Engineering career page.

Continue reading

Building a Data Table Component in React

Building a Data Table Component in React

I’m a front-end developer at Shopify, the leading commerce platform for over 600,000 merchants across the globe. I started in web development when the industry used tables for layout (nearly 20 years ago) and have learned my way through different web frameworks and platforms as web technology evolved. I now work on Polaris, Shopify’s design system that contains design guidelines, content guidelines, and a React component library that Shopify developers use to build the main platform and third-party app developers use to create apps on the App store.

When I started learning React its main advantage (especially for the component library of a design system) was obvious because everything in React is a component and intended to be reused. React props make it possible to choose which component attributes and behaviors to expose and which to hard-code. So, the design system can both standardize design while making customization easier.

But when it came to manipulating the DOM in React, I admit I initially felt frustrated because my background was heavy in jQuery. It’s easy to target an element in jQuery using a selector, pull a value from that element using a baked-in method, and then use another method to apply that value. My initial opinion was that React over-engineered DOM manipulation until I understood the bigger picture.

As developers, we tend to read more code than we write and I’ve inherited my fair share of legacy code. I’ve wasted many hours searching through jQuery files for that elusive piece of code that’s creating that darn animation I need to change. jQuery event listeners are often in different files than the files containing the markup of the elements they’re targeting, making it all too easy to hide the source of animations or style changes.

However, a React component controls its behavior, so you can predict exactly what it’s meant to do. There are no surprises because there is no indirection. It’s also easier to tear down event listeners in React, resulting in better performance.

The first component I worked on with the Polaris team was the data table component, and it helped me realize what makes React such a powerful library. React’s component approach made it easy to create a stateful data table component and a stateless functional cell subcomponent. Its built-in lifecycle methods also provided more control over when to re-render the data table's cell heights.

Here are the basic steps we took to build the Polaris data table component in React.

The Challenge

Building a good data table is a common design challenge most of us have had to solve at least once. By nature, a table has an inflexible grid shape with a nearly infinite potential to grow both vertically and horizontally, but it still needs to be flexible to work well on all screen sizes and orientations. The data table needs to fulfill a few requirements at once: it must be responsive, readable, contextual, and accessible.

Must Be Responsive

For a data table to fit all screen sizes and orientations, it needs to accommodate the potential for several columns of data that surpass the horizontal edges of the screen. Typically, responsive designs either stack or collapse elements at narrow widths, but these solutions break the grid structure of a data table, so it requires a different design solution.

Responsive Design Stacking
Responsive Design Stacking


Responsive Design Collapsing
Responsive Design Collapsing

Must Be Readable

A typical use case for a data table is presenting product data to a merchant who wants to see which of their products earned the most income. The purpose of the data table is to organize the information in a way that makes it easy for the merchant, in Shopify’s case, to compare and analyze— so proper alignment is important. A flexible data table solution can account for long strings of data without creating misalignment or compromising readability.

Must Be Contextual

A good experience for the user is a well-designed data table that provides context around the information, preventing the user from getting confused by seemingly random cell values. This means keeping headings visible at all times so that whichever data a user is seeing, it still has meaning.

Must Be Accessible

Finally, to accommodate users with screen readers a data table needs to have proper semantic markup and attributes.

Building a Data Table

Here’s how to create a stripped down version of the data table we built for Polaris using React (note: This post requires polaris-react. Polaris uses TypeScript, and for this example, I’ve used JavaScript). I’ve left out some features like a totals row, a footer, and sortable columns for the sake of simplicity.

Start With a Basic React Data Table

First, create a basic data table component that receives as props an array of headings and an array of rows. Map over these two arrays to extract cell content then break <Cell /> up into its subcomponent and pass content to it.





Basic Data Table Component
Basic Data Table Component

You can see the first problem in the image. With this many columns, the width of the table exceeds the screen width and scrolls the entire document horizontally, which isn’t ideal.

Basic Data Table Component Scrolling
Basic Data Table Component Scrolling

One way to handle a wide table is to collapse the columns and make them expandable, but this solution only works with a limited number of columns. Beyond a certain number, the collapsed width of each column still exceeds the total screen width, especially in portrait orientation. The columns are also awkward to expand and collapse, which is a poor experience for users. To solve this, restrict the width of the table.

Making it Responsive: Add Max-width

Wrap the entire table in a div element with max-width: 100vw and give the table itself width: 100%.



Unfortunately, this doesn’t work properly at very narrow screen widths when the cell content contains long words. The longest word forces the cell width to expand and pushes the table width beyond the screen’s right edge.

Basic Data Table Component - Max Width
Basic Data Table Component - Max Width

Sure, you can solve this with word-break: break-all, but that violates the design requirements to keep the data readable.

 

Basic Data Table Component - word-break: break-all
Basic Data Table Component - word-break: break-all


So, the next thing to do is force only the table to scroll instead of the entire document.

Making it Responsive and Readable: Create a Scroll Container

Wrap the table in a div element with overflow-x: auto to cause a scrolling behavior for the overflow content.


Scroll all the way right to the last column, and you see the next problem. The data is difficult to understand without the context of the first column, which are the product names in this example.

Basic Data Table Component - Missing First Column Context
Basic Data Table Component - Missing First Column Context


With several rows of data to compare, it’s difficult to remember which row corresponds to which product and repeatedly scrolling left and right is a terrible experience for the user. We chose to keep the first column visible at all times by fixing it in place and preventing it from scrolling along with the other columns as a solution.

Adding Context: Create a Fixed First Column

Give each cell in the first column an explicit width, then position them with position: absolute and left: 0. Then add margin-left: 145px to the remaining columns’ cells (the value must be equal to the width of the first column cells).

Add className=”Cell-fixed” to the first cell of each row. The component maps through each row (and not each column) so, for simplicity, we pass a boolean prop called fixed to the cell component. It’s set to true if the current item is first in the array being mapped over. The cell component then adds the class name Cell-fixed to the cell it renders if fixed is true.




Basic Data Table Component - Fixed Column
Basic Data Table Component - Fixed Column


Using an absolute position on each cell gives us a fixed first column, but creates another problem.

Basic Data Table Component- Fixed Column Issue
Basic Data Table Component- Fixed Column Issue


Typically, the DOM renders each cell height to match the height of the tallest cell in the same row, but this behavior breaks when the cells are positioned absolutely, so cell heights need to be adjusted manually.

Fixing a Bug: Adjust Cell Heights

Create a state variable called cellHeights.


Set a ref on the table element that calls a function called setTable.


Then write a function called getTallestCellHeights() that targets the table ref and creates an array of all of its <tr> elements, using getElementsByTagName.

Absolute positioning converts the fixed column to a block and breaks the natural behavior of the table, so the cell heights no longer adjust according to the height of the other cells in their row. To fix this, pull the clientHeight value from both the fixed cell and the remaining cells for each row in the array. Write a function that uses Math.max to find the highest number (the tallest height) of each cell in each row and return an array of those values.


Create a function called handleCellHeightResize() that calls getTallestCellHeights() to set the state of heights from the returned array.


The table needs to render first for the DOM to have clientHeight values to fetch, so place the call to handleCellHeightResize() in the componentDidMount() lifecycle method and re-render the component.


When mapping over the headings and rows arrays use the same index to target the correct value in the heights array to retrieve a height value for each <Cell /> and pass it as height prop. Because the heights array contains all heights and there are two separate calls to <Cell /> (one for headings and one for the table body) you need to increment the row index by 1 in renderRow() to skip the value for the headings cells.



We’re close now, and there’s one final bug to solve. The handleCellHeightResize() is called after the component is mounted and is never called again unless the page is refreshed. This means the height values for each cell remain the same even if the window is resized.

 

Set up an event listener and call the function any time the window is resized, so the cell heights readjust. In this example, I’ve used the event listener component already in Polaris.


Making it Accessible

Two important attributes make a data table accessible. Add a caption that a screen reader will read and a scope tag for each cell. For more details, the a11y project has an article about how to create accessible data tables.





A Responsive, Accessible Data Table Component
A Responsive, Accessible Data Table Component


And there you have it, a responsive, accessible data table component in React that can be used to compare and analyze a data set. Check out the full version of the Polaris React data table component in the Polaris component library that includes a full set of features like truncation, a totals row, sortable columns, and navigation.

If you are passionate about design systems and excellent user experience, check out our job openings! Reach out to me on Twitter or have a look at the job posting.

Continue reading

Lost in Translations: Bringing the World to Shopify

Lost in Translations: Bringing the World to Shopify

At Shopify, the leading multi-channel commerce platform that powers over 600,000 businesses in approximately 175 countries, we aim at making commerce better for everyone, everywhere. Since Shopify’s early days, it’s been possible to provide customers with a localized translated experience, but merchants had to understand English to use the platform. Fortunately, things have started to change. For the past few months, my team and I focused on international expansion bringing new shipping patterns, new payment paradigms, compliance with local laws and much more to explore. However, the biggest challenge is preparing the platform for our translation efforts.


I speak French. Growing up, I learned that things have genders. A pencil is masculine, but a feather is feminine. A wall is a he, but a chair is a she. Even countries have genders, too — le Canada, but la France. It’s a construct of the language native speakers learned to deal with. It’s so natural, one can usually guess the gender of unknown or new things without even knowing what they are.

Did you know that in English, zero dictates the plural form? We’d say zero cars, car being plural. But in French, zero is always singular, as in, zéro voiture. Singular, no s. Why? I don’t know but each language has their quirks. Sometimes it might be obvious, like genders, or more subtle like a special pluralization rule.

Shopify employs hundreds of developers working on millions of lines of code. For the past twelve years, we collectively hardcoded thousands and thousands of English strings scattered across all our products oblivious to our future international growth. It would be great if we could simply replace words from one language with another, but unfortunately, differences like gender and pluralizations force us to rethink established patterns.

We had to educate ourselves, build new tools, and refactor entire parts of our codebase. We made mistakes, tried different things, and failed many times. But now, six months after we started, Shopify is available in a variety of languages. What you’ll find below is a small collection of thoughts and patterns that have helped us succeed.

Stop the Bleeding

The first step, like with any significant refactoring effort, is to stop the bleeding. At Shopify, we deploy hundreds of hardcoded English words daily. If we were to translate everything that exists today, we’d have to do it again tomorrow and again the day after because we’re always deploying new hardcoded words. As brilliantly explained by my colleague Simon Hørup Eskildsen, it’s unrealistic to think you can align everyone with an email or to fix everything with a single pull request.

Fortunately, Shopify relies on automated tooling (cops, linters, and tests) to communicate best practices and correct violations. It’s the perfect medium to tell developers about new patterns and guide them with contextual insights as they learn about new practices. We built cops and linters to detect a variety of violations:

  • Hardcoded strings in HTML files
  • Hardcoded strings in specific method arguments
  • Hardcoded date and time formats

How we built the cops and linters could be a post on its own, but the concept is what matters here: we knew a pattern to avoid, so we built tools to inform and correct. These tools gave developers a strong feedback loop, prevented the addition of new violations, and gave an estimate of the size of the task we had in front of us.

Automate the Mundane

Shopify has, relatively speaking, quite a big codebase. Due to our cops and linters, we build all new features with translation in mind. However, all the hard-coded content that existed before our intervention still had to be extracted and moved to dictionaries. So we did what any engineer would do; we built tools.

Linters made identifying violations easy. We ran them against every single file of our application and found a significant number of items in need of translation. After identification, we opted for the simplest approach; create a file named after the current module, move the actual content in there, and reference it through a key created from a combination of file path and the content itself. Slowly but surely, all the content was moved to dictionaries. The results weren’t perfect. There was duplicated content and the reference names weren’t always intuitive, but despite this, we extracted most of the basic and mundane stuff, like static content and documentation. What was left were edge cases like complex interpolations — I like to call them, fun challenges.

Pseudolocalization to the Rescue

Identifying the extracted content from everything else immediately became a challenging issue. Yes, some sentences were now in dictionaries, but the product looked exactly the same as before. We needed to distinguish between hardcoded and extracted content, all while keeping the product in a usable state so that translators, content writers, and product managers could stay informed about our progress. Enter pseudolocalization.

Pseudolocalization (or pseudo-localization, or pseudo-translation) is a software testing method used for examining internationalization aspects of software. Instead of translating the text of the software into a foreign language, as in the process of localization, an altered version of the original language replaces the textual elements of an application.

We created a new backend built on top of Rails I18n, the default Rails framework for internationalization, that hijacked all translation calls and swapped resulting characters with an altered yet similar alternative: a became α, b became ḅ, and so on.

Word lengths differ from one language to another. On average, German words are 30% longer and has the potential to seriously mess up a UI built without this knowledge. In French, a simple word like “Save” translates to “Sauvegarder”, which is almost 200% longer. Our pseudotranslation module intercepted all translation calls, so we took the opportunity to double all vowels in an attempt to mimic languages with longer words. The end result was a remarkable achievement in readability. We easily distinguished between content and performed visual testing on the UI against longer words.

Pseudotranslation in Action on Shopify
ASCII is Dead, Long Live UTF8

Character sets also prove to be a fun challenge. Shopify runs on MySQL. Unfortunately, MySQL’s default utf8 isn’t really UTF-8. It only stores up to three bytes per code point, which means no support for hentaigana, emoji, and other characters outside of the Basic Multilingual Plane. This means that unless explicitly told otherwise, most of our tables didn’t support emoticons characters, and thus needed migration.

On the application side, Rails isn’t perfect neither. Popular methods such as parameterize and ordinalize don’t come with international support built-in.

Identifying and fixing all of these broken behaviors wasn’t an easy task, and we’re still finding occurrences here and there. There is no secret sauce or real generic approach. Some bugs were fixed right away, others were simply deprecated, and some were only rolled out to new customers.

If anything, one trick to try is to introduce UTF8 characters in your fixtures and other data seeds. The more exposed you are to other character sets, the more likely you are to stumble on broken behavior.

Translation Platform

Preparing content for translation is one thing, but getting it actually translated is another. Now that everything was in dictionaries, we had to find a way for developers and product managers to request new translations and to talk to translators in a lean, simple, and automated way.

Managing translations isn’t part of our core expertise and other companies do this more elegantly than we ever could. Translators and other linguists rely on specialized tools that empower them with glossaries, memories, automated suggestions, and so on.

So, on one side of this equation, we have Github and our developers, and on the other are translators and their translation management system. Could GitHub’s API, coupled with our translation management system API help bridge the gap between developers and translators? We bet that it could.

Leveraging APIs from both sides, we built an internal tool called “Translation Platform”. It’s a simple and efficient way for developers and translators to collaborate in a streamlined and automated manner. The concept is quite simple; each repository defines their configuration file that indicates where to find the language files, what’s the source language, and what are the targeted languages. A basic example would look as follows:

Once the configuration file in place, the Translation Platform starts listening to Github’s webhooks and automatically detects if a change impacts one of the repository’s language file. If it does, it uses the translation management system API to issue a new translation request, one per targeted language. From a translator standpoint, the tool works similarly. It listens to the translation management system webhooks, detects when translations are ready and approved, then automatically creates a new commit or a new pull request with the newly translated content.

Shopify's Translation Platform

Translation Platform made gathering translations a seamless process, similar to running tests on our continuous integration environment. It gives us visibility of the entire flow while allowing us to gather logs, metrics, and data we can later use to provide SLAs and guarantees on translation requests. The simplicity of the Translation Platform was key to successfully introducing our new translation processes across the company.

Future Challenges

Localization challenges don’t stop with words. Every single UX element needs examination through an international lens. For example, shipping and payment are two concepts that vary significantly from one market to another. The iconography that accompanies them must acknowledge these differences and cultural gaps that may exist. A mailbox doesn’t look the same in Japan as it does in France. A credit card isn’t used as much in India as it is in North America.

Maps and geography represent another intriguing challenge. Centering a world map over Japan instead of the UK can go a long way with our Japanese merchants. The team needs to take special care of regions like Taiwan and Macau, which can lead to important conflicts if not labeled correctly, especially when what is considered “correct” changes depending on whom we ask.

Number formatting, addresses, and phone numbers are all things that change from one region or language to another. If something requires formatting for display purposes, the format will change with the country or the language.

We’re only at the beginning of our journey. The internationalization and globalization of a platform isn’t a small task but an ongoing effort. The same way our security experts never sleep, we expect to always be around, informing our peers about language specificities, market subtleties, and local requirements.


My name is Christian and I lead the engineering team responsible for internationalization and localization at Shopify. If these types of challenges are appealing to you, feel free to reach out to me on twitter or through our career page.

Continue reading

Mohammed Ridwanul Islam: How Mentorship, the T Model and a Pen Are the Keys to His Success

Mohammed Ridwanul Islam: How Mentorship, the T Model and a Pen Are the Keys to His Success

Mohammed Ridwanul Islam: How Mentorship, the T Model and a Pen Are the Keys to His Success
Mohammed’s feature is part of our series called Behind The Code, where we share the stories of our employees and how they’re solving meaningful problems at Shopify and beyond.

Mohammed Ridwanul is a software engineer on the Eventscale team and joined Shopify a year and a half ago.

Mohammed grew up in Dubai but was born in Noakhali, a small village in Bangladesh before moving when he was five. The village was far-removed from technology — most of the areas had no electricity, and you could count the number of TVs with one hand. The people of Noakhali were extremely practical and had ingenious solutions to the problems that would arise. Adults who had an engineering education or background were highly-regarded for how they improved the quality of life in the village. This inspired and motivated Mohammed to pursue a career in engineering, and he hopes eventually, to impact communities the way those individuals did to his.

What has your career path looked like?
I’ve had the opportunity to work in different industries including sales, advertising, and design. Also, I’m an avid musician and love making my own music and doing shows with my band. With all these different skills, I thought perhaps I could make my own game. While trying to learn everything I could about game development, I wrote my first line of code which was in C#.

All my experiences have one thing in common; I love to face tough challenges and see a rapid manifestation of the things I do or build. So, I studied engineering and got an internship working at Shopify during my undergrad which turned into my current full-time role.

What type of Engineering did you study?
I went to the University of Waterloo and took a Bachelor of Applied Sciences in Electrical Engineering.

What does your team do at Shopify?
The Eventscale team is part of the Data Platform Engineering organization. Shopify receives an immense amount of data. Acquiring such large amounts of data so that we can clean, process, reliably store, and provide easy access for analysis, requires highly performant specialized tools and infrastructure. The Data Platform Engineering team are responsible for building these tools.

The Eventscale team builds the tools, libraries, and infrastructure to collect event-oriented streaming data. This data is used for both internal and merchant analytics and other operational needs. We build for all platforms at Shopify including web, backend, and mobile.

What was something difficult for you to learn, and how did you go about acquiring it?
During my first time leading a team project, I had some challenges learning useful team management principles. Like understanding the needs of each team member, aligning everyone to a shared vision and goal to get the work done, required a different set of skills which took time and experience to learn. Luckily my senior co-workers consistently mentored me and taught me concepts such as project cost estimates, team management strategies, success metrics, and other fundamental project management principles. My team lead also guided me towards several books and whitepapers from other companies which have helped me develop strong opinions related to project management and strategy. Check out my Goodreads profile for a list of those books and read Ben Thomson's work on Stratechery.com.

How does your daily routine help you cultivate a good work ethic?
Mohammed Ridwanul Islam's Daily Routine
Habits, in my opinion, are useful in navigating life. I believe humans are creatures of habits; it’s challenging to have a constant cognitive load to tell yourself to do x, y and z tasks that are good for you. Instead, by building a habit, you reduce the load as your body and mind start to realize that this is a way of life. My daily routine helped me achieve this habit formation.

What’s your favorite dev tool?
VIM. It has a learning curve, but you can have so much fun with it once you learn it. VIM is an editor you can mold into your own little product; personalized for you with custom configurations using dotfiles. You can pretty much make it behave however you want. I love it! If you’re interested, feel free to check out my custom VIM settings.

What’s your favorite language and why?
Java, mostly because it’s a strongly typed language, and to this day I prefer explicitly defining types without having the language make assumptions on types.

Are you working on any side projects?
Yes, I’m working on an enterprise project management software that can be used by a consulting team to manage a large number of projects in parallel. Essentially, it’s a centralized repository for all the current projects that the consultants are handling, along with the cost breakdown and timeline details. Also, it allows the user to dig into each project further and keep records of how human resources are applied. The software tries to enforce a framework of thinking about resource management and project strategy which I have developed over the years.

What are some ways you think through challenging work?
Writing things down on paper has been my go-to method to work through challenging things. I don’t start writing code until I’ve designed the overall larger components on paper. Similarly, for any other situations in life, writing has always helped me tackle challenges.

What book(s) are you currently reading?
Designing Data-Intensive Applications by Martin Kleppmann and The Essential Rumi by Rumi.

What is the best career advice you’ve gotten?
It doesn’t matter what you do as long as it meets two criteria: 1. It positively impacts society and is aligned with your values, and 2. It allows you to push and grow yourself by doing work to the best of your abilities.

What kind of advice would you give to someone trying to break into the technology industry?
I’m a big fan of the “T” model of learning, which essentially states that you should try and be competent in a few different things (small horizontal line), but you should strive to be the authoritative figure for at least one thing (longer vertical line). Programming might be the tool used to solve tough engineering problems, but the ability to solve problems is the more critical skill. So focus on chiseling that ability which comes with exposure and specialization in one specific area.

If you’d like to get in touch with Mohammed, check out his website www.mohammedri.com.

We’re hiring! If you’re interested in joining our team, check out our Engineering career page for available opportunities.

Continue reading

Dev Degree - A Big Bet on Software Education

Dev Degree - A Big Bet on Software Education

“Tell me and I forget, teach me and I may remember, involve me and I learn.”
- Benjamin Franklin


When I decided to study computer science at university, my parents were skeptical. They didn’t know anyone who had chosen this as a career. Computer science was, and still is, in its infancy. Software development isn’t pure science or pure engineering — it’s a combination of the two, mixed with a remarkable amount of artistic flare. It's a profession where you grow by learning the theory and then doing. A lot of doing. It’s a profession that’s increasingly in demand. And it’s a profession so new that schools are still learning how to teach it. The supply isn’t matching the demand; not even close.

Our industry is fraught with critical shortages of skills and diversity — software developers are more valuable to companies than money [1]. It’s pretty obvious, we have to aggressively invest in growing and developing software professionals more than ever.

Shopify has figured out an important part of how to solve these problems. We call it Dev Degree — a work-integrated learning (WIL) program that combines an accredited university degree with double the experience of a traditional co-op. The program is already in its 3rd year, and it’s time to talk about why it’s a big deal to us.

The Beginnings of Dev Degree

While living and working in Australia, my company invested in hiring hundreds of graduate developers. The graduates were intelligent and knew their theory, but they lacked the fundamental skills and experience required for software development. This held them back in making quick impacts to our small but growing company.

To fill in the gaps, we developed an internal training program for new graduates. It helped them level up faster and transitioned best practices they learned in school into practical skills for the world of software development. It wasn’t long before I recognized that this knowledge gap wasn't an isolated incident. There wasn’t just one university churning out students ill-prepared for the workforce, it was a systemic issue.

I decided to tour Australian universities and talk to their Computer Science departments. I pitched the idea of adding pieces of our training program to their curriculum to better prepare students for their careers. My company even offered to pay to develop the program. The universities loved the idea, but they didn't know how to make it a reality within their academic frameworks. I saw many nods of agreement on that tour, but no action.

Dev Degree started, in earnest, when I returned to Canada and joined Shopify. The main lesson I learned from Australia was that universities couldn’t implement a WIL curriculum without industry partners in a true long-term arrangement. Shopify seemed born to step into that role. When I approached Tobi with this embryo of an idea, he was on board to make it a reality. Tobi had his own positive experience with apprenticeships in Germany. Our shared passion for software development and Canada motivated us to give this idea another shot, and we started searching for a university partner.

Canadian universities were eager to get involved, but again, most weren’t sure how to make it happen. For many, the question was: how is this different from our co-op program?

The co-op model is straightforward. Students alternate between a school term and a work term throughout their program. In this structure, students are thrown over the wall of academia into an industry with no connection to their curriculum. WIL, on the other hand, requires a structural change to the education system that creates a fully integrated and deep learning experience for the students. To do this properly, we needed to make changes to the curriculum and assessments, fully integrate universities and companies, launch new learning programs, and provide additional student support. This was a multi-dimensional problem.

Carleton University rose to the challenge, becoming the first and founding university partner of Dev Degree. Their team understood the value of WIL and were already exploring ways to incorporate this style of learning when we met. It was clear to both sides that we had found the perfect partner to make WIL a successful reality. We were both eager to innovate and weren’t afraid to make huge structural changes to our operations.

Carleton didn’t just talk about being involved, they developed an entirely separate stream of their Bachelor of Computer Science program that allocated over 20% of credits to student practicums. This required Carleton’s Senate approval, which was granted after thoughtful debate. Our first strong partnership was formed and we were ready to get started.

Inside Dev Degree

The Dev Degree FamilyThe Dev Degree Family


The core of the Dev Degree model is building tighter feedback loops between theory and practice while layering programming and personal growth skills early on. Each semester students take 3 courses at University and spend 25 hours a week at Shopify.

Because K-12 software education is lacking, we wanted to turbo-boost students to be able to write and deploy production software, solving real problems, before they even graduate. Our bet was that this model would better engage a more diverse set of students, empower deeper understanding, and foster more critical thought when building software.

Dev Degree - Hand-On Learning

These types of challenges are not part of the university curriculum — students can only get this experience in an industry setting. Thomas Edison said innovation is 1% inspiration and 99% perspiration. By that measure, Dev Degree is a real-time training program in experimental perspiration.

But there’s also a strong link to validating that competencies are acquired. The partner university allocates at least 20% of the degrees credits for their work done with Shopify development teams. Students write a practicum report at the end of every term (every four months) and submit the practicum report to the university. In the practicum, the student describes how they have achieved specific learning outcomes. The learning outcomes used in the Dev Degree program were influenced by standards from the Association for Computing Machinery (ACM) and the IEEE Computer Society.

During the first two years, we learned a lot. It wasn’t a smooth ride as we ironed out how best to deliver this program with the University, Students, and teams in Shopify. Here are some of the most important lessons we’ve learned.

Key Lesson #1: Re-Learn True Collaboration

During our school career, we learn that the final mark is most important. We strive to deliver the perfect assignment to get that A+. This is the complete opposite of how to get good results in the real world. The best students, and the most successful people, are the ones who share their ideas early, get feedback, experiment, explore, re-compose, and iterate. They embrace failure and keep trying.

The end result is important, but you have to cheat to get the best version of it. Sounds counterintuitive, I know. But by “cheating,” I mean asking people for help and incorporating the lessons they teach you into your own work. Collaboration is a prerequisite for true learning and growth. The Lone Wolf mentality instilled in students from years of schooling is more difficult to change than we anticipated, but working directly alongside other developers, pairing regularly, allowed us to break down those habits over time.

Key Lesson #2: Start with Development Skills

Our first cohort joined Shopify after three months of Developer Skills Training, based on the ACM framework I mentioned. This was quite ambitious on our end, but we hoped it was enough time to prepare them for the real-world work they would do with our teams.

It wasn’t. After the three months, our students still didn’t have enough knowledge to make a strong impact at Shopify. To better support them, our Dev Degree team hosted additional workshops on various developer tools and technologies to get them up to speed, but we knew there was more to be done.

It was clear that we needed to pivot the first year of our program to focus more heavily on Developer Skills Training. Our students needed to be better prepared to enter a fast-paced team building impactful products. Now, Dev Degree students participate in Developer Skills Training for their entire first year at Shopify. By tripling the amount of time they spend in training, we’ve seen Dev Degree students create earlier and more positive impacts on Shopify teams.

Key Lesson #3: Mentorship Comes in Many Forms

In 2016, students were paired with technical mentors once they joined a development team. The technical mentor is a software developer who guides their mentee on a daily basis by giving direction, reviewing work, offering feedback, and answering questions. While this was successful, we identified a gap where we weren’t equipping students with the tools and support they needed to transition into the workforce. We were giving them tons of technical support, but that didn’t necessarily help them conquer the social aspects of the job.

Now, Dev Degree students receive an additional layer of mentorship. Each student is paired with two people: a technical mentor and a Life@Shopify mentor. The Life@Shopify mentor is a trusted supporter, friend, and guide who provides a listening ear and supports the student’s growth. It’s a big leap to go from high school to being a trusted member of a company. We’ve found that this combination provides students with a diverse range of support throughout their time at Shopify.

The Results

To put it bluntly, the Dev Degree model works.

We see above average retention rates compared to traditional academia. Generally, 20-50% of students dropout of their initial program or from postsecondary programs completely. In Dev Degree, our retention rate is 95%. We’ve increased gender diversity in the program, with women accounting for over 50% of Shopify Dev Degree students — a dramatic rise from the 19% of women graduating with a computer science degree.

Companies have been focusing 66% of their philanthropic tech education on K-12 programs, with only 3% on post-secondary programs. But we need to look at the entire education system to solve the skills shortage and lack of diversity in STEM programs. And it needs to happen faster.

Traditionally, new graduates hired at Shopify take anywhere from six months to two years to fully complete onboarding and start making an impact on development teams. Skill acquisition in our WIL program happens three times faster than the average developer education: Dev Degree students become productive members of their teams after only nine months into the program, instead of up to two years after graduation.

We have a lot more to learn, and we’re not done yet. While we’re excited by our early results, a true measure of success will be seeing more universities and industry partners adopt this model. We’re working to scale the program with our partners so that the Dev Degree model starts popping up all over Canada.

That’s why we’re excited to announce the expansion of our Dev Degree program to York University’s Lassonde School of Engineering! Our first Toronto-based students have started their journey with Dev Degree, and we’re excited to see what challenging problems they’ll solve.

None of this would be possible without our academic partners at Carleton and York who worked relentlessly to get Senate approval for new WIL computer science streams and design the model itself. We truly believe that if more universities worked hand-in-hand with industry to better prepare students for the workforce, Canada would become the leader in talent development for years to come.

Continue reading

Introducing the Deprecation Toolkit

Introducing the Deprecation Toolkit

Shopify is happy to announce that we’ve open sourced the Deprecation Toolkit, a ruby gem that keeps track of deprecations in your codebase in an efficient way.


At Shopify, the leading cloud-based, multi-channel commerce platform with 600,000+ merchants in over 175 countries, upgrading our dependencies is a frequently applied best practice. We even have bots that automatically upgrade dependencies when a minor version is released.

However, more complex upgrades require human intervention and the time required varies from dependency to dependency, some even taking years. We realized that we could speed up this process if our application were using as little deprecated code as possible.

The motivation for building the Deprecation Toolkit came after a few unsuccessful attempts to prevent the hundreds of developers working on our monolith from accidentally using deprecated code in libraries, but also in our codebase.

Why Should You Use This Gem and How Can It Help?

Did I just called a new deprecated method? “Did I just call a new deprecated method?“ 🤔

If you are the creator/maintainer of a library or if you’d like to deprecate methods in your application, you have couple options to notify consumers of your code about a future API change. The most common option is to output a warning message on the standard output explaining the change happening in the next release.

This approach has a major caveat: it doesn’t prevent developers from using the deprecated code by accident. The only warning is the deprecation message, which is very easy to miss and becomes impossible to spot if there is already a lot of them.

The second option is to provide a callback mechanism whenever a deprecation is triggered. If you are familiar with Ruby on Rails or Active Support you might have heard about the ActiveSupport::Deprecation module which allows you to configure the behavior of your choice that gets called whenever a deprecation is triggered. Active Support provides few behavior options by default, the two most common ones are log or raise.


Raising an error when deprecated code is triggered looked like a solution, but it would mean we’d have to fix every single deprecation before activating the configuration; otherwise, our CI wouldn’t pass and that would block developers from doing their daily tasks. We needed a different way to solve this problem that didn’t require fixing all deprecations at once and treat existing deprecations as “acceptable” allowing us time to fix those gradually. New deprecations, however, should be handled differently and be the one that raises errors. This is the approach we took with the Deprecation Toolkit.

Internally, we called this process the “Shitlist-driven development.” My colleague Flo gave an amazing talk at the Red Dot Ruby Conference in 2017 you can view called "Shitlist-driven development and other tricks for working on large codebases."

How Does It Work?

Introducing the Deprecation Toolkit

The Deprecation Toolkit uses a whitelist approach. First, you need to record all existing deprecations in your application by running your test suites, either locally or on CI. The toolkit writes each deprecation that gets triggered for a given test inside YAML files. These YAML files will consist of your whitelist of acceptable deprecations.

The next time your tests run, the toolkit will compare all the deprecations that got triggered in the test run against the ones marked as acceptable. If a mismatch is found it either means a deprecation was introduced or removed, either way, the Deprecation Toolkit will trigger the behavior of your choice, but by default, it’ll raise an error.

The toolkit has many configuration options, however, if the default configuration suits your needs, all you need to do is add the gem in your Gemfile. The Deprecation Toolkit README has a detailed configuration reference to help you setup the toolkit in the way you need. You can, for example, configure the toolkit to ignore some deprecations, dynamically determine where deprecations should be recorded, or even create custom behaviors when new deprecations are introduced.

Deprecation Toolkit in ActionDeprecation Toolkit in Action

Keeping your system free of deprecations is part of having a sane codebase, whether that's fixing deprecations from libraries or your codebase. We’ve used the Deprecation Toolkit in our core application for about a year now. It helped us to reduce the number of deprecations in our system significantly and contributed towards speeding up our dependencies upgrade process. It’s instrumental in making every developer involved in fixing deprecations as Pull Requests can’t be merged if the code is introducing new deprecations.

Last but not least, we gamified fixing existing deprecations amongst developers. All deprecations were grouped by component and assigned an owner, usually a team lead, to help fix them. Over time, we counted the failures and progression of each team. All participating teams viewed their results in a shared Google sheet. Splitting the deprecated code into chunks and assigning each one to a different owner made the process super smooth and even faster.

Give the Deprecation Toolkit a try; we are looking forward to hearing if it helped you and how we can improve it! If the current workflow doesn’t work for you or if you’d like to see a new feature in this gem, feel free to open an issue in our issue tracker.

Continue reading

Mobile Tophatting at Shopify

Mobile Tophatting at Shopify

At Shopify, the leading cloud-based, multi-channel commerce platform for 600,000+ merchants in over 175 countries, it’s crucial to test and verify the functionality of the new features that get introduced in the platform. Since the company doesn’t have a QA team by design, testing features is the developer's responsibility. To do so, we set up a project to contain automated test steps which execute via our continuous integration infrastructure (CI) and additional manual checks are performed by developers.


One of those manual checks is trying out the changes before merging them into the codebase. We call this process “tophatting” after the 🎩 emoji. Back when Github didn’t have support for code review requests, Shopify relied on emojis to easily communicate the state of the code review process. 🎩 indicates that the reviewer not only looked at the code but also ran it locally to make sure everything works as expected, especially when the changes affected the user interface.

The tophat process requires the developer to save their current work, checkout a different git branch, set up their local environment for that branch and build the app. For mobile developers, this process is tedious because changing the git branch often invalidates the cache, increasing the build time in Xcode and Android Studio. Depending on the project, it can take up to 15 minutes to build the app, during which developers can’t do any other work in the same project.

To eliminate their pain points and facilitate best practices, we’ve created a fast and frictionless tophatting process which integrates seamlessly with our CI infrastructure and dev, Shopify's all-purpose development tool that all mobile developers have running in their environments. In this post, I’ll describe how we built our frictionless tophatting process and show you an example of what it looks like.

Setting up Projects for Tophat

The slowest part of the mobile tophatting process is compilation. To speed this up for mobile developers we skipped the compilation step. We already build the apps on CI, so the application binaries are available in the disposable environments we created for running the PR builds. We updated the projects pipeline to export the binaries so that we can list and access them through the CI API. Depending on the platform (iOS or Android) the exported app has a different format:

  • iOS: Apps are folders and we zip the folder using a naming convention that includes the name of the app and its version. For example, an exported Shopify app version 3.2.1 would be named Shopify-3.2.1.app.zip
  • Android: APK files are zip archives, so we export them with its existing name. 

Once the apps are exported we leverage GitHub commit statuses to let developers know that their PRs have tophattable builds:

Tophat Github Commit Status

Command line interface

Dev is an internal tool that provides a set of standard commands across all the projects at the company (you can read more about it on devproductivity.io). One of the commands that backend developers use is tophat and we extended its use to support mobile projects.

The command looks like:

dev platform tophat resource


Where platform can be either ios or android and the resource can be any of the following:

  • Pull request URL: For tophatting other developer’s work
  • Repository URL: For tophatting the main branch of a repository
  • Repository branch URL: For tophatting a specific branch
  • Build URL: For tophatting a specific build from CI

For example, if a developer would like to tophat the pull request 35 of the project android, they could run the command:

dev android tophat https://github.com/shopify/android/pulls/35

Under the Hood

When the tophat command is run, the following steps are executed:

  1. The user is authenticated on the Buildkite and GitHub API if they aren’t already authenticated. The access token is persisted in the macOS keychain to be reused in future API calls.
  2. If the given resource is a GitHub URL, we use commit statuses to get the URL of the build.
  3. Since the list of artifacts might contain resources that can’t go through tophatting, we filter them out and only show the valid ones. If there’s more than one in the repository, the developer can select which app they’d like to tophat.
  4. After selecting the app:
    1. For iOS projects, we list the system simulators and boot into the one the user selects. Most times, developers tend to use the same simulator so the command remembers and suggests it as default
    2. For Android projects, we list the emulators available in the environment and a few more default ones in case the developer doesn’t have any emulators configured locally yet.
  5. Once the simulator is booted, we install the app and launch it. 

The example below shows the process of tophatting Shopify Mobile for iOS:

An example of the mobile tophatting process

Future Improvements

We’re thrilled with the response received from our mobile developers; they love the feature. Since we launched, Shopifolks enthusiastically submitted bug reports and proposals with many ideas about how we can keep improving the tophatting process. Some of the improvements we’re currently incorporating are:

  • Caching: Every time we start the tophat process, we pull artifacts from Buildkite, even if we already tophatted the build. Adding a local cache will prevent downloading the artifact again and copy it from the cache instead.
  • Real devices: Developers usually try the apps on real devices and we’d like to facilitate this. For iOS, the builds need to be signed with a valid certificate that allows installing the app on the testing devices.
  • Tophat from commit and branch: Rather than passing the whole GitHub URL we simplify the input by letting developers specify the repository and the branch/commit they’d like to tophat.

Testing someone else’s work is now easier than ever. Our developers don’t need to know how to set up the environment or compile the apps they are tophatting. They can run a single command and the tool does the rest. The Mobile Tooling team is committed to gathering feedback and working with our mobile developers to add improvements and bug fixes that facilitate their workflows.

Continue reading

Shaping the Future of Payments in the Browser

Shaping the Future of Payments in the Browser

Part 1: Setting up Our Experiment with the Payment Request API

By Anna Gyergyai and Krystian Czesak

At Shopify, the leading multi-channel commerce platform that powers over 600,000 businesses in approximately 175 countries, we aim at making commerce better for everyone. This sometimes means investing in new technologies and giving back what we learned to the community, especially if it’s a technology we think will drastically change the status quo. To that end, we joined the World Wide Web Consortium's (W3C) Web Payments Working Group in 2016 to take part in shaping the future of native browser payments. Since then, we’ve engaged in opinionated discussions and participated in a few hack-a-thons (Interledger Payment App as an example) as a result of this collaborative and innovative working environment.

The W3C aims to develop protocols and guidelines that ensure the long-term growth of the Web. The Web Payments Working Group’s goal is to make payments easier and more secure on the Web. The first specification they introduced was Payment Request: a javascript API that replaces traditional checkout forms and vastly simplifies the checkout experience for users. The first iteration of this specification was recently finalized and integrated into a few browsers, most notably Chrome.

Despite being in Candidate Recommendation, Payment Request’s adoption by platforms and developers alike is still in the early stages. We found this to be a perfect opportunity to test it out and explore this new technology. The benefits of such an initiative are threefold. We gather data that helps the W3C and browser vendors grow this technology, continue to contribute to the working group, and encourage participation through further experimentation.

Defining the Project

To present detailed findings to the community, we first needed a properly formulated hypothesis. We wanted to have at least one qualitative and one quantitative success metric, and we came up with the following:

We believe that Payment Request is market ready for all users of our platform (of currently supported browsers). We’ll know this to be true when we see that the checkout completion rate for select merchants remains unchanged or gets better, and the purchase experience is better and faster.

This was our main driving success metric. We define checkout completion rate (CCR) as the number of people that completed a purchase vs the total number of people that demonstrated an intent to purchase. An intent to purchase is indicated by buyers who clicked the “checkout” button on the cart page. In addition, we monitored time to completion of the purchase and drop-off rates.

For our qualitative metric, we spent time comparing Payment Request’s checkout experience with Shopify’s existing purchase experience. This metric was mostly driven by user experience research and was less of a data-driven comparison. We’ll cover this in a follow-up post.

We set off to launch an A/B experiment with a group of select merchants that showed interest in the potential this technology had to offer. We built this experiment outside of our core platform because a few key benefits allowed us to:

  • Iterate fast and in isolation
  • Leverage our own platform’s existing APIs
  • Release the app to our app marketplace for everyone to use, if valuable

Payment Request API Terminology

The Payment Request API has interesting value propositions: it surfaces more than one payment method, it’s agnostic of the payment method used, and it can indicate back upstream if the buyer is able to proceed with a purchase or not. This last feature is referenced as the canMakePayment() method call, which returns a boolean value indicating that the browser supports any of the desired payment methods indicated by the merchant.

Most browsers that implement Payment Request allow processing credit card payments through it (this payment method is referenced as basic-card in the specification). At the time of writing, basic-card was the only payment method widely implemented in browsers, and as a result, we ran our experiment with credit card transactions in mind only.

In the case of basic-card, canMakePayment() would return true if the end user already had a credit card provisioned. As an example on Chrome, the method returning true would mean that the user had already a credit card on file in their browser either through one of Chrome’s services, autofill or from having already gone once through the Payment Request experience.

Payment Request demo on Chrome Android
Payment Request demo on Chrome Android

Finally, the UI presented to the buyer during their purchase journey through Payment Request is called the payment sheet. Its implementation depends on the browser vendor, which means that the experience might differ from one browser to another. As seen in the demo above, it usually contains the buyer’s contact information, shipping address and payment method. Once the shipping address is selected, the buyer is allowed to select their shipping method (if applicable).

Defining our A/B Experiment

Our A/B experiment ran on select merchants and tested buyer behaviour. The conditions of the experiment are as follows:

Merchant Qualification

Merchant Qualification

Since most Payment Request implementations in browsers only support the basic-card payment method, we were limited to merchants who accept direct credit card payments as their primary method of payment. With this limitation, one of the primary merchant qualifications was the use of a credit card based payment processor.

Audience Eligibility

Our experiment audience is buyers. A buyer is eligible to be part of the experiment if their browser supports Payment Request. At the time of writing, Payment Request is available with the latest Chrome (on all devices), Microsoft Edge on desktop and Samsung Browser (available on Samsung mobile devices). We were only able to gather experiment data on Chrome. We experienced minimal browser traffic through Samsung Browser, and Microsoft Edge's Payment Request implementation only supports North American credit cards.

Experiment Segmentation

From the qualified buyers, when they clicked the “checkout” button on the cart page, 50% of them are placed in a control group and the other 50% in an experiment group. The control group are buyers that won’t see the payment sheet and continue through our regular checkout. The buyers that go through the Payment Request purchase experience and see the payment sheet are the experiment group.

Payment Request Platform Integration

In order to build our experiment in an isolated manner, we leveraged our current app ecosystem. The experiment ran in a simple ruby app that uses our existing rails engine for Shopify Apps. We used our existing infrastructure to quickly deploy to Google Cloud (more on our move to the cloud here). In conjunction with our existing ShipIt deployment tool, we were able to setup a pipeline in a matter of minutes, making deployment a breeze.

After setting up our continuous delivery, we then shifted our focus towards the app lifecycle, which can be better explained in 2 phases: merchant facing app installation and the buyer’s storefront experience.

App Installation

The installation process is pretty straightforward: once the merchant gives permission to run the experiment on their storefront, we then install our app in their backend. Upon installation, our app injects a script tag on the merchant’s storefront. This javascript file contains our experiment logic and would run for every buyer visiting that merchant’s shop.

Storefront Experience

The buyer’s storefront experience is split into two processes: binding the experiment logic and surfacing the right purchase experience.

Storefront Experience - Binding the Experiment LogicBinding the experiment logic

Every time a buyer visits the cart page, our front-end logic first determines if the user is eligible for our experiment. If so, the javascript code pings our app backend, which in turn gathers the shop’s information through our REST Admin API. This ping determines if the shop still has a credit card based processor and if the merchant supports discount codes or gift cards. This information determines the shop’s eligibility for the experiment and displays the proper alternative flow if gift cards or discount codes are accepted. When both the buyer and the merchant are eligible for the experiment, we override the “checkout” button on the cart page. We usually discourage this practice, as it can cause the checkout experience to be adversely affected. For our purposes, we allowed it for the duration of the experiment only.

Surfacing the Purchase ExperienceSurfacing the purchase experience

Upon clicking the Checkout button, buyers in our control group would get redirected to Shopify’s existing web checkout. Buyers in our experiment group would enter the Payment Request experimental flow via the Payment sheet, and the javascript would interact with Shopify’s Checkout API to complete a payment.

Alternative Payment Flows

Since the majority of merchants on the Shopify platform accept discount codes and gift cards as part of their purchase flow, it was important to not negatively impact the merchants’ business during this experiment due to the Payment Request API not supporting discount code entry.

Shopify only supports this feature on the regular checkout flow, and implementing this feature on the cart page prior to checkout would involve a non-trivial effort. Therefore, we needed to provide an ability for buyers to opt out of the experiment if they wanted to provide a discount code. We included a link under the checkout button that read: “Discount or gift card?”. Clicking this link would redirect the buyer to our normal checkout flow, where they could use those items, and they would never see the payment sheet.

Finally, if the buyer cancelled the payment sheet purchase flow or an exception occurred, we’d show a link under the checkout button that reads: “Continue with regular checkout”.

What’s Next

The Payment Request API can provide a better purchase experience by eliminating checkout forms. Shopify is extremely interested in this technology and ran an experiment to see if Payment Request was market ready. Now that we've talked about how the experiment was set up, we’re excited to share experiment data points and lessons in the second part of Shaping the Future of Payments in the Browser. It will include breakdowns in time to completion times, user flow learnings in buyer interactions and Payment Request’s overall effect on the purchase experience (both quantitative and qualitative).

Part 2: The Results of Our Experiment with Payment Request API

In Part 1, we dove into how we ran an experiment in order to test the readiness of Payment Request. The goal was to invest in this new technology and share what we learned back to the W3C and browser vendors, in order to improve web payments as a whole. Regardless of the conclusion of the experiment on our platform, we continue to invest in the current and future web payments specifications.

As a reminder, our hypothesis was as follows:

We believe that Payment Request is market ready for all users on our platform (of currently supported browsers). We’ll know this to be true when we see that the checkout completion rate for select merchants remains unchanged or gets better, and the purchase experience is better and faster.

We define checkout completion rate (CCR) as the number of people that completed a purchase vs the total number of people that demonstrated an intent to purchase. An intent to purchase is indicated by buyers who clicked the “checkout” button on the cart page.

In this post, we investigate and analyze the data gathered during the experiment, including checkout completion rates, checkout completion times, and drop-off rates. This data provides insight on future Payment Request features, UX guidelines, and buyer behaviour.

Data Insights

We ran our experiment for over 2 months with 30 merchants participating. At its peak, there were around 15,000 payment sheet opens per week. The sample size allowed us to have high confidence in our data and our standard error is ±1%.

Time to Completion

Time to Completion

Form Factor

canMakePayment()

10th percentile Median time 90th percentile
Desktop true 0:54 2:16 6:23
Desktop false 1:33 3:13 7:57
Mobile true 0:56 2:35 6:29
Mobile false 1:35 3:22 8:08

Time to completion by device form factor

The time to completion is defined as the time between when the buyer clicks the “checkout” button until their purchase is completed (i.e. they’re on the order status page). The value of canMakePayment() determines if the buyer has a credit card provisioned or not. As an example on Chrome, the method returning true would mean that the buyer had already a credit card on file in their browser; either through one of Chrome’s services, autofill, or from having already gone once through the Payment Request experience.

The median time for buyers with canMakePayment() = false is 3:17 whereas the median time for buyers with canMakePayment() = true is 2:25. This is promising, as both medians are faster than our standard checkout. We can also take a look at the 10th percentile with canMakePayment() = true and see that the checkout completion times are under a minute.

Checkout Completion Rates

As mentioned previously, we define checkout completion rate (CCR) as the number of people that completed a purchase vs the total number of people that demonstrated an intent to purchase. Comparing the control group to the experiment group, we saw a average 7% drop of CCR (with a standard error of ±1%), regardless of canMakePayment().

It is important to put this 7% into perspective. The Payment Request API is still in its infancy: the purchase experience it’s leveraging (through the payment sheet) is something buyers are still getting accustomed to. A CCR drop in the context of our experiment is to be expected, as buyers on our platform are familiar with a very specific and tailored process.

Our experiment did not adversely affect the merchants overall CCR, being that it only ran on a very small subset of buyer traffic. Looking at all eligible merchants, the experiment represented roughly 5% of their traffic, as seen in the following graph:

Overall experiment traffic relative to normal site traffic

We started by slowly ramping up the experiment to select eligible merchants. This explains the low traffic percentage at the beginning of the graph above.

User Flow Analysis

The graph below documents the buyer’s journey through the payment sheet by listing all possible events, in the order they occurred during the purchase session. An event is a user interaction like the user clicking the checkout button or selecting a shipping method. All the possible events can be seen on the right side of the graph below. Not shown on the graph, is that 10% of buyers prefer clicking the provided “Discount or gift card?” link rather than on the “checkout” button, before entering into the experiment.

The ideal user flow for the experiment is:

  1. The buyer clicks the “checkout” button
  2. The payment sheet opens
  3. The buyer selects a shipping address
  4. The buyer selects a shipping method
  5. The buyer clicks “pay”
  6. The payment is successful

The number at the top of the bars indicate the percentage of events that occurred at that step relative to step 1. For example, by step 6, a total of 43% of events were emitted compared to step 1.

Payment sheet event breakdown by order of occurrence
Payment sheet event breakdown by order of occurrence

Here are some ways the user flows break down:

  • [Step #1 to Step #2] Not all buyers who click the button will see the payment sheet. This is due to the various conflicting Javascript on the merchant’s storefront, leading to exceptions
  • [Step #3] Upon seeing the payment sheet, 60% of buyers will drop out without interacting with their shipping contact information or provided shipping methods
  • [Step #4] Once they exited the sheet, 35% of buyers prefer clicking on one of the other links provided. 84% of these will click the “Discount or gift card?” link while the rest will click on the “Continue with regular checkout” link. A small percentage of buyers will retry the payment sheet.
  • [Step #5] 32% of buyers will initiate a payment in the payment sheet by clicking the “Pay” call to action
  • [Step #6] At this point, 28% of buyers are able to complete their checkout. The rest will have to retry a payment because of a processing error such as an invalid credit card, insufficient funds, etc...

Of the buyers that don’t go to through the payment sheet, only 30% of them will retry one or two times to go through Payment Request again and 7% of buyers will retry two or more times.

Furthermore, we don't know why 60% of buyers drop out of the payment sheet, as the Payment Request API doesn’t provide event listeners on all sheet interactions. However, we think that the payment sheet being fairly foreign to buyers might be part of the cause. This 60% drop out rate certainly accounts for the 7% CCR drop we mentioned earlier. This is not to say that the purchase experience is subpar; rather, that it will take time for buyers to get accustomed to. As this new specification gains traction and adoption broadens, we think that the number of buyers that drop out will significantly decrease. Our merchant feedback seems to support our hypothesis:

“I found the pop-up really surprising and confusing because it doesn't go with the rest of our website.”

“[...] it comes up when you are still on the cart page even though you expect to be taken to checkout. It's just not what you are used to seeing as a standard checkout process [...]”

“My initial thoughts on it is that the UI/UX is harshly different than the rest of our site and shopify [...]”

Merchants were definitively apprehensive of Payment Request, but were quite excited by the prospect of a streamlined purchase experience that could leverage the buyers information securely stored in the browser. This is best reflected in the nuanced feedback we received after our experiment ended:

"I just wanted to check in and see if there was any update with this. We’d really love to try out the new checkout again."

“[...] I love the test, it’s just a pretty drastic change from what online shoppers are used to in terms of checkout process.”

Finally, to better understand merchant feedback, we performed user experience research on the different payment sheet UIs implemented by browser vendors. We’ll share specific research insights with the concerned browser vendors, but the lessons listed below can be applied to all payment sheets and are recurring throughout implementations.

We found that in order to create more familiarity with the buyer as they navigate from the storefront to the payment sheet, it’s useful to surface the merchant’s name or logo as part of it. Furthermore, it’s important to keep “call to actions” with negative connotations (i.e. cancel or abort) in the same area in every payment sheet screen. This helps to set the proper expectations for the buyer. An example is having the “Pay” call to action in the bottom right of the very first screen, then having a “Cancel” call to action in the bottom right of the next screen.

As for the user experience, it’s preferred not to surface grayed out fields unless they are preselected. An example is surfacing a grayed out shipping address to the buyer on the very first screen of the payment sheet, without it being preselected. The buyer might think that they don’t have to select a shipping address as it’s already presented to them. This leads to confusion for the buyer and relates well to merchant feedback we’ve received:

“When this pops up, it's really unclear how to proceed so much so that it was jarring to see "Pay" as the CTA button [...]”

Finally, to prevent unnecessary back and forth between screens, surface validation errors as soon as possible in the flow (ideally in the form, near the fields).

Experiment Conclusion

Reiterating our initial hypothesis:

We believe that Payment Request is market ready for all users on our platform (of currently supported browsers). We will know this to be true when we see that the checkout completion rate for select merchants remains unchanged or gets better, and the purchase experience is better and faster.

Event though merchants were interested in the prospect of Payment Request, we don’t believe that Payment Request is a good fit for them yet. We pride ourselves on offering a highly optimized checkout across all browsers. We constantly tweak it by running extensive UX research, testing it against multiple devices, and regularly offering new features and interesting value propositions for merchants and buyers alike. These include Google Autocomplete for Shopify, Shopify Pay or Dynamic Checkout, which allow us to streamline the purchase experience.

As buyer recognition of the feature grows and browsers tweak their UI to improve the payment sheet, we believe that the aforementioned 7% Checkout Conversion Rate drop and the 60% drop of buyers at the payment sheet will greatly diminish. Paired with the very promising time to completion medians, we are excited to see how the specification will grow in the upcoming months.

What’s next

Payment Request has a bright future ahead of it as both the W3C and browser vendors show interest in pushing this technology forward. The next major milestone for Payment Request is to accept third party payment methods through the new Payment Handler API, which will definitely help adoption of this technology. It was, up until recently, only available behind a feature flag in Chrome but Google has officially rolled it out as part of v68. We’ve already started experimenting with this next specification and are quite excited by its possibilities. You can find several demos we recorded for the W3C here: Shopify Web Payments Demos. We chose Affirm and iDeal as payment methods for the exploration, and the results are promising.

Shopify’s excited to be part the Web Payments Working Group and thrilled to hear your comments. We invite you to explore the specification by implementing it on your own website. Then join the discussion over at the Web Payments Slack group or over at W3C’s wiki page, where you’ll find resources to comment, discuss and help us in developing this new standard.

We do believe Payment Request has great potential and will shift the status quo in web payments. We’re excited to see the upcoming changes to Payment Request. Shopify is very keen on the technology and remains active in W3C discussions regarding web payments.

 

Continue reading

Iterating Towards a More Scalable Ingress

Iterating Towards a More Scalable Ingress

Shopify, the leading cloud-based, multi-channel commerce platform, is growing at an incredibly fast pace. Since the beginning of 2016, the number of merchants on the platform increased from 375,000 to 600,000+. As the platform scales, we face new and exciting challenges such as implementing Shopify’s Pod architecture and future proofing our cloud storage usage. Shopify’s infrastructure relies heavily on Kubernetes to serve millions of requests every minute. An essential component of any Kubernetes cluster is its ingress, the first point of entry in a cluster that routes incoming requests to the corresponding services. The ingress controller implementation we adopted at the beginning of the year is ingress-nginx, an open source project.

Before ingress-nginx, we used Google Cloud Load Balancer Controller (glbc). We opted out of glbc because, for Shopify, it underperformed on the cloud. We observed underperforming load balancing and request queueing, particularly during deployments. Shopify currently deploys around 40 times per day without scheduling downtime. At the time we identified these problems, glbc wasn’t endpoint aware while ingress-nginx was. Having endpoint awareness allows the ingress to implement alternative load balancing solutions and not rely on the solution offered by Kubernetes Services through kube-proxy. The above reasons, together with the NGINX expertise Shopify acquired through running and maintaining its NGINX (supercharged with Lua) edge load balancers, made the Edgescale team migrate the ingress on our Kubernetes clusters from glbc to ingress-nginx.

Even though we now leverage endpoint awareness through ingress-nginx to enhance our load balancing solution, there are still additional performance issues that arise at our scale. The Edgescale team, which is in charge of architecting, building and maintaining Shopify’s edge infrastructure, began contributing optimizations to the ingress-nginx project to ensure it performs well at Shopify’s scale and as a way to give back to the ingress-nginx community. This post focuses on the dynamic configuration optimization we contributed to the project which allowed us to reduce the number of NGINX reloads throughout the day.

Now’s the perfect time to introduce myself 😎— my name is Francisco Mejia, and I’m a Production Engineering Intern on the Edgescale team. One of my major goals for this internship was to learn and become familiar with Kubernetes at scale, but little did I know that I would spend most of my internship contributing to a Kubernetes project!

One of the first performance bottlenecks we identified when using ingress-nginx was the high frequency of NGINX reloads during application deployments. Whenever application deployments occurred on the cluster, we observed increased latencies for end users which lead us to investigate and find a solution to this problem.

NGINX uses a configuration file to store the active endpoints for every service it routes traffic to. During deployments to our clusters, Pods running the older version are killed and replaced with Pods running the updated version. It’s possible that a single deployment may trigger multiple reloads, as the controller receives updates for the endpoint changes. Any time NGINX reloads it reads an NGINX configuration file into memory, starts new worker processes and signals the old worker processes to shutdown gracefully.

Although NGINX reloads gracefully, reloads are still detrimental from a performance perspective. Old worker processes being shut down results in increased memory consumption, and the reset of keepalive connections and load balancing state. Clients that previously had open keepalive connections with the old worker processes now need to open new connections with the new worker processes. In addition, opening connections at a faster rate means that the server will need to allocate more resources to handle connection requests. We addressed this issue by introducing dynamic configuration to the ingress controller.

To reduce the number of NGINX reloads when deployments occur we added the ability for ingress-nginx to update application endpoints by maintaining them in-memory, thereby eliminating the need for NGINX to regenerate the configuration file and issue a reload. We accomplished this by creating an HTTP endpoint inside NGINX using lua-nginx-module that receives endpoint configuration updates from the ingress controller and modifies an internal Lua shared dictionary that stores the endpoint configuration for all services. This mechanism enabled us to both: skip NGINX reloads during deployments and significantly improved request latencies, especially during deploys.

Here’s a more granular look at the general flow when we instruct the controller to dynamically configure endpoints:

  1. A Kubernetes resource is modified, created or deleted.
  2. The ingress controller sees the changes and sends a POST request to /configuration/backends containing the up to date list of endpoints for every service.
  3. NGINX receives a POST request to /configuration/backends which is served by our Lua configuration module.
  4. The module handles the request by receiving the list of endpoints for all services and updates a shared dictionary that keeps track of the endpoints for all backends.

My team carried out tests to compare the latency of requests between glbc and ingress-nginx with dynamic configuration enabled. The test consisted of the following:

  1. Find a request rate for the load generator where the average request latency is under 100ms when using glbc to access an endpoint.
  2. Use the same rate to generate load on an endpoint behind ingress-nginx and compare latencies, standard deviation and throughput.
  3. Repeat step 1, but this time carry out application deploys while load is being generated to endpoints.

The latencies were distributed as follows:

Latency by percentile distribution glbc vs dynamic

Up until the 99.9th percentile of request latencies both ingresses are very similar, but when we reach 99.99th percentile or greater, ingress-nginx outperforms glbc by multiple orders of magnitude. It’s vital to minimize the request latency as much as possible as it highly impacts merchants success.

We also compared the request latencies when running the ingress controller with and without dynamic configuration. The results were the following:

Latency by percentile distribution - Dynamic configuration enabled vs disabled

From the graph, we can see that the 99th percentile of latencies when using dynamic configuration is comparable to the 99th percentile when using the vanilla ingress controller - with roughly similar results.

We also carried out the previous test, but this time during application deploys - here’s where we really get to see the impact of the dynamic configuration feature. The results are depicted below:

Latency by percentile distribution deploys - dynamic vs vanilla

It’s clear from the graph that there was a huge increase in performance after the 80th percentile from ingress-nginx with dynamic configuration.

When operating at Shopify’s scale a whole new world of engineering challenges and opportunities arise. Together with my teammates, we have the opportunity to find creative ways to solve optimization problems involving both Kubernetes and NGINX. We contributed our NGINX expertise to the ingress-nginx project and will continue doing so. The contribution explained throughout this post wouldn’t have been possible without the support of the ingress-nginx community, massive kudos to them 🎉! Keep an eye out for more ingress-nginx updates on its GitHub page!

Continue reading

E-Commerce at Scale: Inside Shopify's Tech Stack - Stackshare.io

E-Commerce at Scale: Inside Shopify's Tech Stack - Stackshare.io

 9 minute read

Before 2015, we had an Operations and Performance team. Around this time, we decided to create the Production Engineering department and merge the teams. The department is responsible for building and maintaining common infrastructure that allows the rest of product development teams to run their code. Both Production Engineering and all the product development teams share responsibility for the ongoing operation of our end user applications. This means all technical roles share monitoring and incident response, with escalation happening laterally to bring in any skill set required to restore service in case of problems.  

Continue reading

Behind The Code: Jennice Colaco, Backend Developer

Behind The Code: Jennice Colaco, Backend Developer

Behind the Code is an initiative with the purpose of sharing the various career stories of our engineers at Shopify, to show that the path to development is non-linear and quite interesting. The features will showcase people just starting their careers, those who made career switches, and those who've been in the industry for many years. Enjoy!

Continue reading

Scaling iOS CI with Anka

Scaling iOS CI with Anka

Shopify has a growing number of software developers working on mobile apps such as Shopify, Shopify POS and Frenzy. As a result, the demand for a scalable and stable build system increased. Our Developer Acceleration team decided to invest in creating a single unified build system for all continuous integration and delivery (CI/CD) pipelines across Shopify, which includes support for Android and iOS.

We want our developers to build and test code in a reliable way, as often as they want. Having a CI system that makes this effortless. The result is that we can deliver new features quickly and with confidence, without sacrificing the stability of our products.

Shopify’s Build System

We have built our own CI system at Shopify, which we call Shopify Build. It’s based on Buildkite, and we run it on our own infrastructure. We’ve deployed our own version of the job bootstrap script that sets up the CI environment, rather than the one that ships with Buildkite. This allows us to accomplish the following goals:

  • Provide a standard way to define general purpose build pipelines
  • Ensure the build environment integrates well with our other developer tools and are consistent with our production environment
  • Ensure builds are resilient against infrastructure failures and flakiness of third-party dependencies
  • Provide disposable build environments so that subsequent jobs can’t interfere with each other
  • Support selective builds for monorepos, or repositories with multiple projects in them

Initially, Shopify Build only supported Linux environments using Docker to provide disposable environments, and it works extremely well for backend and Android projects. Previously, we had separate CI systems for iOS projects, but we wanted to provide our iOS developers with the same benefits as our other developers by integrating iOS into Shopify Build.

Build Infrastructure for iOS

Building infrastructure for iOS comes with its unique set of challenges. It’s the only piece of infrastructure at Shopify that doesn’t run on top of Linux. We can leverage the same Google Cloud infrastructure we already use in production for our Android build nodes. Unfortunately, Cloud providers such as Amazon Web Services (AWS) and Google Cloud Platform (GCP) don’t provide infrastructure that can run macOS. The only feasible option for us is using a non-cloud provider like MacStadium but the tradeoff is that we can’t auto-scale the infrastructure based on demand.

2017: VMware ESXi and a Storage Area Network

Since we published our blog post on the VMware-based CI for Android and iOS, we’ve learned many lessons. We had a cluster of Mac Pros running ephemeral VMs on top of VMware ESXi. Although it served us well in 2017, it was a maintenance burden on the small team. We relied on tools such as Packer and ovftool, but we built many custom provisioning scripts to build and distribute VMware virtual machines.

On top being difficult to maintain, the setup had a single point of failure: the Storage Area Network (SAN). Each Mac Pro shared this solid-state based infrastructure. By the end of 2017, we exceeded the write throughput, degrading build stability and speed for all of our mobile developers. Due to our write-heavy CI workload, the only solution was to upgrade to a substantially more expensive dedicated storage solution. Dedicated storage would push us a bit farther, but the system would not be horizontally scalable.

2018: Disposable Infrastructure with Anka

During the time we had our challenges with VMWare, a new virtualization technology called Anka was released by Veertu. Anka provides a Docker-like command line interface for spinning up lightweight macOS virtual machines, built on top of Apple’s Hypervisor.framework.

Anka has the concept of a container registry similar to Docker with push and pull functionality, fast boot times, and easy provisioning provided through a command line interface. With Anka, we can quickly provision a virtual machine with the preferred macOS version, disk, memory, CPU configuration and Xcode version.

Mac Minis Versus Mac Pros

Our VMWare-based setup was running a small cluster of 12-core Mac Pros in MacStadium. The Mac Pros provided high bandwidth to the shared storage and ran multiple VMs in parallel. For that reason, they were the only viable choice for a SAN-based setup. However, Anka runs on local storage, and therefore it doesn’t require a SAN.

After further experimentation, we realized a cluster of Core i7 Mac Minis would be a better fit to run with Anka. They are more cost-effective than Mac Pros while providing the same or higher per-core CPU performance. For the price of a single Mac Pro, we could run about 6 Mac Minis. Mac Minis don’t provide 10 Gbit networking, but that isn’t a deal breaker in our Anka setup as we no longer need a SAN. We’re running only one Anka VM per Mac Mini, giving us four cores and up to 16 GB memory per build node. Running a single VM also avoids the performance degradation that we found when running multiple VMs on the same host, as they need to share resources.

Distributing Anka Images to Nodes in Different Regions

We use a separate Mac Mini as a controller node that provisions an Anka VM with all dependencies such as Xcode, iOS simulators and Ruby. The command anka create generates the base macOS image in about 30 minutes and only needs a macOS installer (.app) from the Mac App Store as input.

Anka’s VM image management optimizes disk space usage and data transfer times when pushing and pulling the VMs on the Mac nodes. Our images build automatically in multiple layers to benefit from this mechanism. Multiple layers allow us to make small changes to an image quickly. By re-using previous layers, changing a small number of files in an image across our nodes can be done in under 10 minutes, and upgrading the Xcode version in about an hour.

After the provisioning completes, our controller node continues by suspending the VM and pushes it to our Anka registries. The image is tagged with its unique git revision. We host the Anka Registry on machines with 10 Gbps networking. Since all nodes run Anka independently, we can run our cluster in two MacStadium data centers in parallel. If a regional outage occurs, we offload builds to just one of the two clusters, giving us extra resiliency.

The final step of the image distribution is a parallel pull performed on the Mac Minis with each pulling only the new layers from the available images in their respective Anka Registry to speed up the process. Each Mac Mini has 500 GB of SSD storage, which is enough to store all our macOS image variants. We allow build pipelines to specify images with both name and tags, such as macos-xcode92:latest or macos-xcode93:<git-revision>, similar to how Docker manages images. The Anka Image Distribution Process

Running Builds With Anka and Buildkite

We use Buildkite as the scheduler and front-end for CI at Shopify. It allows for fine-grained customization of pipelines and build scripts, which makes it a good fit for our needs.

We run a single Buildkite Agent on each Mac Mini and keep our git repositories cached on each of the hosts, for a fast git checkout. We also support shallow clones. We found that with a single large repository and many git submodules, a local cache gives the best performance. As mentioned before, we maintain copies of suspended Anka images on each of the Mac Minis. Suspended Anka VMs, rather than stopped ones, can boot in under a second, which is a significant improvement over our VMware VMs, which took about one minute to boot even from a suspended state.

As part of running a build, a sequence of Anka commands is invoked. First, we clone the base image to a temporary snapshot. This is done using anka clone. We then start the VM, wait for it to be booted and continue by mounting volumes to expose artifacts. With anka run we execute the command corresponding to the Buildkite step and wait for it to finish. Artifacts are uploaded to cloud storage and the Anka VM is deleted afterwards with anka delete. The Lifecycle of a Build Job Using Anka Containers.

We monitor the demand for build nodes and work with MacStadium to scale the number of Mac Minis in both data centers. It’s easier than managing Macs ourselves, but it’s still a challenge as we can’t scale our cluster dynamically. In the graph below, you can see the green line indicating the total number of available build nodes and the required agent count in orange.

Our workload is quite spiky, with high load exceeding our capacity at moments during the day. During those moments, our queue time will increase. We expect to add more Mac Minis to our cluster as we grow our developer teams to keep our queue times under control.

 A Graph Showing Shopify's iOS CI Workload over 4 Hours During Our Work Day

Summary

It took us about four months to implement the new infrastructure on top of Anka with a small team. Building your own CI system requires an investment in engineering time and infrastructure, and at Shopify, we believe it’s worth it for companies that plan to scale while continuing to iterate at a high pace on their iOS apps.

By using Anka, we substantially improved the maintainability and scalability of our iOS build infrastructure. We recommend it to anyone looking for macOS virtualization in a Docker-like fashion. During the day, our team of about 60 iOS developers runs about 350 iOS build jobs per hour. Anka provides superior boot times by reducing the setup time of a build step. Upgrading to new versions of macOS and Xcode is easier than before. We have eliminated shared storage as a single point of failure thereby increasing the reliability of our CI system. It also means the system is horizontally scalable, so we can easily scale with the growth of our engineering team.

Finally, the system is easier to use for our developers by being part of Shopify Build, sharing the same interface we use for CI across Shopify.

If you would like to chat with us about CI infrastructure or other developer productivity topics, join us at Developer Productivity on Slack.

Continue reading

Introducing the Merge Queue

Introducing the Merge Queue

Scaling a Majestic Monolith

Shopify’s primary application is a monolithic Rails application that powers our cloud-based, multi-channel commerce platform for 600,000+ merchants in over 175 countries. As we continue to grow the number of developers working on the app, our tooling has grown with them. At Shopify, we mostly follow a trunk based development workflow, and every week more developers write more code, open more pull requests, and merge more commits to master. Occasionally, master merges can go wrong. For example, two unrelated merges can affect one another, the introduction of a new flaky test, or even accidental merges of work in progress. Even a low percentage of a growing number of failed merges will eventually become too big to ignore, so we needed to improve our tooling around merging pull requests.

Shipit is our open source deployment coordination tool. It’s our source of truth of what is deployed, what’s being deployed (if anything), and what’s about to be deployed. There are times we don’t want any more commits merged to master (e.g. if CI on master is failing; if there’s an ongoing incident and we don’t want any more changes introduced to the master branch; or if the batch size of undeployed commits is too high) and Shipit is also the source of truth for this. Originally, we expected developers to check the status of master by hand before merging. This quickly became unsustainable, so Shipit has a browser extension which tells the developer the status of stack right on their pull request:

Introducing the Merge Queue - Stack Status Clear

If for some reason, it’s unsafe to merge, then the developer is asked to hold off:

Introducing the Merge Queue - Please Hold Off Merging

Developers had to manually revisit their pull request to see if it was safe to merge. Large batches of undeployed commits are also considered unsafe for more merges, a condition Shipit considers ‘backlogged’:

Introducing the Merge Queue - Backlogged Status

A rapidly growing development team brings scaling challenges (and lots of frustration) because when a stack returned to a mergeable state, developers rushed to get their changes merged before the pipeline became backlogged again. As we continued to grow, this became more and more disruptive, and so the merge queue idea was born.

The Merge Queue

Shipit was the obvious candidate to house this new automation — it’s the source of truth for the state of master and deploys, and already is integrated with Github. We added the ability to enqueue pull requests for merge directly within Shipit (you can see how it’s configured here in the Shipit Github repo). Once queued and the state of master is ok, a pull request is merged very quickly. We didn’t want our developers to have to leave Github to enqueue pull requests, and we looked at the browser extension to solve that problem!

Introducing the Merge Queue - Merge Pull Request

If a stack has the merge queue enabled, we inject an ‘Add to merge queue’ button. Integrating the button with the normal development flow was important for developer adoption. During testing, we discovered that people still merged directly to master for routine merging and interviews revealed that they instinctively ‘pressed the big green button to merge’. We wanted the merge queue to become the default mechanism for merges and by tweaking our extension to de-emphasise the default ‘Merge pull request’ button by turning it gray, and we saw a further boost in adoption.

By bringing the merge event into the regular deploy pipeline, we’re able to codify some other things we consider best practices — for example, the merge queue can be configured to reject pull requests if it's diverged from its merge base beyond configurable thresholds. Short-lived branches are very important for trunk based development, so old branches (both in terms of date and number of commits diverged) represent an increased risk, and need to be discouraged. The merge queue is configured inside shipit.yml, so the discussions that inform these decisions are all traceable back to a pull request!

It’s important to stress that the merge queue is highly encouraged, but not enforced. At Shopify, we trust our developers to override the automation, if they feel it’s required, and merge directly to master.

After launching the merge queue, we quickly learned that the queue wasn’t always behaving as developers expected. We configured the queue to require certain CI statuses before merging and if a pull request wasn’t ready, Shipit would eject it from the queue, making the developer re-enqueue the pull request later. There are some common situations where this causes frustrations for developers. For example, after a code review, some small tweaks are made to satisfy reviewers, and the pull request is ready to merge pending CI passing. The developer wants to queue the pull request for merging and move on to their next task but needs to monitor CI. Similarly, this also happened with minor changes (readme updates and the like) and developers would save a lot of time if they could queue-and-forget, so that’s what we did! If CI is pending on a queued pull request, Shipit will wait for CI to pass or fail, and merge or reject as appropriate.

We received a lot of positive feedback for that small adjustment, and for the merge queue in general. By getting automation involved earlier in the pipeline, we’re able to take some of the load off our developers, make them happier, and more productive. Over 90% of pull requests to Shopify’s core application are using Shipit with the merge queue! That makes Shipit the largest contributor to our monolith.

Unsafe Commits

A passing, regularly exercised CI pipeline gives you high confidence that a given changeset won’t cause any negative impacts once it reaches production. Ultimately, the only way to see the impact of your changes is to ship them, and sometimes that results in a breaking change reaching your users. You quickly roll back the deploy, stop the shipping pipeline, and investigate what caused the break. Once you identify the bad commit, it can be reverted on master, and the pipeline can resume, right? Consider this batch of commits on master, waiting to deploy:

  • Good commit A
  • Bad commit B
  • Good commit C
  • Good commit D
  • Revert of commit B

How does your deployment tool know that deploying commit C or D is unsafe? Up until recently, we were relying on our developers to manage this situation by hand, manually deploying the first safe commit before unlocking the pipeline. We’d rather our developers focus on adding value elsewhere and decided to have Shipit manage this automatically where possible. Our solution comes in 2 parts:

Marking Commits as Unsafe for Deploy

Introducing the Merge Queue - Marking Commits as Unsafe for Deploy

If a commit is marked as unsafe, Shipit will not deploy that ref in isolation. In the above example, the bottom (oldest) commit might be deployed, followed by the remaining two commits together. This is the functionality we want but still requires manual configuration, so we complement this with automatic revert detection.

Automatic Revert Detection

If Shipit detects a revert of an undeployed commit, it will mark the original commit (and any intermediate commits between it and the revert) as unsafe for deploy:

Introducing the Merge Queue - Automatic Revert Detection

This removes the need for any manual intervention when doing git revert as Shipit can continue to deploy automatically and safely.

In Conclusion

These new Shipit features allow us to ship faster, safer, and hands valuable time back to our developers. Shipit is open source, so you can benefit from these features yourself — check out the setup guide to get started. We’re actively exploring open sourcing the browser extension mentioned above, stay tuned for more updates on that!

Continue reading

Shopify Interns Share Their Tips for Success

Shopify Interns Share Their Tips for Success

At Shopify, we pride ourselves on our people. Shopifolk come from all kinds of backgrounds and experiences — our interns are no exception. So, we gathered some of our current and past interns to chat with them about their careers so far. They share insights about work, education, and tips for interviewing and succeeding at Shopify.

Natalie Dunbar (Backend Developer Intern, Marketing Technology)

Office: Toronto

Education: University of Toronto
Natalie Dunbar (Backend Developer Intern, Marketing Technology)
Get to know Natalie:
  • Studied as a philosophy major for three years, then switched to Computer Science
  • Former camp counselor and sailing instructor
  • Best tip when stuck on a problem? Pair programming.

What does your day-to-day look like?
Once I get to work I immediately open GitHub and Slack. Our team does a daily stand-up through Slack to review our tasks from yesterday and today. My morning is usually responding to PR comments, Slack messages, and working on my assigned tasks. After lunch, I work on projects and usually try to merge my work from earlier in the afternoon to monitor it in production. Finally, before I leave I try to update my PRs so that our team in San Francisco can view them before the end of their day.

What do you feel was the hardest part of your interview?
I've done many technical interviews before, and the “Life Story” step in the Shopify interview process is unique from other companies. I was unsure what to expect. Looking back, I realize it’s not something to worry about because it's an incredibly comfortable conversation with your recruiter that gave them the knowledge to place me on a team that was the best possible fit.

Dream career if you weren’t working in tech?
Philosophy professor (specializing in either logic, philosophy of language, or continental philosophy).

Best piece of advice you’ve ever gotten?
Always be open with your mentor/lead. They want to make your internship experience great so always help them do this for you. This means both requesting and giving feedback frequently.

What are your tips for future Shopify applicants?
Be yourself! And if you are applying for a role that requires a personal project, show one that is targeted at what you’re interested in working on. I made a completely new project over the few days before my internship, which is in no way necessary, and my interviewer (and now lead) was able to determine my technical fit from that.

Gurpreet Gill (Developer Intern, Products)

Office: Ottawa

Education: University of Waterloo
Gurpreet Gill (Developer Intern, Products)
Get to know Gurpreet:
  • No experience of technical stack used at Shopify when hired
  • Can move ears on command
  • Best tip when stuck on a problem? Take a break.

What does your day-to-day look like?
I’m usually in the office by 8:30, and I try not to miss breakfast. I typically avoid coding in the morning. Instead, I review and address feedback on my PRs; read emails; and catch up on work. My team and I head to lunch, then I start coding in the afternoon. I like taking a coffee break in the afternoon at Cody’s Cafe (yes, Ottawa has its own cafe) and make myself a latte with terrible latte art. I also like to play ping-pong as a break!

Dream career if you weren’t in tech?
A chef, or police officer.

Best piece of advice you’ve ever gotten?
Asking for help is okay - it doesn’t make you look weaker, and it’s never too late to reach out for it.

What are your tips for future Shopify applicants?
I believe Shopify is a unique company. Having “people skills” is as important as having technical skills here. So just be yourself during interviews. Don’t pretend to be someone you are not. Be passionate about what you do. Ask questions, don’t be afraid to crack jokes, and be ready to meet some dope people.

Joyce Lee (Solutions Engineering Intern, Shopify Plus)

Office: Waterloo

Education: University of Western Ontario
Joyce Lee (Solutions Engineering Intern, Shopify Plus)
Get to know Joyce:
  • Started interning at Shopify in September 2017
  • Spent 8 months at Shopify in a sales-focused role, but will spend next 4 in a technical one
  • Once tried to sell composting worms online, but inventory sourcing and fulfilment ended up being really complicated.

What’s your day to day like?
Grab a bottle of cold-pressed juice, then go to a full day of meetings selling Shopify Plus to prospective merchants. On days with fewer meetings, I’m building proofs-of-concept for merchants, and working on small projects to level up the revenue organization.

The hardest part of your interview?
I had a slightly different technical interview than other engineers. I was given a business problem and asked to propose a technical solution for it. Then explain it twice to two different audiences: a CEO and a developer.

Any tips for future Shopify applicants?
Complete every part of the application. For interns, it’s typically quite long so start early, but the application actually helps you know Shopify better, which is a great experience. Shopify is worth the long application process, trust me.

How do you succeed within Shopify?
Ask dumb questions, and ask them quickly. The more you ask, the less dumb they’ll get.

Yash Mathur (Developer Intern, Shopify Exchange)

Office: Toronto

Education: University of Waterloo
Yash Mathur (Developer Intern, Shopify Exchange)
Get to know Yash:
  • Has done two work terms with Shopify
  • Demoed an Android app when interviewing for a front-end developer role at Shopify (Spoiler alert: it all worked out!)
  • Favourite language is C++ but has learned to love Ruby for its simplicity (and because Shopify has great Ruby on Rails developers to learn from!)

What does your day-to-day look like?
I come in to the office around 10am. I prefer that because I like to spend my mornings running or swimming, and the rest of my team usually comes in around then too. I start off the day by grabbing breakfast and going through emails and messages. Our team does a daily stand-up where we review what we’ll be working on that day. Then, I like to grab lunch with my team or the other interns. Each day is a mix of coding and meetings to discuss projects or pair-programming. During my breaks, I love playing FIFA or ping pong with others.

Dream career if you weren’t in tech?
Astronaut.

Any tips for future Shopify applicants?
Shopify looks for people who are passionate and willing to expand their skill set. Make sure you bring that point across each phase of the interview.

How do you succeed within Shopify?
Take initiative. Shopify has a startup culture - people won’t tell you what to do, so you have to look for ways to contribute and be valuable. Also, talk to people outside your team. It’s important to understand how your team fits within the rest of the company.

Jenna Blumenthal (Developer, Channels Apps)

Office: Toronto
Education: McGill University
Jenna Blumenthal (Developer, Channels Apps)
Get to know Jenna:
  • Former Shopify intern
  • Started as an intern in January 2017, and was hired full-time in May 2017
  • Studied Physics and Physiology in undergrad, later completing a master’s degree in Industrial Engineering

What’s your day-to-day look like?
Most of the day is spent working on net-new code that will contribute to whatever feature or project we are building. The rest is spent on reviewing other team member’s code, investigating bugs that come in through support and pairing with others (devs or not) on issues.

Any tips for future Shopify applicants?
Play up your non-traditional background. Whoever you meet with, explain why your experiences have shaped the person you are and the way you work. Shopify thrives on people with diverse skills and opinions.

How do you succeed within Shopify?
One of the core tenets you hear a lot at Shopify is, “strong opinions, weakly held.” Don’t think that because you’re just an intern, or new, that you don’t have a valuable opinion. Sometimes fresh eyes see the root of a problem the fastest. Be confident, but also be willing to drive consensus even if it doesn’t turn out your way.

Jack McCracken (Developer Intern, Application Security)

Office: Ottawa
Education: Carleton University
Jack McCracken (Developer Intern, Application Security)
Get to know Jack:
Has a red-belt-black-stripe in Taekwondo. Almost a black belt!
Sells snacks and drinks to students at Carleton using Shopify’s POS system
Has worked at Shopify consistently since May 2015 and as of April 2018 is now full time...that’s six internships!

The hardest part of your interview?
The hardest part of my interview was admitting what I didn’t know. When I got to my second interview, I was so nervous that I completely blanked! After a while of trying to graciously “fake it till I made it, ”I worked up the courage to tell the person I was interviewing with that I totally had no idea. That was hard, but I still believe it got me the job to this day.

Dream career if you weren’t in tech?
If I wasn’t working in tech, I like to think I’d be an author. I love writing stories and explaining complex things to people.

Best piece of advice you’ve ever gotten?
If you ask a question that saves you an hour of hard work and takes the person you’re asking 5 minutes to explain, you just saved your company 55 minutes in development time.

How do you succeed at Shopify?
Succeeding at Shopify is slightly different than your average tech company. It’s very self-driven, so you need to ask questions. It’s hard to succeed (actually pretty much impossible) in a large organization without any context, and it’s much easier to learn by talking to your team lead, a mentor, or some random person you found on Slack than to laboriously read through code or wiki pages.

Ariel Scott-Dicker (iOS Developer, Mobile Foundations)

Office: Ottawa 

Education: Flatiron School
Ariel Scott-Dicker (iOS Developer, Mobile Foundations)
Get to know Ariel:
  • Was doing a degree in Cultural Anthropology, with a minor in Music. He didn’t finish the university degree but did a software development bootcamp and developer internship, before coming to Shopify
  • Never wrote in Swift before coming to Shopify. Now, it’s his favourite programming language!

What’s your day-to-day like?
We release new versions and updates to our iOS app every three weeks. This makes our day-to-day consist of working our way through various tasks that we’ve designated for the current three week period. Sometimes for me, that’s one large task or several smaller ones.

The hardest part of your interview?
I didn’t progress past Life Story the first time. I think it was because I didn’t relate the course of my life thus far to how I could be successful at Shopify. Other than that, the hardest part (which I thought was really fun) was solving conceptual problems verbally, not through coding terms.

Any tips for future Shopify applicants?
During your interview, be yourself, stay calm and confident, and breathe. Make sure whatever you mention speaks for itself, and that it demonstrates how you can succeed at and contribute to Shopify.

Dream career if not in tech?
Working in a big cat sanctuary or experimental agriculture.

How do you succeed at Shopify?
For me, a huge tip for succeeding at Shopify is being selfish with your education and development. This means, asking questions, using the smart people around you as resources, and taking the time to understand something practically or theoretically.

A huge thanks to our Winter 2018 interns for all they have contributed this term. We’re so proud of the work you’ve done and can’t wait to see what’s next for all of you! Think you can see yourself as one of our interns? We’re currently hiring for the Fall 2018 term. Find the application at shopify.com/careers/interns and make sure you apply before the deadline on May 11, 2018 at 9:00 AM EST.

Want to learn more about Shopify's Engineering intern program? Check out these posts:

Continue reading

Solving the N+1 Problem for GraphQL through Batching

Solving the N+1 Problem for GraphQL through Batching

Authors: Leanne Shapton, Dylan Thacker-Smith, & Scott Walkinshaw

When Shopify merchants build their businesses on our platform, they trust that we’ll provide them with a seamless experience. A huge part of that is creating scalable back-end solutions that allow us to manage the millions of requests reaching our servers each day.

When a storefront app makes a request to our servers, they’re interacting with the Storefront API. Historically, REST is the language of choice when designing APIs, but Shopify uses GraphQL.

GraphQL is an increasingly popular query language in the developer community, because it avoids the classic over-fetching problem associated with REST. In REST, the endpoint determines the type and amount of data returned. GraphQL, however, permits highly specific client-side queries that return only the data requested.

Over-fetching occurs when the server returns more data than needed. REST is especially prone to it, due to its endpoint design. Conversely, if a particular endpoint does not yield enough information (under-fetching), clients need to make additional queries to reach nested data. Both over-fetching and under-fetching waste valuable computing power and bandwidth.

In this REST example, the client requests all ‘authors’, and receives a response, including fields for name, id, number of publications, and country. The client may not have originally wanted all that information; the server has over-fetched the data.

REST Query and Response

Conversely, in this GraphQL version, the client makes a query specifically for all authors’ names, and receives that only that information in the response.

GraphQL Query

GraphQL Response

GraphQL queries are made to a single endpoint, as opposed to multiple endpoints in REST. Because of this, clients need to know how to structure their requests to reach the data, rather than simply targeting endpoints. GraphQL back-end developers share this information by creating schemas. Schemas are like maps; they describe all the data and their relationships within a server.

A schema for the above example might look as follows.

The schema defines the type ‘author’, for which two fields of information are available; name and id. The schema indicates that for each author, there’s a non-nullable string value for the ‘name’ field, and a unique, non-nullable identifier for the ‘id’ field. For more information, visit the schema section on the official GraphQL website.

How does GraphQL return data to those fields? It uses resolvers. A resolver is a field-specific function that hunts for the requested data in the server. The server processes the query and the resolvers return data for each field, until it has fetched all the data in the query. Data is returned in the same format and order as the query, in a JSON file.

GraphQL’s major benefits are its straightforwardness and ease of use. Its solved our biggest problems by reducing the bandwidth used and latency while retrieving data for our apps.

As great as GraphQL is, it’s prone to encountering an issue, known as the n+1 problem. The n+1 problem arises because GraphQL executes a separate resolver function for every field, whereas REST has one resolver per endpoint. These additional resolvers mean that GraphQL runs the risk of making additional round trips to the database than are necessary for requests.

The n+1 problem means that the server executes multiple unnecessary round trips to datastores for nested data. In the above case, the server makes 1 round trip to a datastore to fetch the authors, then makes N round trips to a datastore to fetch the address for N authors. For example, if there were fifty authors, then it would make fifty-one round trips for all the data. It should be able to fetch all the addresses together in a single round trip, so only two round trips to datastores in total, regardless of the number of authors. The computing expenditure of these extra round trips are massive when applied to large requests, like asking for fifty different colours of fifty t-shirts.

The n+1 problem is further exacerbated in GraphQL, because neither clients nor servers can predict how expensive a request is until after it’s executed. In REST, costs are predictable because there’s one trip per endpoint requested. In GraphQL, there’s only one endpoint, and it’s not indicative of the potential size of incoming requests. At Shopify, where thousands of merchants interact with the Storefront API each day, we needed a solution that allowed us to minimize the cost of each request.

Facebook previously introduced a solution to the N+1 issue by creating DataLoader, a library that batches requests specifically for JavaScript. Dylan Thacker-Smith, a developer at Shopify, used DataLoader as inspiration and built the GraphQL Batch Ruby library specifically for the GraphQL Ruby library. This library reduces the overall number of datastore queries required when fulfilling requests with the GraphQL Ruby library. Instead of the server expecting each field resolver to return a value, the library allows the resolver to request data and return a promise for that data. For GraphQL, a promise represents the eventual, rather than immediate, resolution of a field. Therefore, instead of resolver functions executing immediately, the server waits before returning the data.

GraphQL Batch allows applications to define batch loaders that specify how to group and load similar data. The field resolvers can use one of the loaders to load data, which is grouped with similar loads, and returns a promise for the result. The GraphQL request executes by first trying to resolve all the fields, which may be resolved with promises. GraphQL Batch iterates through the grouped loads, uses their corresponding batch loader to load all the promises together, and replaces the promises with the loaded result. When an object field loads, fields nested on those objects resolve using their field resolvers (which may themselves use batch loaders), and then they’re grouped with similar loads that haven't executed. The benefits for Shopify are huge, as it massively reduces the amount of computing power required to process the same requests.

GraphQL Batch is now considered general best-practice for all GraphQL work at Shopify. We believe great tools should be shared with peers. The GraphQL Batch library is simple, but solves a major complaint within the GraphQL Ruby community. We believe the tool is flexible and has the potential to solve problems beyond just Shopify’s scope. As such, we chose to make GraphQL Batch open-source.

Many Shopify developers are already active individual GraphQL contributors, but Shopify is still constantly exploring ways to interact more meaningfully with the vibrant GraphQL developer community. Sharing the source code for GraphQL Batch is just a first step. As GraphQL adoption increases, we look forward to sharing our learnings and collaborating externally to build tools that improve the GraphQL developing experience.

Learn More About GraphQL at Shopify

Continue reading

Shopify’s Infrastructure Collaboration with Google

Shopify’s Infrastructure Collaboration with Google

We’re always working to deliver the best commerce experience to our merchants and their customers. We provide a seamless merchant experience while shaping the future of retail by building a platform that can handle the traffic of a Kylie Cosmetic flash sale (they sell out in 20 seconds), ship new features into production hundreds of times a day, and process more than double the amount of orders year over year.

For Production Engineering to meet these needs, we regularly review our technology stack to ensure we are using the best tools for the job and our journey to the Cloud is a perfect example. That’s why, we are excited to share that Shopify is now building our Cloud with Google, but before sharing the details of this announcement, we want to provide some context on our journey.

Shopify has been a cloud company since day one. We provide a commerce cloud to our merchants, solving their worries about hiring full-time IT staff to manage the infrastructure side of the business. Cloud is part of our DNA and our public cloud connection goes back to 2006, the same year both Shopify and Amazon Web Services (AWS) launched. Early on, we leveraged the public cloud as a small piece of our commerce cloud. It was great for hosting some of our smaller services, but we found the public cloud wasn’t a great fit for our main Rails monolith.

We’re pragmatic about how to evolve and invest in our infrastructure. In our startup days - with a small team - we valued simplicity and chose to focus on shipping the foundations of a commerce platform by deferring more complex infrastructure like database sharding. As we grew in scale and engineering expertise, we took on solving more complex patterns. With each major infrastructure scalability feature we shipped, like database sharding, application sharding, and production load testing, we continued to revisit how to horizontally scale our Rails application across thousands of servers. Over the years, we moved more and more of our supporting services to the Cloud, gaining additional context which fed into our developing monolith Cloud strategy.

Our latest push to the Cloud started over two years ago. Google launched Google Kubernetes Engine (GKE) (formerly Google Container Engine) as we had just finished production-hardening Docker. In 2014, Shopify invested in Docker to capitalize on the benefits of immutable infrastructure: predictable, repeatable builds and deployments; simpler and more robust rollbacks; and elimination of configuration management drift. Once you’re running containers, the next natural step is to take inspiration from Google’s Borg and start building out a dynamic container management and orchestration system. Being early adopters of Docker meant there weren’t many open-source options available, so we decided to build minimal container management features ourselves. The community and codebase were in its infancy and changing rapidly. Building these features allowed us to focus on application scalability and resilience while avoiding additional complexity as the Docker community matured.

In 2016, internal discussions began around what Shopify would look like in the future. The infrastructure changes from 2012 to 2016 allowed us to lay the foundation for using the Cloud in a pragmatic way via database sharding, application sharding, perf testing and automated failovers, but we were still missing an orchestration solution. Luckily, several exciting developments were happening, and the most promising one for Shopify was Kubernetes, an open-source container management system created by the teams at Google that built Borg and GKE.

After 12 years of building and running the foundation of our own commerce cloud with our own data centers, we are excited to build our Cloud with Google. We are working with a company who shares our values in open-source, security, performance and scale. We are better positioned to change the face of global commerce while providing more opportunities to the 600,000+ merchants on our platform today.

Since we began our Google Cloud migration, we have:

  • Built our Shop Mover, a selective database data migration tool, that lets us rebalance shops between database shards with an average of 2.5s of downtime per shop
  • Migrated over 50% of our data center workloads, and counting, to Google Cloud
  • Contributed and leveraged, Grafeas, Google’s open source initiative to define a uniform way for auditing and governing the modern software supply chain
  • Grown to over 400 production services and built a platform as a service (PaaS) to consolidate all production services on Kubernetes
  • Joined the Cloud Native Computing Foundation (CNCF) and participated in the Kubernetes Apps Special Interest Group and Application Definition Working Group

By leveraging Google’s deep understanding of global infrastructure at scale, we’re able to ensure that every engineer we hire focuses on building and shaping the future of commerce on a global scale.

Stay tuned. We’re excited to share more stories about Shopify’s journey to Google Cloud with you.

Dale Neufeld, VP of Production Engineering

Continue reading

A Pods Architecture To Allow Shopify To Scale

A Pods Architecture To Allow Shopify To Scale

In 2015, it was no longer possible to continue buying a larger database server for Shopify. We finally had no choice but to shard the database, which allowed us to horizontally scale our databases and continue our growth. However, what we gained in performance and scalability we lost in resilience. Throughout the Shopify codebase was code like this:

Sharding.with_each_shard do

some_action

end

If any of our shards went down, that entire action would be unavailable across the platform. We realized this would become a major problem as the number of shards continued to increase. In 2016 we sat down to reorganize Shopify’s runtime architecture.

Continue reading

Accelerating Android Talent Through Community Bootcamps

Accelerating Android Talent Through Community Bootcamps

6 minute read

The mobile team knew they needed developers, particularly Android developers. A few years ago, Shopify pivoted to mobile-first, which led to the launches of Shopify Mobile, Shopify Pay, Frenzy, and others. To maintain momentum, Shopify had to keep building up its mobile talent.

Back when Shopify's mobile teams spun up, many of our then-early mobile developers never did any mobile development before, instead teaching themselves how to do it on the job. From this observation, we had an insight: what if we could teach developers how to build an Android app, via a Shopify-hosted workshop?

The benefits were obvious: this educational initiative could help our local developer community pick up some new skills, while potentially allowing us to meet exciting new talent. The idea for Android Bootcamp was born.

Continue reading

Future Proofing Our Cloud Storage Usage

Future Proofing Our Cloud Storage Usage

How we reduced error rates, and dropped latencies across merchants’ flows

Reading Time: 6 Minutes

Shopify merchants trust that when they build their stores on our platform, we’ve got their back. They can focus on their business, while we handle everything else. Any failures or degradations that happen put our promise of a sturdy, battle-tested platform at risk.

To do so, we need to ensure that the platform stays up and stays reliable. Shopify since 2016 has grown from 375,000 merchants to over 600,000. As of today, an average of 450,000 S3 operations per second are being made through our platform. However, that rapid growth also came with an increased S3 error rate, and increased read and write latencies.

While we use S3 at Shopify, if your application uses any flavor of cloud storage, and its use of cloud storage strongly correlates with the growth of your user base—whether it’s storing user or event data—I’m hoping this post provides some insight into how to optimize your cloud storage!

Continue reading

2017 Bug Bounty Year in Review

2017 Bug Bounty Year in Review

7 minute read

At Shopify, our bounty program complements our security strategy and allows us to leverage a community of thousands of researchers who help secure our platform and create a better Shopify user experience. We first launched the program in 2013 and moved to the HackerOne platform in 2015 to increase hacker awareness. Since then, we've continued to see increasing value in the reports submitted, and 2017 was no exception.

Continue reading

Implementing ChatOps into our Incident Management Procedure

Implementing ChatOps into our Incident Management Procedure

8 minute read

Production engineers (PE) are expected to be incident management experts. Still, incident handling is difficult, often messy, and exhausting. We encounter new incidents, search high and low for possible explanations, sometimes tunnel on symptoms, and, under pressure, forget some best practices.

At Shopify, we care not only about handling incidents quickly and efficiently, but also PE well-being. We have a special IMOC (incident manager on call) rotation and an incident chatbot to assist IMOCs. This post provides an overview of incident management at Shopify, the responsibility of different roles during an incident, and how our chatbot works to support our team.

Continue reading

How Shopify Merchants can Measure Retention

How Shopify Merchants can Measure Retention

At Shopify, our business depends upon understanding the businesses of the more than 500,000 merchants who rely on our platform. Customers are at the heart of any business, and deciphering their behavior helps entrepreneurs to effectively allocate their time and money. To help our merchants, we set upon tackling the nontrivial problem of helping our merchants determine customer retention.

When a customer stops buying from a business, we call that churn. In a contractual business (like software as a service), it’s easy to see when a customer leaves because they dissolve their contract. By comparison, in a non-contractual business (like a clothing store), it’s more difficult as the customer simply stops purchasing without any direct notification. This business won’t know, so we can’t describe it as deterministic. Entrepreneurs running non-contractual businesses can better define churn using probability.

Correctly describing customer churn is important: picking the wrong churn model means your analysis will be either full of arbitrary assumptions or misguided. Far too often businesses define churn as no purchases after N days; typically N is a multiple of 7 or 30 days. Because of this time-limit, it arbitrarily buckets customers into two states: active or inactive. Two customers in the active state may look incredibly different and have different propensities to buy, so it’s unnatural to treat them the same. For example, a customer who buys groceries in bulk should be treated differently than a customer who buys groceries every day. This binary model has clear limitations.

Our Data team recognized the limitation of defining churn incorrectly, and that we had to do better. Using probability, we have a new way to think about customer churn. Imagine a few hypothetical customers visit a store, visualized in the below figure. Customer A is reliable. They are a long-time customer and buy from your store every week. It’s been three days since you last saw them in your store but chances are they’ll be back. Customer B’s history is short-lived. When they first found your store, they made purchases almost daily, but now you haven’t seen them in months, so there’s a low chance of them still being considered active. Customer C has a slower history. They buy something from your store a couple times a year, and you last saw them 10 months ago. What can you say about Customer C’s probability of being active? It’s likely somewhere in the middle.

How Shopify Merchants can Measure Retention
We can formalize this intuition of probabilistic customers in a model. We’ll consider a simple model for now. Suppose each customer has two intrinsic parameters: a rate of purchasing, \(\lambda\), and a probability of churn event, \(p\). From the business point of view, even if a customer churns, we don’t see the churn event and we can only infer churn from their purchase history. Given a customer’s rate of purchase, their times between purchases is exponentially distributed with rate \(\lambda\), which means it looks like a Poisson process. After each future purchase, the customer has a \(p\) chance of churning. Rather than trying to estimate every customer's’ parameters, we can think about an individual customer’s parameter coming from a probability distribution. Thus we can estimate the distribution that generates the parameters, and hence, the customers’ behavior. Altogether this is known as a hierarchical model, where there are unobservables (the customer behaviors) being created from probability distributions.
The probability distributions for \(\lambda\) and \(p\) are different for each business. The first step in applying this model is to estimate your specific business’s distributions for these quantities. Let’s assume that a customer’s \(\lambda\) comes from Gamma distribution (with currently unknown parameters), and \(p\) comes from a Beta distribution (also with currently unknown parameters). This is the model the authors of “Counting Your Customers the Easy Way: An Alternative to the Pareto/NBD Model” propose. They call it the BG/NBD (Beta Geometric / Negative Binomial Distribution) model.
BG/NBD (Beta Geometric / Negative Binomial Distribution) model

Further detail on implementing the BG/NBD model is given below, but what’s interesting is that after writing down the likelihood of the model, the sufficient statistics turn out to be:

  • Age: the duration between the customer’s first purchase and now
  • Recency: what was the Age of the customer at their last purchase?
  • Frequency: how many repeat purchases have they made?

Because the above statistics (age, frequency, recency) contain all the relevant information needed, we only need to know these three quantities per customer as input to the model. These three statistics are easily computed from the raw purchase data. Using these new statistics, we can redescribe our customers above:

  • Customer A has a large Age, Frequency, and Recency.
  • Customer B has a large Age and Frequency, but much smaller Recency.
  • Customer C has a large Age, low Frequency, and moderate Recency.

Being able to statistically determine the behaviors of Customers A, B and C means an entrepreneur can better run targeted ad campaigns, introduce discount codes, and project customer lifetime value.

The individual-customer data can be plugged into a likelihood function and fed to a standard optimization routine to find the Gamma distribution and Beta distribution parameters \((r, \alpha)\), and \((a, b)\), respectively. You can use the likelihood function derived in the BG/NBD paper for this:


We use optimization routines in Python, but the paper describes how to do this in a spreadsheet if you prefer.

Once these distribution parameters are known \((\alpha, r, a, b)\), we can look at metrics like the probability of a customer being active given their purchase history. Organizing this as a distribution is useful as a proxy for the health of a customer base. Another view is to look at the heatmap of the customer base. As we vary the recency of a customer, we expect the probability of being active to increase. And as we vary the frequency, we expect the probability to increase given a high recency too. Below we plot the probability of being active given varying frequency and recency:

How Shopify Merchants Can Measure Retention - Probability of Being Active, by Frequency and Recency

The figure reassures us that the model behaves as we expect. Similarly, we can look at the expected number of future purchases in a single unit of time: 

Expected Number of Future Purchases for 1 Unit of Time

At Shopify, we’re using a modified BG/NBD model implemented in lifetimes, an open-source package maintained by the author and the Shopify Data team. The resulting analysis is sent to our reporting infrastructure to display in customer reports. We have over 500K merchants that we can train the BG/NBD model on, all in under an hour. We do this by using Apache Spark’s DataFrames to pick up the raw data, group rows by the shop, and apply a Python user-defined function (UDF) to each partition.  The UDF contains the lifetimes estimation algorithm. For performance reasons, we subsample to 50k customers per shop because the estimation beyond this yielded diminishing returns. After fitting the data to the BG/NBD model’s parameters, we apply the model to each customer in that shop, and yield the results again. In all, we infer churn probabilities and expected values for the over 500 million historical merchant customers.

One reason for choosing the BG/NBD model is its easy interpretability. Because we are displaying the end results to shop owners, we didn’t want the model to be a black-box that they’d have a difficult time explaining why a customer was at-risk or loyal. Recall the variables the BG/NBD model requires are age, frequency and recency. Each of these variables is easily understood by even non-technical individuals. The BG/NBD model is codifying the interactions between these three variables and providing quantitative measures based on them. On the other hand, the BG/NBD does suffer from over simplicity. It doesn’t handle seasonal trends well. For example, the frequency term collapses all purchases into a single value, ignoring any seasonality in the purchase behaviour. Another limitation is using this model, you cannot add additional customer variables to the model (ex: country, products purchased) easily.

Once we fitted a model for a store, we rank customers from highest to lowest probability of being active. The highest customers are the reliable customers. The lowest customers are unlikely to come back. The customers around 50% probability are at risk of churning, so targeted campaigns could be made to entice them back, possibly reviving the relationship and potentially gaining a life-long customer. By providing these statistics, our merchants are in a position to drive smarter marketing campaigns, order fulfillment prioritization, and customer support.

Continue reading

How We Enable Our Interns to Make an Impact

How We Enable Our Interns to Make an Impact

Making an Impact

When interns join Shopify for their internship term, they work on projects that will impact our merchants, partners, and even their fellow developers. Some of these projects will alleviate a merchant's pain points, like the ability to sell their products on different channels, or simplify a complicated process for our developers. We want interns to leave knowing they worked on real projects with real impact.

Continue reading

Tell Your Stories: The Benefits of Strategic Engineering Communications

Tell Your Stories: The Benefits of Strategic Engineering Communications

In early 2016, we faced a problem at Shopify. We were growing quickly, and decisions could no longer be made across the room, so to speak. Four offices became five, accommodating that growth raised interesting questions like: how would new people know the history of the company, and how could existing Shopifolk keep up with new developments? In addition to sharing knowledge inside the company, we also wanted to let people outside Shopify know what we were working on to give back to the community and to support recruitment efforts.

Engineering communications was born to solve a specific problem. A valued saying here is “do things, tell people,” but, while we’re very good at the first part, we weren’t living up to expectations on the second. Ad hoc worked when we were smaller, but with technical stories now coming from teams as varied as production engineering, mobile, front-end development, and data engineering, we needed something more formalized. Strong communications inside the engineering team could help prevent the overlap of work by different teams or the duplication of mistakes, and it could support cross-pollination of ideas.

Continue reading

How Shopify Governs Containers at Scale with Grafeas and Kritis

How Shopify Governs Containers at Scale with Grafeas and Kritis

Today, Google and its contributors launched Grafeas, an open source initiative to define a uniform way for auditing and governing the modern software supply chain. At Shopify, we’re excited to be part of this announcement.

Grafeas, or “scribe” in Greek, enables us to store critical software component metadata during our build and integration pipelines. With over 6,000 container builds per day and 330,000 images in our primary container registry, the security team was eager to implement an appropriate auditing strategy to be able to answer questions such as:

  • Is this container deployed to production?
  • When was the time this container was pulled (downloaded) from our registry?
  • What packages are installed in this container?
  • Does this container contain any security vulnerabilities?
  • Does this container meet our security controls?

Using Grafeas as the central source of truth for container metadata has allowed the security team to answer these questions and flesh out appropriate auditing and lifecycling strategies for the software we deliver to users at Shopify.

Here’s a sample of some of the container introspection we gain from Grafeas. In this example we have details surrounding the origin of this container including its build details, base image and the operations that resulted in the container layers.

Build Details:

Image Basis:

As part of Grafeas, Google also introduced Kritis, or “judge” in Greek, which allows us to use the metadata stored in Grafeas to build and enforce real-time deployment policies with Kubernetes. During CI, a number of audits are performed against the containers and attestations are generated. These attestations make up the policies we can enforce with Kritis on Kubernetes.

At Shopify we use PGP to digitally sign our attestations, ensuring the identity of our builder and other attestation authorities.

Here’s an example of a signed attestation:

The two key concepts of Kritis are attestation authorities and policies. Attestation authorities are described as a named entity which has the capability to create attestations. A policy would then name one or more attestation authorities whose attestations are required to deploy a container to a particular cluster. Here’s an example of what that might look like:

Given the above attestation authorities (built-by-us and tested) we can deploy a policy similar to this example:

This policy would preclude the deployment of any container that does not have signed attestations from both authorities.

Given this model, then we can create a number of attestation authorities which satisfy particular security controls.

Attestation Examples:

  • This container has been built by us
  • This container comes from our (or a trusted) container repository
  • This container does not run as root
  • This container passes CI tests
  • This container does not introduce any new vulnerabilities (scanned)
  • This container is deployed with the appropriate security context

Given the attestation examples above, we can enable Kritis enforcement on our Kubernetes clusters that ensures we only run containers which are free from known vulnerabilities, have passed our CI tests, do not run as root, and have been built by us!

In addition to build time container security controls we can also generate Kritis attestations for the Kubernetes workload manifests with the results of kubeaudit during CI. This means we can ensure there are no regressions in the runtime security controls before the container is even deployed.

Using tools like Grafeas and Kritis has allowed us to inject security controls into the DNA of Shopify’s cloud platform to provide software governance techniques at scale alongside our developers, unlocking the velocity of all the teams.

We’re really excited about these new tools and hope you are too! Here are some of the ways you can learn more about the projects and get involved:

Try Grafeas now and join the GitHub project: https://github.com/Grafeas

Attend Shopify’s talks at Google Cloud Summit in Toronto on 10/17 and KubeCon in December.

See grafeas.io for documentation and examples.

Continue reading

Building Shopify Mobile with Native and Web Technology

Building Shopify Mobile with Native and Web Technology

For mobile apps to have an excellent user experience, they should be fast, use the network sparingly, and use visual and behavioural conventions native to the platform. To achieve this, the Shopify Mobile apps are native iOS and Android, and they're powered by GraphQL. This ensures our apps are consistent with the platforms they run on, are performant, and use the network efficiently.

This essentially means developing Shopify on each platform: iOS, Android, and web. As Shopify has far more web developers than mobile developers, it’s almost impossible to keep pace with the feature releases on the web admin. Since Shopify has invested in making the web admin responsive, we often leverage parts of the web to maintain feature parity between mobile and desktop platforms.

Core parts of the app that are used most are native to give the best experience on a small screen. A feature that is data-entry intensive or has high information density is also a good candidate for a native implementation that can be optimized for a smaller screen and for reduced user input. For secondary activities in the app, web views are used. Several of the settings pages, as well as reports, which are found in the Store tab, are web views.  This allows us to focus on creating a mobile-optimized version of the most used parts of our product, while still allowing our users to have access to all of Shopify on the go.

With this mixed-architecture approach, not only can a user go from a native view to a web view, using deep-links the user can also be presented a native view from a web view. For example, tapping a product link in a web view will present the native product detail view.

At Unite, our developer conference, Shopify announced Polaris, a design language that we use internally for our web and mobile applications. A common design language ensures our products are familiar to our users, as well as helping to facilitate a mixed architecture where web pages can be used in conjunction with native views.

Third Party Apps

In addition to the features that are built internally, Shopify has an app platform, which allows third party developers to create (web) applications that extend the functionality of Shopify. In fact, we have an entire App Store dedicated to showcasing these apps. These apps authenticate to Shopify using OAuth, and consume our REST APIs. We also offer a JavaScript SDK called the Embedded App SDK (EASDK) that allow apps to be launched within an iframe of the Shopify Admin (instead of opening the app in another tab), and to use Shopify’s navigation bars, buttons, pop ups, and status messages. Apps that use the EASDK are called "embedded apps," and most of the applications developed for Shopify today are embedded.

Our users rely on these third party apps to run their business, and they are doing so increasingly from their mobile devices. When our team was tasked with bringing these apps to mobile, we quickly found these apps use too much vertical real-estate for their navigation when loaded in a web view. Also, it would introduce inconsistencies between the native app navigation bars and their web counterparts. It was clear that this would be a sub-par experience. Additionally since these apps are maintained by third-party developers, it would not be possible to update them to be responsive.

Our goal was to have apps optimize their screen usage, and have them look and behave like the rest of the mobile app. We wanted to achieve this without requiring existing apps to make any code change.  This approach means our users would have all of the apps they currently use, in addition to access to the thousands of apps available on the Shopify App Store on the day we released the feature.

Content size highlighted in an app rendered in a web view (left) vs. in Shopify Mobile (right).  


This screenshots above illustrate what an app would look like rendered in a web view as-is, vs. how they look now, optimized for mobile. Much of the navigation bar elements have been collapsed into the native nav bar, which allows the app to reclaim the vertical space for content instead of displaying a redundant navigation bar. Also, the web back button has been combined into the native navigation back stack, so tapping back through the web app is the same as navigating back through native views.  These changes allowed the apps to reclaim more than 40% more vertical real estate.

I'll now go through how we incorporated each element.

Building the JavaScript bridge

The EASDK is what apps use to configure their UI within Shopify. We wanted to position the Shopify Mobile app to be on the receiving end of this API, much like the Shopify web admin is today. This would allow existing apps to use the EASDK with no changes. The EASDK contains several methods to configure the navigation bar which can consist of buttons, title, breadcrumbs and pagination. We looked at reducing the amount of items that the navigation bar needed to render, and starting pruning. We found that the breadcrumbs and pagination buttons were not necessary, and not a common pattern for mobile apps. They were the first to be cut. The next step was to collapse the web navigation bar into the native bar. To do this, we had to intercept the JavaScript calls to the EASDK methods.

To allow interception of calls to the EASDK, we created a middleware system on Shopify web that could be injected by the mobile apps. This allows Shopify Mobile to augment the messages before they hit their final destination or suppress them entirely. This approach is very flexible, and generic; clients can natively implement features piecemeal without the need for versioning between client and server.

This middleware is implemented in JavaScript and bundled with the mobile apps. A single shared JavaScript file contains the common methods for both platforms, and then separate platform-specific files which contain the iOS and Android specific native bridging.

High level overview of the data flow from an embedded app to native code on the Shopify Mobile

The shared JavaScript file injects itself into the main context, extends the Shopify.EmbeddedApp class, and overrides the methods that are to be intercepted on the mobile app. The methods in this shared file simply forward the calls to another object, Mobile, which is implemented in the separate files for iOS and Android.


Shared JS File

On iOS, WKWebView relies on postMessage to allow the web-page to communicate to native Swift. The two JavaScript files are injected into the WKWebView using WKUserScript. The iOS specific JavaScript file forwards the method calls from the EASDK into postMessages that are intercepted by the WKScriptMessageHandler.


iOS JS File


iOS native message handling

On Android, a Java Object can be injected into the WebView, which gives the JavaScript access to its methods.



Android JS File

 

Android native message handling

When an embedded app is launched from the mobile app, we inject a URL parameter to inform Shopify not to render the web nav bar since we will be doing so natively. As calls to the EASDK methods are intercepted, the mobile apps render titles, buttons and activity indicators natively. This provides better use of the screen space, and required no changes to the third party apps, so all the existing apps work as-is!

Communicating from native to web

App with native primary button, and secondary buttons in the overflow menu


In addition to intercepting calls from the web, the mobile apps need to communicate user interactions back to the web.  For instance, when a user taps a native button, we need to trigger the appropriate behaviour as defined in the embedded app.  The middleware facilitates communicating from native to web via HTML postMessages.  Buttons have an associated message name, which we use when a button is tapped.

Alternatively, a button can be defined to load a URL, in which case we can simply load the target URL in the web view. A button can also be configured to emit a postMessage.


 

iOS implementation of button handling

Android implementation of button handling

Summary

By embracing the web in our mobile apps, we are able to keep pace with the feature releases of the rest of Shopify while complementing it with native versions for features that merchants use most. This also allows us to extend the Shopify Mobile with apps that were created by our third party developers with no change to the EASDK. By complementing the web view with a JavaScript bridge, we were able to optimize the real estate and make the apps more consistent with the rest of the mobile app.

With multiple teams contributing features to Shopify Mobile concurrently, our mobile app is the closest it’s been to reaching feature parity with the web admin, while ensuring the frequently used parts of the app to be optimized for mobile by writing them natively.

To learn more about creating apps for Shopify Mobile, check out our developer resources.

Continue reading

Code Style Consistency for Shopify’s Decade-Old Codebase

Code Style Consistency for Shopify’s Decade-Old Codebase

5 minute read

Over the course of Shopify's 13-year codebase history, the core platform has never been rewritten. That meant a slew of outdated code styles, piling atop of one another and without a lot of consistency. By 2012, our CEO Tobi created a draft Ruby style guide to keep up with the growth. Unfortunately, it never became embedded in our programming culture and many people didn't even know it existed.

Continue reading

Integrating with Amazon: How We Bridged Two Different Commerce Domain Models

Integrating with Amazon: How We Bridged Two Different Commerce Domain Models

Over the past decade, the internet and mobile devices became the dominant computing platforms. In parallel, the family of software architecture styles that support distributed computing are the ways we build systems to tie these platforms together. Styles fall in and out of favor as technologies evolve and as we, the community of software developers, gain experience building ever more deeply connected systems.

If you’re building an app to integrate two or more systems, you’ll need to bridge between two different domain models, communication protocols, and/or messaging styles. This is the situation that our team found itself in as we were building an application to integrate with Amazon’s online marketplace. This post talks about some of our experiences integrating two well-established but very different commerce platforms.

Shopify is a multi-channel commerce platform enabling merchants to sell online, in stores, via social channels (Facebook, Messenger and Pinterest), and on marketplaces like Amazon from within a single app. Our goals for the Amazon channel were to enable merchants to use Shopify to:

  • Publish products from Shopify to Amazon
  • Automatically sync orders that were placed on Amazon back to Shopify
  • Manage synced orders by pushing updates such as fulfillments and refunds back to Amazon

At Shopify, we deal with enormous scale as the number of merchants on our platform grows. In the beginning, to limit the scale that our Amazon app would face, we set several design constraints including:

  • Ensure the data is in sync to enable merchants to meet Amazon’s SLAs
  • Limit the growth of the data our app stores by not saving order data

In theory, the number of orders our app processes is unbounded and increases with usage. By not storing order data, we believed that we could limit the rate of growth of our database, deferring the need to build complex scaling solutions such as database sharding.

That was our plan, but we discovered during implementation that the differences between the Amazon and Shopify systems required our app to do more work and store more data. Here’s how it played out.

Integrating Domain Woes

In an ideal world, where both systems use a similar messaging style (such as REST with webhooks for event notification) the syncing of an order placed on Amazon to the Shopify system might look something like this:

Integrating with Amazon: How we bridged two different commerce domain models

Each system notifies our app, via a webhook, of a sale or fulfillment. Our app transforms the data into the format required by Amazon or Shopify and creates a new resource on that system by using an HTTP POST request.

Reality wasn’t this clean. While Shopify and Amazon have mature APIs, each has a different approach to the design of these APIs. The following chart lists the major differences:

Shopify API
Amazon’s Marketplace Web Server (MWS) API
  • uses representational state transfer (REST)
  • uses remote procedure call (RPC) messaging style
  • synchronous data write requests
  • asynchronous data write requests
  • uses webhooks for event notification
  • uses polling for event discovery, including completion of asynchronous write operations.

To accommodate, the actual sequence of operations our app makes is:

  1. Request new orders from Amazon
  2. Request order items for new orders
  3. Create an order on Shopify
  4. Acknowledge receipt of the order to Amazon
  5. Confirm that the acknowledgement was successfully processed

When the merchant subsequently fulfills the order on Shopify by the merchant, we receive a webhook notification and post the fulfillment to Amazon. The entire flow looks like this:

Integrating with Amazon: How we bridged two different commerce domain models

When our app started to receive an odd error from Amazon when posting fulfilment requests we knew the design wasn’t totally figured out. It turned out that our app received the fulfillment webhook from Shopify before the order acknowledgement was sent to Amazon. Therefore when we attempted to send the fulfillment to Amazon, it failed. 

Shopify has a rich ecosystem of third-party apps for merchants’ shops. Many of these apps help automate fulfillment by watching for new orders and automatically initiating a shipment. We had to be careful because one of these apps could trigger a fulfilment request before our app sends the order acknowledgement back to Amazon.

Shopify uses a synchronous messaging protocol requiring two messages for order creation and fulfillment. Amazon’s messaging protocol is a mix of synchronous (retrieving the order and order items) and asynchronous messages (acknowledging and then fulfilling the order), which requires four messages. All six of these messages need to be sent and processed in the correct sequence. This is a message ordering problem: we can’t send the fulfillment request to Amazon until the acknowledgement request has been sent and successfully processed even if we get a fulfillment notification from Shopify. We solved the message ordering problem by holding the fulfillment notification from Shopify until the order acknowledgement is processed by Amazon.

Another issue cropped up when we started processing refunds. The commerce domain model implemented by Amazon requires refunds to be associated with an item sold while Shopify allows for more flexibility. Neither model is wrong, they simply reflect the different choices made by the respective teams when they chose the commerce use-cases to support.

To illustrate, consider a simplified representation of an order received from Amazon.

This order contains two items, a jersey and a cap. The item and shipping prices for each are just below the item title. When creating the order in Shopify, we send this data with the same level of detail, transformed to JSON from the XML received from Amazon.

Shopify is flexible and allows the merchant to submit the refund either quickly by entering a refund amount, or with more a detailed method specifying the individual items and prices. If the merchant takes the quicker approach, Shopify sends the following data to our app when the refund is created:

Notice that we didn’t get an item-by-item breakdown of the item or shipping prices from Shopify. This causes a problem because we’re required to send Amazon values for price, shipping costs, and taxes for each item. We solved this by retaining the original order detail retrieved from Amazon and using this to fill in missing data when sending the refund details back.

Lessons Learned

Our choices violated the design constraint that we initially set to not persist order data. Deciding to persist orders and all the detail retrieved from Amazon in our app’s database enabled us to solve our problems integrating the different domain models. Looking back, here are a few things we learned:

  • It’s never wrong to go back and re-visit assumptions, decisions, or constraints put in place early in a project. You’ll learn something more about your problem with every step you take towards shipping a feature. This is how we work at Shopify, and this project highlighted why this flexibility is important
  • Understand the patterns and architectural style of the systems with which you’re integrating. When you don’t fully account for these patterns, it can cause implementation difficulties later on. Keep an eye open for this
  • Common integration problems include message ordering and differences in message granularity. A persistence mechanism can be used to overcome these. In our case, we needed the durability of an on-disk database

By revisiting assumptions, being flexible, and taking into account the patterns and architectural style of Amazon, the team successfully integrated these two very different commerce domains in a way that benefits our merchants and makes their lives easier.

Continue reading

How Shopify Capital Uses Quantile Regression To Help Merchants Succeed

How Shopify Capital Uses Quantile Regression To Help Merchants Succeed

6 minute read

Shopify Capital provides funding to help merchants on Shopify grow their businesses. But how does Shopify Capital award these merchant cash advances? In this post, I'll dive deep into the machine-learning technique our Risk-Algorithms team uses to decide eligibility for cash advances.

The exact features that go into the predictive model that powers Shopify Capital are secret, but I can share the key technique we use: quantile regression.

Continue reading

Upgrading Shopify to Rails 5

Upgrading Shopify to Rails 5

Today, Shopify runs on Rails 5.0, the latest version. It’s important to us to stay on the latest version so we can improve the performance and stability of the application without having to increase the maintenance cost of applying monkey patches. This guarantees we would always be in the version maintained by the community; and, that we would have access to new features soon.

Upgrading the Shopify monolith—one of the oldest and the largest Rails applications in the industry—from Rails 4.2 to 5.0 took us nearly a year. In this post, I’ll share our upgrade story and the lessons we learned. If you're wondering how the Shopify scale looks like or you plan a major Rails upgrade, this post is for you.

Continue reading

Maintaining a Swift and Objective-C Hybrid Codebase

Maintaining a Swift and Objective-C Hybrid Codebase

6 minute read

Swift is gaining popularity among iOS developers, which is of no surprise. It's strictly typed, which means you can prove the correctness of your program at compile time, given that your typesystem describes the domain well. It's a modern language offering syntax constructs encouraging developers to write better architecture using fewer lines of code, making it expressive. It's more fun to work with, and all the new Cocoa projects are being written in Swift. 

At Shopify, we want to adopt Swift where it makes sense, while understanding that many existing projects have an extensive codebase (some of them written years ago) in Objective-C (OBJC) that are still actively supported. It's tempting to write new code in Swift, but we can't migrate all the existing OBJC codebase quickly. And sometimes it just isn't worth the effort.

Continue reading

How 17 Lines of Code Improved Shopify.com Loading by 50%

How 17 Lines of Code Improved Shopify.com Loading by 50%

3 minute read

Big improvements don't have to be hard nor take a long time to implement. It took, for example, only 17 lines of code to decrease the time to display text on Shopify.com by 50%. That saved visitors 1.2 seconds: each second matters given that 40% of users expect a website to load within two seconds and those same users will abandon a site if it takes longer than three.  

Continue reading

Bootsnap: Optimizing Ruby App Boot Time

Bootsnap: Optimizing Ruby App Boot Time

8 minute read

Hundreds of Shopify developers work on our largest codebase, the monolithic Rails application that powers most of our product offering. There are various benefits to having a “majestic monolith,” but also a few downsides. Chief among them is the amount of time people spend waiting for Rails to boot.

Doing development, two of the most common tasks are running a development server and running a unit test file. By improving the performance of these tasks, we will also improve the experience for developers working on this codebase and achieve higher iteration speed. We started measuring and profiling the following code paths:

  • Development server: time to first request
  • Unit testing: time to first unit test

Continue reading

Building a Dynamic Mobile CI System

Building a Dynamic Mobile CI System

18 minute read

The mobile space has changed quickly, even within the past few years. At Shopify, the world’s largest Rails application, we have seen the growth and potential of the mobile market and set a goal of becoming a mobile-first company. Today, over 130,000 merchants are using Shopify Mobile to set up and run their stores from their smartphones. Through the inherent simplicity and flexibility of the mobile platform, many mobile-focused products have found success.

 

This post was co-written with Arham Ahmed, and shout-outs to Sean Corcoran of MacStadium and Tim Lucas of Buildkite.

Continue reading

The Side Hustle: Building a Quadcopter Controller for iOS

The Side Hustle: Building a Quadcopter Controller for iOS

Our engineering blog is home to our stories sharing technical knowledge and lessons learned. But that's only part of the story: we hire passionate people who love what they do and are invested in mastering their craft. Today we launch "The Side Hustle," an occasional series highlighting some side projects from our devs while off the Shopify clock.

When Gabriel O'Flaherty-Chan noticed quadcopter controllers on mobile mostly translated analog controls to digital, he took it upon himself to find a better design.

7 minute read

For under $50, you can get ahold of a loud little flying piece of plastic from Amazon, and they’re a lot of fun. Some of them even come with cameras and Wi-Fi for control via a mobile app.

Unfortunately, these apps are pretty low quality — they’re unreliable and frustrating to use, and look out of place in 2017. The more I used these apps, the more frustrated I got, so I started thinking about ways I could provide a better solution, and two months later I emerged with two things:

1. An iOS app for flying quadcopters called SCARAB, and

2. An open-source project for building RC apps called QuadKit

Continue reading

Sharing the Philosophy Behind Shopify's Bug Bounty

Sharing the Philosophy Behind Shopify's Bug Bounty

2 minute read

Bug bounties have become commonplace as companies realize the advantages to distributing the hunt for flaws and vulnerabilities among talented people around the world. We're no different, launching a security response program in 2012 before evolving it into a bug bounty with HackerOne in 2015. Since then, we've seen meaningful results including nearly 400 fixes from 250 researchers, to the tune of bounties totalling over half a million dollars.

Security is vital for us. With the number of shops and volume of info on our platform, it's about maintaining trust with our merchants. Entrepreneurs are running their businesses and they don't want to worry about security, so anything we can do to protect them is how we measure our success. As Tobi recently mentioned on Hacker News, “We host the livelihoods of hundreds of thousands of other businesses. If we are down or compromised all of them can't make money.” So, we have to ensure any issue gets addressed.

Continue reading

Surviving Flashes of High-Write Traffic Using Scriptable Load Balancers (Part II)

Surviving Flashes of High-Write Traffic Using Scriptable Load Balancers (Part II)

7 minute read

In the first post of this series, I outlined Shopify’s history with flash sales, our move to Nginx and Lua to help manage traffic, and the initial attempt we made to throttle traffic that didn’t account sufficiently for customer experience. We had underestimated the impact of not giving preference to customers who’d entered the queue at the beginning of the sale, and now we needed to find another way to protect the platform without ruining the customer experience.

 

Continue reading

Surviving Flashes of High-Write Traffic Using Scriptable Load Balancers (Part I)

Surviving Flashes of High-Write Traffic Using Scriptable Load Balancers (Part I)

7 minute read

This Sunday, over 100 million viewers will watch the Super Bowl. Whether they’re catching the match-up between the Falcons and the Patriots, or there for the commercials between the action, that’s a lot of eyeballs—and that’s only counting America. But all that attention doesn’t just stay on the screen, it gets directed to the web, and if you’re not prepared curious visitors could be rewarded with a sad error page.

The Super Bowl makes us misty-eyed because our first big flash sale happened in 2007, after the Colts beat the Bears. Fans rushed online for T-shirts celebrating the win, giving us a taste of what can happen when a flood of people convene on one site in a very short duration of time. Since then, we’ve been continually levelling up our ability to handle flash sales, and our merchants have put us to the test: on any given day, they’ll hurl Super Bowl-sized traffic, often without notice.

 

Continue reading

Start your free 14-day trial of Shopify