Development

Translation missing: en.categories.development.content_html

Development

Horizontally scaling the Rails backend of Shop app with Vitess

by Hammad Khalid
Jan 17, 2024

Shop app horizontally scaled a Ruby on Rails app with Vitess. This blog describes Vitess and our detailed approach for introducing Vitess to a Rails app.

Development

Creating a Flexible Order Routing System with Shopify Functions

by Ebun Segun
Apr 13, 2023

In this article, I’ll cover how we added flexibility to our previous one-size-fits-all order routing system with the introduction of “routing rules”, and how we dogfooded our own Shopify Functions feature to give merchants the ability to create their own routing rules.

Development

Adventures in Garbage Collection: Improving GC Performance in our Massive Monolith

by Jean Boussier
Apr 6, 2023

At the beginning of this year, we ran several experiments aimed at reducing the latency impact of the Ruby garbage collector (GC) in Shopify's monolith. In this article, Jean talks about the changes we made to improve GC performance, and more importantly, how we got to these changes.

Development

What Being a Staff Developer Means at Shopify

by Rose Wiegley
Mar 29, 2023

A staff developer is an adaptable engineering leader that's comfortable working with ambiguity.

Development

Lessons From Linguistics: i18n Best Practices for Front-End Developers

by Lucas Huang
Mar 16, 2023

Here are a few internationalization (i18n) best practices to help front-end developers avoid errors and serve more robust text content on multilingual websites and apps.

Development

Improving the Developer Experience with the Ruby LSP

by Vinicius Stock
Feb 24, 2023

The Ruby LSP is a new language server built at Shopify that makes coding in Ruby even better by providing extra Ruby features for any editor that has a client layer for the LSP. In this article, we’ll cover how we built the Ruby LSP, the features included within it, and how you can install it.

Development

The Case Against Monkey Patching, From a Rails Core Team Member

by Eileen Uchitelle
Feb 21, 2023

Monkey patching is considered one of the more powerful features of the Ruby programming language. However, by the end of this post I’m hoping to convince you that they should be used sparingly, if at all, because they are brittle, dangerous, and often unnecessary. I’ll also share tips on how to use them as safely as possible in the rare cases where you do need to monkey patch.

Development

The 25 Percent Rule for Tackling Technical Debt

by John DeWyze
Feb 16, 2023

Let’s talk about technical debt. Let’s talk about practical usable approaches for actually paying it down on a daily, weekly, monthly, and yearly basis. Let’s talk about what debt needs to be fixed now versus what can wait for better planning.

Development

Bringing Javascript to WebAssembly for Shopify Functions

by Surma
Feb 9, 2023

While we’re working on getting our Shopify Functions infrastructure ready for the public beta, we thought we’d use this opportunity to shine some light on how we brought JavaScript to WebAssembly, how we made everything fit within our very tight Shopify Function constraints, and what our plans for the future look like.

Development

The Hardest Part of Writing Tests is Getting Started

by Jack Reichert
Feb 2, 2023

The hardest part of writing tests is setting things up. Full test suites have a lot of complicated helper tools, stubs, and fixtures. They’re not easy to understand as a beginner, let alone set up for yourself. This post covers the four things you can do to get started.

Development

Making Your React Native Gestures Feel Natural

by Andrew Lo
Jan 20, 2023

When working with draggable elements in React Native mobile apps, I’ve learned that there are some simple ways to help gestures and animations feel better and more natural.

Let’s look at the Shop app’s Sheet component as an example:

A gif showing a sample Shop app store that shows the Sheet Component being dragged and close on the screen — The Sheet component being dragged open and closed by the user’s gestures

This component can be dragged by the user. Once the drag completes, it either animates back to the open position or down to the bottom of the screen to close.

To implement this, we can start by using a gesture handler which sets yPosition to move the sheet with the user’s finger:

When the drag ends and the user lifts their finger, we animate to either the closed or open position based on the finger's position, as implemented in onEnd above. This works but there are some issues.

Problem 1: Speed of Drag

If we drag down quickly from the top, shouldn’t it close? We only take the position into account when determining whether it opens or closes. Shouldn’t we also take the speed of the drag when it ends as well?

A gif showing a sample Shop app store that shows the Sheet Component screen being flicked by a user but not closing“ style= — The user tries to drag the Sheet closed by quickly flicking it down, but it does not close

In this example above, the user may feel frustrated that they are flicking the sheet down hard, yet it won’t close.

Problem 2: Position Animation

No matter what the distance is from the end position, the animation after the drag ends always takes 600 ms. If it’s closer, shouldn’t it take less time to get there? If you drag it with more force before letting go, shouldn’t that momentum make it go to the destination faster?

A gif showing a sample Shop app store that shows the Sheet Component being dragged to the open position on the screen — The Sheet takes the same amount of time to move to the open position regardless of the distance it has to move

Springs and Velocity

To address problem number one, we use event.velocityY from onEnd, and add it to the position to determine whether to close or open. We have a multiplier as well to adjust how much we want velocity to count towards where the sheet ends up.

For problem number two, we use a spring animation rather than a fixed duration one! Spring animations don’t necessarily need to have an elastic bounce back. withSpring takes into account distance and velocity to animate in a physically realistic way.

In the example above, it’s now easy to flick it quickly closed or open, and the animations to the open or closed position behave in a more realistic and natural way by taking distance and drag velocity into account.

Elasticity and Resistance

The next time you drag down a photo or story to minimize or close it, try doing it slowly and watch what’s happening. Is the element that’s being dragged matching your finger position exactly? Or is it moving slower than your finger?

When the dragged element moves slower than your finger, it can create a feeling of elasticity, as if you’re pulling against a rubber band that resists the drag.

In the Sheet example below, what if the user drags it up instead of down while the sheet is already open?

A gif showing a sample Shop app store that shows the Sheet Component being dragged up the screen — The Sheet stays directly under the user’s finger as it’s dragged further up while open

Notice that the Sheet matches the finger position perfectly as the finger moves up. As a result, it feels very easy to continue dragging it up. However, dragging it up further has no functionality since the Sheet is already open. To teach the user that it can’t be dragged up further, we can add a feeling of resistance to the drag. We can do so by dividing the distance dragged so the element only moves a fraction of the distance of the finger:

alt text — Instead of moving directly under the user’s finger, the sheet is dragged up by a fraction of the distance the finger has moved, giving a sense of resistance to the drag gesture.

The user will now feel that the Sheet is resisting being dragged up further, intuitively teaching them more about how the UI works.

Make Gestures Better for Everyone

This is the final gesture handler with all the above techniques included:

As user interface developers, we have an amazing opportunity to delight people and make their experiences better.

If we care about and nail these details, they’ll combine together to form a holistic user experience that feels good to touch and interact with.

I hope that you have as much fun working on gestures like I do!

The above videos were taken with the simulator in order to show the simulated touches. For testing the gestures yourself however, I recommend trying the above examples by touching a real device.

Andrew Lo is a Staff Front End Developer on the Shop's Design Systems team. He works remotely from Toronto, Canada.

We all get shit done, ship fast, and learn. We operate on low process and high trust, and trade on impact. You have to care deeply about what you’re doing, and commit to continuously developing your craft, to keep pace here. If you’re seeking hypergrowth, can solve complex problems, and can thrive on change (and a bit of chaos), you’ve found the right place. Visit our Engineering career page to find your role.

Development

Ruby 3.2’s YJIT is Production-Ready

by Maxime Chevalier-Boisvert
Jan 17, 2023

YJIT, a just-in-time (JIT) implementation on top of CRuby built at Shopify, is now production-ready and delivering major improvements to performance and speed. Maxime (Senior Staff Engineer and leader of the YJIT project) shares the updates that have been made in this newest version of YJIT, and future plans for further optimization.

Development

Reliving Your Happiest HTTP Interactions with Ruby’s VCR Gem

by Stephen Prater
Jan 5, 2023

VCR is a Ruby library that records HTTP interactions and plays them back to your test suite, verifying input and returning predictable output. If you're struggling with difficult to maintain mocks, misbehaving APIs or complex multi-step interactions and would like tests that are more reliable, faster, and easier to debug, VCR can help you get there. Here’s how.

Development

Optimizing Ruby’s Memory Layout: Variable Width Allocation

by Peter Zhu
Dec 25, 2022

Shopify is improving CRuby’s performance in Ruby 3.2 by optimizing the memory layout in the garbage collector through the Variable Width Allocation project.

Development

Year in Review 2022: Tenderlove's Ruby and Rails Reflections and Predictions

by Aaron Patterson
Dec 22, 2022

Senior Staff Engineer and Rails core team member Aaron Patterson recaps his favourite Ruby and Rails updates of 2022 and some predictions (okay, so maybe it’s more of a wish list) for 2023.

Development

Shopify Embraces Rust for Systems Programming

by Mike Shaver
Dec 14, 2022

Shopify is adopting Rust as our systems programming language, and joining the Rust Foundation.

Development

Our Solution for Measuring React Native Rendering Times

by Elvira Burchik
Nov 24, 2022

After Shopify went all-in on React Native, we had to find a way to confirm our mobile apps are fast. The solution is an open-source @shopify/react-native-performance library, which measures rendering times in React Native apps.

Development

Implementing Server-Driven UI Architecture on the Shop App

by Ashwin Narayanan
Nov 22, 2022

Ashwin explains why and how we implemented server-driven UI in the Shop App’s Store Screen, and his experience working on the project as a Dev Degree intern.

Development

What We Learned from Open-Sourcing FlashList

by David Cortés
Nov 17, 2022

Here's how we launched the React Native List library FlashList as an open-source project, and how you can make your next open-source project a success.

Development

Caching Without Marshal Part 2: The Path to MessagePack

by Chris Salzberg
Nov 15, 2022

In part one of Caching Without Marshal, we dove into the internals of Marshal, Ruby’s built-in binary serialization format. Marshal is the black box that Rails uses under the hood to transform almost any object into binary data and back. Caching, in particular, depends heavily on Marshal: Rails uses it to cache pretty much everything, be it actions, pages, partials, or anything else.

Marshal’s magic is convenient, but it comes with risks. Part one presented a deep dive into some of the little-documented internals of Marshal with the goal of ultimately replacing it with a more robust cache format. In particular, we wanted a cache format that would not blow up when we shipped code changes.

Part two is all about MessagePack, the format that did this for us. It’s a binary serialization format, and in this sense it’s similar to Marshal. Its key difference is that whereas Marshal is a Ruby-specific format, MessagePack is generic by default. There are MessagePack libraries for Java, Python, and many other languages.

You may not know MessagePack, but if you’re using Rails chances are you’ve got it in your Gemfile because it’s a dependency of Bootsnap.

The MessagePack Format

On the surface, MessagePack is similar to Marshal: just replace .dump with .pack and .load with .unpack. For many payloads, the two are interchangeable.

Here’s an example of using MessagePack to encode and decode a hash:

MessagePack supports a set of core types that are similar to those of Marshal: nil, integers, booleans, floats, and a type called raw, covering strings and binary data. It also has composite types for array and map (that is, a hash).

Notice, however, that the Ruby-specific types that Marshal supports, like Object and instance variable, aren’t in that list. This isn’t surprising since MessagePack is a generic format and not a Ruby format. But for us, this is a big advantage since it’s exactly the encoding of Ruby-specific types that caused our original problems (recall the beta flag class names in cache payloads from Part One).

Let’s take a closer look at the encoded data of Marshal and MessagePack. Suppose we encode a string "foo" with Marshal, this is what we get:

Visual representation encoded data of Marshall.dump("foo") = 0408 4922 0866 6f6f 063a 0645 54 — Encoded data from Marshal for Marshall.dump("foo")

Let’s look at the payload: 0408 4922 0866 6f6f 063a 0645 54. We see that the payload "foo" is encoded in hex as 666f6f and prefixed by 08 representing a length of 3 (f-o-o). Marshal wraps this string payload in a TYPE_IVAR, which as mentioned in part 1 is used to attach instance variables to types that aren’t strictly implemented as objects, like strings. In this case, the instance variable (3a 0645) is named :E. This is a special instance variable used by Ruby to represent the string’s encoding, which is T (54) for true, that is, this is a UTF-8 encoded string. So Marshal uses a Ruby-native idea to encode the string’s encoding.

In MessagePack, the payload (a366 6f6f) is much shorter:

Visual representation of encoded data MessagePack(“foo") = 0408 4922 0866 6f6f 063a 0645 54 — Encoded data from MessagePack for MessagePack.pack("foo")

The first thing you’ll notice is that there isn’t an encoding. MessagePack’s default encoding is UTF-8, so there’s no need to include it in the payload. Also note that the payload type (10100011), String, is encoded together with its length: the bits 101 encodes a string of less than 31 bytes, and 00011 says the actual length is 3 bytes. Altogether this makes for a very compact encoding of a string.

Extension Types

After deciding to give MessagePack a try, we did a search for Rails.cache.write and Rails.cache.read in the codebase of our core monolith, to figure out roughly what was going into the cache. We found a bunch of stuff that wasn’t among the types MessagePack supported out of the box.

Luckily for us, MessagePack has a killer feature that came in handy: extension types. Extension types are custom types that you can define by calling register_type on an instance of MessagePack::Factory, like this:

An extension type is made up of the type code (a number from 0 to 127—there’s a maximum of 128 extension types), the class of the type, and a serializer and deserializer, referred to as packer and unpacker. Note that the type is also applied to subclasses of the type’s class. Now, this is usually what you want, but it’s something to be aware of and can come back to bite you if you’re not careful.

Here’s the Date extension type, the simplest of the extension types we use in the core monolith in production:

As you can see, the code for this type is 3, and its class is Date. Its packer takes a date and extracts the date’s year, month, and day. It then packs them into the format string "s< C C" using the Array#pack method with the year to a 16 bit signed integer, and the month and day to 8-bit unsigned integers. The type’s unpacker goes the other way: it takes a string and, using the same format string, extracts the year, month, and day using String#unpack, then passes them to Date.new to create a new date object.

Here’s how we would encode an actual date with this factory:

Converting the result to hex, we get d603 e607 0909 that corresponds to the date (e607 0909) prefixed by the extension type (d603):

Visual breakdown of hex results d603 e607 0909 — Encoded date from the factory

As you can see, the encoded date is compact. Extension types give us the flexibility to encode pretty much anything we might want to put into the cache in a format that suits our needs.

Just Say No

If this were the end of the story, though, we wouldn’t really have had enough to go with MessagePack in our cache. Remember our original problem: we had a payload containing objects whose classes changed, breaking on deploy when they were loaded into old code that didn’t have those classes defined. In order to avoid that problem from happening, we need to stop those classes from going into the cache in the first place.

We need MessagePack, in other words, to refuse encoding any object without a defined type, and also let us catch these types so we can follow up. Luckily for us, MessagePack does this. It’s not the kind of “killer feature” that’s advertised as such, but it’s enough for our needs.

Take this example, where factory is the factory we created previously:

If MessagePack were to happily encode this—without any Object type defined—we’d have a problem. But as mentioned earlier, MessagePack doesn’t know Ruby objects by default and has no way to encode them unless you give it one.

So what actually happens when you try this? You get an error like this:

NoMethodError: undefined method `to_msgpack' for <#Object:0x...>

Notice that MessagePack traversed the entire object, through the hash, into the array, until it hit the Object instance. At that point, it found something for which it had no type defined and basically blew up.

The way it blew up is perhaps not ideal, but it’s enough. We can rescue this exception, check the message, figure out it came from MessagePack, and respond appropriately. Critically, the exception contains a reference to the object that failed to encode. That’s information we can log and use to later decide if we need a new extension type, or if we are perhaps putting things into the cache that we shouldn’t be.

The Migration

Now that we’ve looked at Marshal and MessagePack, we’re ready to explain how we actually made the switch from one to the other.

Making the Switch

Our migration wasn’t instantaneous. We ran with the two side-by-side for a period of about six months while we figured out what was going into the cache and which extension types we needed. The path of the migration, however, was actually quite simple. Here’s the basic step-by-step process:

First, we created a MessagePack factory with our extension types defined on it and used it to encode the mystery object passed to the cache (the puzzle piece in the diagram below).
If MessagePack was able to encode it, great! We prefixed a version byte prefix that we used to track which extension types were defined for the payload, and then we put the pair into the cache.
If, on the other hand, the object failed to encode, we rescued the NoMethodError which, as mentioned earlier, MessagePack raises in this situation. We then fell back to Marshal and put the Marshal-encoded payload into the cache. Note that when decoding, we were able to tell which payloads were Marshal-encoded by their prefix: if it’s 0408 it’s a Marshal-encoded payload, otherwise it’s MessagePack.

Path of the migration — The migration three step process

The step where we rescued the NoMethodError was quite important in this process since it was where we were able to log data on what was actually going into the cache. Here’s that rescue code (which of course no longer exists now since we’re fully migrated to MessagePack):

As you can see, we sent data (including the class of the object that failed to encode) to both logs and StatsD. These logs were crucial in flagging the need for new extension types, and also in signaling to us when there were things going into the cache that shouldn’t ever have been there in the first place.

We started the migration process with a small set of default extension types which Jean Boussier, who worked with me on the cache project, had registered in our core monolith earlier for other work using MessagePack. There were five:

Symbol (offered out of the box in the messagepack-ruby gem. It just has to be enabled)
Time
DateTime
Date (shown earlier)
BigDecimal

These were enough to get us started, but they were certainly not enough to cover all the variety of things that were going into the cache. In particular, being a Rails application, the core monolith serializes a lot of records, and we needed a way to serialize those records. We needed an extension type for ActiveRecords::Base.

Encoding Records

Records are defined by their attributes (roughly, the values in their table columns), so it might seem like you could just cache them by caching their attributes. And you can.

But there’s a problem: records have associations. Marshal encodes the full set of associations along with the cached record. This ensures that when the record is deserialized, the loaded associations (those that have already been fetched from the database) will be ready to go without any extra queries. An extension type that only caches attribute values, on the other hand, needs to make a new query to refetch those associations after coming out of the cache, making it much more inefficient.

So we needed to cache loaded associations along with the record’s attributes. We did this with a serializer called ActiveRecordCoder. Here’s how it works. Consider a simple post model that has many comments, where each comment belongs to a post with an inverse defined:

Note that the Comment model here has an inverse association back to itself via its post association. Recall that Marshal handles this kind of circularity automatically using the link type (@ symbol) we saw in part 1, but that MessagePack doesn’t handle circularity by default. We’ll have to implement something like a link type to make this encoder work.

The trick we use for handling circularity involves something called an Instance Tracker. It tracks records encountered while traversing the record’s network of associations. The encoding algorithm builds a tree where each association is represented by its name (for example :comments or :post), and each record is represented by its unique index in the tracker. If we encounter an untracked record, we recursively traverse its network of associations, and if we’ve seen the record before, we simply encode it using its index.

This algorithm generates a very compact representation of a record’s associations. Combined with the records in the tracker, each encoded by its set of attributes, it provides a very concise representation of a record and its loaded associations.

Here’s what this representation looks like for the post with two comments shown earlier:

Once ActiveRecordCoder has generated this array of arrays, we can simply pass the result to MessagePack to encode it to a bytestring payload. For the post with two comments, this generates a payload of around 300 bytes. Considering that the Marshal payload for the post with no associations we looked at in Part 1 was 1,600 bytes in length, that’s not bad.

But what happens if we try to encode this post with its two comments using Marshal? The result is shown below: a payload over 4,000 bytes long. So the combination of ActiveRecordCoder with MessagePack is 13 times more space efficient than Marshal for this payload. That’s a pretty massive improvement.

Visual representation of the difference between an ActiveRecordCoder + MessagePack payload vs a Marshal payload — ActiveRecordCoder + MessagePack vs Marshal

In fact, the space efficiency of the switch to MessagePack was so significant that we immediately saw the change in our data analytics. As you can see in the graph below, our Rails cache memcached fill percent dropped after the switch. Keep in mind that for many payloads, for example boolean and integer valued-payloads, the change to MessagePack only made a small difference in terms of space efficiency. Nonetheless, the change for more complex objects like records was so significant that total cache usage dropped by over 25 percent.

Line graph showing Rails cache memcached fill percent versus time. The graph shows a decrease when changed to MessagePack Rails cache memcached fill percent versus time

Handling Change

You might have noticed that ActiveRecordCoder, our encoder for ActiveRecord::Base objects, includes the name of record classes and association names in encoded payloads. Although our coder doesn’t encode all instance variables in the payload, the fact that it hardcodes class names at all should be a red flag. Isn’t this exactly what got us into the mess caching objects with Marshal in the first place?

And indeed, it is—but there are two key differences here.

First, since we control the encoding process, we can decide how and where to raise exceptions when class or association names change. So when decoding, if we find that a class or association name isn’t defined, we rescue the error and re-raise a more specific error. This is very different from what happens with Marshal.

Second, since this is a cache, and not, say, a persistent datastore like a database, we can afford to occasionally drop a cached payload if we know that it’s become stale. So this is precisely what we do. When we see one of the exceptions for missing class or association names, we rescue the exception and simply treat the cache fetch as a miss. Here’s what that code looks like:

The result of this strategy is effectively that during a deploy where class or association names change, cache payloads containing those names are invalidated, and the cache needs to replace them. This can effectively disable the cache for those keys during the period of the deploy, but once the new code has been fully released the cache again works as normal. This is a reasonable tradeoff, and a much more graceful way to handle code changes than what happens with Marshal.

Core Type Subclasses

With our migration plan and our encoder for ActiveRecord::Base, we were ready to embark on the first step of the migration to MessagePack. As we were preparing to ship the change, however, we noticed something was wrong on continuous integration (CI): some tests were failing on hash-valued cache payloads.

A closer inspection revealed a problem with HashWithIndifferentAccess, a subclass of Hash provided by ActiveSupport that makes symbols and strings work interchangeably as hash keys. Marshal handles subclasses of core types like this out of the box, so you can be sure that a HashWithIndifferentAccess that goes into a Marshal-backed cache will come back out as a HashWithIndifferentAccess and not a plain old Hash. The same cannot be said for MessagePack, unfortunately, as you can confirm yourself:

MessagePack doesn’t blow up here on the missing type because HashWithIndifferentAccess is a subclass of another type that it does support, namely Hash. This is a case where MessagePack’s default handling of subclasses can and will bite you; it would be better for us if this did blow up, so we could fall back to Marshal. We were lucky that our tests caught the issue before this ever went out to production.

The problem was a tricky one to solve, though. You would think that defining an extension type for HashWithIndifferentAccess would resolve the issue, but it didn’t. In fact, MessagePack completely ignored the type and continued to serialize these payloads as hashes.

As it turns out, the issue was with msgpack-ruby itself. The code handling extension types didn’t trigger on subclasses of core types like Hash, so any extensions of those types had no effect. I made a pull request (PR) to fix the issue, and as of version 1.4.3, msgpack-ruby now supports extension types for Hash as well as Array, String, and Regex.

The Long Tail of Types

With the fix for HashWithIndifferentAccess, we were ready to ship the first step in our migration to MessagePack in the cache. When we did this, we were pleased to see that MessagePack was successfully serializing 95 percent of payloads right off the bat without any issues. This was validation that our migration strategy and extension types were working.

Of course, it’s the last 5 percent that’s always the hardest, and indeed we faced a long tail of failing cache writes to resolve. We added types for commonly cached classes like ActiveSupport::TimeWithZone and Set, and edged closer to 100 percent, but we couldn’t quite get all the way there. There were just too many different things still being cached with Marshal.

At this point, we had to adjust our strategy. It wasn’t feasible to just let any developer define new extension types for whatever they needed to cache. Shopify has thousands of developers, and we would quickly hit MessagePack’s limit of 128 extension types.

Instead, we adopted a different strategy that helped us scale indefinitely to any number of types. We defined a catchall type for Object, the parent class for the vast majority of objects in Ruby. The Object extension type looks for two methods on any object: an instance method named as_pack and a class method named from_pack. If both are present, it considers the object packable, and uses as_pack as its serializer and from_pack as its deserializer. Here’s an example of a Task class that our encoder treats as packable:

Note that, as with the ActiveRecord::Base extension type, this approach relies on encoding class names. As mentioned earlier, we can do this safely since we handle class name changes gracefully as cache misses. This wouldn’t be a viable approach for a persistent store.

The packable extension type worked great, but as we worked on migrating existing cache objects, we found many that followed a similar pattern, caching either Structs or T::Structs (Sorbet’s typed struct). Structs are simple objects defined by a set of attributes, so the packable methods were each very similar since they simply worked from a list of the object’s attributes. To make things easier, we extracted this logic into a module that, when included in a struct class, automatically makes the struct packable. Here’s the module for Struct:

The serialized data for the struct instance includes an extra digest value (26450) that captures the names of the struct’s attributes. We use this digest to signal to the Object extension type deserialization code that attribute names have changed (for example in a code refactor). If the digest changes, the cache treats cached data as stale and regenerates it:

Simply by including this module (or a similar one for T::Struct classes), developers can cache struct data in a way that’s robust to future changes. As with our handling of class name changes, this approach works because we can afford to throw away cache data that has become stale.

The struct modules accelerated the pace of our work, enabling us to quickly migrate the last objects in the long tail of cached types. Having confirmed from our logs that we were no longer serializing any payloads with Marshal, we took the final step of removing it entirely from the cache. We’re now caching exclusively with MessagePack.

Safe by Default

With MessagePack as our serialization format, the cache in our core monolith became safe by default. Not safe most of the time or safe under some special conditions, but safe, period. It’s hard to underemphasize the importance of a change like this to the stability and scalability of a platform as large and complex as Shopify’s.

For developers, having a safe cache brings a peace of mind that one less unexpected thing will happen when they ship their refactors. This makes such refactors—particularly large, challenging ones—more likely to happen, improving the overall quality and long-term maintainability of our codebase.

If this sounds like something that you’d like to try yourself, you’re in luck! Most of the work we put into this project has been extracted into a gem called Shopify/paquito. A migration process like this will never be easy, but Paquito incorporates the learnings of our own experience. We hope it will help you on your journey to a safer cache.

Chris Salzberg is a Staff Developer on the Ruby and Rails Infra team at Shopify. He is based in Hakodate in the north of Japan.

Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Engineering career page to find out about our open positions and learn about Digital by Design.

Development

Caching Without Marshal Part 1: Marshal from the Inside Out

by Chris Salzberg
Nov 11, 2022

Caching is critical to how Rails applications work. At every layer, whether it be in page rendering, database querying, or external data retrieval, the cache is what ensures that no single bottleneck brings down an entire application.

But caching has a dirty secret, and that secret’s name is Marshal.

Marshal is Ruby’s ultimate sharp knife, able to transform almost any object into a binary blob and back. This makes it a natural match for the diverse needs of a cache, particularly the cache of a complex web framework like Rails. From actions, to pages, to partials, to queries—you name it, if Rails is touching it, Marshal is probably caching it.

Marshal’s magic, however, comes with risks.

A couple of years ago, these risks became very real for us. It started innocently enough. A developer at Shopify, in an attempt to clean up some code in our core monolith, shipped a PR refactoring some key classes around beta flags. The refactor got the thumbs up in review and passed all tests and other checks.

As it went out to production, though, it became clear something was very wrong. A flood of exceptions triggered an incident, and the refactor was quickly rolled back and reverted. We were lucky to escape so easily.

The incident was a wake-up call for us. Nothing in our set of continuous integration (CI) checks had flagged the change. Indeed, even in retrospect, there was nothing wrong with the code change at all. The issue wasn’t the code, but the fact that the code had changed.

The problem, of course, was Marshal. Being so widely used, beta flags were being cached. Marshal serializes an object’s class along with its other data, so many of the classes that were part of the refactor were also hardcoded in entries of the cache. When the newly deployed code began inserting beta flag instances with the new classes into the cache, the old code—which was still running as the deploy was proceeding—began choking on class names and methods that it had never seen before.

As a member of Shopify’s Ruby and Rails Infrastructure team, I was involved in the follow-up for this incident. The incident was troubling to us because there were really only two ways to mitigate the risk of the same incident happening again, and neither was acceptable. The first is simply to put less things into the cache, or less variety of things; this decreases the likelihood of cached objects conflicting with future code changes. But this defeats the purpose of having a cache in the first place.

The other way to mitigate the risk is to change code less, because it’s code changes that ultimately trigger cache collisions. But this was even less acceptable: our team is all about making code cleaner, and that requires changes. Asking developers to stop refactoring their code goes against everything we were trying to do at Shopify.

So we decided to take a deeper look and fix the root problem: Marshal. We reasoned that if we could use a different serialization format—one that wouldn’t cache any arbitrary object the way Marshal does, one that we could control and extend—then maybe we could make the cache safe by default.

The format that did this for us is MessagePack. MessagePack is a binary serialization format that’s much more compact than Marshal, with stricter typing and less magic. In this two-part series (based on a RailsConf talk by the same name), I’ll pry Marshal open to show how it works, delve into how we replaced it, and describe the specific challenges posed by Shopify’s scale.

But to start, let’s talk about caching and how Marshal fits into that.

You Can’t Always Cache What You Want

Caching in Rails is easy. Out of the box, Rails provides caching features that cover the common requirements of a typical web application. The Rails Guides provide details on how these features work, and how to use them to speed up your Rails application. So far, so good.

What you won’t find in the guides is information on what you can and can’t put into the cache. The low-level caching section of the caching guide simply states: “Rails’ caching mechanism works great for storing any kind of information.” (original emphasis) If that sounds too good to be true, that’s because it is.

Under the hood, all types of cache in Rails are backed by a common interface of two methods, read and write, on the cache instance returned by Rails.cache. While there are a variety of cache backends—in our core monolith we use Memcached, but you can also cache to file, memory, or Redis, for example—they all serialize and deserialize data the same way, by calling Marshal.load and Marshal.dump on the cached object.

A diagram showing the differences between the cache encoding format between Rails 6 and Rail 7 — Cache encoding format in Rails 6 and Rails 7

If you actually take a peek at what these cache backends put into the cache, you might find that things have changed in Rails 7 for the better. This is thanks to work by Jean Boussier, who’s also in the Ruby and Rails Infrastructure team at Shopify, and who I worked with on the cache project. Jean recently improved cache space allocation by more efficiently serializing a wrapper class named ActiveSupport::Cache::Entry. The result is a more space-efficient cache that stores cached objects and their metadata without any redundant wrapper.

Unfortunately, that work doesn’t help us when it comes to the dangers of Marshal as a serialization format: while the cache is slightly more space efficient, all those issues still exist in Rails 7. To fix the problems with Marshal, we need to replace it.

Let’s Talk About Marshal

But before we can replace Marshal, we need to understand it. And unfortunately, there aren’t a lot of good resources explaining what Marshal actually does.

To figure that out, let’s start with a simple Post record, which we will assume has a title column in the database:

We can create an instance of this record and pass it to Marshal.dump:

This is what we get back:

This is a string of around 1,600 bytes, and as you can see, a lot is going on in there. There are constants corresponding to various Rails classes like ActiveRecord, ActiveModel and ActiveSupport. There are also instance variables, which you can identify by the @ symbol before their names. And finally there are many values, including the name of the post, Caching Without Marshal, which appears three times in the payload.

The magic of Marshal, of course, is that if we take this mysterious bytestring and pass it to Marshal.load, we get back exactly the Post record we started with.

You can do this a day from now, a week from now, a year from now, whenever you want—you will get the exact same object back. This is what makes Marshal so powerful.

And this is all possible because Marshal encodes the universe. It recursively crawls objects and their references, extracts all the information it needs, and dumps the result to the payload.

But what is actually going on in that payload? To figure that out, we’ll need to dig deeper and go to the ultimate source of truth in Ruby: the C source code. Marshal’s code lives in a file called marshal.c. At the top of the file, you’ll find a bunch of constants that correspond to the types Marshal uses when encoding data.

Marshal types defined in marshal.c — Marshal types defined in `marshal.c`

Top in that list are MARSHAL_MAJOR and MARSHAL_MINOR, the major and minor versions of Marshal, not to be confused with the version of Ruby. This is what comes first in any Marshal payload. The Marshal version hasn’t changed in years and can pretty much be treated as a constant.

Next in the file are several types I will refer to here as “atomic”, meaning types which can’t contain other objects inside themself. These are the things you probably expect: nil, true, false, numbers, floats, symbols, and also classes and modules.

Next, there are types I will refer to as “composite” that can contain other objects inside them. Most of these are unsurprising: array, hash, struct, and object, for example. But this group also includes two you might not expect: string and regex. We’ll return to this later in this article.

Finally, there are several types toward the end of the list whose meaning is probably not very obvious at all. We will return to these later as well.

Objects

Let’s first start with the most basic type of thing that Marshal serializes: objects. Marshal encodes objects using a type called TYPE_OBJECT, represented by a small character o.

Here’s the Marshal-encoded bytestring for the example Post we saw earlier, converted to make it a bit easier to parse.

The first thing we can see in the payload is the Marshal version (0408), followed by an object, represented by an ‘o’ (6f). Then comes the name of the object’s class, represented as a symbol: a colon (3a) followed by the symbol’s length (09) and name as an ASCII string (Post). (Small numbers are stored by Marshal in an optimized format—09 translates to a length of 4.) Then there’s an integer representing the number of instance variables, followed by the instance variables themselves as pairs of names and values.

You can see that a payload like this, with each variable itself containing an object with further instance variables of its own, can get very big, very fast.

Instance Variables

As mentioned earlier, Marshal encodes instance variables in objects as part of its object type. But it also encodes instance variables in other things that, although seemingly object-like (subclassing the Object class), aren’t in fact implemented as such. There are four of these, which I will refer to as core types, in this article: String, Regex, Array, and Hash. Since Ruby implements these types in a special, optimized way, Marshal has to encode them in a special way as well.

Consider what happens if you assign an instance variable to a string, like this:

This may not be something you do every day, but it’s something you can do. And you may ask: does Marshal handle this correctly?

The answer is: yes it does.

It does this using a special type called TYPE_IVAR to encode instance variables on things that aren’t strictly implemented as objects, represented by a variable name and its value. TYPE_IVAR wraps the original type (String in this case), adding a list of instance variable names and values. It’s also used to encode instance variables in hashes, arrays, and regexes in the same way.

Circularity

Another interesting problem is circularity: what happens when an object contains references to itself. Records, for example, can have associations that have inverses pointing back to the original record. How does Marshal handle this?

Take a minimal example: an array which contains a single element, the array itself:

What happens if we run this through Marshal? Does it segmentation fault on the self-reference?

As it turns out, it doesn’t. You can confirm yourself by passing the array through Marshal.dump and Marshal.load:

Marshal does this thanks to an interesting type called the link type, referred to in marshal.c as TYPE_LINK.

The way Marshall does this is quite efficient. Let’s look at the payload: 0408 5b06 4000. It starts with an open square bracket (5b) representing the array type, and the length of the array (as noted earlier, small numbers are stored in an optimized format, so 06 translates to a length of 1). The circularity is represented by a @ (40) symbol for the link type, followed by an index of the element in the encoded object the link is pointing to, in this case 00 for the first element (the array itself).

In short, Marshal handles circularity out of the box. That’s important to note because when we deal with this ourselves, we’re going to have to reimplement this process.

Core Type Subclasses

I mentioned earlier that there are a number of core types that Ruby implements in a special way, and that Marshal also needs to handle in a way that’s distinct from other objects. Specifically, these are: String, Regex, Array, and Hash.

One interesting edge case is what happens when you subclass one of these classes, like this:

If you create an instance of this class, you’ll see that while it looks like a hash, it’s, indeed, an instance of the subclass:

So what happens if you encode this with Marshal? If you do, you’ll find that it actually captures the correct class:

Marshal does this because it has a special type called TYPE_UCLASS. To the usual data for the type (hash data in this case), TYPE_UCLASS adds the name of the class, allowing it to correctly decode the object when loading it back. It uses the same type to encode subclasses of strings, arrays, and regexes (the other core types).

The Magic of Marshal

We’ve looked at how Marshal encodes several different types of objects in Ruby. You might be wondering at this point why all this information is relevant to you.

The answer is because—whether you realize it or not—if you’re running a Rails application, you most likely rely on it. And if you decide, like we did, to take Marshal’s magic out of your application, you’ll find that it’s exactly these things that break. So before doing that, it’s a good idea to figure out how to replace each one of them.

That’s what we did, with a little help from a format called MessagePack. In the next part of this series, we’ll take a look at the steps we took to migrate our cache to MessagePack. This includes re-implementing some of the key Marshal features, such as circularity and core type subclasses, explored in this article, as well as a deep dive into our algorithm for encoding records and their associations.

Chris Salzberg is a Staff Developer on the Ruby and Rails Infra team at Shopify. He is based in Hakodate in the north of Japan.

Development

Apollo Cache is Your Friend, If You Get To Know It

by Raman Lally
Nov 9, 2022

Shopify's Raman Lally delves into the Apollo GraphQL client cache and the life cycle of objects that are cached within it.

Development

Mixing It Up: Remix Joins Shopify to Push the Web Forward

by Dion Almaer
Oct 31, 2022

At Shopify, the Remix open-source community has a strong sponsor and supporter that will enable the web framework to accelerate its roadmap.

Development

Finding Relationships Between Ruby’s Top 100 Packages and Their Dependencies

by Kevin Lin
Oct 19, 2022

In June of this year, RubyGems, the main repository for Ruby packages (gems), announced that multi-factor authentication (MFA) was going to be gradually rolled out to users. This means that users eventually will need to login with a one-time password from their authenticator device, which will drastically reduce account takeovers.

The team I'm interning on, the Ruby Dependency Security team at Shopify, played a big part in rolling out MFA to RubyGems users. The team’s mission is to increase the security of the Ruby software supply chain, so increasing MFA usage is something we wanted to help implement.

A large Ruby with stick arms and leg pats a little Ruby with stick arms and legs — Illustration by Kevin Lin

One interesting decision that the RubyGems team faced is determining who was included in the first milestone. The team wanted to include at least the top 100 RubyGems packages, but also wanted to prevent packages (and people) from falling out of this cohort in the future.

To meet those criteria, the team set a threshold of 180 million downloads for the gems instead. Once a gem crosses 180 million downloads, its owners are required to use multi-factor authentication in the future.

Bar graph showing gem download numbers for Gem 1 and Gem 2 — Gem downloads represented as bars. Gem 2 is over the 180M download threshold, so its owners would need MFA.

This design decision led me to a curiosity. As packages frequently depend on other packages, could some of these big (more than 180M downloads) packages depend on small (less than 180M downloads) packages? If this was the case, then there would be a small loophole: if a hacker wanted to maximize their reach in the Ruby ecosystem, they could target one of these small packages (which would get installed every time someone installed one of the big packages), circumventing the MFA protection of the big packages.

On the surface, it might not make sense that a dependency would ever have fewer downloads than its parent. After all, every time the parent gets downloaded, the dependency does too, so surely the dependency has at least as many downloads as the parent, right?

Screenshot of a Slack conversation between coworkers discussing one's scepticism about finding exceptions — My coworker Jacques, doubting that big gems will rely on small gems. He tells me he finds this hilarious in retrospect.

Well, I thought I should try to find exceptions anyway, and given that this blog post exists, it would seem that I found some. Here’s how I did it.

The Investigation

The first step in determining if big packages depended on small packages was to get a list of big packages. The rubygems.org stats page shows the top 100 gems in terms of downloads, but the last gem on page 10 has 199 million downloads, meaning that scraping these pages would yield an incomplete list, since the threshold I was interested in is 180 million downloads.

A screenshot of a page of Rubygems.org statistics — Page 10 of https://rubygems.org/stats, just a bit above the MFA download threshold

To get a complete list, I instead turned to using the data dumps that rubygems.org makes available. Basically, the site takes a daily snapshot of the rubygems.org database, removes any confidential information, and then publishes it. Their repo has a convenient script that allows you to load these data dumps into your own local rubygems.org database, and therefore run queries on the data using the Rails console. It took me many tries to make a query that got all the big packages, but I eventually found one that worked:

Rubygem.joins(:gem_download).where(gem_download: {count: 180_000_000..}).map(&:name)

I now had a list of 112 big gems, and I had to find their dependencies. The first method I tried was using the rubygems.org API. As described in the documentation, you can give the API the name of a gem and it’ll give you the name of all of its dependencies as part of the response payload. The same endpoint of this API also tells you how many downloads a gem has, so the path was clear: for each big gem, get a list of its dependencies and find out if any of them had fewer downloads than the threshold.

Here are the functions that get the dependencies and downloads:

Ruby function that gets a list of dependencies as reported by the rubygems.org API. Requires built-in uri, net/http, and json packages.

Ruby function that gets downloads from the same rubygems.org API endpoint. Also has a branch to check the download count for specific versions of gems, that I later used.

Putting all of this together, I found that 13 out of the 112 big gems had small gems as dependencies. Exceptions! So why did these small gems have fewer downloads than their parents? I learned that it was mainly due to two reasons:

Some gems are newer than their parents, that is, a new gem came out and a big gem developer wanted to add it as a dependency.
Some gems are shipped with Ruby by default, so they don’t need to be downloaded and thus have low(er) download count (for example, racc and rexml).

With this, I now had proof of the existence of big gems that would be indirectly vulnerable to account takeover of a small gem. While an existence proof is nice, it was pointed out to me that the rubygems.org API only returns a list symbolic of the direct dependencies of a gem, and that those dependencies might have sub-dependencies that I wasn’t checking. So how could I find out which packages get installed when one of these big gems gets installed?

With Bundler, of course!

Bundler is the Ruby dependency manager software that most Ruby users are probably familiar with. Bundler takes a list of gems to install (the Gemfile), installs dependencies that satisfy all version requirements, and, crucially for us, makes a list of all those dependencies and versions in a Gemfile.lock file. So, to find out which big gems relied in any way on small gems, I programmatically created a Gemfile with only the big gem in it, programmatically ran bundle lock, and programmatically read the Gemfile.lock that was created to get all the dependencies.

Here’s the function that did all the work with Bundler:

Ruby function that gets all dependencies that get installed when one gem is installed using Bundler

With this new methodology, I found that 24 of the 112 big gems rely on small gems, which is a fairly significant proportion of them. After discovering this, I wanted to look into visualization. Up until this point, I was just printing out results to the command line to make text dumps like this:

Text dump of dependency results. Big gems are red, their dependencies that are small are indented in black

This visualization isn’t very convenient to read, and it misses out on patterns. For example, as you can see above, many big gems rely on racc. It would be useful to know if they relied directly on it, or if most packages depended on it indirectly through some other package. The idea of making a graph was in the back of my mind since the beginning of this project, and when I realized how helpful it might be, I committed to it. I used the graph gem, following some examples from this talk by Aja Hammerly. I used a breadth-first search, starting with a queue of all the big gems, adding direct dependencies to the queue as I went. I added edges from gems to their dependencies and highlighted small gems in red. Here was the first iteration:

The output of the graph gem that highlights gem dependencies — The first iteration

It turns out there a lot of AWS gems, so I decided to remove them from the graph and got a much nicer result:

The graph, while moderately cluttered, shows a lot of information succinctly. For instance, you can see a galaxy of gems in the middle-left, with rails being the gravitational attractor, a clear keystone in the Ruby world.

Output of the gem graph with Rails at the center — The Rails galaxy

The node with the most arrows pointing into it is activesupport, so it really is an active support.

A close up view of activesupport in the output of the gem graph. activesupport has many arrows pointing into it. — 14 arrows pointing into activesupport

Racc, despite appearing in my printouts as a small gem for many big gems, is only the dependency of nokogiri.

A close up view of racc in the output of the gems graph — racc only has 1 edge attached to it

With this nice graph created, I followed up and made one final printout. This time, whenever I found a big gem that depended on a small gem, I printed out all the paths on the graph from the big gem to the small gem, that is, all the ways that the big gem relied on the small gem.

Here’s an example printout:

Big gem is in green (googleauth), small gems are in purple, and the black lines are all the paths from the big gem to the small gem.

I achieved this by making a directional graph data type and writing a depth-first search algorithm to find all the paths from one node to another. I chose to create my own data type because finding all paths on a graph isn’t already implemented in any Ruby gem from what I could tell. Here’s the algorithm, if you’re interested (`@graph` is a Hash of `String:Array` pairs, essentially an adjacency list):

Recursive depth-first search to find all paths from start to end

What’s Next

In summary, I found four ways to answer the question of whether or not big gems rely on small gems:

direct dependency printout (using rubygems.org API)
sub-dependency printout (using Bundler)
graph (using graph gem)
sub-dependency printout with paths (2. using my own graph data type).

I’m happy with my work, and I’m glad I got to learn about file I/O and use graph theory. I’m still relatively new to Ruby, so offshoot projects like these are very didactic.

The question remains of what to do with the 24 technically insecure gems. One proposal is to do nothing, since everyone will eventually need to have MFA enabled, and account takeover is still an uncommon event despite being on the rise.

Another option is to enforce MFA on these specific gems as a sort of blocklist, just to ensure the security of the top gems sooner. This would mean a small group of owners would have to enable MFA a few months earlier, so I could see this being a viable option.

Either way, more discussion with my team is needed. Thanks for reading!

Kevin is an intern on the Ruby Dependency Security team at Shopify. He is in his 5th year of Engineering Physics at the University of British Columbia.

Development

On the Importance of Pull Request Discipline

by Yash Kapadia
Oct 13, 2022

Unorganized pull requests are the bane of large codebases. Follow Yash's tips to prevent your PRs from getting roasted in the group chat.

Development

How to Write Code Without Having to Read It

by JM Neri
Oct 11, 2022

Do we need to read code before editing it?

The idea isn’t as wild as it sounds. In order to safely fix a bug or update a feature, we may need to learn some things about the code. However, we’d prefer to learn only that information. Not only does extra reading waste time, it overcomplicates our mental model. As our model grows, we’re more likely to get confused and lose track of critical info.

But can we really get away with reading nothing? Spoiler: no. However, we can get closer by skipping over areas that we know the computer is checking, saving our focus for areas that are susceptible to human error. In doing so, we’ll learn how to identify and eliminate those danger areas, so the next person can get away with reading even less.

Let’s give it a try.

Find the Entrypoint

If we’re refactoring code, we already know where we need to edit. Otherwise, we’re changing a behavior that has side effects. In a backend context, these behaviors would usually be exposed APIs. On the frontend, this would usually be something that’s displayed on the screen. For the sake of example, we’ll imagine a mobile application using React Native and Typescript, but this process generalizes to other contexts (as long as they have some concept of build or test errors; more on this later).

If our goal was to read a lot of code, we might search for all hits on RelevantFeatureName. But we don’t want to do that. Even if we weren’t trying to minimize reading, we’ll run into problems if the code we need to modify is called AlternateFeatureName, SubfeatureName, or LegacyFeatureNameNoOneRemembersAnymore.

Instead, we’ll look for something external: the user-visible strings (including accessibility labels—we did remember to add those, right?) on the screen we’re interested in. We search various combinations of string fragments, quotation marks, and UI inspectors until we find the matching string, either in the application code or in a language localization file. If we’re in a localization file, the localization key leads us to the application code that we’re interested in.

Tip
If we’re dealing with a regression, there’s an easier option: git bisect. When git bisect works, we really don’t need to read the code. In fact, we can skip most of the following steps. Because this is such a dramatic shortcut, always keep track of which bugs are regressions from previously working code.

Make the First Edit

If we’ve come in to make a simple copy edit, we’re done. If not, we’re looking for a component that ultimately gets populated by the server, disk, or user. We can no longer use exact strings, but we do have several read-minimizing strategies for zeroing in on the component:

Where is this component on the screen, relative to our known piece of text?
What type of standard component is it using? Is it a button? Text input? Text?
Does it have some unusual style parameter that’s easy to search? Color? Corner radius? Shadow?
Which button launches this UI? Does the button have searchable user-facing text?

These strategies all work regardless of naming conventions and code structure. Previous developers would have a hard time making our life harder without making the code nonfunctional. However, they may be able to make our life easier with better structure.

For example, if we’re using strategy #1, well-abstracted code helps us quickly rule out large areas of the screen. If we’re looking for some text near the bottom of the screen, it’s much easier to hit the right Text item if we can leverage a grouping like this:

<SomeHeader /> <SomeContent /> <SomeFooter />

rather than being stuck searching through something like this:

// Header <StaticImage /> <Text /> <Text /> <Button /> <Text /> ... // Content ... // Footer ...

where we’ll have to step over many irrelevant hits.

Abstraction helps even if the previous developer chose wacky names for header, content, or footer, because we only care about the broad order of elements on the screen. We’re not really reading the code. We’re looking for objective cues like positioning. If we’re still unsure, we can comment out chunks of the screen, starting with larger or highly-abstracted components first, until the specific item we care about disappears.

Once we’ve found the exact component that needs to behave differently, we can make the breaking change right now, as if we’ve already finished updating the code. For example, if we’re making a new component that displays data newText, we add that parameter to its parent’s input arguments, breaking the build.

If we’re fixing a bug, we can also start by adjusting an argument list. For example, the condition “we shouldn’t be displaying x if y is present” could be represented with the tagged union {mode: 'x', x: XType} | {mode: 'y'; y: YType}, so it’s physically impossible to pass in x and y at the same time. This will also trigger some build errors.

Tagged Unions
Tagged unions go by a variety of different names and syntaxes depending on language. They’re most commonly referred to as discriminated unions, enums with associated values, or sum types.

Climb Up the Callstack

We now go up the callstack until the build errors go away. At each stage, we edit the caller as if we’ll get the right input, triggering the next round of build errors. Notice that we’re still not reading the code here—we’re reading the build errors. Unless a previous developer has done something that breaks the chain of build errors (for example, accepting any instead of a strict type), their choices don’t have any effect on us.

Once we get to the top of the chain, we adjust the business logic to grab newText or modify the conditional that was incorrectly sending x. At this point, we might be done. But often, our change could or should affect the behavior of other features that we may not have thought about. We need to sweep back down through the callstack to apply any remaining adjustments.

On the downswing, previous developers’ choices start to matter. In the worst case, we’ll need to comb through the code ourselves, hoping that we catch all the related areas. But if the existing code is well structured, we’ll have contextual recommendations guiding us along the way: “because you changed this code, you might also like…”

Update Recommended Code

As we begin the downswing, our first line of defense is the linter. If we’ve used a deprecated library, or a pattern that creates non-obvious edge cases, the linter may be able to flag it for us. If previous developers forgot to update the linter, we’ll have to figure this out manually. Are other areas in the codebase calling the same library? Is this pattern discouraged in documentation?

After the linter, we may get additional build errors. Maybe we changed a function to return a new type, and now some other consumer of that output raises a type error. We can then update that other consumer’s logic as needed. If we added more cases to an enum, perhaps we get errors from other exhaustive switches that use the enum, reminding us that we may need to add handling for the new case. All this depends on how much the previous developers leaned on the type system. If they didn’t, we’ll have to find these related sites manually. One trick is to temporarily change the types we’re emitting, so all consumers of our output will error out, and we can check if they need updates.

Exhaustiveness
An exhaustive switch statement handles every possible enum case. Most environments don’t enforce exhaustiveness out of the box. For example, in Typescript, we need to have strictNullChecks turned on, and ensure that the switch statement has a defined return type. Once exhaustiveness is enforced, we can remove default cases, so we’ll get notified (with build errors) whenever the enum changes, reminding us that we need to reassess this switch statement.

Our final wave of recommendations comes from unit test failures. At this point, we may also run into UI and integration tests. These involve a lot more reading than we’d prefer; since these tests require heavy mocking, much of the text is just noise. Also, they often fail for unimportant reasons, like timing issues and incomplete mocks. On the other hand, unit tests sometimes get a bad rap for requiring code restructures, usually into more or smaller abstraction layers. At first glance, it can seem like they make the application code more complex. But we didn’t need to read the application code at all! For us, it’s best if previous developers optimized for simple, easy-to-interpret unit tests. If they didn’t, we’ll have to find these issues manually. One strategy is to check git blame on the lines we changed. Maybe the commit message, ticket, or pull request text will explain why the feature was previously written that way, and any regressions we might cause if we change it.

At no point in this process are comments useful to us. We may have passed some on the upswing, noting them down to address later. Any comments that are supposed to flag problems on the downswing are totally invisible—we aren’t guaranteed to find those areas unless they’re already flagged by an error or test failure. And whether we found comments on the upswing or through manual checking, they could be stale. We can’t know if they’re still valid without reading the code underneath them. If something is important enough to be protected with a comment, it should be protected with unit tests, build errors, or lint errors instead. That way it gets noticed regardless of how attentive future readers are, and it’s better protected against staleness. This approach also saves mental bandwidth when people are touching nearby code. Unlike standard comments, test assertions only pop when the code they’re explaining has changed. When they’re not needed, they stay out of the way.

Clean Up

Having mostly skipped the reading phase, we now have plenty of time to polish up our code. This is also an opportunity to revisit areas that gave us trouble on the downswing. If we had to read through any code manually, now’s the time to fix that for future (non)readers.

Update the Linter

If we need to enforce a standard practice, such as using a specific library or a shared pattern, codify it in the linter so future developers don’t have to find it themselves. This can trigger larger-scale refactors, so it may be worth spinning off into a separate changeset.

Lean on the Type System

Wherever practical, we turn primitive types (bools, numbers, and strings) into custom types, so future developers know which methods will give them valid outputs to feed into a given input. A primitive like timeInMilliseconds: number is more vulnerable to mistakes than time: MillisecondsType, which will raise a build error if it receives a value in SecondsType. When using enums, we enforce exhaustive switches, so a build error will appear any time a new case may need to be handled.

We also check methods for any non-independent arguments:

Argument A must always be null if Argument B is non-null, and vice versa (for example, error/response).
If Argument A is passed in, Argument B must also be passed in (for example, eventId/eventTimestamp).
If Flag A is off, Flag B can’t possibly be on (for example, visible/highlighted).

If these arguments are kept separate, future developers will need to think about whether they’re passing in a valid combination of arguments. Instead, we combine them, so the type system will only allow valid combinations:

If one argument must be null when the other is non-null, combine them into a tagged union: {type: 'failure'; error: ErrorType} | {type: 'success'; response: ResponseType}.
If two arguments must be passed in together, nest them into a single object: event: {id: IDType; timestamp: TimestampType}.
If two flags don’t vary independently, combine them into a single enum: 'hidden'|'visible'|'highlighted'.

Optimize for Simple Unit Tests

When testing, avoid entanglement with UI, disk or database access, the network, async code, current date and time, or shared state. All of these factors produce or consume side effects, clogging up the tests with setup and teardown. Not only does this spike the rate of false positives, it forces future developers to learn lots of context in order to interpret a real failure.

Instead, we want to structure our code so that we can write simple tests. As we saw, people can often skip reading our application code. When test failures appear, they have to interact with them. If they can understand the failure quickly, they’re more likely to pay attention to it, rather than adjusting the failing assertion and moving on. If a test is starting to get complicated, go back to the application code and break it into smaller pieces. Move any what code (code that decides which side effects should happen) into pure functions, separate from the how code (code that actually performs the side effects). Once we’re done, the how code won’t contain any nontrivial logic, and the what code can be tested—and therefore documented—without complex mocks.

Trivial vs. Nontrivial Logic
Trivial logic would be something like if (shouldShow) show(). Something like if (newUser) show() is nontrivial (business) logic, because it’s specific to our application or feature. We can’t be sure it’s correct unless we already know the expected behavior.

Whenever we feel an urge to write a comment, that’s a signal to add more tests. Split the logic out into its own unit tested function so the “comment” will appear automatically, regardless of how carefully the next developer is reading our code.

We can also add UI and integration tests, if desired. However, be cautious of the impulse to replace unit tests with other kinds of tests. That usually means our code requires too much reading. If we can’t figure out a way to run our code without lengthy setup or mocks, humans will need to do a similar amount of mental setup to run our code in their heads. Rather than avoiding unit tests, we need to chunk our code into smaller pieces until the unit tests become easy.

Confirm

Once we’ve finished polishing our code, we manually test it for any issues. This may seem late, but we’ve converted many runtime bugs into lint, build, and test errors. Surprisingly often, we’ll find that we’ve already handled all the edge cases, even if we’re running the code for the first time.

If not, we can do a couple more passes to address the lingering issues… adjusting the code for better “unread”-ability as we go.

Tip
Sometimes, our end goal really is to read the code. For example, we might be reviewing someone else’s code, verifying the current behavior, or ruling out bugs. We can still pose our questions as writes:

Could a developer have done this accidentally, or does the linter block it when we try?
Is it possible to pass this bad combination of arguments, or would that be rejected at build time?
If we hardcode this value, which features (represented by unit tests) would stop working?

JM Neri is a senior mobile developer on the Shop Pay team, working out of Colorado. When not busy writing unit tests or adapting components for larger text sizes, JM is usually playing in or planning for a TTRPG campaign.

Development

The Engineering Story Behind Flex Comp

by Eric Poirier
Oct 5, 2022

How we built Flex Comp: Shopify's new approach to compensation that gives employees the ability to choose how they want to allocate their total reward between base salary, Restricted Stock Units (RSUs), and Stock Options today, with new features like Shop Cash in the future.

Development

Managing React Form State Using the React-Form Library

by Joe Keohan
Sep 14, 2022

One of Shopify’s philosophies when it comes to adopting a new technology isn’t only to level up the proficiency of our developers so they can implement the technology at scale, but also with the intent of sharing their new found knowledge and understanding of the tech with the developer community.

In Part 1 (Building a Form with Polaris ) of this series, we were introduced to Shopify’s Polaris Design System, an open source library used to develop the UI within our Admin and here in Part 2 we’ll delve further into Shopify’s open source Quilt repo that contains 72 npm packages, one of which is the react-form library. Each package was created to facilitate the adoption and standardization of React and each has its own README and thorough documentation to help get you started.

The react-form Library

If we take a look at the react-form library repo we can see that it’s used to:

“Manage React forms tersely and safely-typed with no effort using React hooks. Build up your form logic by combining hooks yourself, or take advantage of the smart defaults provided by the powerful useForm hook.”

The useForm and useField Hooks

The documentation categorizes the API into three main sections: Hooks, Validation, and Utilities. There are eight hooks in total and for this tutorial we’ll focus our attention on just the ones most frequently used: useForm and useField.

useForm is a custom hook for managing the state of an entire form and makes use of many of the other hooks in the API. Once instantiated, it returns an object with all of the fields you need to manage a form. When combined with useField, it allows you to easily build forms with smart defaults for common use cases. useField is a custom hook for handling the state and validations of an input field.

The Starter CodeSandbox

As this tutorial is meant to be a step-by-step guide providing all relevant code snippets along the way, we highly encourage you to fork this Starter CodeSandbox so you can code along throughout the tutorial.

If you hit any roadblocks along the way, or just prefer to jump right into the solution code, here’s the Solution CodeSandbox.

First Things First—Clean Up Old Code

The useForm hook creates and manages the state of the entire form which means we no longer need to import or write a single line of useState in our component, nor do we need any of the previous handler functions used to update the input values. We still need to manage the onSubmit event as the form needs instructions as to where to send the captured input but the handler itself is imported from the useForm hook.

With that in mind let’s remove all the following previous state and handler logic from our form.

React isn’t very happy at the moment and presents us with the following error regarding the handleTitleChange function not being defined.

ReferenceError handleTitleChange is not defined

This occurs because both TextField Components are still referencing their corresponding handler functions that no longer exist. For the time being, we’ll remove both onChange events along with the value prop for both Components.

Although we’re removing them at the moment, they’re still required as per our form logic and will be replaced by the fields object provided by useForm.

React still isn’t happy and presents us with the following error in regards to the Page Component assigning onAction to the handleSubmit function that’s been removed.

ReferenceError handleSubmit is not defined

It just so happens that the useForm hook provides a submit function that does the exact same thing, which we’ll destructure in the next section. For the time being we’ll assign submit to onAction and place it in quotes so that it doesn’t throw an error.

One last and final act of cleanup is to remove the import for useState, at the top of the file, as we’ll no longer manage state directly.

Our codebase now looks like the following:

Importing and Using the useForm and useField Hooks

Now that our Form has been cleaned up, let’s go ahead and import both the useForm and useField hooks from react-form. Note, for this tutorial the shopify/react-form library has already been installed as a dependency.

import { useForm, useField } from "@shopify/react-form”;

If we take a look at the first example of useForm in the documentation, we can see that useForm provides us quite a bit of functionality in a small package. This includes several properties and methods that can be instantiated, in addition to accepting a configuration object that’s used to define the form fields and an onSubmit function.

In order to keep ourselves focused on the basic functionality, we start with only capturing the inputs for our two fields, title and description, and then handle the submission of the form. We’ll also pass in the configuration object, assigning useField() to each field, and lastly, an onSubmit function.

Since we previously removed the value and onChange props from our TextField components the inputs no longer capture nor display text. They both worked in conjunction where onChange updated state, allowing value to display the captured input once the component re-rendered. The same functionality is still required but those props are now found in the fields object, which we can easily confirm by adding a console.log and viewing the output:

If we do a bit more investigation and expand the description key, we see all of its additional properties and methods, two of which are onChange and value.

With that in mind, let’s add the following to our TextField components:

It’s clear from the code we just added that we’re destructuring the fields object and assigning the key that corresponds to the input’s label field. We should also be able to type into the inputs and see the text updated. The field object also contains additional properties and methods such as reset and dirty that we’ll make use of later when we connect our submit function.

Submitting the Form

With our TextField components all set up, it’s time to enable the form to be submitted. As part of the previous clean up process, we updated the Page Components onAction prop and now it’s time to remove the quotes.

Now that we’ve enabled submission of the form, let’s confirm that the onSubmit function works and take a peek at the fields object by adding a console log.

Let’s add a title and description to our new product and click Save.

Adding a Product Screen with a Title and Description fields — Adding A Product

We see the following output:

More Than Just Submitting

When we reviewed the useForm documentation earlier we made note of all the additional functionality that it provides, two of which we will make use of now: reset and dirty.

Reset the Form After Submission

reset is a method and is used to clear the form, providing the user with a clean slate to add additional products once the previous one has been saved. reset should be called only after the fields have been passed to the backend and the data has been handled appropriately, but also before the return statement.

If you input some text and click Save, our form should clear the input fields as expected.

Conditionally Enable The Save Button

dirty is used to disable the Save button until the user has typed some text into either of the input fields. The Page component manages the Save button and has a disabled property that we assign the value of !dirty because its value is set to false when imported, so we need to change that to true.

You should now notice that the Save button is disabled until you type into either of the fields, at which point Save is enabled

We can also validate that it’s now disabled by examining the Save button in developer tools.

Developer Tools screenshot showing the code disabling the save button. — Save Button Disabled

Form Validation

What we might have noticed when adding dirty, is that if the user types into either field the Save button is immediately enabled. One last aspect of our form is that we’ll require the Title field to contain some input before being allowed to submit the product. To do this we’ll import the notEmpty hook from react-form.

Assigning it also requires that we now pass useField a configuration object that includes the following keys: value and validates. The value key keeps track of the current input value and validates provides us a means of validating input based on some criteria.

In our case, we’ll prevent the form from being submitted if the title field is empty and provide the user an error message indicating that it’s a required field.

Let’s give it a test run and confirm it’s working as expected. Try adding only a description and then click Save.

Add Product screen with Title and Description fields. Title field is empty and showing error message “field is required.” — Field is required error message

As we can see our form implements all the previous functionality and then some, all of which was done via the useForm and useField hooks. There’s quite a bit more functionality that these specific hooks provide, so I encourage you to take a deeper dive and explore them further.

This tutorial was meant to introduce you to Shopify’s open source react-form library that’s available in Shopify’s public Quilt repo. The repo provides many more useful React hooks such as the following to name a few:

react-graphql: for creating type-safe and asynchronous GraphQL components for React
react-testing: for testing React components according to our conventions
react-i18n: which includes i18n utilities for handling translations and formatting.

Joe Keohan is a Front End Developer at Shopify, located in Staten Island, NY and working on the Self Serve Hub team enhancing the buyer’s experience. He’s been instructing software engineering bootcamps for the past 6 years and enjoys teaching others about software development. When he’s not working through an algorithm, you’ll find him jogging, surfing and spending quality time with his family.

Development

Four Approaches to Debugging Server-side WebAssembly

by Jeff Charles
Sep 7, 2022

Shopify Functions enables customizing the business logic of Shopify’s back end with server-side WebAssembly. There are many benefits to using a WebAssembly environment to host these customizations such as low cold-start latency and robust security compared to other alternatives. However, WebAssembly poses new challenges when debugging.

Development

Focus On Behavior, Not State, for a More Maintainable Codebase

by John DeWyze
Sep 1, 2022

How we moved from focusing on state (what an object is) towards behavior (what an object does) while building our Purchase Options APIs as a way to create a more maintainable codebase.

Development

RailsConf 2022: 10 Shopify Tech Talks You Might Have Missed

by Kevin Ritchie
Aug 30, 2022

Watch Shopify's tech talent tackle topics ranging from virtual machines, performance improvements, and investing in open source teams

Development

How We Built Oxygen: Hydrogen’s Counterpart for Hosting Custom Storefronts

by Sneha Shah
Aug 23, 2022

In June, we shared the details of how we built Hydrogen, our React-based framework for building custom storefronts. We talked about some of the big bets we made on new technologies like React Server components, and the many internal and external collaborations that made Hydrogen a reality.

This time we tell the story of Oxygen, Hydrogen’s counterpart that makes hosting Hydrogen custom storefronts easy and seamless. Oxygen guarantees fast and globally available storefronts that securely integrate with the Shopify ecosystem while eliminating additional costs of setting up third-party hosting tools.

We’ll dive into the experiences we focused on, the technical choices we made to build those experiences, and how those choices paved the path for Shopify to get involved in standardizing serverless runtimes in partnership with leaders like Cloudflare and Deno.

Shopify-Integrated Merchant Experience

Let’s first briefly look at why we built Oxygen. There are existing products in the market that can host Hydrogen custom storefronts. Oxygen’s uniqueness is in the tight integration it provides with Shopify. Our technical choices so far have largely been grounded in ensuring this integration is frictionless for the user.

We started with GitHub for version control, GitHub actions for continuous deployment, and Cloudflare for worker runtimes and edge distribution. We combined these third-party services with first-party services such as Shopify CDN, Shopify Admin API, and Shopify Identity and Access Management. They’re glued together by Oxygen-scoped services that additionally provide developer tooling and observability. Oxygen today is the result of bundling together this collection of technologies.

A flow diagram highlighting the interaction between the developer, GitHub, Oxygen, Cloudflare, the buyer, and Shopify — Oxygen overview

We introduced the Hydrogen sales channel as the connector between Hydrogen, Oxygen, and the shop admin. The Hydrogen channel is the portal that provides controls to create and manage custom storefronts, link them to the rest of shop administrative functions, and connect them to Oxygen for hosting. It is built on Shopify’s standard Rails and React stack, leveraging Polaris design system for consistent user experience across Shopify-built admin experiences.

Fast Buyer Experience

Oxygen exists to give merchants the confidence that Shopify will deliver an optimal buyer experience while merchants focus on their entrepreneurial objectives. Optimal buyer experience in Oxygen’s context is a combination of high availability guarantees, super fast site performance from anywhere in the world, and resilience to handle high-volume traffic.

To Build or To Reuse

This is where we had the largest opportunity to contemplate our technical direction. We could leverage over a decade’s experience at Shopify in building infrastructure solutions that keep the entire Shopify platform up and running to build an infrastructure layer, control plane, and proprietary V8 isolates. In fact, we did briefly and it was a successful venture! However, we ultimately decided to opt for Cloudflare’s battle-hardened worker infrastructure that guarantees buyer access to storefronts within milliseconds due to global edge distribution.

This foundational decision significantly simplified upfront infrastructural complexity, scale and security risk considerations, allowing us to get Oxygen to merchants faster and validate our bets. These choices also leave us enough room to go back and build our own proprietary version at scale or a simpler variation of it if it makes sense both for the business and the users.

A flow diagram highlighting the interactions between the Shopify store, Cloudflare, and the Oxygen workers. — Oxygen workers overview

We were able to provide merchants and buyers the promised fast performance while locking in access controls. When a buyer makes a request to a Hydrogen custom storefront hosted at myshop.com, that request is received by Oxygen’s Gateway Worker running in Cloudflare. This worker is responsible for validating that the accessor has the necessary authorization to the shop and the specific storefront version before routing them to the Storefront Worker that is running the Hydrogen-based storefront code. The worker chaining is made possible using Cloudflare’s new Dynamic Dispatch API from Workers for Platforms.

Partnerships and Open Source

Rather than reinventing the wheel, we took the opportunity to work with leaders in the JavaScript runtimes space to collectively evolve the ecosystem. We use and contribute to the Workers for Platforms solution through tight feedback loops with Cloudflare. We also jointly established WinterCG, a JavaScript runtimes community group in partnership with Cloudflare, Vercel, Deno, and others. We leaned in to collectively building with and for the community, just the way we like it at Shopify.

Familiar Developer Experience

Oxygen largely provides developer-oriented capabilities and we strive to provide a developer experience that cohesively integrates with existing developer workflows.

Continuous Data-Informed Improvements

While the Oxygen platform takes care of the infrastructure and distribution management of custom storefronts, it surfaces critical information about how custom storefronts are performing in production, ensuring fast feedback loops throughout the development lifecycle. Specifically, runtime logs and metrics are surfaced through the observability dashboard within the Hydrogen channel for troubleshooting and performance trend monitoring. The developer-oriented user can extrapolate actions necessary to further improve the site quality.

A flow diagram highlighting the observability enablement side — Oxygen observability overview

We made very deliberate technical choices again on the observability enablement side. Unlike Shopify’s internal observability stack, Oxygen’s observability stack consists of Grafana for dashboards and alerting, Cortex for metrics, Loki for logging, and Tempo for tracing. Under the hood, Oxygen’s Trace Worker runs in Cloudflare and attaches itself to Storefront Workers to capture all of the logging and metrics information and forwards all of it to our Grafana stack. Logs are sent to Loki and metrics are sent to Cortex where the Oxygen Observability Service pulls both on-demand when the Hydrogen channel requests it.

The tech stack was chosen for two key purposes: Oxygen provided a good test bed to experiment and evaluate these tools for a potential long-term fit for the rest of Shopify, and Oxygen’s use case is fundamentally different from internal Shopify. To support the latter, we needed a way to separate internal-facing from external-facing metrics cleanly while scaling to the data loads. We also needed the tool to be flexible enough that we can provide merchants optionality to integrate with any existing monitoring tools in their workflows.

What’s Next

Thanks to many flexible, eager, and collaborative merchants who maintained tight feedback loops every step of the way, Oxygen is used in production today by Allbirds, Shopify Supply, Shopify Hardware, and Denim Tears. It is generally available to our Plus merchants as of June.

We’re just getting started though! We have our eyes on unlocking composable, plug-and-play styled usage in addition to surfacing deeper development insights earlier in the development lifecycle to shorten feedback loops. We also know there is a lot of opportunity for us to enhance the developer experience by reducing the number of surfaces to interact with, providing more control from the command line, and generally streamlining the Oxygen developer tools with the overall Shopify developer toolbox.

We’re eager to take in all the merchant feedback as they demand the best of us. It helps us discover, learn, and push ourselves to revalidate assumptions, which will ultimately create new opportunities for the platform to evolve.

Curious to learn more? We encourage you to check out the docs!

Sneha is an engineering leader on the Oxygen team and has contributed to various teams building tools for a developer-oriented audience.

Development

How We Enable Two-Day Delivery in the Shopify Fulfillment Network

by Linda Damus
Aug 18, 2022

Merchants sign up for Shopify Fulfillment Network (SFN) to save time on storing, packing, and shipping their merchandise, which is distributed and balanced across our network of Shopify-certified warehouses throughout North America. We optimize the placement of inventory so it’s closer to buyers, saving on shipping time, costs, and carbon emissions.

Recently, SFN also made it easier and more affordable for merchants to offer two-day delivery across the United States. We consider many real-world factors to provide accurate delivery dates, optimizing for sustainable ground shipment methods and low cost.

A Platform Solution

As with most new features at Shopify, we couldn’t just build a custom solution for SFN. Shopify is an extensible platform with a rich ecosystem of apps that solve complex merchant problems. Shopify Fulfillment Network is just one of the many third-party logistics (3PL) solutions available to merchants.

This was a multi-team initiative. One team built the delivery date platform in the core of Shopify, consisting of a new set of GraphQL APIs that any 3PL can use to upload their delivery dates to the Shopify platform. Another team integrated the delivery dates into the Shopify storefront, where they are shown to buyers on the product details page and at checkout. A third team built the system to calculate and upload SFN delivery dates to the core platform. SFN is an app that merchants install in their shops, in the same way other 3PLs interact with the Shopify platform. The SFN app calls the new delivery date APIs to upload its own delivery dates to Shopify. For accuracy, the SFN delivery dates are calculated using network, carrier and product data. Let’s take a closer look at these inputs to the delivery date.

Four Factors That Determine Delivery Date

There are many factors to predict when a package will leave the warehouse, and how long it will spend in transport to the destination address. Each 3PL has its own particular considerations, and the Shopify platform is flexible enough to support them all. Requiring them to conform to fine-grained platform primitives such as operating days and capacity or processing and transit times would only result in loss of fidelity of their particular network and processes.

With that in mind, we let 3PLs populate the platform with their pre-computed delivery dates that Shopify surfaces to buyers on the product details page and checkout. The 3PL has full control over all the factors that affect their delivery dates. Let’s take a look at some of these factors.

1. Proximity to the Destination

The time required for delivery is dependent on the distance the package must travel to arrive at its destination. Usually, the closer the inventory is to the destination, the faster the delivery. This means that SFN delivery dates depend on specific inventory availability throughout the network.

2. Heavy, Bulky, or Dangerous Goods

Some carrier services aren’t applicable to merchandise that exceeds a specific weight or dimensions. Others can’t be used to transport hazardous materials. The delivery date we predict for such items must be based on a shipping carrier service that can meet the requirements.

3. Time Spent at Fulfillment Centers

Statutory holidays and warehouse closures affect when the package can be ready for shipment. Fulfillment centers have regular operating calendars and sometimes exceptional circumstances can force them to close, such as severe weather events.

On any given operating day, warehouse staff need time to pick and pack items into packages for shipment. There’s also a physical limit to how many packages can be processed in a day. Again, exceptional circumstances such as illness can reduce the staff available at the fulfillment center, reducing capacity and increasing processing times.

4. Time Spent in the Hands of Shipping Carriers

In SFN, shipping carriers such as UPS and USPS are used to transport packages from the warehouse to their destination. Just like the warehouses, shipping carriers have their own holidays and closures that affect when the package can be picked up, transported, and delivered. These are modeled as carrier transit days, when packages are moved between hubs in the carrier’s own network, and delivery days, when packages are taken from the last hub to the final delivery address.

Shipping carriers send trucks to pick up packages from the warehouse at scheduled times of the day. Orders that are made after the last pickup for the day have to wait until the next day to be shipped out. Some shipping carriers only pick up from the warehouse if there are enough packages to make it worth their while. Others impose a limit on the volume of packages they can take away from the warehouse, according to their truck dimensions. These capacity limits influence our choice of carrier for the package.

Shipping carriers also publish the expected number of days it takes for them to transport a package from the warehouse to its destination (called Time in Transit). The first transit day is the day after pickup, and the last transit day is the delivery day. Some carriers deliver on Saturdays even though they won’t transport the package within their network on a Saturday.

Putting It All Together

Together, all of these factors are considered when we decide which fulfillment center should process a shipment and which shipping carrier service to use to transport it to its final destination. At regular intervals, we select a fulfillment center and carrier service that optimizes for sustainable two-day delivery for every merchant SKU to every US zip code. From this, we upload pre-calculated schedules of delivery dates to the Shopify platform.

That’s a Lot of Data

It’s a lot of data, but much of it can be shared between merchant SKUs. Our strategy is to produce multiple schedules, each one reflecting the delivery dates for inventory available at the same set of fulfillment centers and with similar characteristics such as weight and dimensions. Each SKU is mapped to a schedule, and SKUs from different shops can share the same schedule.

Example mapping of SKUs to schedules of delivery dates

In this example, SKU-1.2 from Shop 1 and SKU-3.1 from Shop 3 share the same schedule of delivery dates for heavy items stocked in both California and New York. If an order is placed today by 4pm EST for SKU-1.2 or SKU-3.1, shipping to zip code 10019 in New York, it will arrive on June 10. Likewise, if an order is placed today for SKU-1.2 or SKU-3.1 by 3pm PST, shipping to zip code 90002 in California, it will arrive on June 11.

Looking Up Delivery Dates in Real Time

When Shopify surfaces a delivery date on a product details page or at checkout, it’s a direct, fast lookup (under 100 ms) to find the pre-computed date for that item to the destination address. This is because the SFN app uploads pre-calculated schedules of delivery dates to the Shopify platform and maps each SKU to a schedule using the delivery date APIs.

The date the buyer sees at checkout is sent back to SFN during order fulfillment, where it’s used to ensure that the order is routed to a warehouse and shipping label that meets the delivery date.

There you have it, a highly simplified overview of how we built two-day delivery in the SFN.

Learn More About SFN

Spin Cycle: Shopify's SFN Team Overcomes a Cloud-Development Spiral

Development

When is JIT Faster Than A Compiler?

by Noah Gibbs
Aug 11, 2022

I had this conversation over and over before I really understood it. It goes:

“X language can be as fast as a compiled language because it has a JIT compiler!”

“Wow! It’s as fast as C?”

“Well, no, we’re working on it. But in theory, it could be even FASTER than C!”

“Really? How would that work?”

“A JIT can see what your code does at runtime and optimize for only that. So it can do things C can’t!”

“Huh. Uh, okay.”

It gets hand-wavy at the end, doesn’t it? I find that frustrating. These days I work on YJIT, a JIT for Ruby. So I can make this extremely NOT hand-wavy. Let’s talk specifics.

I like specifics.

Wait, What’s JIT Again?

An interpreter reads a human-written description of your app and executes it. You’d usually use interpreters for Ruby, Python, Node.js, SQL, and nearly all high-level dynamic languages. When you run an interpreted app, you download the human-written source code, and you have an interpreter on your computer that runs it. The interpreter effectively sees the app code for the first time when it runs it. So an interpreter doesn’t usually spend much time fixing or improving your code. They just run it how it’s written. An interpreter that significantly transforms your code or generates machine code tends to be called a compiler.

A compiler typically turns that human-written code into native machine code, like those big native apps you download. The most straightforward compilers are ahead-of-time compilers. They turn human-written source code into a native binary executable, which you can download and run. A good compiler can greatly speed up your code by putting a lot of effort into improving it ahead of time. This is beneficial for users because the app developer runs the compiler for them. The app developer pays the compile cost, and users get a fast app. Sometimes people call anything a compiler if it translates from one kind of code to another—not just source code to native machine code. But when I say “compiler” here, I mean the source-code-to-machine-code kind.

A JIT, aka a Just-In-Time compiler, is a partial compiler. A JIT waits until you run the program and then translates the most-used parts of your program into fast native machine code. This happens every time you run your program. It doesn’t write the code to a file—okay, except MJIT and a few others. But JIT compilation is primarily a way to speed up an interpreter—you keep the source code on your computer, and the interpreter has a JIT built into it. And then long-running programs go faster.

It sounds kind of inefficient, doesn’t it? Doing it all ahead of time sounds better to me than doing it every time you run your program.

But some languages are really hard to compile correctly ahead of time. Ruby is one of them. And even when you can compile ahead of time, often you get bad results. An ahead-of-time compiler has to create native code that will always be correct, no matter what your program does later, and sometimes that means it’s about as bad as an interpreter, which has that exact same requirement.

Ruby is Unreasonably Dynamic

Ruby is like my four-year-old daughter: the things I love most about it are what make it difficult.

In Ruby, I can redefine + on integers like 3, 7, or -45. Not just at the start—if I wanted, I could write a loop and redefine what + means every time through that loop. My new + could do anything I want. Always return an even number? Print a cheerful message? Write “I added two numbers” to a log file? Sure, no problem.

That’s thrilling and wonderful and awful in roughly equal proportions.

And it’s not just +. It’s every operator on every type. And equality. And iteration. And hashing. And so much more. Ruby lets you redefine it all.

The Ruby interpreter needs to stop and check every time you add two numbers if you have changed what + means in between. You can even redefine + in a background thread, and Ruby just rolls with it. It picks up the new + and keeps right on going. In a world where everything can be redefined, you can be forgiven for not knowing many things, but the interpreter handles it.

Ruby lets you do awful, awful things. It lets you do wonderful, wonderful things. Usually, it’s not obvious which is which. You have expressive power that most languages say is a very bad idea.

I love it.

Compilers do not love it.

When JITs Cheat, Users Win

Okay, we’ve talked about why it’s hard for ahead-of-time (AOT) compilers to deliver performance gains. But then, how do JIT compilers do it? Ruby lets you constantly change absolutely everything. That’s not magically easy at runtime. If you can’t compile + or == or any operator, why can you compile some parts of the program?

With a JIT, you have a compiler around as the program runs. That allows you to do a trick.

The trick: you can compile the method wrong and still get away with it.

Here’s what I mean.

YJIT asks, “Well, what if you didn’t change what + means every time?” You almost never do that. So it can compile a version of your method where + keeps its meaning from right now. And so does equals, iteration, hashing, and everything you can change in Ruby but you nearly never do.

But… that’s wrong. What if I do change those things? Sometimes apps do. I’m looking at you, ActiveRecord.

But your JIT has a compiler around at runtime. So when you change what + means, it will throw away all those methods it compiled with the old definition. Poof. Gone. If you call them again, you get the interpreted version again. For a while—until JIT compiles a new version with the new definition. This is called de-optimization. When the code starts being wrong, throw it away. When 3+4 stops being 7 (hey, this is Ruby!), get rid of the code that assumed it was. The devil is in the details—switching from one version of a method to another version midway through is not easy. But it’s possible, and JIT compilers basically do it successfully.

So your JIT can assume you don’t change + every time through the loop. Compilers and interpreters can’t get away with that.

An AOT compiler has to create fully correct code before your app even ships. It’s very limited if you change anything. And even if it had some kind of fallback (“Okay, I see three things 3+4 could be in this app”), it can only respond at runtime with something it figured out ahead of time. Usually, that means very conservative code that constantly checks if you changed anything.

An interpreter must be fully correct and respond immediately if you change anything. So they normally assume that you could have redefined everything at any time. The normal Ruby interpreter spends a lot of time checking if you changed the definition of + over time. You can do clever things to speed up that check, and CRuby does. But if you make your interpreter extremely clever, pre-building optimized code and invalidating assumptions, eventually you realize that you’ve built a JIT.

Ruby and YJIT

I work on YJIT, which is part of CRuby. We do the stuff I mention here. It’s pretty fast.

There are a lot of fun specifics to figure out. What do we track? How do we make it faster? When it’s invalid, do we need to recompile or cancel it? Here’s an example I wrote recently.

You can try out our work by turning on --yjit on recent Ruby versions. You can use even more of our work if you build the latest head-of-master Ruby, perhaps with ruby-build 3.2.0-dev. You can also get all the details by reading the source, which is built right into CRuby.

By the way, YJIT has some known bugs in 3.1 that mean you should NOT use it for real production. We’re a lot closer now—it should be production-ready for some uses in 3.2, which comes out Christmas 2022.

What Was All That Again?

A JIT can add assumptions to your code, like the fact that you probably didn’t change what + means. Those assumptions make the compiled code faster. If you do change what + means, you can throw away the now-incorrect code.

An ahead-of-time compiler can’t do that. It has to assume you could change anything you want. And you can.

An interpreter can’t do that. It has to assume you could have changed anything at any time. So it re-checks constantly. A sufficiently smart interpreter that pre-builds machine code for current assumptions and invalidates if it changes could be as fast as JIT… Because it would be a JIT.

And if you like blog posts about compiler internals—who doesn’t?—you should hit “Yes, sign me up” up above and to the left.

Noah Gibbs wrote the ebook Rebuilding Rails and then a lot about how fast Ruby is at various tasks. Despite being a grumpy old programmer in Inverness, Scotland, Noah believes that some day, somehow, there will be a second game as good as Stuart Smith’s Adventure Construction Set for the Apple IIe. Follow Noah on Twitter and GitHub

Development

Mastering React’s Stable Values

by Colin Gray
Aug 2, 2022

The concept of stable value is a distinctly React term, and especially relevant since the introduction of Functional Components. It refers to values (usually coming from a hook) that have the same value across multiple renders. And they’re immediately confusing. In this post, Colin Gray, Principal Developer at Shopify, walks through some cases where they really matter and how to make sense of them.

Development

10 Tips for Building Resilient Payment Systems

by Bart de Water
Jul 28, 2022

The top 10 tips and tricks for building resilient payment systems from a Staff Developer working on Shopify’s payment infrastructure.

Development

Shopify and Open Source: A Mutually Beneficial Relationship

by Mike Dalessio
Jul 6, 2022

Shopify and Rails have grown up together. Both were in their infancy in 2004, and our CEO (Tobi) was one of the first contributors and a member of Rails Core. Shopify was built on top of Rails, and our engineering culture is rooted in the Rails Doctrine, from developer happiness to the omakase menu, sharp knives, and majestic monoliths. We embody the doctrine pillars.

Shopify's success is due, in part, to Ruby and Rails. We feel obligated to pay that success back to the community as best we can. But our commitment and investment are about more than just paying off that debt; we have a more meaningful and mutually beneficial goal.

One Hundred Year Mission

At Shopify, we often talk about aspiring to be a 100-year company–to still be around in 2122! That feels like an ambitious dream, but we make decisions and build our code so that it scales as we grow with that goal in mind. If we pull that off, will Ruby and Rails still be our tech stack? It's hard to answer, but it's part of my job to think about that tech stack over the next 100 years.

Ruby and Rails as 100-year tools? What does that even mean?

To get to 100 years, Rails has to be more than an easy way to get started on a new project. It's about cost-effective performance in production, well-formed opinions on the application architecture, easy upgrades, great editors, avoiding antipatterns, and choosing when you want the benefits of typing.

To get to 100 years, Ruby and Rails have to merit being the tool of choice, every day, for large teams and well-aged projects for a hundred years. They have to be the tool of choice for thousands of developers, across millions of lines of code, handling billions of web requests. That's the vision. That's Rails at scale.

And that scale is where Shopify is investing.

Why Companies Should Invest In Open Source

Open source is the heart and soul of Rails: I’d say that Rails would be nowhere near what it is today if not for the open source community.

Rafael França, Shopify Principal Engineer and Rails Core Team Member

We invest in open source to build the most stable, resilient, performant version to grow our applications. How much better could it be if more people were contributing? As a community, we can do more. Ruby and Rails can only continue to be a choice for companies if we're actively investing in the development, and to do that; we need more companies involved in contributing.

It Improves Engineering Skills

Practice makes progress! Building open source software with cross-functional teams helps build better communication skills and offers opportunities to navigate feedback and criticism constructively. It also enables you to flex your debugging muscles and develop deep expertise in how the framework functions, which helps you build better, more stable applications for your company.

It’s Essential to Application Health & Longevity

Contributing to open source helps ensure that Rails benefits your application and the company in the long term. We contribute because we care about the changes and how they affect our applications. Investing upfront in the foundation is proactive, whereas rewrites and monkey patches are reactive and lead to brittle code that's hard to maintain and upgrade.

At our scale, it's common to find issues with, or opportunities to enhance, the software we use. Why keep those improvements private? Because we build on open source software, it makes sense to contribute to those projects to ensure that they will be as great as possible for as long as possible. If we contribute to the community, it increases our influence on the software that our success is built on and helps improve our chances of becoming a 100-year company. This is why we make contributions to Ruby and Rails, and other open source projects. The commitment and investment are significant, but so are the benefits.

How We're Investing in Ruby and Rails

Shopify is built on a foundation of open source software, and we want to ensure that that foundation continues to thrive for years to come and that it continues to scale to meet our requirements. That foundation can’t succeed without investment and contribution from developers and companies. We don’t believe that open source development is “someone else’s problem”. We are committed to Ruby and Rails projects because the investment helps us future-proof our foundation and, therefore, Shopify.

We contribute to strategic projects and invest in initiatives that impact developer experience, performance, and security—not just for Shopify but for the greater community. Here are some projects we’re investing in:

Improving Developer Tooling

We’ve open-sourced projects like toxiproxy, bootsnap, packwerk, tapioca, paquito, and maintenance_tasks that are niche tools we found we needed. If we need them, other developers likely need them as well.
We helped add Rails support to Sorbet's gradual typing to make typing better for everyone.
We're working to make Ruby support in VScode best-in-class with pre-configured extensions and powerful features like refactorings.
We're working on automating upgrades between Ruby and Rails versions to reduce friction for developers.

Increasing Performance

We created YJIT, a just-in-time compiler built into the Ruby interpreter to increase performance.
We're working on variable-width allocation and object shapes to help improve Ruby performance.
We're collaborating with Oracle on TruffleRuby, an alternative Ruby implementation with high speed potential.
We recently announced funding of $500k for computer science researchers to focus their work towards Ruby and the needs of the Ruby community.

Enhancing Security

We're actively contributing to bundler and rubygems to make Ruby's supply chain best-in-class.
We're partnering with Ruby Central to ensure the long-term success and security of Rubygems.org through strategic investments in engineering, security-related projects, critical tools and libraries, and improving the cycle time for contributors.

Meet Shopify Contributors

The biggest investment you can make is to be directly involved in the future of the tools that your company relies on. We believe we are all responsible for the sustainability and quality of open source. Shopify engineers are encouraged to contribute to open source projects where possible. The commitment varies. Some engineers make occasional contributions, some are part-time maintainers of important open source libraries that we depend on, and some are full-time contributors to critical open source projects.

Meet some of the Shopify engineers contributing to open source. Some of those faces are probably familiar because we have some well-known experts on the team. But some you might not know…yet. We're growing the next generation of Ruby and Rails experts to build for the future.

Mike is a NYC-based engineering leader who's worked in a variety of domains, including energy management systems, bond pricing, high-performance computing, agile consulting, and cloud computing platforms. He is an active member of the Ruby open-source community, where as a maintainer of a few popular libraries he occasionally still gets to write software. Mike has spent the past decade growing inclusive engineering organizations and delivering amazing software products for Pivotal, VMware, and Shopify.

Development

Navigating Recurring Payments in India: A Backend Perspective

by Yash Kapadia
Jun 29, 2022

To address the new RBI framework for recurring payments in Shopify’s billing platform, we worked with a local payment provider that could accommodate both card-based payments as well as a local payment method called Unified Payments Interface (UPI).

Development

How We Built Hydrogen: A React Framework for Building Custom Storefronts

by Josh Larson
Jun 22, 2022

We’ve been building Hydrogen for more than a year. Here’s a look behind the scenes at how we did it, what we learned from making big bets, and what it was like building a new framework from the ground up on experimental technology.

Development

How We Built Shopify Party

by Daniel Beauchamp
Jun 14, 2022

Shopify Party is a browser-based internal tool that we built to make our virtual hangouts more fun. With Shopify’s move to remote, we wanted to explore how to give people a break from video fatigue and create a new space designed for social interaction. Here's how we built it.

Development

Building a Form with Polaris

by Joe Keohan
Jun 1, 2022

As companies grow in size and complexity, there comes a moment in time when teams realize the need to standardize portions of their codebase. This most often becomes the impetus for creating a common set of processes, guidelines, and systems, as well as a standard set of reusable UI elements. Shopify’s Polaris design system was born from such a need. It includes design and content guidelines along with a rich set of React components for the UI that Shopify developers use to build the Shopify admin, and that third-party app developers can use to create apps on the Shopify App Store.

Working with Polaris

Shopify puts a great deal of effort into Polaris and it’s used quite extensively within the Shopify ecosystem, by both internal and external developers, to build UIs with a consistent look and feel. Polaris includes a set of React components used to implement functionality such as lists, tables, cards, avatars, and more.

Form… Forms… Everywhere

Polaris includes 16 separate components that not only encompass the form element itself, but also the standard set of form inputs such as checkbox, color or date picker, radio button, and text field to name just a few. Here are a few examples of basic form design in Shopify admin related to creating a Product or Collection.

Additional forms you’ll encounter in the Shopify admin also include creating an Order or Gift Card.

From these form examples, we see a clear UI design implemented using reusable Polaris components.

What We’re Building

In this tutorial, we’ll build a basic form for adding a product that includes Title and Description fields, a Save button, and a top navigation that allows the user to return to the previous page.

Although our initial objective is to create the basic form design, this tutorial is also meant as an introduction to React, so we’ll also add the additional logic such as state, events, and event handlers to emulate a fully functional React form. If you’re new to React then this tutorial is a great introduction into many of the fundamental underlying concepts of this library.

Starter CodeSandbox

As this tutorial is a step by step guide, providing all the code snippets along the way, we highly encourage you to fork this Starter CodeSandbox and code along with us. “Knowledge” is in the understanding but the “Knowing” is only gained in its application.

If you hit any roadblocks along the way, or just prefer to jump right into the solution code, here’s the Solution CodeSandbox.

Initial Code Examination

The codebase we’re working with is a basic create-react-app. We’ll use the PolarisForm component to build our basic form using Polaris components and then add standard React state logic. The starter code also imports the @shopify/polaris library which can be seen in the dependencies section.

Initial Setup in CodeSandbox — Initial Setup

One other thing to note about our component structure is that the component folders contain both an index.js file and a ComponentName.js file. This is a common pattern visible throughout the Polaris component library, such as the Avatar component in the example below.

Polaris Avatar Component Folder

Let’s first open the PolarisForm component. We can see that it’s a bare bones component that outputs some initial text just so that we know everything is working as expected.

Choosing Components

Choosing our Polaris components will seem intuitive but, at times, also requires a deeper understanding of the Polaris library which the tutorial is meant to introduce along the way. Here’s a list of the components we’ll include in our design:

Form	The actual form
FormLayout	To apply a bit of styling between the fields
TextField	The text inputs for our form
Page	To provide the back arrow navigation and Save button
Card	To apply a bit of styling around the form

Reviewing the Polaris Documentation

When working with any new library, it’s always best to examine the documentation or as developers like to say, RTFM. With that in mind we’ll review the Polaris documentation along the way, but for now let’s start with the Form component.

The short description of the Form component describes it as “a wrapper component that handles the submissions of forms.” Also, in order to make it easier to start working with Polaris, each component provides best practices, related components, and several use case examples along with a corresponding CodeSandbox. The docs provide explanations for all the additional props that can be passed to the component.

Adding Our Form

It’s time to dive in and build out the form based on our design. The first thing we need to do is import the components we’ll be working with into the PolarisForm.js file.

import { Form, FormLayout, TextField, Page, Card } from "@shopify/polaris";

Now let’s render the Form and TextField components. I’ve also gone ahead and included the following TextField props: label, type, and multiline.

So it seems our very first introduction into our Polaris journey is the following error message:

MissingAppProviderError No i18n was provided. Your application must be wrapped in an <AppProvider> component. See https://polaris.shopify.com/components/structure/app-provider for implementation instructions.

Although the message also provides a link to the AppProvider component, I’d suggest we take a few minutes to read the Get Started section of the documentation. We see there’s a working example of rendering a Button component that’s clearly wrapped in an AppProvider.

And if we take a look at the docs for the AppProvider component it states it's “a required component that enables sharing global settings throughout the hierarchy of your application.”

As we’ll see later, Polaris creates several layers of shared context which are used to manage distinct portions of the global state. One important feature of the Shopify admin is that it supports up to 20 languages. The AppProvider is responsible for sharing those translations to all child components across the app.

We can move past this error by importing the AppProvider and replacing the existing React Fragment (<>) in our App component.

The MissingAppProviderError should now be a thing of the past and the form renders as follows:

Examining HTML Elements

One freedom developers allow themselves when developing code is to be curious. In this case, my curiosity drives me towards viewing what the HTML elements actually look like in developer tools once rendered.

Some questions that come to mind are: “how are they being rendered” and “do they include any additional information that isn’t visible in the component.” With that in mind, let’s take a look at the HTML structure in developer tools and see for ourselves.

At first glance it’s clear that Polaris is prefixed to all class and ID names. There are a few more elements to the HTML hierarchy as some of the elements are collapsed, but please feel free to pursue your own curiosity and continue to dig a bit deeper into the code.

Working With React Developer Tools

Another interesting place to look as we satisfy our curiosity is the Components tab in Dev Tools. If you don’t see it then take a minute to install the React Developer Tools Chrome Extension. I’ve highlighted the Form component so we can focus our attention there first and see all the component hierarchy of our form. Once again, we’ll see there’s more being rendered than what we imported and rendered into our component.

We also can see that at the very top of the hierarchy, just below App, is the AppProvider component. Being able to see the entire component hierarchy provides some context into how many nested levels of context are being rendered.

Context is a much deeper topic in React and is meant to allow child components to consume data directly instead of prop drilling.

Form Layout Component

Now that we’ve allowed ourselves the freedom to be curious, let’s refocus ourselves on implementing the form. One thing we might have noticed in the initial layout of the elements is that there’s no space between the Title input field and the Description label. This can be easily fixed by wrapping both TextField components in a FormLayout component.

If we take a look at the documentation, we see that the FormLayout component is “used to arrange fields within a form using standard spacing and that by default it stacks fields vertically but also supports horizontal groups of fields.”

Since spacing is what we needed in our design, let’s include the component.

Once the UI updates, we see that it now includes the additional space needed to meet our design specs.

With our input fields in place, let’s now move onto adding the back arrow navigation and the Save button to the top of our form design.

The Page Component

This is where Polaris steps outside the bounds of being intuitive and requires that we do a little digging into the documentation, or better yet the HTML. Since we know that we’re rebuilding the Add Product form, then perhaps we take a moment to once again explore our curiosity and take a look at the actual Form component in Shopify admin in the Chrome Dev Tools.

Polaris Page Component As Displayed In HTML

If we highlight the back arrow in HTML it highlights several class names prefixed with Polaris-Page. It looks like we’ve found a reference to the component we need, so now it’s off to the documentation to see what we can find.

Located under the Structure category in the documentation, there’s a component called Page. The short description for the Page component states that it’s “used to build the outer-wrapper of a page, including the page title and associated actions.” The assumption is that title is used for the Add Product text and the associated action includes the Save button.

Let’s give the rest of the documentation a closer look to see what props implement that functionality. Taking a look at the working example, we can correlate it with the following props:

breadcrumbs	Adds the back arrow navigation
title	Adds the title to the right of the navigation
primaryActions	Adds the button which will include a event listener

With props in hand, let’s add the Page component along with its props and set their values accordingly.

The Card Component

We're getting so close to completing the UI design, but based on a side by side comparison, it still needs a bit of white space around the form elements. Of course we could opt to add our own custom CSS, but Polaris provides us a component that achieves the same results.

If we take a look at the Shopify admin Add Products form in Dev Tools, we see a reference to the Card component, and it appears to be using padding (green outline in the image below) to create the space.

Let’s also take a look at the documentation to see what the Card component brings to the table. The short description for the Card component states that it is “used to group similar concepts and tasks together to make Shopify easier for merchants to scan, read and get things done.” Although it makes no mention of creating space either via padding or margin, if we look at the example provided, we see that it contains a prop called sectioned and the docs state the prop it used to “auto wrap content in a section.” Feel free to toggle the True/False buttons to confirm this does indeed create the spacing we’re looking for.

Let’s add the Card component and include the sectioned prop.

It seems our form has finally taken shape and our UI design is complete.

Add the Logic

Although our form is nice to look at, it doesn’t do very much at this point. Typing into the fields doesn’t capture any input and clicking the Save button does nothing. The only feature that appears to function is clicking on the back button.

If you’re new to React then this section introduces some of the fundamental elements of React such as state, events, and event handlers.

In React, the Forms can be configured as Controlled or Uncontrolled. If you google either concept you’ll find many articles that describe the differences and use cases for either approach. In our example we will configure the form as a Controlled form. What that means is that we’ll capture every keystroke and re-render the input in its current state.

Adding State and Handler Functions

Since we will be capturing every keystroke in two separate input fields we’ll opt to instantiate two instances of state. The first thing we need to do is import the useState Hook.

import { useState } from "react";

Now we can instantiate two unique instances of state called title and description. Instantiating state requires that we create both a state value and a setState function. React is very particular about state and requires that any updates to the state value use the setState function.

With our state instantiated, let’s create our event handler functions that manages all updates to the state. Handler functions aren’t required, however, they represent a React best practice in that developers expect a handler function as part of the convention and many times additional logic needs to take place before state is updated.

Since we have two state values to update, we create two separate handler functions for each one, but being that we’re also implementing a form, we also need an event handler to manage when the form is submitted.

Adding Events

We’re almost there. Now that state and our event handler functions are in place, it’s time to add the events and assign them the corresponding functions. The two events that we add are: onChange and onSubmit.

Let’s start with adding the onChange event to both of the TextField components. Not only will we need to add the event, but also, being that we are implementing a Controlled form, need to include the value prop and assign the value to its corresponding state value.

Take a moment to confirm that the form is capturing input by typing into the fields. If it works then we’re good to go.

The last event we’ll add is the onSubmit event. Our logic dictates that the form would only be submitted once the user clicks on the Save button, so that’s where we’ll add the event logic.

If we take a look at the documentation for the Page component we see that it includes an onAction prop. Although the documentation doesn’t go any further than providing an example we assume that’s the prop we use to trigger the onSubmit function.

Of course let’s confirm that everything is now tied together by clicking on the Save button. If everything worked we should see the following console log output:

SyntheticBaseEvent {_reactName: "onClick", _targetInst: null, type: "click", nativeEvent: PointerEvent, target: HTMLSpanElement...}

Clearing the Form

The very last step in our form submission process is to clear the form fields so that the merchant has a clean slate if they choose to add another product.

This tutorial was meant to introduce you to Polaris React components available in Shopify’s Polaris Design System. The library provides a robust set of components that have been meticulously designed by our UX teams and implemented by our Development teams. The Polaris github library is open source, so feel free to look around or set up the local development environment (which uses Storybook).

Joe Keohan is an RnD Technical Facilitator responsible for onboarding our new hire engineering developers. He has been teaching and educating for the past 10 years and is passionate about all things tech. Feel free to reach out on LinkedIn and extend your network or to discuss engineering opportunities at Shopify! When he isn’t leading sessions, you’ll find Joe jogging, surfing, coding and spending quality time with his family.

Development

To Thread or Not to Thread: An In-Depth Look at Ruby’s Execution Models

by Jean Boussier
May 31, 2022

Deploying Ruby applications using threaded servers has become widely considered as standard practice in recent years. According to the 2022 Ruby on Rails community survey, in which over 2,600 members of the global Rails community responded to a series of questions regarding their experience using Rails, threaded web servers such as Puma are by far the most popular deployment target. Similarly when it comes to job processors, the thread-based Sidekiq seems to represent the majority of the deployments.

In this post, I’ll explore the mechanics and reasoning behind this practice and share knowledge and advice to help you make well-informed decisions on whether or not you should utilize threads in your applications (and to that point—how many).

Why Are Threads the Popular Default?

While there are certainly many different factors for threaded servers' rise in popularity, their main selling point is that they increase an application’s throughput without increasing its memory usage too much. So to fully understand the trade-offs between threads and processes, it’s important to understand memory usage.

Memory Usage of a Web Application

Conceptually, the memory usage of a web application can be divided in two parts.

Two separate text boxes stacked on top of each other, the top one containing the words "Static memory" and the bottom containing the words "Processing memory". — Static memory and processing memory are the two key components of memory usage in a web application.

The static memory is all the data you need to run your application. It consists of the Ruby VM itself, all the VM bytecode that was generated while loading the application, and probably some static Ruby objects such as I18n data, etc. This part is like a fixed cost, meaning whether your server runs 1 or 10 threads, that part will stay stable and can be considered read-only.

The request processing memory is the amount of memory needed to process a request. There you'll find database query results, the output of rendered templates, etc. This memory is constantly being freed by the garbage collector and reused, and the amount needed is directly proportional to the amount of thread your application runs.

Based on this simplified model, we express the memory usage of a web application as:

processes * (static_memory + (threads * processing_memory))

So if you have only 512MiB available, with an application using 200MiB of static memory and needing 150MiB of processing memory, using two single threaded processes requires 700MiB of memory, while using a single process with two threads will use only 500MiB and fit in a Heroku dyno.

Two columns of text boxes next to each other. On the left, a column representing a single process shows a box with the text "Static Memory" at the top, and two boxes with the text "Thread #2 Processing Memory beneath it. In the column on the right which represents two single threaded processes, there are four boxes, which read: "Process 1 Static Memory", "Process #1 Processing Memory", "Process #2 Static Memory", and "Process #2 Processing memory" in order from top to bottom. — A single process with two threads uses less memory than two single threaded processes.

However this model, like most models, is a simplified depiction of reality. Let’s bring it closer to reality by adding another layer of complexity: Copy on Write (CoW).

Enter Copy on Write

CoW is a common resource management technique involving sharing resources rather than duplicating them until one of the users needs to alter it, at which point the copy actually happens. If the alteration never happens, then neither does the copy.

In old UNIX systems of the ’70s or ’80s, forking a process involved copying its entire addressable memory over to the new process address space, effectively doubling the memory usage. But since the mid ’90s, that’s no longer true as, most, if not all, fork implementations are now sophisticated enough to trick the processes into thinking they have their own private memory regions, while in reality they’re sharing it with other processes.

When the child process is forked, its pages tables are initialized to point to the parent’s memory pages. Later on, if either the parent or the child tries to write in one of these pages, the operating system is notified and will actually copy the page before it’s modified.

This means that if neither the child nor the parent write in these shared pages after the fork happens, forked processes are essentially free.

A flow chart with "Parent Process Static Memory" in a text box at the top. On the second row, there are two text boxes containing the text "Process 1 Processing Memory" and "Process 2 Processing Memory", connected to the top text box with a line to illustrate resource sharing by forking of the parent process. — Copy on Write allows for sharing resources by forking the parent process.

So in a perfect world, our memory usage formula would now be:

static_memory + (processes * threads * processing_memory)

Meaning that threads would have no advantage at all over processes.

But of course we're not in a perfect world. Some shared pages will likely be written into at some point, the question is how many? To answer this, we’ll need to know how to accurately measure the memory usage of an application.

Beware of Deceiving Memory Metrics

Because of CoW and other memory sharing techniques, there are now many different ways to measure the memory usage of an application or process. Depending on the context, some metrics can be more or less relevant.

Why RSS Isn’t the Metric You’re Looking For

The memory metric that’s most often shown by various administration tools, such as ps, is Resident Set Size (RSS). While RSS has its uses, it's really misleading when dealing with forking servers. If you fork a 100MiB process and never write in any memory region, RSS will report both processes as using 100MiB. This is inaccurate because 100MiB is being shared between the two processes—the same memory is being reported twice.

A slightly better metric is Proportional Set Size (PSS). In PSS, shared memory region sizes are divided by the number of processes sharing them. So our 100MiB process that was forked once should actually have a PSS of 50MiB. If you’re trying to figure out whether you’re nearing memory exhaustion, this is already a much more useful metric to look at because if you add up all the PSS numbers you get how much memory is actually being used—but we can go even deeper.

On Linux, you can get a detailed breakdown of a process memory usage though cat /proc/$PID/smaps_rollup. Here’s what it looks like for a Unicorn worker on one of our apps in production:

And for the parent process:

Let’s unpack what each element here means. First, the Shared and Private fields. As its name suggests, Shared memory is the sum of memory regions that are in use by multiple processes. Whereas Private memory is allocated for a specific process and isn’t shared by other processes. In this example, we see that out of the 771,912 kB of addressable memory only 437,928 kB (56.7%) are really owned by the Unicorn worker, the rest is inherited from the parent process.

As for Clean and Dirty, Clean memory is memory that has been allocated but never written to (things like the Ruby binary and various native shared libraries). Dirty memory is memory that has been written into by at least one process. It can be shared as long as it was only written into by the parent process before it forked its children.

Measuring and Improving Copy on Write Efficiency

We’ve established that shared memory is a key to maximizing efficiency of processes, so the important question here is how much of the static memory is actually shared. To approximate this, we compare the worker shared memory with the parent process RSS, which is 508,544 kB in this app, so:

 worker_shared_mem / master_rss

>>(18288 + 315648) / 508544.0 * 100

>>65.66

Here we see that about two-thirds of the static memory is shared:

A flow chart depicting worker shared memory, with Private and Parent Process Shared Static Memory in text boxes at the top, connecting to two separate columns, each containing Private Static Memory and Processing Memory. — By comparing the worker shared memory with the parent process RSS, we can see that two thirds of this app’s static memory is shared.

If we were looking at RSS, we’d think each extra worker would cost ~750MiB, but in reality it’s closer to ~427MiB, when an extra thread would cost ~257MiB. That’s still noticeably more, but far less than what the initial naive model would have predicted.

There’s a number of ways an application owner can improve CoW efficiency with the general idea being to load as many things as possible as part of the boot process before the server forks. This topic is very broad and could be a whole post by itself, but here are a few quick pointers.

The first thing to do is configure the server to fully load the application. Unicorn, Puma, and Sidekiq Enterprise all have a preload_app option for that purpose. Once that’s done, a common pattern that degrades CoW performance is memoized class variables, for example:

Such delayed evaluation both prevents that memory from being shared and causes a slowdown for the first request to call this method. The simple solution is to instead use a constant, but when it’s not possible, the next best thing is to leverage the Rails eager_load_namespaces feature, as shown here:

Now, locating these lazy loaded constants is the tricky part. Ruby heap-profiler is a useful tool for this. You can use it to dump the entire heap right after fork, and then after processing a few requests, see how much the process has grown and where these extra objects were allocated.

The Case for Process-Based Servers

So, while there are increased memory costs involved in using process-based servers, using more accurate memory metrics and optimizations like CoW to share memory between processes can alleviate some of this. But why use process-based servers such as Unicorn or Resque at all, given the increased memory cost? There are actually advantages to process-based servers that shouldn’t be overlooked, so let’s go through those.

Clean Timeout Mechanism

When running large applications, you may run into bugs that cause some requests to take much longer than desirable. There could be many reasons for that—they might be specifically crafted by a malicious actor to try to DOS your service, or they might be processing an unexpectedly large amount of data. When this occurs, being able to cleanly interrupt this request is paramount for resiliency. Process-based servers can kill the worker process and fork a fresh one to replace it, ensuring the request is cleanly interrupted.

Threads, however, can’t be interrupted cleanly. Since they directly share mutable resources with other threads, if you attempt to kill a single thread, you may leave some resources such as mutexes or database connections in an unrecoverable state, causing the other threads to run into various unrecoverable errors.

The Black Box of Global VM Lock Latency

Improved latency is another major advantage of processes over threads in Ruby (and other languages with similar constraints such as Python). A typical web application process will do two types of work: CPU and I/O. So two Ruby process might look like this:

Two rows of text boxes, containing separate boxes with the text "IO", "CPU", and "GC", representing the work of processes in a Ruby web application. — CPU and IOs in two processes in a Ruby application.

But in a Ruby process, because of the infamous Global VM Lock (GVL) , only one thread at a time can execute Ruby code, and when the garbage collector (GC) triggers, all threads are paused. So if we were to use two threads, the picture may instead look like this:

Two rows of text boxes, with the individual boxes containing the text "CPU", "IO", "GVL wait" and GC", representing the work of threads in a Ruby web application and the latency introduced by the GVL. — Global VM Lock (GVL) increases latency in Ruby threads.

So every time two threads need to execute Ruby code at the same time, the service latency increases. How much this happens varies considerably from one application to another and even from one request to another. If you think about it, to fully saturate a process with N threads an application only needs to spend less than 1 / N of its time waiting on I/O. So 50 percent I/O for two threads, 75 percent I/O for four threads, etc. And that’s only the saturation limit, given that a request’s use of I/O and CPU is very much unpredictable, an application doing 75 percent I/O with two threads will still frequently wait on the GVL.

The common wisdom in the Ruby community is that Ruby applications are relatively I/O heavy, but from my experience it’s not quite true, especially once you consider that GC pauses do acquire the GVL too, and Ruby applications tend to spend quite a lot of time in GC.

Web applications are often specifically crafted to avoid long I/O operations in the web request cycle. Any potentially slow or unreliable I/O operation like calling a third-party API or sending an email notification is generally deferred to a background job queue, so the remaining I/O in web requests are mostly reasonably fast database and cache queries. A corollary is that the job processing side of applications tends to be much more I/O intensive than the web side. So job processors like Sidekiq can more frequently benefit from a higher thread count. But even for web servers, using threads can be seen as a perfectly acceptable tradeoff between throughput per dollar and latency.

The main problem is that as of today there isn’t really a good way to measure how much the service latency is impacted by the GVL, so service owners are left in the dark. Since Ruby doesn’t provide any way to instrument the GVL, all we’re left with are proxy metrics, like gradually increasing or decreasing the number of threads and measuring the impact on the latency metrics, but that’s far from enough.

That’s why I recently put together a feature request and a proof of concept implementation for Ruby 3.2 to provide a GVL instrumentation API . It's a really low-level and hard to use API, but if it’s accepted I plan to publish a gem to expose simple metrics to know exactly how much time is spent waiting for the GVL, and I hope application performance monitoring services include it.

Ractors and Fibers—Not a Silver Bullet Solution

In the last few years, the Ruby community has been experimenting heavily with other concurrency constructs to potentially replace threads, known as Ractors and Fibers.

Ractors can execute Ruby code in parallel, rather than having one single GVL, each Ractor has its own lock, so they theoretically could be game changing. However Ractors can’t share any global mutable state, so even sharing a database connection pool or a logger between Ractors isn’t possible. That’s a major architectural challenge that would require most libraries to be heavily refactored, and the result would likely not be as usable. I hope to be proven wrong, but I don’t expect Ractors to be used as units of execution for sizable web applications any time soon.

As for Fibers, they’re essentially lighter threads that are cooperatively scheduled. So everything said in the previous sections about threads and the GVL applies to them as well. They’re very well suited for I/O intensive applications that mostly just move bytes streams around and don’t spend much time executing code, but any application that doesn’t benefit from more than a handful of threads won’t benefit from using fibers.

YJIT May Change the Status Quo

While it’s not yet the case, the advent of YJIT may significantly increase the need to run threaded servers in the future. Since just-in-time (JIT) compilers speedup code execution at the expense of unshareable memory usage, JITing Ruby will decrease CoW performance, but will also make applications proportionally more I/O intensive.

Right now, YJIT only offers modest speed gains, but if in the future it manages to provide even a two times speedup, it would certainly allow application owners to ramp up their number of web threads by as much to compensate for the increased memory cost.

Tips to Remember

Ultimately choosing between process versus thread-based servers involves many trade-offs, so it’s unreasonable to recommend either without first looking at an application’s metrics.

But in the abstract, here are a few quick takeaways to keep in mind:

Always enable application preloading to benefit from CoW as much as possible.
Unless your application fits on the smallest offering or your hosting provider, use a smaller number of larger containers instead of a bigger number of smaller containers. For instance a single 4CPU 2GiB of RAM box is more efficient than 4 boxes with 1CPU 512MiB of RAM each.
If latency is more important to you than keeping costs low, or if you have enough free memory for it, use Unicorn to benefit from the reliable request timeout.

Note: Unicorn must be protected from slow client attacks by a reverse proxy that buffer requests. If that’s a problem, Puma can be configured to run with a single thread per worker.

If using threads, start with only two threads unless you’re confident your application is indeed spending more than half its time waiting on I/O operations. This doesn’t apply to job processors since they tend to be much more I/O intensive and are much less latency sensitive, so they can easily benefit from higher thread counts.

Looking Ahead: Future Improvements to the Ruby Ecosystem

We’re exploring a number of avenues to improve the situation for both process and thread-based servers.

First, there’s the GVL instrumentation API mentioned previously that should hopefully allow application owners to make more informed trade-offs between throughput and latency. We could even try to use it to automatically apply backpressure by dynamically adjusting concurrency when GVL contention is over some threshold.

Additionally, threaded web servers could theoretically implement a reliable request timeout mechanism. When a request takes longer than expected, they could stop forwarding requests to the impacted worker and wait for all other requests to either complete or timeout before killing the worker and reforking it. That’s something Matthew Draper explored a few years ago and that seems doable.

Then, the CoW performance of Ruby itself could likely be improved further. Several patches have been merged for this purpose over the years, but we can probably do more. Notably we suspect that Ruby’s inline caches cause most of the VM bytecode to be unshared once it’s executed. I think we could also take some inspiration from what the Instagram engineering team did to improve Python’s CoW performance . For instance they introduced a gc.freeze() method that instructs the GC that all existing memory regions will become shared. Python uses this information to make more intelligent decisions around memory usage, like not using any free slots in these shared regions since it’s more efficient to allocate a new page than to dirty an old one.

Jean Boussier is a Rails Core team member, Ruby committer, and Senior Staff Engineer on Shopify's Ruby and Rails infrastructure team. You can find him on GitHub as @byroot or on Twitter at @_byroot.

If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Visit our Engineering career page to find out about our open positions. Join our remote team and work (almost) anywhere. Learn about how we’re hiring to design the future together—a future that is digital by design.

Development

Implementing Equality in Ruby

by Denis Defreyne
May 26, 2022

Ruby is one of the few programming languages that get equality right. I often play around with other languages, but keep coming back to Ruby. This is largely because Ruby’s implementation of equality is so nice.

Nonetheless, equality in Ruby isn't straightforward. There is #==, #eql?, #equal?, #===, and more. Even if you’re familiar with how to use them, implementing them can be a whole other story.

Let's walk through all forms of equality in Ruby and how to implement them.

Why Properly Implementing Equality Matters

We check whether objects are equal all the time. Sometimes we do this explicitly, sometimes implicitly. Here are some examples:

Do these two Employees work in the same Team? Or, in code: denis.team == someone.team.
Is the given DiscountCode valid for this particular Product? Or, in code: product.discount_codes.include?(given_discount_code).
Who are the (distinct) managers for this given group of employees? Or, in code: employees.map(&:manager).uniq.

A good implementation of equality is predictable; it aligns with our understanding of equality.

An incorrect implementation of equality, on the other hand, conflicts with what we commonly assume to be true. Here is an example of what happens with such an incorrect implementation:

The geb and geb_also objects should definitely be equal. The fact that the code says they’re not is bound to cause bugs down the line. Luckily, we can implement equality ourselves and avoid this class of bugs.

No one-size-fits-all solution exists for an equality implementation. However, there are two kinds of objects where we do have a general pattern for implementing equality: entities and value objects. These two terms come from domain-driven design (DDD), but they’re relevant even if you’re not using DDD. Let’s take a closer look.

Entities

Entities are objects that have an explicit identity attribute. Often, entities are stored in some database and have a unique id attribute corresponding to a unique id table column. The following Employee example class is such an entity:

Two entities are equal when their IDs are equal. All other attributes are ignored. After all, an employee’s name might change, but that does not change their identity. Imagine getting married, changing your name, and not getting paid anymore because HR has no clue who you are anymore!

ActiveRecord, the ORM that is part of Ruby on Rails, calls entities "models" instead, but they’re the same concept. These model objects automatically have an ID. In fact, ActiveRecord models already implement equality correctly out of the box!

Value Objects

Value objects are objects without an explicit identity. Instead, their value as a whole constitutes identity. Consider this Point class:

Two Points will be equal if their x and y values are equal. The x and y values constitute the identity of the point.

In Ruby, the basic value object types are numbers (both integers and floating-point numbers), characters, booleans, and nil. For these basic types, equality works out of the box:

Arrays of value objects are in themselves also value objects. Equality for arrays of value objects works out of the box—for example, [17, true] == [17, true]. This might seem obvious, but this isn’t true in all programming languages.

Other examples of value objects are timestamps, date ranges, time intervals, colors, 3D coordinates, and money objects. These are built from other value objects; for example, a money object consists of a fixed-decimal number and a currency code string.

Basic Equality (Double Equals)

Ruby has the == and != operators for checking whether two objects are equal or not:

Ruby’s built-in types all have a sensible implementation of ==. Some frameworks and libraries provide custom types, which will have a sensible implementation of ==, too. Here is an example with ActiveRecord:

For custom classes, the == operator returns true if and only if the two objects are the same instance. Ruby does this by checking whether the internal object IDs are equal. These internal object IDs are accessible using #__id__. Effectively, gizmo == thing is the same as gizmo.__id__ == thing.__id__.

This behavior is often not a good default, however. To illustrate this, consider the Point class from earlier:

The == operator will return true only when calling it on itself:

This default behavior is often undesirable in custom classes. After all, two points are equal if (and only if) their x and y values are equal. This behavior is undesirable for value objects (such as Point) and entities (such as the Employee class mentioned earlier).

The desired behavior for value objects and entities is as follows:

Image showing the desired behavior for value objects and entities. The first pairing for value objects checks if x and y (all attributes) are equal. The second pair for entities, checks whether the id attributes are equal. The third pair shows the default ruby check, which is whether internal object ids are equal

For value objects (a), we’d like to check whether all attributes are equal.
For entities (b), we’d like to check whether the explicit ID attributes are equal.
By default (c), Ruby checks whether the internal object IDs are equal.

Instances of Point are value objects. With the above in mind, a good implementation of == for Point would look as follows:

This implementation checks all attributes and the class of both objects. By checking the class, checking equality of a Point instance and something of a different class return false rather than raise an exception.

Checking equality on Point objects now works as intended:

The != operator works too:

A correct implementation of equality has three properties: reflexivity, symmetry, and transitivity.

Image with simple circles to describe the implementation of equality having three properties: reflexivity, symmetry, and transitivity, described below the image for more context

Reflexivity (a): An object is equal to itself: a == a
Symmetry (b): If a == b, then b == a
Transitivity (c): If a == b and b == c, then a == c

These properties embody a common understanding of what equality means. Ruby won’t check these properties for you, so you’ll have to be vigilant to ensure you don’t break these properties when implementing equality yourself.

IEEE 754 and violations of reflexivity

It seems natural that something would be equal to itself, but there is an exception. IEEE 754 defines NaN (Not a Number) as a value resulting from an undefined floating-point operation, such as dividing 0 by 0. NaN, by definition, is not equal to itself. You can see this for yourself:

This means that == in Ruby is not universally reflexive. Luckily, exceptions to reflexivity are exceedingly rare; this is the only exception I am aware of.

Basic Equality for Value Objects

The Point class is an example of a value object. The identity of a value object, and thereby equality, is based on all its attributes. That is exactly what the earlier example does:

Basic Equality for Entities

Entities are objects with an explicit identity attribute, commonly @id. Unlike value objects, an entity is equal to another entity if and only if their explicit identities are equal.

Entities are uniquely identifiable objects. Typically, any database record with an id column corresponds to an entity. Consider the following Employee entity class:

Other forms of ID are possible too. For example, books have an ISBN, and recordings have an ISRC. But if you have a library with multiple copies of the same book, then ISBN won’t uniquely identify your books anymore.

For entities, the == operator is more involved to implement than for value objects:

This code does the following:

The super call invokes the default implementation of equality: Object#==. On Object, the #== method returns true if and only if the two objects are the same instance. This super call, therefore, ensures that the reflexivity property always holds.
As with Point, the implementation Employee#== checks class. This way, an Employee instance can be checked for equality against objects of other classes, and this will always return false.
If @id is nil, the entity is considered not equal to any other entity. This is useful for newly-created entities which have not been persisted yet.
Lastly, this implementation checks whether the ID is the same as the ID of the other entity. If so, the two entities are equal.

Checking equality on entities now works as intended:

Blog post of Theseus

Implementing equality on entity objects isn’t always straightforward. An object might have an id attribute that doesn’t quite align with the object’s conceptual identity.

Take a BlogPost class, for example, with id, title, and body attributes. Imagine creating a BlogPost, then halfway through writing the body for it, scratching everything and starting over with a new title and a new body. The id of that BlogPost will still be the same, but is it still the same blog post?

If I follow a Twitter account that later gets hacked and turned into a cryptocurrency spambot, is it still the same Twitter account?

These questions don’t have a proper answer. That’s not surprising, as this is essentially the Ship of Theseus thought experiment. Luckily, in the world of computers, the generally accepted answer seems to be yes: if two entities have the same id, then the entities are equal as well.

Basic Equality with Type Coercion

Typically, an object is not equal to an object of a different class. However, this isn’t always the case. Consider integers and floating-point numbers:

Here, float_two is an instance of Float, and integer_two is an instance of Integer. They are equal: float_two == integer_two is true, despite different classes. Instances of Integer and Float are interchangeable when it comes to equality.

As a second example, consider this Path class:

This Path class provides an API for creating paths:

The Path class is a value object, and implementing #== could be done just as with other value objects:

However, the Path class is special because it represents a value that could be considered a string. The == operator will return false when checking equality with anything that isn’t a Path:

It can be beneficial for path == "/usr/bin/ruby" to be true rather than false. To make this happen, the == operator needs to be implemented differently:

This implementation of == coerces both objects to Strings, and then checks whether they are equal. Checking equality of a Path now works:

This class implements #to_str, rather than #to_s. These methods both return strings, but by convention, the to_str method is only implemented on types that are interchangeable with strings.

The Path class is such a type. By implementing Path#to_str, the implementation states that this class behaves like a String. For example, it’s now possible to pass a Path (rather than a String) to IO.open, and it will work because IO.open accepts anything that responds to #to_str.

String#== also uses the to_str method. Because of this, the == operator is reflexive:

Strict Equality

Ruby provides #equal? to check whether two objects are the same instance:

Here, we end up with two String instances with the same content. Because they are distinct instances, #equal? returns false, and because their content is the same, #== returns true.

Do not implement #equal? in your own classes. It isn’t meant to be overridden. It’ll all end in tears.

Earlier in this post, I mentioned that #== has the property of reflexivity: an object is always equal to itself. Here is a related property for #equal?:

Property: Given objects a and b. If a.equal?(b), then a == b.

Ruby won't automatically validate this property for your code. It’s up to you to ensure that this property holds when you implement the equality methods.
For example, recall the implementation of Employee#== from earlier in this article:

The call to super on the first line makes this implementation of #== reflexive. This super invokes the default implementation of #==, which delegates to #equal?. Therefore, I could have used #equal? rather than super:

I prefer using super, though this is likely a matter of taste.

Hash Equality

In Ruby, any object can be used as a key in a Hash. Strings, symbols, and numbers are commonly used as Hash keys, but instances of your own classes can function as Hash keys too—provided that you implement both #eql? and #hash.

The #eql? Method

The #eql? method behaves similarly to #==:

However, #eql?, unlike #==, does not perform type coercion:

If #== doesn’t perform type coercion, the implementations of #eql? and #== will be identical. Rather than copy-pasting, however, we’ll put the implementation in #eql?, and let #== delegate to #eql?:

I made the deliberate decision to put the implementation in #eql? and let #== delegate to it, rather than the other way around. If we were to let #eql? delegate to #==, there’s an increased risk that someone will update #== and inadvertently break the properties of #eql? (mentioned below) in the process.

For the Path value object, whose #== method does perform type coercion, the implementation of #eql? will differ from the implementation of #==:

Here, #== does not delegate to #eql?, nor the other way around.

A correct implementation of #eql? has the following two properties:

Property: Given objects a and b. If a.eql?(b), then a == b.
Property: Given objects a and b. If a.equal?(b), then a.eql?(b).

These two properties are not explicitly called out in the Ruby documentation. However, to the best of my knowledge, all implementations of #eql? and #== respect this property.

Ruby will not automatically validate that these properties hold in your code. It’s up to you to ensure that these properties aren’t violated.

The #hash Method

For an object to be usable as a key in a Hash, it needs to implement not only #eql?, but also #hash. This #hash method will return an integer, the hash code, that respects the following property:

Property: Given objects a and b. If a.eql?(b), then a.hash == b.hash.

Typically, the implementation of #hash creates an array of all attributes that constitute identity and returns the hash of that array. For example, here is Point#hash:

For Path, the implementation of #hash will look similar:

For the Employee class, which is an entity rather than a value object, the implementation of #hash will use the class and the @id:

If two objects are not equal, the hash code should ideally be different, too. This isn’t mandatory, however. It’s okay for two non-equal objects to have the same hash code. Ruby will use #eql? to tell objects with identical hash codes apart.

Avoid XOR for Calculating Hash Codes

A popular but problematic approach for implementing #hash uses XOR (the ^ operator). Such an implementation would calculate the hash codes of each individual attribute, and combine these hash codes with XOR. For example:

With such an implementation, the chance of a hash code collision, which means that multiple objects have the same hash code, is higher than with an implementation that delegates to Array#hash. Hash code collisions will degrade performance and could potentially pose a denial-of-service security risk.

A better way, though still flawed, is to multiply the components of the hash code by unique prime numbers before combining them:

Such an implementation has additional performance overhead due to the new multiplication. It also requires mental effort to ensure the implementation is and remains correct.

An even better way of implementing #hash is the one I’ve laid out before—making use of Array#hash:

An implementation that uses Array#hash is simple, performs quite well, and produces hash codes with the lowest chance of collisions. It’s the best approach to implementing #hash.

Putting it Together

With both #eql? and #hash in place, the Point, Path, and Employee objects can be used as hash keys:

Here, we use a Hash instance to keep track of a collection of Points. We can also use a Set for this, which uses a Hash under the hood, but provides a nicer API:

Objects used in Sets need to have an implementation of both #eql? and #hash, just like objects used as hash keys.

Objects that perform type coercion, such as Path, can also be used as hash keys, and thus also in sets:

We now have an implementation of equality that works for all kinds of objects.

Mutability, Nemesis of Equality

So far, the examples for value objects have assumed that these value objects are immutable. This is with good reason because mutable value objects are far harder to deal with.

To illustrate this, consider a Point instance used as a hash key:

The problem arises when changing attributes of this point:

Because the hash code is based on the attributes, and an attribute has changed, the hash code is no longer the same. As a result, collection no longer seems to contain the point. Uh oh!

There are no good ways to solve this problem except for making value objects immutable.

This isn’t a problem with entities. This is because the #eql? and #hash methods of an entity are solely based on its explicit identity—not its attributes.

So far, we’ve covered #==, #eql?, and #hash. These three methods are sufficient for a correct implementation of equality. However, we can go further to improve that sweet Ruby developer experience and implement #===.

Case Equality (Triple Equals)

The #=== operator, also called the case equality operator, isn’t really an equality operator at all. Rather, it’s better to think of it as a membership testing operator. Consider the following:

Here, Range#=== checks whether a range covers a certain element. It’s also common to use case expressions to achieve the same:

This is also where case equality gets its name. Triple-equals is called case equality, because case expressions use it.

You never need to use case. It’s possible to rewrite a case expression using if and ===. In general, case expressions tend to look cleaner. Compare:

The examples above all use Range#===, to check whether the range covers a certain number. Another commonly used implementation is Class#===, which checks whether an object is an instance of a class:

I’m rather fond of the #grep method, which uses #=== to select matching elements from an array. It can be shorter and sweeter than using #select:

Regular expressions also implement #===. You can use it to check whether a string matches a regular expression:

It helps to think of a regular expression as the (infinite) collection of all strings that can be produced by it. The set of all strings produced by /[a-z]/ includes the example string "+491573abcde". Similarly, you can think of a Class as the (infinite) collection of all its instances, and a Range as the collection of all elements in that range. This way of thinking clarifies that #=== really is a membership testing operator.

An example of a class that could implement #=== is a PathPattern class:

An example instance is PathPattern.new("/bin/*"), which matches anything directly under the /bin directory, such as /bin/ruby, but not /var/log.

The implementation of PathPattern#=== uses Ruby’s built-in File.fnmatch to check whether the pattern string matches. Here is an example of it in use:

Worth noting is that File.fnmatch calls #to_str on its arguments. This way, #=== automatically works on other string-like objects as well, such as Path instances:

The PathPattern class implements #===, and therefore PathPattern instances work with case/when, too:

Ordered Comparison

For some objects, it’s useful not only to check whether two objects are the same, but how they are ordered. Are they larger? Smaller? Consider this Score class, which models the scoring system of my university in Ghent, Belgium.

(I was a terrible student. I’m not sure if this was really how the scoring even worked — but as an example, it will do just fine.)

In any case, we benefit from having such a Score class. We can encode relevant logic there, such as determining the grade and checking whether or not a score is passing. For example, it might be useful to get the lowest and highest score out of a list:

However, as it stands right now, the expressions scores.min and scores.max will result in an error: comparison of Score with Score failed (ArgumentError). We haven’t told Ruby how to compare two Score objects. We can do so by implementing Score#&<=>:

An implementation of #<=> returns four possible values:

It returns 0 when the two objects are equal.
It returns -1 when self is less than other.
It returns 1 when self is greater than other.
It returns nil when the two objects cannot be compared.

The #<=> and #== operators are connected:

Property: Given objects a and b. If (a <=> b) == 0, then a == b.
Property: Given objects a and b. If (a <=> b) != 0, then a != b.

As before, it’s up to you to ensure that these properties hold when implementing #== and #<=>. Ruby won’t check this for you.

For simplicity, I’ve left out the implementation Score#== in the Score example above. It’d certainly be good to have that, though.

In the case of Score#<=>, we bail out if other is not a score, and otherwise, we call #<=> on the two values. We can check that this works: the expression Score.new(6) <=> Score.new(12) evaluates to -1, which is correct because a score of 6 is lower than a score of 12. (Did you know that the Belgian high school system used to have a scoring system where 1 was the highest and 10 was the lowest? Imagine the confusion!)

With Score#<=> in place, scores.max now returns the maximum score. Other methods such as #min, #minmax, and #sort work as well.

However, we can’t yet use operators like <. The expression scores[0] < scores[1], for example, will raise an undefined method error: undefined method `<' for #<Score:0x00112233 @value=6>. We can solve that by including the Comparable mixin:

By including Comparable, the Score class automatically gains the <, <=, >, and >= operators, which all call <=> internally. The expression scores[0] < scores[1] now evaluates to a boolean, as expected.

The Comparable mixin also provides other useful methods such as #between? and #clamp.

Wrapping Up

We talked about the following topics:

the #== operator, used for basic equality, with optional type coercion
#equal?, which checks whether two objects are the same instance
#eql? and #hash, which are used for testing whether an object is a key in a hash
#===, which isn’t quite an equality operator, but rather a “is kind of” or “is member of” operator
#<=> for ordered comparison, along with the Comparable module, which provides operators such as < and >=

You now know all you need to know about implementing equality in Ruby. For more information check out the following resources:

The Ruby documentation is a good place to find out more about equality:

I also found the following resources useful:

Kevin Newton’s article on Ruby type conversion
Brandon Weaver’s article on triple equals

Denis is a Senior Software Engineer at Shopify. He has made it a habit of thanking ATMs when they give him money, thereby singlehandedly staving off the inevitable robot uprising.

Development

Shopify Invests in Research for Ruby at Scale

by Chris Seaton
May 16, 2022

Shopify is continuing to invest on Ruby on Rails at scale. We’ve taken that further recently by funding high-profile academics to focus their work towards Ruby and the needs of the Ruby community. Over the past year we have given nearly half a million dollars in gifts to influential researchers that we trust to make a significant impact on the Ruby community for the long term.

Shopify engineers and researchers at a recent meetup in London

We want developments in programming languages and their implementations to be explored in Ruby, so that support for Ruby's unique properties are built in from the start. For example, Ruby's prevalent metaprogramming motivated a whole new kind of inline caching to be developed and presented as a paper at one of the top programming language conferences, and Ruby's unusually loose C extension API motivated a new kind of C interpreter to run virtualized C. These innovations wouldn't have happened if academics weren't looking at Ruby.

We want programming language research to be evaluated against the workloads that matter to companies using Ruby. We want researchers to understand the scale of our code bases, how frequently they're deployed, and the code patterns we use in them. For example, a lot of VM research over the last couple of decades has traded off a long warmup optimization period for better peak performance, but this doesn't work for companies like Shopify where we're redeploying very frequently. Researchers aren't aware of these kinds of problems unless we partner with them and guide them.

We think that working with academics like this will be self-perpetuating. With key researchers thinking and talking about Ruby, more early career researchers will consider working with Ruby and solving problems that are important to the Ruby community.

Let’s meet Shopify’s new research collaborators.

Professor Laurence Tratt

Professor Laurence Tratt is the Shopify and Royal Academy of Engineering Research Chair in Language Engineering at King’s College London. Jointly funded by Shopify, the Royal Academy, and King’s College, Laurie is looking at the possibility of automatically generating a just-in-time compiler from the existing Ruby interpreter through hardware meta-tracing and basic-block stitching.

Laurie has an eclectic and influential research portfolio, and extensive writing on many aspects of improving dynamic languages and programming. He has context from the Python community and the groundbreaking work towards meta-tracing in the PyPy project. Laurie also works to build the programming language implementation community for the long term by co-organising a summer school series for early career researchers, bringing them together with experienced researchers from academia and industry.

Professor Steve Blackburn

Professor Steve Blackburn is an academic at the Australian National University and Google Research. Shopify funded his group’s work on MMTk, the memory management toolkit, a general library for garbage collection that brings together proven garbage collection algorithms with a framework for research into new ideas for garbage collection. We’re putting MMTk into Ruby so that Ruby can get the best current collectors today and future garbage collectors can be tested against Ruby.

Steve is a world-leading expert in garbage collection, and Shopify’s funding is putting Ruby’s unique requirements for memory management into his focus.

Dr Stefan Marr

Dr Stefan Marr is a Senior Lecturer at the University of Kent in the UK and a Royal Society Industrial Fellow. With the support of Shopify, he’s examining how we can make interpreters faster and improve interpreter startup and warmup time.

Stefan has a distinguished reputation for benchmarking techniques, differential analysis between languages and implementation techniques, and dynamic language implementation. He co-invented a new method for inline caching that has been instrumental for improving the performance of Ruby’s metaprogramming in TruffleRuby.

Shopify engineers and research collaborators discuss how to work together to improve Ruby

We’ve been bringing together the researchers that we’re funding with our senior Ruby community engineers to share their knowledge of what’s already possible and what could be possible, combining our understanding of how Ruby and Rails are used at scale today and what the community needs.

These external researchers are all in addition to our own internal teams doing publishable research-level work on Ruby, with YJIT and TruffleRuby, and more efforts.

Part of Shopify’s Ruby and Rails Infrastructure Team listening to research proposals

We’ll be looking forward to sharing more about our investments in Ruby research over the coming years in blog posts and academic papers.

Chris Seaton has a PhD in optimizing Ruby and works on TruffleRuby, a highly optimizing implementation of Ruby, and research projects at Shopify.

Development

Maestro: The Orchestration Language Powering Shopify Flow

by Thiago Tonelli
May 13, 2022

Adagio misterioso

Shopify recently unveiled a new version of Shopify Flow. Merchants extensively use Flow’s workflow language and associated execution engine to customize Shopify, automate tedious, repetitive tasks, and focus on what matters. Flow comes with a comprehensive library of templates for common use cases, and detailed documentation to guide merchants in customizing their workflows.

For the past couple of years my team has been working on transitioning Flow from a successful Shopify Plus App into a platform designed to power the increasing automation and customization needs across Shopify. One of the main technical challenges we had to address was the excessive coupling between the Flow editor and engine. Since they shared the same data structures, the editor and engine couldn't evolve independently, and we had limited ability to tailor these data structures for their particular needs. This problem was significant because editor and engine have fundamentally very different requirements.

The Flow editor provides a merchant-facing visual workflow language. Its language must be declarative, capturing the merchant’s intent without dealing with how to execute that intent. The editor concerns itself mainly with usability, understandability, and interactive editing of workflows. The Flow engine, in turn, needs to efficiently execute workflows at scale in a fault-tolerant manner. Its language can be more imperative, but it must have good support for optimizations and have at-least-once execution semantics that ensures workflow executions recover from crashes. However, editor and engine also need to play together nicely. For example, they need to agree on the type system, which is used to find user errors and to support IDE-like features, such as code completion and inline error reporting within the visual experience.

We realized it was important to tackle this problem right away, and it was crucial to get it right while minimizing disruptions to merchants. We proceeded incrementally.

First, we designed and implemented a new domain-specific orchestration language that addressed the requirements of the Flow engine. We call this language Maestro. We then implemented a new, horizontally scalable engine to execute Maestro orchestrations. Next, we created a translation layer from original Flow workflow data structures into Maestro orchestrations. This allowed us to execute existing Flow workflows with the new engine. At last, we slowly migrated all Flow workflows to use the new engine, and by BFCM 2020 essentially all workflows were executing in the new engine.

We were then finally in a position to deal with the visual language. So we implemented a brand new visual experience, including a new language for the Flow editor. This language is more flexible and expressive than the original, so any of the existing workflows could be easily migrated. The language also can be translated into Maestro orchestrations, so it could be directly executed by the new engine. Finally, once we were satisfied with the new experience, we started migrating existing Flow workflows, and by early 2022, all Flow workflows had been migrated to use the new editor and new engine.

In the remainder of this post I want to focus on the new orchestration language, Maestro. I’ll give you an overview of its design and implementation, and then focus on how it neatly integrates with and addresses the requirements of the new version of Shopify Flow.

A Sample of Maestro

Allegro grazioso

Let’s take a quick tour to get a taste of what Maestro looks like and what exactly it does. Maestro isn’t a general purpose programming language, but rather an orchestration language focused solely on coordinating the sequence in which calls to functions on some host language are made, while capturing which data is passed between those function calls. For example, suppose you want to implement code that calls a remote service to fetch some relevant customers and then deletes those customers from the database. The Maestro language can’t implement the remote service call nor the database call themselves, but it can orchestrate those calls in a fault-tolerant fashion. The main benefit of using Maestro is that the state of the execution is precisely captured and can be made durable, so you can observe the progression and restart where you left in the presence of crashes.

The following Maestro code, slightly simplified for presentation, implements an orchestration similar to the example above. It first defines the shape of the data involved in the orchestration: an object type called Customer with a few attributes. It then defines three functions. Function fetch_customers takes no parameters and returns an array of Customers. Its implementation simply performs a GET HTTP request to the appropriate service. The delete_customer function, in this example, simulates the database deletion by calling the print function from the standard library. The orchestration function represents the main entry point. It uses the sequence expression to coordinate the function calls: first call fetch_customers, binding the result to the customers variable, then map over the customers calling delete_customer on each.

Maestro functions declare interfaces to encapsulate expressions: the bodies of fetch_customers and delete_customer are call expressions, and the body of orchestration is a sequence expression that composes other expressions. But at some point we must yield to the host language to implement the actual service request, database call, print, and so on. This is accomplished by a function whose body is a primitive expression, meaning it binds to the host language code registered under the declared key. For example, these are the declarations of the get and print functions from the standard library of our Ruby implementation:

We now can use the Maestro interpreter to execute the orchestration function. This is one possible simplified output from the command line:

The output contains the result of calling print twice, once for each of the customers returned by the fetch service. The interesting aspect here is that the -c flag instructed the interpreter to also dump checkpoints to the standard output.

Checkpoints are what Maestro uses to store execution state. They contain enough information to understand what has already happened in the orchestration and what wasn’t completed yet. For example, the first checkpoint contains the result of the service request that includes a JSON object with the information about customers to delete. In practice, checkpoints are sent to durable storage, such as Kafka, Redis, or MySQL. Then, if the interpreter stops for some reason, we can restart and point it to the existing checkpoints. The interpreter can recover by skipping expressions for which a checkpoint already exists. If we crash while deleting customers from the database, for example, we wouldn’t re-execute the fetch request because we already have its result.

The checkpoints mechanism allows Maestro to provide at-least-once semantics for primitive calls, exactly what’s expected of Shopify Flow workflows. In fact, the new Flow engine, at a high level, is essentially a horizontally scalable, distributed pool of workers that execute the Maestro interpreter on incoming events for orchestrations generated by Flow. Checkpoints are used for fault tolerance as well as to give merchants feedback on each execution step, current status, and so on.

Flow and Maestro Ensemble

Presto festoso

Now that we know what Maestro is capable of, let’s see how it plays together with Flow. The following workflow, for example, shows a typical Flow automation use case. It triggers when orders in a store are created and checks for certain conditions in that order, based on the presence of discount codes or the customer’s email. If the condition predicate matches successfully, it adds a tag to the order and subsequently sends an email to the store owner to alert of the discount.

Screenshot of the Flow app showing the visualization of creating a workflow based on conditions — A typical Flow automation use case

Consider a merchant using the Flow App to create and execute this workflow. There are four main activities involved

navigating the possible tasks and types to use in the workflow
validating that the workflow is correct
activating the workflow so it starts executing on events
monitoring executions.

Catalog of Tasks and Types

The Flow Editor displays a catalog of tasks for merchants to pick from. Those are triggers, conditions, and actions provided both by Shopify and Shopify Apps via Shopify Flow Connectors. Furthermore, Flow allows merchants to navigate Shopify’s GraphQL Admin API objects in order to select relevant data for the workflow. For example, the Order created trigger in this workflow conceptually brings an Order resource that represents the order that was just created. So, when the merchant is defining a condition or passing arguments to actions, Flow assists in navigating the attributes reachable from that Order object. To do so, Flow must have a model of the GraphQL API types and understand the interface expected and provided by tasks. Flow achieves this by building on top of Maestro types and functions, respectively.

Flow models types as decorated Maestro types: the structure is defined by Maestro types, but Flow adds information, such as field and type descriptions. Most types involved in workflows come from APIs, such as the Shopify GraphQL Admin API. Hence, Flow has an automated process to consume APIs and generate the corresponding Maestro types. Additional types can be defined, for example, to model data included in the events that correspond to triggers, and model the expected interface of actions. For instance, the following types are simplified versions of the event data and Shopify objects involved in the example:

Flow then uses Maestro functions and calls to model the behavior of triggers, conditions, and actions. The following Maestro code shows function definitions for the trigger and actions involved in the workflow above.

Actions are mapped directly to Maestro functions that define the expected parameters and return types. An action used in a workflow is a call to the corresponding function. A trigger, however, is mapped to a data hydration function that takes event data, which often includes only references by IDs, and loads additional data required by the workflow. For example, the order_created function takes an OrderCreatedTrigger, which contains the order ID as an Integer, and performs API requests to load an Order object, which contains additional fields like name and discountCode. Finally, conditions are currently a special case in that they’re translated to a sequence of function calls based on the predicate defined for the condition (more on that in the next section).

Workflow Validation

Once a workflow is created, it needs validation. For that, Flow composes a Maestro function representing the whole workflow. The parameter of the workflow function is the trigger data since it’s the input for its execution. The body of the function corresponds to the transitions and configurations of tasks in the workflow. For example, the following function corresponds to the example:

The first call in the sequence corresponds to the trigger function that’s used to hydrate objects from the event data. The next three steps correspond to the logical expression configured for the condition. Each disjunction branch becomes a function call (to eq and ends_with, respectively), and the result is computed with or. A Maestro match expression is used to pattern match on the result. If it’s true, the control flow goes to the sequence expression that calls the functions corresponding to the workflow actions.

Flow now can rely on Maestro static analysis to validate the workflow function. Maestro will type check, verify that every referred variable is in scope, verify that object navigation is correct (for example, that order.customer.email is valid), and so on. Then, any error found through static analysis is mapped back to the corresponding workflow node and presented in context in the Editor. In addition to returning errors, static analysis results contain symbol tables for each expression indicating which variables are in scope and what their types are. This supports the Editor in providing code completion and other suggestions that are specific for each workflow step. The following screenshot, for example, shows how the Editor can guide users in navigating the fields present in objects available when selecting the Add order tags action.

Flow App Editor screenshot shows how the Editor can guide users in navigating the fields present in objects available when selecting the Add order tags action. The editor shows detailed information for ther. — Shopify Flow App editor

Note that transformation and validation run while a Flow workflow is being edited, either in the Flow Editor or via APIs. This operation is synchronous and, thus, must be very fast since merchants are waiting for the results. This architecture is similar to how modern IDEs send source code to a language service that parses the code into a lower level representation and returns potential errors and additional static analysis results.

Workflow Activation

Once a workflow is ready, it needs to be activated to start executing. The process is initially similar to validation in that Flow generates the corresponding Maestro function. However, there are a few additional steps. First, Maestro performs a static usage analysis: for each call to a primitive function it computes which attributes of the returned type are used by subsequent steps. For example, the call to shopify::admin::order_created returns a tuple (Shop, Order) but not all attributes of those types are used. In particular, order.customer.name isn’t used by this workflow. It wouldn’t only be inefficient to hydrate that value, in the presence of recursive definitions (such as Order has a Customer who has Orders), it would be impossible to determine where to stop digging into the type graph. The result of usage analysis is then passed at runtime to the host function implementation. The runtime can use it to tailor how it computes the values it returns, for instance, by optimizing the queries to the Admin GraphQL API.

Second, Maestro performs a compilation step. The idea is to apply optimizations, removing anything unnecessary for the runtime execution of the function, such as type definitions and auxiliary functions that aren’t invoked by the workflow function. The result is a simplified, small, and efficient Maestro function. The compiled function is then packaged together with the result of usage analysis and becomes an orchestration. Finally, the orchestration is serialized and deployed to the Flow Engine that observes events and runs the Maestro interpreter on the orchestration.

Monitoring Executions

As the Flow Engine executes orchestrations, the Maestro interpreter emits checkpoints. As we discussed before, checkpoints are used by the engine when restarting the interpreter to ensure at-least-once semantics for actions. Additionally, checkpoints are sent back to Flow to feed the Activity page, which lists workflow executions. Since checkpoints have detailed information about the output of every primitive function call, they can be used to map back to the originating workflow step and offer insight into the behavior of executions.

A screenshot of the Flow run log from the Activity Page. It displays the results of the workflow execution at each step — Shopify Flow Run Log from the Activity page

For instance, the image above shows a Run Log for a specific execution of the example, which can be accessed from the Activity page. Note that Flow highlights the branch of the workflow that executed and which branch of the condition disjunction actually evaluated to true at runtime. All this information comes directly from interpreting checkpoints and mapping back to the workflow.

Outro: Future Work

Largo maestoso

In this post I introduced Maestro, a domain-specific orchestration language we developed to power Shopify Flow. I gave a sample of what Maestro looks like and how it neatly integrates with Flow, supporting features of both the Flow Editor as well as the Flow Engine. Maestro has been powering Flow for a while, but we are planning more, such as:

Improving the expressiveness of the Flow workflow language, making better use of all the capabilities Maestro offers. For example, allowing the definition of variables to bind the result of actions for subsequent use, support for iteration, pattern matching, and so on.
Implementing additional optimizations on deployment, such as merging Flow workflows as a single orchestration to avoid redundant hydration calls for the same event.
Using the Maestro interpreter to support previewing and testing of Flow workflows, employing checkpoints to show results and verify assertions.

If you are interested in working with Flow and Maestro or building systems from the ground up to solve real-world problems. Visit our Engineering career page to find out about our open positions. Join our remote team and work (almost) anywhere. Learn about how we’re hiring to design the future together—a future that is digital by design.

Development

Our Experience Porting the YJIT Ruby Compiler to Rust

by Maxime Chevalier-Boisvert
May 11, 2022

In this post, I want to give a nuanced perspective on our experience porting YJIT from C to Rust. I'll talk about the positives, but also discuss the things that we found challenging or suboptimal in our experience.

Development

Building a Business System Integration and Automation Platform at Shopify

by Will Watkinson
Apr 22, 2022

Companies organize and automate their internal processes with a multitude of business systems. Since companies function as a whole, these systems need to be able to talk to one another. At Shopify, we took advantage of Ruby, Rails, and our scale with these technologies to build a business system integration solution.

The Modularization of Business Systems

In step with software design’s progression from monolithic to modular architecture, business systems have proliferated over the past 20 years, becoming smaller and more focused. Software hasn’t only targeted the different business domains like sales, marketing, support, finance, legal, and human resources, but the niches within or across these domains, like tax, travel, training, documentation, procurement, and shipment tracking. Targeted applications can provide the best experience by enabling rapid development within a small, well defined space.

The Gap

The transition from monolithic to modular architecture doesn’t remove the need for interaction between modules. Maintaining well-defined, versioned interfaces and integrating with other modules is one of the biggest costs of modularization. In the business systems space, however, it doesn’t always make sense for vendors to take responsibility for integration, or do it in the same way.

Business systems are built on different tech stacks with different levels of competition and different customer requirements. This landscape leads to business systems with asymmetric interfaces (from SOAP to FTP to GraphQL) and integration capabilities (from complete integration platforms to nothing). Businesses are left with a gap between their systems and no clear, easy way to fill it.

Organic Integration

Connecting these systems on an as needed basis leads to a hacky hodgepodge of:

ad hoc code (often running on individual’s laptops)
integration platforms like Zapier
users downloading and uploading CSVs
third party integration add ons from app stores
out of the box integrations
custom integrations built on capable business systems.

Frequently data won’t be going from the source system directly to the target system, but has multiple layovers in whatever systems it could integrate with. The only determining factors are the skillsets and creativity of the people involved in building the integration.

When a company is small this can work, but as companies scale and the number of integrations grow it can become more difficult to manage. This is also true with monolithic architecture.

Integration Platform as a Service

One solution, as validated by the existence of numerous Integration Platform as a Service (IPaaS) solutions like Mulesoft, Dell Boomi, and Zapier, is yet another piece of software that’s responsible for integrating business systems. The consistency provided by using one application for all integration can help.

Mulesoft

At Shopify, we created a small team of business system integration developers and put them to work building on Mulesoft as a potential solution, through we identified that it had some limitations.

Isolation from Shopify Development

Shopify employs thousands of developers. We have infrastructure, training, and security teams. We maintain a multitude of packages and have tons of Slack channels for getting help, discussing ideas, and learning about best practices. Shopify is intentionally narrow in the technologies it uses (Ruby, React, and Go) to benefit from this scale.

Mulesoft is a proprietary platform leveraging XML configuration for the Java virtual machine. This isn’t part of Shopify’s tech stack, so we missed out on many of the advantages of developing at Shopify.

Integrating Internal Applications

Mulesoft’s cloud runtime takes care of infrastructure for its users, a huge advantage of using the platform. However, Shopify has a number of internal services, like shipment tracking, as well as infrastructure, like Kafka, that for security reasons are used within Shopify’s cloud. This meant that we would need to build infrastructure skills on our team to host Mulesoft on our own cloud.

Although using Mulesoft initially seemed to lower the costs of connecting business systems, due to our unique situation, it had more drawbacks than developing on Shopify’s tech stack.

Building on Shopify’s Stack

Unless performance is paramount, in which case we use Go, Ruby is Shopify’s choice for backend development. Generally Shopify uses the Rails framework, so if we’re going to start building business system integrations on Shopify’s tech stack, Ruby on Rails is our choice. The logic for choosing Ruby on Rails within the context of development at Shopify is straightforward, but how do we use it for business system integration?

The Design Priorities

When the platform is complete, we want to build reliable integrations quickly. To turn that idea into a design, we need to look at the technical aspects of business system integration that differentiate it from the standard application development Rails is designed around.

Minimal

Generally applications are structured around a domain and get to determine the requirements, the data they will and won’t accept. An integration, however, isn’t the source of truth for anything. Any validation we introduce in an integration will be, at best, a duplication of logic in the target application. At worst our logic will create erroneous errors.

I did this the other day with a Sorbet Struct. I was using it to organize data before posting it. Unfortunately a field was required in the struct that wasn’t required in the target system. This resulted in records failing in transit when the target system would have accepted them.

Transparent

Many business systems are highly configurable. Changes in their configuration can lead to changes in their APIs, affecting integrations.

Airtable, for example, uses the column names as the JSON keys in their API, so changing a column name in the user interface can break an integration. We need to provide visibility into exactly what integrations are doing to help system admins avoid creating errors and quickly resolve them when they arise.

Flexible

Business systems are diverse, created at different times by different developers using different technologies and design patterns. For integration work this⁠—most importantly⁠—leads to a wide variety of interfaces like FTP, REST, SOAP, JSON, XML, and GraphQL. If we want a centralized, standardized place to build integrations it needs to support whatever requirements are thrown at it.

Secure

Integrations may process sensitive information, such as personally identifiable information (PII) or compensation information. We need to make sure that integrations apply appropriate security.

Reusable

Small, point to point integrations are the most reliable and maintainable. This design has the potential to create a lot of duplicate code and infrastructure. If we want to build integrations quickly we need to reuse as much as possible.

Implementation

Those are some nice high-level design priorities. How did we implement them?

Documentation

From the beginning of the project, documentation has been a priority. We document

decisions that we’re making, so they’re understood and challenged in the future as needs change
the integrations living on our platform
the clients we’ve implemented for connecting to different systems and how to use them
how to build on the platform as a whole.

Initially we were using GitHub’s built-in wiki, but being able to version control our documentation and commit updates alongside the code made it easier to trace changes and ensure documentation was being kept up to date. Fortunately Shopify’s infrastructure makes it very easy to add a static site to a git repository.

Design priorities covered: transparency, reusability

Language Features

Ruby is a mature, feature-rich language. Beyond being Turing complete, over the years it’s added a plethora of features to make programming simpler and more concise. It also has an extensive package ecosystem thanks to Ruby’s wide usage, long life, and generous community. In addition to reusing our code, we’re able to leverage other developer’s and organization’s code. Many business systems have great, well-maintained gems, so integrating with them is as simple as adding the gem and credentials.

Design priorities covered: reusability

Rails Engines

We reused Shopify Core’s architecture, designing our application as a modular monolith made up of Rails Engines. Initially the application didn’t take advantage of Rails Engines and simply used namespaces within the app directory. It quickly became apparent that this model made tracking down an individual integration’s code difficult. You have to go through every one of the app directories, controllers, helpers, and more to see if an integration’s namespace was present.

After a lot of research and a few conversations with my Shopify engineering mentor, I began to understand Rails Engines. Rails engines are a great fit for our platform because integrations have relatively obvious boundaries, so it’s easy and advantageous to modularize them.

This design enabled us to reuse the same infrastructure for all our integrations. It also enabled us to share code across integrations by creating a common Rails Engine, without the overhead of packaging it up into rubygems or duplicating it. This reduces both development and maintenance costs.

In addition, this architecture benefitted transparency by keeping all of the code in one place and modularizing it. It’s easy to know what integrations exist and what code belongs to them.

Design priorities covered: reusability, transparent

Eliminating Data Storage

To reduce complexity and promote transparency and security, our business system integration platform won’t be the source of truth for any business data. The business data comes from other business systems and passes through our application.

Design priorities covered: transparent, minimal, secure

Actions

Business system integration consists almost entirely of business logic. In Rails, there are multiple places this could live, but they generally involve abstractions designed around building standalone applications, not integrations. Using one of these abstractions would add complexity and obfuscate the logic.

Actions were floating around Shopify as a potential home for business logic. They have the same structure as Active Jobs, one public method, perform, and don’t reference any other Actions. The Action concept provides consistency, making all integration logic easy to find. It also provides transparency by putting all business logic in one place, so it’s only necessary to look at one Action to understand a data flow.

One of the side effects of Actions is code duplication. This was a trade-off we accepted. Given that integrations should be acting independently, we would prefer to duplicate some code than tightly couple integrations.

Design priorities covered: transparent, minimal

Embracing Hashes

Dataflows are the purpose of our application. In every integration we are dealing with at least two API abstractions of complex systems. Introducing our own abstractions on top of these abstractions can quickly compound complexity. If we want the application to be transparent, it needs to be obvious what data is flowing through it and how the data is being modified.

Most of the data we’re working with is JSON. In Ruby, JSON is represented as a hash, so working with hashes directly often provides the best transparency with the least room for introducing errors.

I know, I know. We all hate to see strings in our code, but hear me out. You receive a JSON payload. You need to transform it and send out another JSON payload with different keys. You could map the original payload to an object, map that object to another object, and map the final object back to JSON. If you want to track that transformation, though, you need to track it through three transformations. On the other hand, you could use a hash and a transform function and have the mapping clearly displayed.

Using hashes leads to more transparency than abstracting them away, but it also can lead to typos and therefore errors, so it’s important to be careful. If you’re using a string key multiple times, turn it into a constant.

Design priorities covered: transparent, minimal

Low-level Mocking

At Shopify, we generally use Mocha for mocking, but for our use case we default to WebMock. WebMock mocks at the request level, so you see the URL, including query parameters, headers, and request body explicitly in tests. This makes it easy to work directly with business systems API documentation because this is the level it’s documented at, and it allows us to understand exactly what our integrations are doing.

There are some cases, though, where we use Mocha, for example with SOAP. Reading a giant XML text string doesn’t provide useful visibility into what data is being sent. WebMock tests also become complex when many requests are involved in the integration. We’re working on improving the testing experience for complex integrations with common factories and prebuilt WebMocks.

Design priorities covered: transparent

Shopify

Perhaps most importantly, we’ve been able to tap into development at Shopify by leveraging our:

infrastructure, so all we have to do to stand up an application or add a component is run dev runtime …
training team to help onboard our developers
developer pipeline for hiring
observability through established logging, metrics and tracing setups
internal shipment tracking service
security team standards and best practices

The list could go on forever.

Design priorities covered: reusability, security

It’s been a year since work on our Rails integration platform began. Now, we have 18 integrations running, have migrated all our Mulesoft apps to the new platform, have doubled the number of developers from one to two and have other teams building integrations on the platform. The current setup enables us to build simple integrations, the majority of our use case, quickly and securely with minimal maintenance. We’re continuing to work on ways to minimize and simplify the development process, while supporting increased complexity, without harming transparency. We’re currently focused on improving test mock management and the onboarding process and, of course, building new integrations.

Will is a Senior Developer on the Solutions Engineering Team. He likes building systems that free people to focus on creative, iterative, connective work by taking advantage of computers' scalability and consistency.

Development

Code Ranges: A Deeper Look at Ruby Strings

by Kevin Menard
Apr 14, 2022

Contributing to any of the Ruby implementations can be a daunting task. A lot of internal functionality has evolved over the years or been ported from one implementation to another, and much of it is undocumented. This post is an informal look at what makes encoding-aware strings in Ruby functional and performant. I hope it'll help you get started digging into Ruby on your own or provide some additional insight into all the wonderful things the Ruby VM does for you.

Ruby has an incredibly flexible, if not unusual, string representation. Ruby strings are generally mutable, although the core library has both immutable and mutable variants of many operations. There’s also a mechanism for freezing strings that makes String objects immutable on a per-object or per-file basis. If a string literal is frozen, the VM will use an interned version of the string. Additionally, strings in Ruby are encoding-aware, and Ruby ships with 100+ encodings that can be applied to any string, which is in sharp contrast to other languages that use one universal encoding for all its strings or prevent the construction of invalid strings.

Depending on the context, different encodings are applied when creating a string without an explicit encoding. By default, the three primary ones used are UTF-8, US-ASCII, and ASCII-8BIT (aliased as BINARY). The encoding associated with a string can be changed with or without validation. It is possible to create a string with an underlying byte sequence that is invalid in the associated encoding.

The Ruby approach to strings allows the language to adapt to many legacy applications and esoteric platforms. The cost of this flexibility is the runtime overhead necessary to consider encodings in nearly all string operations. When two strings are appended, their encodings must be checked to see if they're compatible. For some operations, it's critical to know whether the string has valid data for its attached encoding. For other operations, it's necessary to know where the character or grapheme boundaries are.

Depending on the encoding, some operations are more efficient than others. If a string contains only valid ASCII characters, each character is one byte wide. Knowing each character is only a byte allows operations like String#[], String#chr, and String#downcase to be very efficient. Some encodings are fixed width—each "character" is exactly N bytes wide. (The term "character" is vague when it comes to Unicode. Ruby strings (as of Ruby 3.1) have methods to iterate over bytes, characters, code points, and grapheme clusters. Rather than get bogged down in the minutiae of each, I'll focus on the output from String#each_char and use the term "character" throughout.) Many operations with fixed-width encodings can be efficiently implemented as character offsets are trivial to calculate. In UTF-8, the default internal string encoding in Ruby (and many other languages), characters are variable width, requiring 1 - 4 bytes each. That generally complicates operations because it's not possible to determine character offsets or even the total number of characters in the string without scanning all of the bytes in the string. However, UTF-8 is backwards-compatible with ASCII. If a UTF-8 string consists of only ASCII characters, each character will be one byte wide, and if the runtime knows that it can optimize operations on such strings the same as if the string had the simpler ASCII encoding.

Code Ranges

In general, the only way to tell if a string consists of valid characters for its associated encoding is to do a full scan of all the bytes. This is an O(n) process, and while not the least efficient operation in the world, it is something we want to avoid. Languages that don't allow invalid strings only need to do the validation once, at string creation time. Languages that ahead-of-time (AOT) compile can validate string literals during compilation. Languages that only have immutable strings can guarantee that once a string is validated, it can never become invalid. Ruby has none of those properties, so its solution to reducing unnecessary string scans is to cache information about each string in a field known as a code range.

There are four code range values:

ENC_CODERANGE_UNKNOWN
ENC_CODERANGE_7BIT
ENC_CODERANGE_VALID
ENC_CODERANGE_BROKEN

The code range occupies an odd place in the runtime. As a place for the runtime to record profile information, it's an implementation detail. There is no way to request the code range directly from a string. However, since the code range records information about validity, it also impacts how some operations perform. Consequently, a few String methods allow you to derive the string's code range, allowing you to adapt your application accordingly.

The mappings are:

Code range	Ruby code equivalent
`ENC_CODERANGE_UNKNOWN`	No representation*
`ENC_CODERANGE_7BIT`	`str.ascii_only?`
`ENC_CODERANGE_VALID`	`str.valid_encoding? && !str.ascii_only?`
`ENC_CODERANGE_BROKEN`	`!str.valid_encoding?`

Table 1: Mapping of internal code range values to public Ruby methods.

* – Code ranges are lazily calculated in most cases. However, when requesting information about a property that a code range encompasses, the code range is calculated on demand. As such, you may pass strings around that have an ENC_CODERANGE_UNKNOWN code range, but asking information about its validity or other methods that require the code range, such as a string's character length, will calculate and cache the code range before returning a value to the caller.

Given its odd standing as somewhat implementation detail, somewhat not, every major Ruby implementation associates a code range with a string. If you ever work on a Ruby implementation's internals or a native extension involving String objects, you'll almost certainly run into working with and potentially managing the code range value.

Semantics

In MRI, the code range value is stored as an int value in the object header with bitmask flags representing the values. Each of the values is mutually exclusive from one another. This is important to note because logically, every string with an ASCII-compatible encoding and consists of only ASCII characters is a valid string. However, such a string will never have a code range value of ENC_CODERANGE_VALID. You should use the ENC_CODERANGE(obj) macro to extract the code range value and then compare it against one of the defined code range constants, treating the code range constants essentially the same as an enum (e.g., if (cr == ENC_CODERANGE_7BIT) { ... }).

If you try to use the code range values as bitmasks directly, you'll have very confusing and difficult to debug results. Due to the way the masks are defined, if a string is annotated as being both ENC_CODERANGE_7BIT and ENC_CODERANGE_VALID it will appear to be ENC_CODERANGE_BROKEN. Conversely, if you try to branch on a combined mask like if (cr & (ENC_CODERANGE_7BIT | ENC_CODERANGE_VALID)) { ... }, that will include ENC_CODERANGE_BROKEN strings. This is because the four valid values are only represented by two bits in the object header. The compact representation makes efficient use of the limited space in the object header but can be misleading to anyone used to working with bitmasks to match and set attributes.

To help illustrate the point a bit better, I've ported some of the relevant C code to Ruby (see Listing 1):

Listing 1: MRI's original C code range representation recreated in Ruby.

JRuby has a very similar implementation to MRI, storing the code range value as an int compactly within the object header, occupying only two bits. In TruffleRuby, the code range is represented as an enum and stored as an int in the object's shape. The enum representation takes up additional space but prevents the class of bugs from misapplication of bitmasks.

String Operations and Code Range Changes

The object's code range is a function of both its sequence of bytes and the encoding associated with the object to interpret those bytes. Consequently, when either the bytes change or the encoding changes, the code range value has the potential to be invalidated. When such an operation occurs, the safest thing to do is to perform a complete code range scan of the resulting string. To the best of our ability, however, we want to avoid recalculating the code range when it is not necessary to do so.

MRI avoids unnecessary code range scans via two primary mechanisms. The first is to simply scan for the code range lazily by changing the string's code range value to ENC_CODERANGE_UNKNOWN. When an operation is performed that needs to know the real code range, MRI calculates it on demand and updates the cached code range with the new result. If the code range is never needed, it's never calculated. (MRI will calculate the code range eagerly when doing so is cheap. In particular, when lexing a source file, MRI already needs to examine every byte in a string and be aware of the string's encoding, so taking the extra step to discover and record the code range value is rather cheap.)

The second way MRI avoids code range scans is to reason about the code range values of any strings being operated on and how an operation might result in a new code range. For example, when working with strings with an ENC_CODERANGE_7BIT code range value, most operations can preserve the code range value since all ASCII characters stay within the 0x00 - 0x7f range. Whether you take a substring, change the casing of characters, or strip off whitespace, the resulting string is guaranteed to also have the ENC_CODERANGE_7BIT value, so performing a full code range scan would be wasteful. The code in Listing 1 demonstrates some operations on a string with an ENC_CODERANGE_7BIT code range and how the resulting string always has the same code range.

Listing 2: Changing the case of a string with an ENC_CODERANGE_7BIT code range will always result in a string that also has an ENC_CODERANGE_7BIT code range.

Sometimes the code range value on its own is insufficient for a particular optimization, in which case MRI will consider additional context. For example, MRI tracks whether a string is "single-byte optimizable." A string is single-byte optimizable if its code range is ENC_CODERANGE_7BIT or if the associated encoding uses characters that are only one-byte wide, such as is the case with the ASCII-8BIT/BINARY encoding used for I/O. If a string is single-byte optimizable, we know that String#reverse must retain the same code range because each byte corresponds to a single character, so reversing the bytes can't change their meaning.

Unfortunately, the code range is not always easily derivable, particularly when the string's code range is ENC_CODERANGE_VALID or ENC_CODERANGE_BROKEN, in which case a full code range scan may prove to be necessary. Operations performed on a string with an ENC_CODERANGE_VALID code range might result in an ENC_CODERANGE_7BIT string if the source string's encoding is ASCII-compatible; otherwise, it would result in a string with an ENC_CODERANGE_VALID encoding. (We've deliberately set aside the case of String#setbyte which could cause a string to have an ENC_CODERANGE_BROKEN code range value. Generally, string operations in Ruby are well-defined and won't result in a broken string.) In Listing 2, you can see some examples of operations performed against a string with an ENC_CODERANGE_VALID code range resulting in strings with either an ENC_CODERANGE_7BIT code range or an ENC_CODERANGE_VALID coderange.

Listing 3: Changing the case of a string with an ENC_CODERANGE_VALID code range might result in a string with a different code range.

Since the source string may have an ENC_CODERANGE_UNKNOWN value and the operation may not need the resolved code range, such as String#reverse called on a string with the ASCII-8BIT/BINARY encoding, it's possible to generate a resulting string that also has an ENC_CODERANGE_UNKNOWN code range. That is to say, it's quite possible to have a string that is ASCII-only but which has an unknown code range that, when operated on, still results in a string that may need to have a full code range scan performed later on. Unfortunately, this is just the trade-off between lazily computing code ranges and deriving the code range without resorting to a full byte scan of the string. To the end user, there is no difference because the code range value will be computed and be accurate by the time it is needed. However, if you're working on a native extension, a Ruby runtime's internals, or are just profiling your Ruby application, you should be aware of how a code range can be set or deferred.

TruffleRuby and Code Range Derivations

As a slight digression, I'd like to take a minute to talk about code ranges and their derivations in TruffleRuby. Unlike other Ruby implementations, such as MRI and JRuby, TruffleRuby eagerly computes code range values so that strings never have an ENC_CODERANGE_UNKNOWN code range value. The trade-off that TruffleRuby makes is that it may calculate code range values that are never needed, but string operations are simplified by never needing to calculate a code range on-demand. Moreover, TruffleRuby can derive the code range of an operation's result string without needing to perform a full byte scan in more situations than MRI or JRuby can.

While eagerly calculating the code range may seem wasteful, it amortizes very well over the lifetime of a program due to TruffleRuby's extensive reuse of string data. TruffleRuby uses ropes as its string representation, a tree-based data structure where the leaves look like a traditional C-style string, while interior nodes represent string operations linking other ropes together. (If you go looking for references to "rope" in TruffleRuby, you might be surprised to see they're mostly gone. TruffleRuby still very much uses ropes, but the TruffleRuby implementation of ropes was promoted to a top-level library in the Truffle family of language implementations, which TruffleRuby has adopted. If you use any other language that ships with the GraalVM distribution, you're also using what used to be TruffleRuby's ropes.) A Ruby string points to a rope, and a rope holds the critical string data.

For instance, on a string concatenation operation, rather than allocate a new buffer and copy data into it, with ropes we create a "concat rope" with each of the strings being concatenated as its children (see Fig.1). The string is then updated to point at the new concat rope. While that concat rope does not contain any byte data (delegating that to its children), it does store a code range value, which is easy to derive because each child rope is guaranteed to have both a code range value and an associated encoding object.

Figure 1: A sample rope for the result of “Hello “ + “François”

Moreover, rope metadata are immutable, so getting a rope's code range value will never incur more overhead than a field read. TruffleRuby takes advantage of that property to use ropes as guards in inline caches for its JIT compiler. Additionally, TruffleRuby can specialize string operations based on the code ranges for any argument strings. Since most Ruby programs never deal with ENC_CODERANGE_BROKEN strings, TruffleRuby's JIT will eliminate any code paths that deal with that code range. If a broken string does appear at runtime, the JIT will deoptimize and handle the operation on a slow path, preserving Ruby's full semantics. Likewise, while Ruby supports 100+ encodings out of the box, the TruffleRuby JIT will optimize a Ruby application for the small number of encodings it uses.

A String By Any Other Name

Often string performance discussions are centered around web template rendering or text processing. While important use cases, strings are also used extensively within the Ruby runtime. Every symbol or regular expression has an associated string, and they're consulted for various operations. The real fun comes with Ruby's metaprogramming facilities: strings can be used to access instance variables, look up methods, send messages to objects, evaluate code snippets, and more. Improvements (or degradations) in string performance can have large, cascading effects.

Backing up a step, I don't want to oversell the importance of code ranges for fast metaprogramming. They are an ingredient in a somewhat involved recipe. The code range can be used to quickly disqualify strings known not to match, such as those with the ENC_CODERANGE_BROKEN code range value. In the past, the code range was used to fail fast when particular identifiers were only allowed to be ASCII-only. While not currently implemented in MRI, such a check could be used to dismiss strings with the ENC_CODERANGE_VALID code range when all identifiers are known to be ENC_CODERANGE_7BIT, and vice versa. However, once a string passes the code range check, there's still the matter of seeing if it matches an identifier (instance variable, method, constant, etc.). With TruffleRuby, that check can be satisfied quickly because its immutable ropes are interned and can be compared by reference. In MRI and JRuby, the equality check may involve a linear pass over the string data as the string is interned. Even that process gets murky depending on whether you're working with a dynamically generated string or a frozen string literal. If you're interested in a deep dive on the difficulties and solutions to making metaprogramming fast in Ruby, Chris Seaton has published a paper about the topic and I've presented a talk about it at RubyKaigi.

Conclusion

More so than many other contemporary languages, Ruby exposes functionality that is difficult to optimize but which grants developers a great deal of expressivity. Code ranges are a way for the VM to avoid repeated work and optimize operations on a per-string basis, guiding away from slow paths when that functionality isn't needed. Historically, that benefit has been most keenly observed when running in the interpreter. When integrated with a JIT with deoptimization capabilities, such as TruffleRuby, code ranges can help eliminate generated code for the types of strings used by your application and the VM internally.

Knowing what code ranges are and what they're used for can help you debug issues, both for performance and correctness. At the end of the day, a code range is a cache, and like all caches, it may contain the wrong value. While such instances within the Ruby VM are rare, they're not unheard of. More commonly, a native extension manipulating strings may fail to update a string's code range properly. Hopefully, with a firm understanding of code ranges, you’ll find Ruby's handling of strings less daunting.

Kevin is a Staff Developer on the Ruby & Rails Infrastructure team where he works on TruffleRuby. When he’s not working on Ruby internals, he enjoys collecting browser tabs, playing drums, and hanging out with his family.

Development

Leveraging Shopify’s API to Build the Latest Marketplace Kit

by Kenji Duggan
Apr 8, 2022

In February, we released the second version of Marketplace Kit: a collection of boilerplate app code and documentation allowing third-party developers to integrate with Shopify, build selected commerce features, and launch a world-class marketplace in any channel.

Previously, we used the node app generated by the Shopify command-line interface (CLI) as a foundation. However, this approach came with two drawbacks: If changes are made to the Shopify CLI, it would affect our code and documentation. Also, we had limited control over best practices, which forced us to use the CLI's node app dependencies.

Since then, we've decoupled code from the Shopify CLI and separated the Marketplace Kit sample app into two separate apps: a full-stack admin app and a buyer-facing client app. For these, we chose dependencies that were widely used, such as Express and NextJS, to appeal to the largest number of partners possible. Open-sourced versions of the apps are publicly available for anyone to try out.

Shopify’s APIs mentioned or shown in this article:

Here are a few ways we leveraged Shopify’s APIs to create the merchant-facing Marketplace admin app for version 2.0.

Before We Get Started

Here’s a brief overview of app development at Shopify. The most popular server-side language used within the Shopify CLI, to ease the development of apps is Node JS, a server-side JavaScript runtime. That’s why we used it for the Marketplace Kit’s sample admin app. With Node JS, we use a web framework library called Express JS, chosen for reasons such as ease of use and popularity word-wide.

On the client-side of the admin and buyer apps, we use the main JavaScript frontend library at Shopify: React JS. In the buyer-side app, we chose Next JS, a framework for React JS. This mainly provides structure to the application, as well as built-in features like server-side rendering and typescript support. When sharing data between frontend and backend apps, we use GraphQL and their helper libraries for ease of integration, Apollo Client and Apollo Server.

It’s also helpful to have familiarity with some key web development concepts, such as JSON Web Token (JWT) authentication, and Vanilla JavaScript, a compiled programming language best known for being used as a scripting language for web pages.

JWT Authentication with App Bridge

Let’s start with how we chose to handle authentication in our apps to assess a working example of using an internal library to ease development within Shopify. App Bridge is a standalone library offering React component wrappers for some app actions. It provides an out-of-the box solution for embedding your app inside of the Shopify admin, Shopify POS, and Shopify Mobile. Since we're using React for our embedded channel admin app, we leveraged additional App Bridge imports to handle authentication. Here is a client-side example from the channels admin app:

The app object, which is an instance of useAppBridge, is used to pass contextual information about the embedded app. We chose to wrap the authenticatedFetch function usage inside of a function with custom auth redirecting. Notice that the authenticatedFetch import does many things under the hood. Notably, it adds two HTTP headers: Authorization with a JWT session token created on demand and X-Requested-With with XMLHttpRequest (basically narrows down the request type and improves security).

This is the server-side snippet that handles the session token. It resides in our main server file, where we define our spec-compliant GraphQL server, using an Express app as middleware. Within the configuration of our ApolloServer’s context property, you'll see how we handle the auth header:

Notice how we leverage Shopify’s Node API to decode the session token and then to load the session data, providing us with the store’s access token. Fantastic!

Quick tip: To add more stores, you can switch out the store value in .env and run the Shopify CLI's shopify app serve command!

Serving REST & GraphQL With Express

In our server-side code, we use the apollo-server-express package instead of simply using apollo-server:

The setup for a GraphQL server using the express-specific package is quite similar to how we would do it with the barebones default package. The difference is that we apply the Apollo Server instance as middleware to an Express HTTP instance with graphQLServer.applyMiddleware({ app }) (or whatever you named your instance).

If you look at the entire file, you'll see that the webhooks and routes for the Express application are added after starting the GraphQL server. The advantage of using the apollo-server-express package over apollo-server is being able to serve REST and GraphQL at the same time using Express. Serving GraphQL within Express allows us to use Node middleware for common problems like rate-limiting, security and authentication. The trade-off is using a little bit more boilerplate, but since apollo-server is a wrapper to the Express specific one, there’s no noticeable performance difference.

Check out the Apollo team’s blog post Using Express with GraphQL to read more.

Custom Client Wrappers

Here’s an example of custom API clients for data fetching from Shopify’s Node API, applying GraphQL and REST:

This allows for easier control of our request configuration, like adding custom User-Agent headers with a unique header title for the project, including its npm package version.

Although Shopify generally encourages using GraphQL, sometimes it makes sense to use REST. In the admin app, we used it for one call: getting the products listings count. There was no need to create a specific query when an HTTP-GET request contains all the information required—using GraphQL would not offer any advantage. It’s also a good example of using REST in the application, ensuring developers who use the admin app as a starting point see an example that takes advantage of both ways to fetch data depending on what’s best for the situation.

Want to Challenge Yourself?

For full instructions on getting started with Marketplace Kit, check out our official documentation. To give you an idea, here are screenshots of the embedded admin app and the buyer app, in that order, upon completion of the tutorials in the docs:

Best-in-Class Developer Experience with Vite and Hydrogen

by Fran Dios
Mar 31, 2022

Hydrogen is a framework that combines React and Vite for creating custom storefronts on Shopify. It maximizes performance for end-users and provides a best-in-class developer experience for you and your team. Since it focuses on evergreen browsers, Hydrogen can leverage modern capabilities, best practices, and the latest tooling in web development to bring the future of ecommerce closer.

Creating a framework requires a lot of choices for frontend tooling. One major part of it is the bundler. Traditionally, developers had no native way to organize their code in JavaScript modules. Therefore, to minimize the amount of code and waterfall requests in the browser, new frontend tools like Webpack started to appear, powering projects such as Next.js and many more.

Bundling code became the de facto practice for the last decade, especially when using view libraries like React or Vue. Whereas these tools successfully solved the problem, they quickly became hard to understand and maintain due to the increasing complexity of the modern web. On top of that, the development process started to slow down because bundling and compiling are inherently slow: the more files in a project, the more work the tool needs to do. Repeat this process for every change made in a project during active development, and one can quickly notice how the developer experience (DX) tanks.

Diagram showing bundle-based dev server. Modules are bundled and compiled to be server ready — Bundle-based dev server image from Vite.js docs

Thanks to the introduction of ES Modules (a native mechanism to author JavaScript modules) and its support in browsers, some new players like Snowpack and Parcel appeared and started shaping up the modern web development landscape.

Image showing use of native ES Modules to minimize the amount of bundling required during development — Native ESM-based dev server from Vite.js docs

This new generation of web tooling aims to improve the DX of building apps. Whereas Webpack needs a complex configuration, even for simple things due to its high flexibility, these new tools provide very sensible but configurable default values. Furthermore, they leverage native ES Modules to minimize the amount of bundling required during development. In particular, they tend to bundle and cache only third-party dependencies to keep network connections low (the number of files downloaded by the browser). Some dependencies may have dozens or hundreds of files, but they don't need to be updated often. On the other hand, it provides user code unbundled to the browser, thus speeding up the refresh rates when making changes.

Enter Vite. With its evergreen and modern philosophy, we believe Vite aligns perfectly with Hydrogen. Featuring a lightning-fast development server with hot module replacement, a rich plugin ecosystem, and clever default configurations that make it work out of the box for most apps, Vite was among the top options to power Hydrogen's development engine.

Why Vite?

Vite is French for "quick", and the Hydrogen team can confirm: it's really fast. From the installation and setup to its hot reloading, things that used to be a DX pain are (mostly) gone. It’s also highly configurable and simple to use.

Partially, this is thanks to the two magnificent tools that power it: ESBuild, a Golang-based, lightning-fast compiler for JavaScript, and Rollup, a flexible and intelligible bundler. However, Vite is much more than the sum of these parts.

Ease of Use

In Vite, the main entry point is a simple index.html file, making it a first-class citizen instead of an after-thought asset. Everything else flows from here by using stylesheets and scripts tags. It crawls and analyzes all of the imported assets and transforms them accordingly.

Thanks to its default values, most flavors of CSS and JavaScript, including JSX, TypeScript (TS), PostCSS, work out of the box.

Let me reiterate this: it just works™. No painful configuration is needed to get those new CSS prefixes or the latest TS type checking working. It even lets you import WebAssembly or SVG files from JavaScript just like that. Also, since Vite's main target is modern browsers, it’s prepared to optimize the code and styles by using the latest supported features by default.

We value the simplicity Vite brings in Hydrogen and share it with our users. It all sums up to saving a lot of time configuring your tooling compared to other alternatives.

A Proven Plugin System

Rollup has been around for a much longer time than Vite. It does one thing and does it very well: bundling. The key here is that Vite can tell it what to bundle.

Furthermore, Rollup has a truly rich plugin ecosystem that is fully compatible with Vite. With this, Vite provides hooks during development and building phases that enable advanced use cases, such as transforming specific syntax like Vue files. There are many plugins out there that use these hooks for anything you can imagine: Markdown pages with JSX, SSR-ready icons, automatic image minification, and more.

In Hydrogen, we found these Vite hooks are easier to understand and use than those in Webpack, and it allows us to write more maintainable code.

Speed

A common task that tends to slow down web development is compiling JavaScript flavors and new features to older and widely supported code. Babel, a compiler written in JavaScript, has been the king in this area for a long time.

However, new tools like ESBuild started to appear recently with a very particular characteristic: they use a machine-compiled language to transform JavaScript instead of using JavaScript itself. In addition, and perhaps more importantly, they also apply sophisticated algorithms to avoid repeating AST parsing and parallelize work, thus establishing a new baseline for speed.

Apart from using ESBuild, Vite applies many other optimizations and tricks to speed up development. For instance, it pre-bundles some third-party dependencies and caches them in the filesystem to enable faster startups.

All in all, we can say Vite is one of the fastest alternatives out there when it comes to local development, and this is something we also want our users to benefit from in Hydrogen.

ESM and HMR

Along with Snowpack and Parcel, Vite is one of the first tools to embrace ECMAScript Modules (ESM) and inject JavaScript into the browser using script tags with type=module.

This, paired with hot-module replacement (HMR), means that changes to files on the local filesystem are updated instantly in the browser.

Vite is also building for the future of the web and the NPM ecosystem. While most third-party libraries are still on CommonJS (CJS) style modules (native in Node.js), the new standard is ESM. Vite performs an exhaustive import analysis of dependencies and transforms CJS modules into ESM automatically, thus letting you import code always in a modern fashion. And this is not something to take lightly. CJS and ESM interoperability is one of the biggest headaches web developers have faced in recent years.

As app developers ourselves in Hydrogen, it is such a relief we can focus on coding without wasting time on this issue. Someday most packages will, hopefully, follow the ESM standard. Until that day, Vite has us covered.

Server-Side Rendering

Server-side rendering (SSR) is a critical piece to modern frameworks like Hydrogen and is another place where Vite shines. It extends Rollup hooks to provide SSR information, thus enabling many advanced use cases.

For example, it is possible to transform the same imported file in different ways depending on the running environment (browser or server). This is key to supporting some advanced features we need in Hydrogen, such as React Server Components, which was only available in Webpack to this date.

Vite can also load front-end code in the server by converting dependencies to a Node-compatible runtime and modules to CJS. Think of simply importing a React application in Node. It greatly eases the way SSR works and is something Hydrogen leverages to remove extra dependencies and simplify code.

Community

Last but not least, Vite has a large and vibrant community around it.

Many projects in addition to Hydrogen are relying on and contributing to Vite, such as Vitest, SvelteKit, Astro, Storybook, and many more.

And it's not just about the projects, but also the people behind them who are incredibly active and always willing to help in Vite's Discord channel. From Vite's creator, @youyuxi, to many other contributors and maintainers such as @patak_dev, @alecdotbiz, or @antfu7.

Hydrogen is also a proud sponsor of Vite. We want to support the project to ensure it stays up to date with the latest DX improvements to make web developers’ life easier.

How Hydrogen uses Vite

Our goal when building Hydrogen on top of Vite was to keep things as “close to the metal” as possible and not reinvent the wheel. CLI tools can rely on Vite commands internally, and most of the required configuration is abstracted away.

Creating a Vite-powered Hydrogen storefront is as easy as adding the @shopify/hydrogen/plugin plugin to your vite.config.js:

Behind the scenes, we are invoking 4 different plugins:

hydrogen-config: This is responsible for altering the default Vite config values for Hydrogen projects. It helps ensure bundling for both Node.js and Worker runtimes work flawlessly, and that third-party packages are processed properly.
react-server-dom-vite: It adds support for React Server Components (RSC). We extracted this plugin from Hydrogen core and made it available in the React repository.
hydrogen-middleware: This plugin is used to hook into Vite’s dev server configuration and inject custom behavior. It allows us to respond to SSR and RSC requests while leaving the asset requests to Vite’s default web server.
@vite/plugin-react: This is an official Vite plugin that adds some goodies for React development such as fast refresh in the browser.

Just with this, Hydrogen is able to support server components, streaming requests, clever caching, and more. By combining this with all the features Shopify already provides, you can unlock unparalleled performance and best-in-class DX for your storefront.

Choosing the Right Tool

There are still many advanced use cases where Webpack is a good fit since it is very mature and flexible. Many projects and teams, such as React’s, rely heavily on it for their day-to-day development.

However, Vite makes building modern apps a delightful experience and empowers framework authors with many tools to make development easier. Storefront developers can enjoy a best-in-class DX while building new features at a faster pace. We chose Vite for Hydrogen and are happy with that decision so far.

Fran works as a Staff Software Engineer on the Hydrogen team at Shopify. Located in Tokyo, he's a web enthusiast and an active open source contributor who enjoys all things tech and all things coconut. Connect with Fran on Twitter and GitHub.

If you’re passionate about solving complex problems at scale, and you’re eager to learn more, we're hiring! Reach out to us or apply on our careers page.