Why Shopify Moved to The Production Engineering Model

Why Shopify Moved to The Production Engineering Model

6 minute read

The traditional model of running large-scale computer systems divides work into Development and Operations as distinct and separate teams. This split works reasonably well for computer systems that are changed or updated very rarely, and organizations sometimes require this if they’re deploying and operating software built by a different company or organization. However, this rigid divide fails for large-scale web applications that are undergoing frequent or even continuous change. DevOps is the term for a movement that’s gathered steam in the past decade to bring together these disciplines.


Continue reading

Automatic Deployment at Shopify

Automatic Deployment at Shopify

6 minute read

Hi, I'm Graeme Johnson, and I work on Shopify's Developer Acceleration team. Our mission is to provide tools that let developers ship fast and safely. Recently we began shipping Shopify automatically as developers hit the merge button in GitHub. This removes the final manual step in our deploy pipeline, which now looks like this:

Merge → Build container → Run CI → Hit deploy button → Ship to production

We have invested a lot of engineering effort to make this pipeline fast enough to run end-to-end in about 15 minutesstill too slow for our tasteand robust enough to allow cancellation at any stage in the process. Automating the actual deploy trigger was the next logical step.

Continue reading

How We're Thinking About Commerce and VR With Our First VR App, Thread Studio

How We're Thinking About Commerce and VR With Our First VR App, Thread Studio

3 minute read

Hey everyone! I’m Daniel and I lead our VR efforts at Shopify.

When I talk to people about VR and commerce, the first idea that usually pops into their heads is about all the possibilities of walking around a virtual shopping mall. While that could be an enjoyable experience for some, I find it’s a very limiting view of how virtual reality can actually improve retail.

If VR gave you the superpowers to do anything, create anything, and go anywhere you want, would you really want to go shopping in a regular mall?

More than a virtual mall

It’s easy to take a new medium and try to shoehorn in what already exists and is familiar. What’s hard is figuring out what content makes the medium truly shine and worthwhile to use. VR offers an amazing storytelling platform for brands. For the first time, brands can put people in the stories that their products tell.

If you’re selling scuba gear, why not show what it’d look like underwater with jellyfish passing by? Or a tent on a windy, chilly cliff, reflecting the light of a scrappy fire? It sure would beat being in a fluorescent-lit camping store. In VR, you could explore inside a tent before you buy it, or change the environment around you at a press of a button.

Continue reading

Shopify Merchants Will Soon Get AMP'd

Shopify Merchants Will Soon Get AMP'd

1 minute read

Today we're excited to share our involvement with the AMP Project.

Life happens on mobile. (In fact, there are over seven billion small screens now!) We're not only comfortable with shopping online, but increasingly we're buying things using our mobile devices. Delays can mean the difference between a sale or no sale, so it's important to make things run as quickly as possible.

AMP, or Accelerated Mobile Pages, is an open source, Google-led initiative aimed at improving the mobile web experience and solving the issue of slow loading content. (You can learn more about the tech here.) Starting today, Google is pointing to AMP’d content beyond their top stories carousel to include general web search results.

Continue reading

How Our UX Team's Approaching Accessibility

How Our UX Team's Approaching Accessibility

Last updated: September 9, 2016

2 minute read

At Shopify, our mission is to make commerce better for everyone. When we say better, we’re talking about caring deeply about making quality products. To us, a quality web product means a few things: certainly beautiful design, engaging copy, and a fantastic user experience, but just as important are inclusivity and the principles of universal design.

“Everyone” is a pretty big group. It includes our merchants, their customers, our developer partners, our employees, and the greater tech community at large, where we love to lead by example. “Everyone” also includes:

We take our mission to heart, so it’s important that Shopify products are useable and useful to all our users. This is something we’ve been thinking about and working on for a few years, but it’s an ongoing, difficult challenge. Luckily, we love tackling challenging problems and we’re constantly chipping away at this one. We’ve learned a lot from the community and think it’s important to contribute back, so — in celebration of Global Accessibility Awareness Day — we’re thrilled to announce a series of posts on accessibility.


Continue reading

How to Set Up Your Own Mobile CI System

How to Set Up Your Own Mobile CI System

1 minute read

Editor's note: a more updated post on this topic is now up! Check out Sander Lijbrink's "Building a Dynamic Mobile CI System."

Over the past few years the mobile development community has seen a dramatic shift towards the use of continuous integration (CI) systems similar to changes present in other communities — particularly web developers. This shift has been a particularly powerful moment for mobile developers, as they’re able to focus on their apps and code rather than spending their time on provisioning, code signing, deployment, and running tests.

I’m a software developer at Shopify currently working on our Developer Acceleration’s Mobile team. My job is to design, create, and manage an automated system to provide an accelerated development experience for our developers.

Based on our experiences at Shopify, we will be talking about “hosted” vs “BYOH” systems, how to provision Mac OS X and Ubuntu machines for iOS and Android, and the caveats we ran into throughout this series. By the end, you should be ready to go build your very own CI setup.


    Continue reading

    Adventures in Production Rails Debugging

    Adventures in Production Rails Debugging

    5 minute read

    At Shopify we frequently need to debug production Rails problems. Adding extra debugging code takes time to write and deploy, so we’ve learned how to use tools like gdb and rbtrace to quickly track down these issues. In this post, we’ll explain how to use gdb to retrieve a Ruby call stack, inspect environment variables, and debug a really odd warning message in production.

    We recently ran into an issue where we were seeing a large number of similar warning messages spamming our log files:

    /artifacts/ruby/2.1.0/gems/rack-1.6.4/lib/rack/utils.rb:92: warning: regexp match /.../n against to UTF-8 string

    This means we are trying to match an ASCII regular expression on a UTF-8 source string.

    Continue reading

    Developer Onboarding at Shopify

    Developer Onboarding at Shopify

    5 minute read

    Hi there! We’re Kat and Omosola and we’re software developers at Shopify. We both started working at Shopify back in May, and we felt both excited and a little nervous before we got here. You never know exactly what to expect when you start at a new company and no matter what your previous experience is, there are always a lot of new skills you need to learn. Thankfully, Shopify has an awesome onboarding experience for its new developers, which is what we want to talk about today.

    Continue reading

    Introducing Shipit

    Introducing Shipit

    3 minute read

    After a year of internal use, we’re excited to open-source our deployment tool, Shipit.

    With dozens of teams pushing code multiple times a day to a variety of different targets, fast and easy deploys are key to developer productivity (and happiness) at Shopify. Along with key improvements to our infrastructure, Shipit plays a central role in making this happen.

    Continue reading

    Secrets at Shopify - Introducing EJSON

    Secrets at Shopify - Introducing EJSON

    This is a continuation of our series describing our evolution of Shopify toward a Docker-powered, containerized data centre. Read the last post in the series here.

    One of the challenges along the road to containerization has been establishing a way to move application secrets like API keys, database passwords, and so on into the application in a secure way. This post explains our solution, and how you can use it with your own projects.

    Continue reading

    Announcing go-lua

    Announcing go-lua

    Today, we’re excited to release go-lua as an Open Source project. Go-lua is an implementation of the Lua programming language written purely in Go. We use go-lua as the core execution engine of our load generation tool. This post outlines its creation, provides examples, and describes some challenges encountered along the way.

    Continue reading

    There's More to Ruby Debugging Than puts()

    There's More to Ruby Debugging Than puts()

    "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan

    Debugging is always challenging, and as programmers we can easily spend a good chunk of every day just trying to figure out what is going on with our code. Where exactly has a method been overwritten or defined in the first place? What does the inheritance chain look like for this object? Which methods are available to call from this context?

    This article will take you through some under-utilized convenience methods in Ruby which will make answering these questions a little easier.

      Continue reading

      Building Year in Review 2014 with SVG and Rails

      Building Year in Review 2014 with SVG and Rails

      As we have for the past 3 years, Shopify released a Year in Review to highlight some of the exciting growth and change we’ve observed over the past year. Designers James and Veronica had ambitious ideas for this year’s review, including strong, bold typographic treatments and interactive data visualizations. We’ve gotten some great feedback on the final product, as well as some curious developers wondering how we pulled it off, so we’re going to review the development process for Year in Review and talk about some of the technologies we leveraged to make it all happen.

      Continue reading

      Building and Testing Resilient Ruby on Rails Applications

      Building and Testing Resilient Ruby on Rails Applications

      Black Friday and Cyber Monday are the biggest days of the year at Shopify with respect to every metric. As the Infrastructure team started preparing for the upcoming seasonal traffic in the late summer of 2014, we were confident that we could cope, and determined resiliency to be the top priority. A resilient system is one that functions with one or more components being unavailable or unacceptably slow. Applications quickly become intertwined with their external services if not carefully monitored, leading to minor dependencies becoming single points of failure.

      For example, the only part of Shopify that relies on the session store is user sign-in - if the session store is unavailable, customers can still purchase products as guests. Any other behaviour would be an unfortunate coupling of components. This post is an overview of the tools and techniques we used to make Shopify more resilient in preparation for the holiday season.

      Continue reading

      Tuning Ruby's Global Method Cache

      Tuning Ruby's Global Method Cache

      I was recently profiling a production Shopify application server using perf and noticed a fair amount of time being spent in a particular function, st_lookup, which is used by Ruby’s MRI implementation for hash table lookups:

      Hash tables are used all over MRI, and not just for the Hash object; global variables, instance variables, classes, and the garbage collector all use MRI’s internal hash table implementation, st_table. Unfortunately, what this profile did not show were the callers of st_lookup. Is this some application code that has gone wild? Is this an inefficiency in the VM?

      Continue reading

      Docker at Shopify: How We Built Containers that Power Over 100,000 Online Shops

      Docker at Shopify: How We Built Containers that Power Over 100,000 Online Shops

      This is the second in a series of blog posts describing our evolution of Shopify toward a Docker-powered, containerized data center. This instalment will focus on the creation of the container used in our production environment when you visit a Shopify storefront.

      Read the first post in this series here.

      Why containerize?

      Before we dive into the mechanics of building containers, let's discuss motivation. Containers have the potential to do for the datacenter what consoles did for gaming. In the early days of PC gaming, each game typically required video or sound driver massaging before you got to play. Gaming consoles however, offered a different experience:

      • predictability: cartridges were self-contained fun: always ready-to-run, with no downloads or updates.
      • fast: cartridges used read-only memory for lightning fast speeds.
      • easy: cartridges were robust and largely child-proof - they were quite literally plug-and-play.

      Predictable, fast, and easy are all good things at scale. Docker containers provide the building blocks to make our data centers easier to run and more adaptable by placing applications into self-contained, ready-to-run units much like cartridges did for console games.

      Continue reading

      Rebuilding the Shopify Admin: Deleting 28,000 lines of JavaScript to Improve Dev Productivity

      Rebuilding the Shopify Admin: Deleting 28,000 lines of JavaScript to Improve Dev Productivity

      6 minute read

      This September, we quietly launched a new version of the Shopify admin. Unlike the launch of the previous major iteration of our admin, this version did not include a major overhaul of the visual design, and for the most part, would have gone largely unnoticed by the user.

      Why would we rebuild our admin without providing any noticeable differences to our users? At Shopify, we strongly believe that any decision should be able to be questioned at any time. In late 2012, we started to question whether our framework was still working for us. This post will discuss the problems in the previous version of our admin, and how we decided that it was time to switch frameworks.

      Continue reading

      Building an Internal Cloud with Docker and CoreOS

      Building an Internal Cloud with Docker and CoreOS

      This is the first in a series of posts about adding containers to our server farm to make it easier to scale, manage, and keep pace with our business.

      The key ingredients are:

      • Docker: container technology for making applications portable and predictable
      • CoreOS: provides a minimal operating system, systemd for orchestration, and Docker to run containers

      Shopify is a large Ruby on Rails application that has undergone massive scaling in recent years. Our production servers are able to scale to over 8,000 requests per second by spreading the load across 1700 cores and 6 TB RAM.

      Continue reading

      Kafka Producer Pipeline for Ruby on Rails

      Kafka Producer Pipeline for Ruby on Rails

      In the early fall our infrastructure team was considering Kafka, a highly available message bus. We were looking to solve several infrastructure problems that had come up around that time.
      • We were looking for a reliable way to collect event data and send it to our data warehouse.

      • We were considering a more service-oriented architecture, and needed a standardized way of message passing between the components.

      • We were starting to evaluate containerization of Shopify, and were searching for a way to get logs out of containers.

      We were intrigued by Kafka due to its highly available design. However, Kafka runs on the JVM, and its primary user, LinkedIn, runs a full JVM stack. Shopify is mainly Ruby on Rails and Go, so we had to figure out how to integrate Kafka into our infrastructure.

      Continue reading

      Building a Rack Middleware

      Building a Rack Middleware

      I'm Chris Saunders, one of Shopify's developers. I like to keep journal entries about the problems I run into while working on the various codebases within the company.

      Recently we ran into a issue with authentication in one of our applications and as a result I ended up learning a bit about Rack middleware. I feel that the experience was worth sharing with the world at large so here's is a rough transcription of my entry. Enjoy!

      I'm looking at invalid form submissions for users who were trying to log in via their Shopify stores. The issue was actually at a middleware level, since we were passing invalid data off to OmniAuth which would then choke because it was dealing with invalid URIs.

      The bug in particular was we were generating the shop URL based on the data that the user was submitting. Normally we'd be expecting something like mystore.myshopify.com or simply mystore, but of course forms can be confusing and people put stuff in there like http://mystore.myshopify.com or even worse my store. We'd build up a URL and end up passing something like https://http::/mystore.myshopify.com.myshopify.com and cause an exception to get raised.

      Another caveat is that we aren't able to even sanitize the input before passing it off to OmniAuth, unless we were to add more code to the lambda that we pass into the setup initializer.

      Adding more code to an initializer is definitely less than optimal, so we figured that we could implement this in a better way: adding a middleware to run before OmniAuth such that we could attempt to recover the bad form data, or simply kill the request before we get too deep.

      We took a bit of time to learn about how Rack middlewares work, and looked to the OmniAuth code for inspiration since it provides a lot of pluggability and is what I'd call a good example of how to build out easily extendable code.

      We decided that our middleware would be initialized with a series of routes to run a bunch of sanitization strategies on. Based on how OmniAuth works, I gleaned that the arguments after config.use MyMiddleWare would be passed into the middleware during the initialization phase - perfect! We whiteboarded a solution that would work as follows:

      Now that we had a goal we just had to implement it. We started off by building out the strategies since that was extremely easy to test. The interface we decided upon was the following:

      We decided that the actions would be destructive, so instead of creating a new Rack::Request at the end of our strategies call, we'd change values on the object directly. It simplifies things a little bit but we need to be aware that order of operations might set some of our keys to nil and we'd have to anticipate that.

      The simplest of sanitizers we'd need is one that cleans up our whitespace. Because we are building these for .myshopify.com domains we know the convention they follow: dashes are used as separators between words if the shop was created with spaces. For example, if I signed up with my super awesome store when creating a shop, that would be converted into my-super-awesome-store. So if a user accidentally put in my super awesome store we can totally recover that!

      Now that we have a sanitization strategy written up, let's work on our actual middleware implementation.

      According to the Rack spec, all we really need to do is ensure that we return the expected result: an array that consists of the following three things: A response code, a hash of headers and an iterable that represents the content body. An example of the most basic Rack response is:

      Per the Rack spec, middlewares are always initialized where the first object is a Rack app, and whatever else afterwards. So let's get to the actual implementation:

      That's pretty much it! We've written up a really simple middleware that takes care of cleaning up some bad user input that necessarily isn't a bad thing. People make mistakes and we should try as much as possible to react to this data in a way that isn't jarring to the users of our software.

      You can check out our implementation on Github and install it via RubyGems. Happy hacking!

      Continue reading

      IdentityCache: Improving Performance one Cached Model at a Time

      IdentityCache: Improving Performance one Cached Model at a Time

      A month ago Shopify was at BigRubyConf where we mentioned an internal library we use for caching ActiveRecord models called IdentityCache. We're pleased to say that the library has been extracted out of the Shopify code base and has been open sourced!
      At Shopify, our core application has been database performance bound for much of our platform’s history. That means that the most straightforward way of making Shopify more performant and resilient is to move work out of the database layer. 
      For many applications, achieving a very high cache ratio is a matter of storing full cached response bodies, and versioning them based on the associated records in the database, serving always the more current version and relying on the cache’s LRU algorithm for expiration. 
      That technique, called a “generational page cache”, is well proven and very reliable.  However, part of Shopify’s value proposition is that store owners can heavily customize the look and feel of their shops. We in fact offer a full fledged templating language
      As a side effect, full page static caching is not as effective as it would be in most other web platforms, because we do not have a deterministic way of knowing what database rows we’ll need to fetch on every page render. 
      The key metric driving the creation of IdentityCache was our master database’s queries per (second/minute) and thus the goal was to reduce read operations reaching the database as much as possible. IdentityCache does this by moving the workload to Memcached instead.
      The inability of a full page cache to take load away from the database becomes even more evident during write heavy - and thus page cache expiring - events like Cyber Monday, and flash sales. On top of that, the traffic on our web app servers typically doubles each year, and we invested heavily in building out IdentityCache to help absorb this growth.  For instance, in 2012 during the last pre-IdentityCache sales peak, we saw 130.000 requests per minute generating 21.000 queries per second in comparison with the latest flash sale on April 2013 generated 203.000 requests with only 14.500 queries per second.  

      What Exactly is IdentityCache?

      IdentityCache is a read through cache for ActiveRecord models. When reading records from the cache, IdentityCache will try to fetch the requested object from memcached. If the cache entry doesn't exist, IdentityCache will load the object from the database and store it in memcache, then the cached copy will be available for subsequent reads and avoid any more trips to the database. This behaviour is key during events that expire the cache often.
      Expiration is explicit and does not rely on Memcached's LRU. It is automatic, objects are expired from the cache by issuing memcached delete command as they change in the database via after_commit hooks. This is important because given a row in the database we can always calculate its cache key based on the current table schema and the row’s id. There is no need for the user to ever call delete themselves. It was a conscious decision to take expiration away from day-to-day developer concerns.
      This has been a huge help as the characteristics of our application and Rails have changed. One great example of this is how Ruby on Rails changed what actions would fire after_commit hooks. For instance, in Rails 3.2, touch will not fire an after_commit. Instead of having to add expires, and think about all the possible ramifications every time, we added the after_touch hook into IdentityCache itself.
      Aside from the default key, built from the schema and the row id, IdentityCache uses developer defined indexes to access your models. Those indexes simply consist of keys that can be created deterministically from other row fields and the current schema. Declaring an index will also add a helper method to fetch your cached models using said index.
      IdentityCache is opt-in, meaning developers need to explicitly specify what should be indexed and explicitly ask for data from the cache. It is important that developers don’t have to guess whether calling a method will bring a cached entry or not. 
      We think this is a good thing. Having caching hook in automatically is nice in its simplest form.  However, IdentityCache wasn't built for simple applications, it has been built for large, complicated applications where you want, and need to know what's going on.

      Down to the Numbers

      If that wasn’t good enough, here are some numbers from Shopify itself.
      This is an example of when we introduced IdentityCache to one of the objects that is heavily hit on the shop storefronts. As you can see we cut out thousands of calls to the database when accessing this model. This was huge since the database is one of the heaviest contended components of Shopify.
      This example shows similar results once IdentityCache was introduced. We eliminated what was approaching 50K calls per minute (which was growing steadily) to almost nothing since the subscription was now being embedded with the Shop object. Another huge win from IdentityCache.

      Specifying Indexes

      Once you include IdentityCache into your model, you automatically get a fetch method added to your model class. Fetch will behave like find plus the read-through cache behaviour.
      You can also add other indexes to your models so that you can load them using a different key. Here are a few examples:
      class Product < ActiveRecord::Base
        include IdentityCache
      class Product < ActiveRecord::Base
        include IdentityCache
        cache_index :handle
      We’ve tried to make IdentityCache as simple as possible to add to your models. For each cache index you add, you end up with a fetch_* method on the model to fetch those objects from the cache.
      You can also specify cache indexes that look at multiple fields. The code to do this would be as follows:
      class Product < ActiveRecord::Base
        include IdentityCache
        cache_index :shop_id, :id
      Product.fetch_by_shop_id_and_id(shop_id, id)

      Caching Associations

      One of the great things about IdentityCache is that you can cache has_one, has_many and belongs_to associations as well as single objects. This really sets IdentityCache apart from similar libraries.
      This is a simple example of caching associations with IdentityCache:
      class Product < ActiveRecord::Base
        include IdentityCache
        has_many :images
        cache_has_many :images
      @product = Product.fetch(id)
      @images = @product.fetch_images
      What happens here is the product is fetched from either Memcached or the database if it's a cache miss. We then look for the images in the cache or database if we get another miss. This also works for both has_one and belongs_to associations with the cache_has_one and cache_belongs_to IdentityCache, respectively.
      What if we always want to load the images though, do we always need to make the two requests to the cache? 

      Embedding Associations

      With IdentityCache we can also embed the associations with the parent object so that when you load the parent the associations are also cached and loaded on a cache hit. This avoids needing to make the multiple Memcached calls to load all the cached data. To enable this you simple need to add the ':embed => true' options. Here's a little example:
      class Product < ActiveRecord::Base
        include IdentityCache
        has_many :images
        cache_has_many :images, :embed => true
      @product = Product.fetch(id)
      @images = @product.fetch_images
      The main difference with this example versus the previous is that the '@product.fetch_images' call won't hit Memcached a second time; the data is already loaded when we fetch the product from Memcached.
      The tradeoffs of using embed are: first your entries in memcached will be larger, as they’ll have to store data for the model and its embedded associations, second the whole cache entry will expire on changes to any of the models cached.
      There are a number of other options and different ways you can use IdentityCache which are highlighted on the github page https://github.com/Shopify/identity_cache, I highly encourage anyone interested to take a look at those examples for more details. Please check it out for yourself and let us know what you think!

      Continue reading

      What Does Your Webserver Do When a User Hits Refresh?

      What Does Your Webserver Do When a User Hits Refresh?

      Your web application is likely rendering requests when the requesting client has already disconnected. Eric Wong helped us devise a patch for the Unicorn webserver that will test the client connection before calling the application, effectively dropping disconnected requests before wasting app server rendering time.

      The Flash Sale

      A common traffic pattern we see at Shopify is the flash sale, where a product will be discounted heavily or only available for a very short period of time. Our customer's flash sales can cause traffic spikes an order of magnitude above our typical traffic rate.

      This blog post highlights one of the problems dealing with these traffic surges that we solved during our preparation for the holiday shopping season.

      In a flash sale scenario, with our app servers under high load, response time grows.  As our response time increases, customers attempting to buy items will hit refresh in frustration.  This was causing a snowball effect that would contribute to reduced availability.

      Connection Queues 

      Each of our application servers run Nginx in front of many Unicorn workers running our Rails application.  When Nginx receives a request, it opens a queued connection on the shared socket that is used to communicate with Unicorn.  The Unicorn workers work off requests in the order they're placed on the socket’s connection backlog.  

      The worker process looks something like:

      The second step takes the bulk majority of time of processing a request.  Under load, the queue of pending requests sitting on the UNIX socket from Nginx grows until it reaches maximum capacity (SOMAXCONN).  When the queue reaches capacity, Nginx will immediately return a 502 to incoming requests as it has nowhere to queue the connection.

      Pending Requests

      While the app worker is busy rendering a request, the pending requests in the socket backlog represent users waiting for a result.  If a users hits refresh, their browser closes the current connection and their new connection enters the end of the queue (or nginx returns a 502 if the queue is full).  So what happens when the application server gets to the user's original request in the queue?

      Nginx and HTTP 499

      The HTTP 499 response code is not part of the HTTP standard.  Nginx logs this response code when a user disconnects before the application returned a result.  Check your logs - an abundance of 499s is a good indication that your application is too slow or over capacity, as people are disconnecting instead of waiting for a response.  Your Nginx logs will always have some 499s due to clients disconnecting before even a quick request finishes.

      HTTP 200 vs HTTP 499 Responses During a Flash Sale

      When Nginx logs an HTTP 499 it also closes the downstream connection to the application, but it is up to the application to detect the closed connection before wasting time rendering a page for a client who already disconnected.

      Detecting Closed Sockets

      With the asynchronous nature of sockets, detecting a closed connection isn't straightforward.  Your options are:

      • Call select() on the socket.  If a connection is closed, it will return as "data available" but a subsequent read() call will fail.
      • Attempt to write to the socket.

      Unfortunately it is typical for web applications to find out the client socket is closed only after spending the time and resources rendering the page, when it attempts to write the response.  This is what our Rails application was doing.  The net effect was that for every time a user pressed refresh, we would render that page, even if the user had already disconnected.  This would cause a snowball effect until eventually our app workers were doing little but rendering pages and throwing them away and our service was effectively down.

      What we wanted to do was test the connection before calling the application, so we could filter out closed sockets and avoid wasting time.  The first detection option above is not great: select() requires a timeout, and generally select() with even the shortest timeout will take a fraction of a millisecond to complete.  So we went with the second solution:  Write something to the socket to test it, before calling the application.  This is typically the best way to deal with resources anyways: just attempt to use them and there will be an error if there’s something in the way.  Unicorn was already acting that way, just not until after wasting time rendering the page.

      Just write an 'H'

      Thankfully all HTTP responses start with "HTTP/1.1", so (rather cheekily) our patch to Unicorn writes this string to test the connection before calling the application.  If writing to the socket fails, Unicorn moves on to process the next request and only a trivial amount of time is spent dealing with the closed connection.

      Eric Wong merged this change into Unicorn master and soon after released Unicorn V4.5.0.  To use this feature you must add 'check_client_connection true' to your Unicorn configuration.


      Continue reading

      Introducing the Super Debugger: A Wireless, Real-Time Debugger for iOS Apps

      Introducing the Super Debugger: A Wireless, Real-Time Debugger for iOS Apps

      By Jason Brennan

      LLDB is the current state of the art for iOS debugging, but it’s clunky and cumbersome and doesn’t work well with objects. It really doesn't feel very different from gdb. It's a solid tool but it requires breakpoints, and although you can integrate with Objective C apps, it's not really built for it. Dealing with objects is cumbersome, and it's hard to see your changes.

      This is where Super Debugger comes in. It's a new tool for rapidly exploring your objects in an iOS app whether they're running on an iPhone, iPad, or the iOS Simulator, and it's available today on Github. Check over the included readme to see what it can do in detail.

      Today we're going to run through a demonstration of an included app called Debug Me.

      1. Clone the superdb repository locally to your Mac and change into the directory.

        git clone https://github.com/Shopify/superdb.git
            cd superdb
      2. Open the included workspace file, SuperDebug.xcworkspace, select the Debug Me target and Build and Run it for your iOS device or the Simulator. Make sure the device is on the same wifi network as your Mac.

      3. Go back to Xcode and change to the Super Debug target. This is the Mac app that you'll use to talk to your iOS app. Build and Run this app.

      4. In Super Debug, you'll see a window with a list of running, debuggable apps. Find Debug Me in the list (hint: it's probably the only one!) and double click it. This will open up the shell view where you can send messages to the objects in your app, all without setting a single break point.

      5. Now let's follow the instructions shown to us by the Debug Me app.

      6. In the Mac app, issue the command .self (note the leading dot). This updates the self pointer, which will execute a Block in the App delegate that returns whatever we want to be pointed to by the variable self. In this case (and in most cases), we want self to point to the current view controller. For Debug Me, that means it points to our instance of DBMEViewController after we issue this command.

      7. Now that our pointer is set up, we can send a message to that pointer. Type self redView layer setMasksToBounds:YES. This sends a chain of messages in F-Script syntax. In Objective C, it would look like [[[self redView] layer] setMasksToBounds:YES]. Here we omit the braces because of our syntax.

        We do use parentheis sometimes, when passing the result of a message send would be ambiguous, for example something like this in Objective C: [view setBackgroundColor:[UIColor purpleColor]] would be view setBackgroundColor:(UIColor purpleColor) in our syntax.

      8. The previous step has no visible result, so lets make a change. Type self redView layer setCornerRadius:15 and see the red view get nice rounded corners!

      9. Now for the impressive part. Move your mouse over the number 15 and see it highlight. Now click and drag left or right, and see the view's corner radius update in real time. Awesome, huh?

      That should be enough to give you a taste of this brand new debugger. Interact with your objects in real-time. Iterate instantly. No more build, compile, wait. It's now Run, Test, Change. Fork the project on Github and get started today.

      Continue reading

      RESTful thinking considered harmful - followup

      RESTful thinking considered harmful - followup

      My previous post RESTful thinking considered harmful caused quite a bit of discussion yesterday. Unfortunately, many people seem to have missed the point I was trying to make. This is likely my own fault for focusing too much on the implementation, instead of the thinking process of developers that I was actually trying to discuss. For this reason, I would like to clarify some points.

      • My post was not intended as an arguments against REST. I don't claim to be a REST expert, and I don't really care about REST semantics.
      • I am also not claiming that it is impossible to get the design right using REST principles in Rails.

      So what was the point I was trying to make?

      • Rails actively encourages the REST = CRUD design pattern, and all tutorials, screencasts, and documentation out there focuses on designing RESTful applications this way.
      • However, REST requires developers to realize that stuff like "publishing a blog post" is a resource, which is far from intuitive. This causes many new Rails developers to abuse the update action.
      • Abusing update makes your application lose valuable data. This is irrevocable damage.
      • Getting REST wrong may make your API less intuitive to use, but this can always be fixed in v2.
      • Getting a working application that properly supports your process should be your end goal, having it adhere to REST principles is just a means to get there.
      • All the focus on RESTful design and discussion about REST semantics makes new developers think this is actually more important and messes with them getting their priorities straight.

      In the end, having a properly working application that doesn't lose data is more important than getting a proper RESTful API. Preferably, you want to have both, but you should always start with the former.

      Improving the status quo

      In the end, what I want to achieve is educating developers, not changing the way Rails implements REST. Rails conventions, generators, screencasts, and tutorials are all part of how we educate new Rails developers.

      • Rails should ship with a state machine implementation, and a generator to create a model based on it. Thinking "publishing a blog post" is a transaction in a state machine is a lot more intuitive.
      • Tutorials, screencasts, and documentation should focus on using it to design your application. This would lead to to better designed application with less bugs and security issues.
      • You can always wrap your state machine in a RESTful API if you wish. But this should always come as step 2.

      Hopefully this clarifies a bit better what I was trying to bring across.

      Continue reading

      RESTful thinking considered harmful

      RESTful thinking considered harmful

      It has been interesting and at times amusing to watch the last couple of intense debates in the Rails community. Of particular interest to me are the two topics that relate to RESTful design that ended up on the Rails blog itself: using the PATCH HTTP method for updates and protecting attribute mass-assignment in the controller vs. in the model.

      REST and CRUD

      These discussions are interesting because they are both about the update part of the CRUD model. PATCH deals with updates directly, and most problems with mass-assignment occur with updates, not with creation of resources.

      In the Rails world, RESTful design and the CRUD interface are closely intertwined: the best illustration for this is that the resource generator generates a controller with all the CRUD actions in place (read is renamed to show, and delete is renamed to destroy). Also, there is the DHH RailsConf '06 keynote linking CRUD to RESTful design.

      Why do we link those two concepts? Certainly not because this link was included in the original Roy Fielding dissertation on the RESTful paradigm. It is probably related to the fact that the CRUD actions match so nicely on the SQL statements in relational databases that most web applications are built on (SELECT, INSERT, UPDATE and DELETE) on the one hand, and on the HTTP methods that are used to access the web application on the other hand. So CRUD seems a logical link between the two.

      But do the CRUD actions match nicely on the HTTP methods? DELETE is obvious, and the link between GET and read is also straightforward. Linking POST and create already takes a bit more imagination, but the link between PUT and update is not that clear at all. This is why PATCH was added to the HTTP spec and where the whole PUT/PATCH debate came from.

      Updates are not created equal

      In the relational world of the database, UPDATE is just an operator that is part of set theory. In the world of publishing hypermedia resources that is HTTP, PUT is just a way to replace a resource on a given URL; PATCH was added later to patch up an existing resource in an application-specific way.

      But was it an update in the web application world? It turns out that it is not so clear cut. Most web application are built to support processes: it is an OLTP system. A clear example of an OLTP system supporting a process is an ecommerce application. In an OLTP system, there is two kinds of data: master data of the objects that play a role within the context of your application (e.g. customer and product) and process-describing data, the raison d'être of your application (e.g., an order in the ecommerce example).

      For master data, the semantics of an update are clear: the customer has a new address, or a products description gets rewritten [1]. For process-related data it is not so clear cut: the process isn't so much updated, the state of the process is changed due to an event: a transaction. An example would be the customer paying the order.

      In this case, a database UPDATE is used to make the data reflect the new reality due to this transaction. The usage of an UPDATE statement actually is an implementation detail, and you can easily do that otherwise. For instance, the event of paying for an order could just as well be stored as a new record INSERTed into the order_payments table. Even better would be to implement the process as a state machine, two concepts that are closely linked, and to store the transactions so you can later analyze the process.

      Transactional design in a RESTful world

      RESTful thinking for processes therefore causes more harm than it does good. The RESTful thinker may design both the payment of an order and the shipping of an order both as updates, using the HTTP PATCH method:

          PATCH /orders/42 # with { order: { paid: true  } }
          PATCH /orders/42 # with { order: { shipped: true } }

      Isn't that a nice DRY design? Only one controller action is needed, just one code path to handle both cases!

      But should your application in the first place be true to RESTful design principles, or true to the principles of the process it supports? I think the latter, so giving the different transactions different URIs is better:

          POST /orders/42/pay
          POST /orders/42/ship

      This is not only clearer, it also allows you to authorize and validate those transactions separately. Both transactions affect the data differently, and potentially the person that is allowed to administer the payment of the order may not be the same as the person shipping it.

      Some notes on implementation and security

      When implementing a process, every possible transaction should have a corresponding method in the process model. This method can specify exactly what data is going to be updated, and can easily make sure that no other will be updated unintentionally.

      In turn, the controller should call this method on the model. Using update_attributes from your controller directly should be avoided: it is too easy to forget appropriate protection for mass-assignment, especially if multiple transactions in the process update different fields of the model. This also sheds some light in the protecting from mass-assignment debate: protection is not so much part of the controller or the model, but should be part of the transaction.

      Again, using a state machine to model the process makes following this following these principles almost a given, making your code more secure and bug free.

      Improving Rails

      Finally, can we improve Rails to reflect these ideas and make it more secure? Here are my proposals:

      • Do not generate an update action that relies on calling update_attributes when running the resource generator. This way it won't be there if it doesn't need to be reducing the possibility of a security problem.
      • Ship with a state machine implementation by default, and a generator for a state machine-backed process model. Be opinionated!

      These changes would point Rails developers into the right direction when designing their application, resulting in better, more secure applications.

      [1] You may even want to model changes to master data as transactions, to make your system fully auditable and to make it easy to return to a previous value, e.g. to roll back a malicious update to the ssh_key field in the users table.

      A big thanks to Camilo Lopez, Jesse Storimer, John Duff and Aaron Olson for reading and commenting on drafts of this article.


      Update: apparently many people missed to point I was trying to make. Please read the followup post in which I try to clarify my point.

      Continue reading

      Webhook Best Practices

      Webhook Best Practices

      Webhooks are brilliant when you’re running an app that needs up-to-date information from a third party. They’re simple to set up and really easy to consume.

      Through working with our third-party developer community here at Shopify, we’ve identified some common problems and caveats that need to be considered when using webhooks. Best practices, if you will.

      When Should I Be Using Webhooks?

      Let’s start with the basics. The obvious case for webhooks is when you need to act on specific events. In Shopify, this includes actions like an order being placed, a product price changing, etc. If you would otherwise have to poll for data, you should be using webhooks.

      Another common use-case we’ve seen is when you’re dealing with data that isn’t easily searchable though the API you’re dealing with. Shopify offers several filters on our index requests, but there’s a fair amount of secondary or implied data that isn’t directly covered by these. Re-requesting the entire product catalog of a store whenever you want to search by SKU or grabbing the entire order history when you need to find all shipping addresses in a particular city is highly inefficient. Fortunately some forward planning and webhooks can help.

      Let’s use searching for product SKUs on Shopify as an example:

      The first thing you should do is grab a copy of the store’s product catalog using the standard REST interface. This may take several successive requests if there’s a large number of products. You then persist this using your favourite local storage solution.

      Then you can register a webhook on the product/updated event that captures changes and updates your local copy accordingly. Bam, now you have a fully searchable up-to-date product catalog that you can transform or filter any way you please.

      How Should I Handle Webhook Requests?

      There’s no official spec for webhooks, so the way they’re served and managed is up to the originating service. At Shopify we’ve identified two key issues:

      • Ensuring delivery/Detecting failure
      • Protecting our system

      To this end, we’ve implemented a 10-second timeout period and a retry period for subscriptions. We wait 10 seconds for a response to each request, and if there isn’t one or we get an error, we retry the connection several times over the next 48 hours.

      If you’re receiving a Shopify webhook, the most important thing to do is respond quickly. There have been several historical occurrences of apps that do some lengthy processing when they receive a webhook that triggers the timeout. This has led to situations where webhooks were removed from functioning apps. Oops!

      To make sure that apps don’t accidentally run over the timeout limit, we now recommend that apps defer processing until after the response has been sent. In Rails, Delayed Jobs are perfect for this.

      What Do I Do if Everything Blows Up?

      This one is a key component of good software design in general, but I think it’s worth mentioning here as the scope is beyond the usual recommendations about data validation and handling failures gracefully.

      Imagine the worst case scenario: Your hosting centre exploded and your app has been offline for more than 48 hours. Ouch. It’s back on its feet now, but you’ve missed a pile of data that was sent to you in the meantime. Not only that, but Shopify has cancelled your webhooks because you weren’t responding for an extended period of time.

      How do you catch up? Let’s tackle the problems in order of importance.

      Getting your webhook subscriptions back should be straightforward as your app already the code that registered them in the first place. If you know for sure that they’re gone you can just re-run that and you’ll be good to go. One thing I’d suggest is adding a quick check that fetches all the existing webhooks and only registers the ones that you need.

      Importing the missing data is trickier. The best way to get it back is to build a harness that fetches data from the time period you were down for and feeds it into the webhook processing code one object at a time. The only caveat is that you’ll need the processing code to be sufficiently decoupled from the request handlers that you can call it separately.

      Webhooks Sound Magic, Where Can I Learn More?

      We have a comprehensive wiki page on webhooks as well as technical documentation on how to manage webhooks in your app.

      There’s also a good chunk of helpful threads on our Developer Mailing List.

      Continue reading

      Defining Churn Rate (no really, this actually requires an entire blog post)

      Defining Churn Rate (no really, this actually requires an entire blog post)

      If you go to three different analysts looking for a definition of "churn rate," they will all agree that it's an important metric and that the definition is self evident. Then they will go ahead and give you three different definitions. And as they share their definitions with each other they all have the same response: why is everyone else making this so com

      Continue reading

      Application Proxies: The New Hotness

      Application Proxies: The New Hotness

      I’m pleased to announce a brand new feature that we recently added to the Shopify API: Application Proxies. These will allow you do develop all kinds of crazy things that weren’t possible before, and we’re really excited about it. Let me explain.

      What’s an App Proxy?

      An App Proxy is simply a page within a Shopify shop that loads its content from another location of your choosing. Applications can tell certain shop pages that they should fetch and display data from another location outside of Shopify.

      The really cool thing about the implementation we’ve put together is that if you return data with the application/liquid content-type we’ll run it through Shopify’s template rendering engine before pushing it out to the user. This allows you to create dynamic native pages without having to do anything crazy with iframes. I’ll explain this in more detail later.

      How Do I Set This Up?

      We have a great App Proxy tutorial over on our API docs that takes you through the steps, but I’ll summarize them here too.

      The first thing you need to do is set up the path that should be proxied and where it should be proxied to. This is done from your app’s configuration screen on the Partners dashboard.

      Once you’ve done that, you’ll need to work out what data you’re going to return when the specified URL is hit. You can return anything you want but for now we’re going to show some very simple stats to get the ball rolling.

      All my examples assume that you’re using the shopify_app gem as a starting point, but the topics I cover translate directly to all languages.

      Before we do anything else we need a controller to handle the calls. I generated a ProxyController class and mapped /proxy to hit its index method. I also created a template to render the response.

      Here’s the controller:

      class ProxyController < ApplicationController
        def index

      And here’s the template:

      <h1>Hello App Proxy World</h1>

      Really easy so far. Now we can start our rails app and visit the proxied page in a browser. It should look something like this:

      Not much to see here just yet. In fact, it looks nothing like our shop. Let’s do something about that.

      What we want is for shopify to render the page just like it were any other native data using the Liquid engine. We tell Shopify to do this by setting the content-type header on our response to application/liquid. At the same time we’re going to tell rails not to use its own layouts when rendering the page.

      Add this line to the index method in ProxyController

      render :layout => false, :content_type => 'application/liquid'

      Now save and reload the page. Tada! Here’s what you’ll see:

      Good, eh?

      Next Steps

      Static text is all well and good, but its not very interesting. What we really want here are some stats. I’ve chosen to display the shop’s takings as well as a link to the most popular product for the last week.

      Now that we’re trying to access shop data we need to figure out which shop is sending us the request in the first place. Fortunately the url of the shop is one of the GET parameters on the request, so we can grab that and use it to configure our environment to make API calls. Details on how to do this are documented here, so go set that up and then come back when you’re done. I’ll wait.

      Back? Excellent. Let’s put some info into our response. here’s what your ProxyController should look like now:

      class ProxyController < ApplicationController
        def index
          ShopifyAPI::Base.site = Shop.find_by_name(params[:shop]).api_url
          @orders = ShopifyAPI::Order.find(:all, :params => {:created_at_min => 1.week.ago})
          @total = 0
          @product_sale_counts =Hash.new()
          @orders.each do |order|
            order.line_items.each do |line_item|
              if @product_sale_counts[line_item.product_id]
                @product_sale_counts[line_item.product_id] = @product_sale_counts[line_item.product_id] + line_item.quantity
                @product_sale_counts[line_item.product_id] = line_item.quantity
            @total += order.total_price.to_i
          top_seller_stats = @product_sale_counts.max_by{|k,v| v}
          @product = ShopifyAPI::Product.find(top_seller_stats.first)
          @top_seller_count = top_seller_stats.last
          render :layout => false, :content_type => 'application/liquid'

      And here’s the template:

      <h1>This Week's Earnings</h1>
      <p><%= number_to_currency(@total)%> from <%= @orders.count%> orders</p>
      <h1>Top Seller: <%= link_to(@product.title, url_for_product(@product)) %></h1>
      <p>This product sold <%= @top_seller_count %> units</p>

      Here's the finished product. The CSS could use some work, but all our info is there and matches the theme perfectly:

      A Word On Security

      So far so good, but right now there’s no security on our proxy. Anyone sending a request to that url with a ‘shop’ parameter will get data back. Oops! Let’s fix that.

      Just like our webhooks, we sign all our proxy requests. There are details in the API docs on exactly how this is done, but for simplicity’s sake just add this private function to your ProxyController and add it as a before_filter:

      def verify_request_source
        url_parameters = {
          "shop" => params[:shop],
          "path_prefix" => params[:path_prefix],
          "timestamp" => params[:timestamp]
        sorted_params = url_parameters.collect{ |k, v| "#{k}=#{Array(v).join(',')}" }.sort.join
        calculated_signature = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::Digest.new('sha256'),
        ShopifyAppProxyExample::Application.config.shopify.secret, sorted_params)
        raise 'Invalid signature' if params[:signature] != calculated_signature

      Great! Now you can be sure that Shopify is the one sending you this data and not some dirty impostor.

      There you have it. Application Proxies are a great way to introduce dynamic third-party content into a native shop page. There's a lot more that you can do with them, far too much to cover in a single blog post.

      If you're interested I encourage you to set up a quick app and give them a try. You can also discuss potential ideas with other developers on our dev mailing list.

      Continue reading

      Three Months of CoffeeScript

      Three Months of CoffeeScript

      Guest Post by Kamil Tusznio!

      Kamil’s a developer at Shopify and has been working in our developer room just off the main “bullpen” that I like to refer to as “The Batcave”. That’s where the team working on the Batman.js framework have been working their magic. Kamil asked if he could post an article on the blog about his experiences with CoffeeScript and I was only too happy to oblige.


      Since joining the Shopify team in early August, I have been working on Batman.js, a single-page app micro-framework written purely in CoffeeScript. I won't go into too much detail about what CoffeeScript is, because I want to focus on what it allows me to do.

      Batman.js has received some flack for its use of CoffeeScript, and more than one tweet has asked why we didn't call the framework Batman.coffee. I feel the criticism is misguided, because CoffeeScript allows you to more quickly write correct code, while still adhering to the many best practices for writing JavaScript.

      An Example

      A simple example is iteration over an object. The JavaScript would go something like this:

      var obj = {
        a: 1, 
        b: 2, 
        c: 3
      for (var key in obj) {
        if (obj.hasOwnProperty(key)) { // only look at direct properties
          var value = obj[key];
          // do stuff...

      Meanwhile, the CoffeeScript looks like this:

      obj =
        a: 1
        b: 2
        c: 3
      for own key, value of obj
        # do stuff...

      Notice the absence of var, hasOwnProperty, and needing to assign value. And best of all, no semi-colons! Some argue that this adds a layer of indirection to the code, which it does, but I'm writing less code, resulting in fewer opportunities to make mistakes. To me, that is a big win.


      Another criticism levelled against CoffeeScript is that debugging becomes harder. You're writing .coffee files that compile down to .js files. Most of the time, you won't bother to look at the .js files. You'll just ship them out, and you won't see them until a bug report comes in, at which point you'll be stumped by the compiled JavaScript running in the browser, because you've never looked at it.

      Wait, what? What happened to testing your code? CoffeeScript is no excuse for not testing, and to test, you run the .js files in your browser, which just about forces you to examine the compiled JavaScript.

      (Note that it's possible to embed text/coffeescript scripts in modern browsers, but this is not advisable for production environments since the browser is then responsible for compilation, which slows down your page. So ship the .js.)

      And how unreadable is that compiled JavaScript? Let's take a look. Here's the compiled version of the CoffeeScript example from above:

      var key, obj, value;
      var __hasProp = Object.prototype.hasOwnProperty;
      obj = {
        a: 1,
        b: 2,
        c: 3
      for (key in obj) {
        if (!__hasProp.call(obj, key)) continue;
        value = obj[key];

      Admittedly, this is a simple example. But, after having worked with some pretty complex CoffeeScript, I can honestly say that once you become familiar (which doesn't take long), there aren't any real surprises. Notice also the added optimizations you get for free: local variables are collected under one var statement, and hasOwnProperty is called via the prototype.

      For more complex examples of CoffeeScript, look no further than the Batman source.


      I'm always worried when I come across tools that add a level of indirection to my workflow, but CoffeeScript has not been bad in this respect. The only added step to getting code shipped out is running the coffee command to watch for changes in my .coffee files:

      coffee --watch --compile src/ --output lib/

      We keep both the .coffee and .js files under git, so nothing gets lost. And since you still have .js files kicking around, any setup you have to minify your JavaScript shouldn't need to change.


      After three months of writing CoffeeScript, I can hands-down say that it's a huge productivity booster. It helps you write more elegant and succinct code that is less susceptible to JavaScript gotchas.

      Further Reading

      [ This article also appears in Global Nerdy. ]

      Continue reading

      Most Memory Leaks are Good

      Most Memory Leaks are Good


      Catastrophe! Your app is leaking memory. When it runs in production it crashes and starts raising Errno::ENOMEM exceptions. So you babysit it and restart it consistently so that your app keeps responding.

      As hard as you try you don’t see any memory leaks. You use the available tools, but you can’t find the leak. Understanding your full stack, knowing your tools, and good ol’ debugging will help you find that memory leak.

      Memory leaks are good?

      Yes! Depending on your definition. A memory leak is any memory that is allocated, but never freed. This is the basis of anything global in your programs. 

      In a Ruby program global variables are allocated but will never be freed. Same goes with constants, any constant you define will be allocated and never freed. Without these things we couldn’t be very productive Ruby programmers.

      But there’s a bad kind

      The bad kind of memory leak involves some memory being allocated and never freed, over and over again. For example, if a constant is appended each time a web request is made to a Rails app, that's a memory leak. Since that constant will never be freed and it’s memory consumption will only grow and grow.

      Separating the good and the bad

      Unfortunately, there’s no easy way to separate the good memory leaks from the bad ones. The computer can see that you’re allocating memory, but, as always, it doesn’t understand what you’re trying to do, so it doesn’t know which memory leaks are unintentional.

      To make matters more muddy, the computer can’t differentiate betweeen a memory leak in Ruby-land and a memory leak in C-land. It’s all just memory.

      If you’re using a C extension that’s leaking memory there are tools specific to the C language that can help you find memory leaks (Valgrind). If you have Ruby code that is leaking memory there are tools specific to the Ruby language that can help you (memprof). Unfortunately, if you have a memory leak in your app and have no idea where it’s coming from, selecting a tool can be really tough.

      How bad can memory leaks get?

      This begins the story of a rampant memory leak we experienced at Shopify at the beginning of this year. Here’s a graph showing the memory usage of one of our app servers during that time.

      You can see that memory consumption continues to grow unhindered as time goes on! Those first two spikes which break the 16G mark show that memory consumption climbed above the limit of physical memory on the app server, so we had to rely on the swap. With that large spike the app actually crashed, raising Errno::ENOMEM errors for our users.

      After that you can see many smaller spikes. We wrote a script to periodically reboot the app, which releases all of the memory it was using. This was obviously not a sustainable solution. Case in point: the last spike on the graph shows that we had an increase in traffic which resulted in memory usage growing beyond the limits of physical memory again.

      So, while all this was going on we were searching high and low to find this memory leak.

      Where to begin?

      The golden rule is to make the leak reproducible. Like any bug, once you can reproduce it you can surely fix it. For us, that meant a couple of things:

      1. When testing, reproduce your production environment as closely as possible. Run your app in production mode on localhost, set up the same stack that you have on production. Ensure that you are running the same exact versions of the software that is running on production.

      2. Be aware of any issues happening on production. Are there any known issues with the production environment? Losing connections to the database? Firewall routing traffic properly? Be aware of any weird stuff that’s happening and how it may be affecting your problem.


      Now that we’ve laid out the basics at a high level, we’ll dive into a tool that can help you find memory leaks.

      Memprof is a memory profiling tool built by ice799 and tmm1. Memprof does some crazy stuff like rewriting the current Ruby binary at runtime to hot patch features like object allocation tracking. Memprof can do stuff like tell you how many objects are currently alive in the Ruby VM, where they were allocated, what their internal state is, etc.

      VM Dump

      The first thing that we did when we knew there was a problem was to reach into the toolbox and try out memprof. This was my first experience with the tool. My only exposure to the tool had been a presentation by @tmm1 that detailed some heavy duty profiling by dumping every live object in the Ruby VM in JSON format and using MongoDB to perform analysis.

      Without any other leads we decided to try this method. After hitting our staging server with some fake traffi we used memprof to dump the VM to a JSON file. An important note is that we did not reproduce the memory leak on our staging server, we just took a look at the dump file anyway.

      Our dump of the VM came out at about 450MB of JSON. We loaded it into MongoDB and did some analysis. We were surprised by what we found. There were well over 2 million live objects in the VM, and it was very difficult to tell at a glance which should be there and which should not.

      As mentioned earlier there are some objects that you want to ‘leak’, especially true when it comes to Rails. For instance, Rails uses ActiveSupport::Callbacks in many key places, such as ActiveRecord callbacks or ActionController filters. We had tons of Proc objects created by ActiveSupport::Callbacks in our VM, but these were all things that needed to stick around in order for Shopify to function properly.

      This was too much information, with not enough context, for us to do anything meaningful with.

      Memprof stats

      More useful, in terms of context, is having a look at Memprof.stats and the middleware that ships with Memprof. Using these you can get an idea of what is being allocated during the course of a single web request, and ultimately how that changes over time. It’s all about noticing a pattern of live objects growing over time without stopping.


      The other useful tool we used was memprof.com. It allows you to upload a JSON VM dump (via the memprof gem) and analyse it using a slick web interface that picks up on patterns in the data and shows relevant reports. It has since been taken offline and open sourced by tmm1 at https://github.com/tmm1/memprof.com.

      Unable to reproduce our memory leak on development or staging we decided to run memprof on one of our production app servers. We were only able to put it in rotation for a few minutes because it increased response time by 1000% due to the modifications made by memprof. The memory leak that we were experiencing would typically take a few hours to show itself, so we weren’t sure if a few minutes of data would be enough to notice the pattern we were looking for.

      We uploaded the JSON dump to memprof.com and started using the web UI to look for our problem. Different people on the team got involved and, as I mentioned earlier, this data can be confusing. After seeing the huge amount of Proc object from ActiveSupport::Callbacks some claimed that “ActiveSupport::Callbacks is obviously leaking objects on every request”. Unfortunately it wasn’t that simple and we weren’t able to find any patterns using memprof.com.

      Good ol’ debuggin: Hunches & Teamwork

      Unable to make progress using these approaches we were back to square one. I began testing locally again and, through spying on Activity Monitor, thought that I noticed a pattern emerging. So I double-checked that I had all the same software stack running that our production environment has, and then the pattern disappeared.

      It was odd, but I had a hunch that it had something to do with a bad connection to memcached. I shared my hunch with @wisqnet and he started doing some testing of his own. We left our chat window open as we were testing and shared all of our findings.

      This was immensely helpful so that we could both begin tracing patterns between each others results. Eventually we tracked down a pattern. If we consistently hit a URL we could see the memory usage climb and never stop. We eventually boiled it down to a single of code:

      loop { Rails.cache.write(rand(10**10).to_s, rand(10**10).to_s) }

      If we ran that code in a console and then shut down the memcached instance it was using, memory usage immediately spiked.

      Now What?

      Now that it was reproducible we were able to experiment with fixing it. We tracked the issue down to our memcached client library. We immediately switched libraries and the problem disappeared in production. We let the library author know about the issue and he had it fixed in hours. We switched back to our original library and all was well!


      It turned out that the memory leak was happening in a C extension, so the Ruby tools would not have been able to find the problem.

      Three pieces of advice to anyone looking for a memory leak:

      1. Make it reproducible!
      2. Trust your hunches, even if they don’t make sense.
      3. Work with somebody else. Bouncing your theories off of someone else is the most helpful thing you can do.

      Continue reading

      How Batman can Help you Build Apps

      How Batman can Help you Build Apps

      Batman.js is Shopify’s new open source CoffeeScript framework, and I’m absolutely elated to introduce it to the world after spending so much time on it. Find Batman on GitHub here.

      Batman emerges into a world populated with extraordinary frameworks being used to great effect. With the incredible stuff being pushed out in projects like Sproutcore 2.0 and Backbone.js, how is a developer to know what to use when? There’s only so much time to play with cool new stuff, so I’d like to give a quick tour of what makes Batman different and why you might want to use it instead of the other amazing frameworks available today.

      Batman makes building apps easy

      Batman is a framework for building single page applications. It’s not a progressive enhancement or a single purpose DOM or AJAX library. It’s built from the ground up to make building awesome single page apps by implementing all the lame parts of development like cross browser compatibility, data transport, validation, custom events, and a whole lot more. We provide handy helpers for development to generate and serve code, a recommended app structure for helping you organize code and call it when necessary, a full MVC stack, and a bunch of extras, all while remaining less than 18k when gzipped. Batman doesn’t provide only the basics, or the whole kitchen sink, but a fluid API that allows you to write the important code for your app and none of the boilerplate.

      A super duper runtime

      At the heart of Batman is a runtime layer used for manipulating data from objects and subscribing to events objects may emit. Batman’s runtime is used similarly to SproutCore’s or Backbone’s in that all property access and assignment on Batman objects must be done through someObject.get and someObject.set, instead of using standard dot notation like you might in vanilla JavaScript. Adhering to this property system allows you to:

      • transparently access “deep” properties which may be simple data or computed by a function,
      • inherit said computed properties from objects in the prototype chain,
      • subscribe to events like change or ready on other objects at “deep” keypaths,
      • and most importantly, dependencies can be tracked between said properties, so chained observers can be fired and computations can be cached while guaranteed to be up-to-date.

      All this comes free with every Batman object, and they still play nice with vanilla JavaScript objects. Let’s explore some of the things you can do with the runtime. Properties on objects can be observed using Batman.Object::observe:

      crimeReport = new Batman.Object
      crimeReport.observe 'address', (newValue) ->
        if DangerTracker.isDangerous(newValue)

      This kind of stuff is available in Backbone and SproutCore both, however we’ve tried to bring something we missed in those frameworks to Batman: “deep” keypaths. In Batman, any keypath you supply can traverse a chain of objects by separating the keys by a . (dot). For example:

      batWatch = Batman
        currentCrimeReport: Batman
          address: Batman
            number: "123"
            street: "Easy St"
            city: "Gotham"
      batWatch.get 'currentCrimeReport.address.number' #=> "123"
      batWatch.set 'currentCrimeReport.address.number', "461A"
      batWatch.get 'currentCrimeReport.address.number' #=> "461A"

      This works for observation too:

      batWatch.observe 'currentCrimeReport.address.street', (newStreet, oldStreet) ->
        if DistanceCalculator.travelTime(newStreet, oldStreet) > 100000

      The craziest part of the whole thing is that these observers will always fire with the value of whatever is at that keypath, even if intermediate parts of the keypath change.

      crimeReportA = Batman
        address: Batman
          number: "123"
          street: "Easy St"
          city: "Gotham"
      crimeReportB = Batman
        address: Batman
          number: "72"
          street: "Jolly Ln"
          city: "Gotham"
      batWatch = new Batman.Object({currentCrimeReport: crimeReportA})
      batWatch.get('currentCrimeReport.address.street') #=> "East St"
      batWatch.observe 'currentCrimeReport.address.street', (newStreet) ->
      batWatch.set('currentCrimeReport', crimeReportB)
      # the "MuggingWatcher" callback above will have been called with "Jolly Ln"

      Notice what happened? Even though the middle segment of the keypath changed (a whole new crimeReport object was introduced), the observer fires with the new deep value. This works with arbitrary length keypaths as well as intermingled undefined values.

      The second neat part of the runtime is that because all access is done through get and set, we can track dependencies between object properties which need to be computed. Batman calls these functions accessors, and using the CoffeeScript executable class bodies they are really easy to define:

      class BatWatch extends Batman.Object
        # Define an accessor for the `currentDestination` key on instances of the BatWatch class.
        @accessor 'currentDestination', ->
          address = @get 'currentCrimeReport.address'
          return "#{address.get('number')} #{address.get('street')}, #{address.get('city')}"
      crimeReport = Batman
        address: Batman
          number: "123"
          street "Easy St"
          city: "Gotham"
      watch = new BatWatch(currentCrimeReport: crimeReport)
      watch.get('currentDestination') #=> "123 Easy St, Gotham"

      Importantly, the observers you may attach to these computed properties will fire as soon as you update their dependencies:

      watch.observe 'currentDestination', (newDestination) -> console.log newDestination
      crimeReport.set('address.number', "124")
      # "124 Easy St, Gotham" will have been logged to the console

      You can also define the default accessors which the runtime will fall back on if an object doesn’t already have an accessor defined for the key being getted or setted.

      jokerSimulator = new Batman.Object
      jokerSimulator.accessor (key) -> "#{key.toUpperCase()}, HA HA HA!"
      jokerSimulator.get("why so serious") #=> "WHY SO SERIOUS, HA HA HA!"

      This feature is useful when you want to present a standard interface to an object, but work with the data in nontrivial ways underneath. For example, Batman.Hash uses this to present an API similar to a standard JavaScript object, while emitting events and allowing objects to be used as keys.

      What’s it useful for?

      The core of Batman as explained above makes it possible to know when data changes as soon as it happens. This is ideal for something like client side views. They’re no longer static bundles of HTML that get cobbled together as a long string and sent to the client, they are long lived representations of data which need to change as the data does. Batman comes bundled with a view system which leverages the abilities of the property system.

      A simplified version of the view for Alfred, Batman’s todo manager example application, lies below:

      <ul id="items">
          <li data-foreach-todo="Todo.all" data-mixin="animation">
              <input type="checkbox" data-bind="todo.isDone" data-event-change="todo.save" />
              <label data-bind="todo.body" data-addclass-done="todo.isDone" data-mixin="editable"></label>
              <a data-event-click="todo.destroy">delete</a>
          <li><span data-bind="Todo.all.length"></span> <span data-bind="'item' | pluralize Todo.all.length"></span></li>
      <form data-formfor-todo="controllers.todos.emptyTodo" data-event-submit="controllers.todos.create">
        <input class="new-item" placeholder="add a todo item" data-bind="todo.body" />

      We sacrifice any sort of transpiler layer (no HAML), and any sort of template layer (no Eco, jade, or mustache). Our views are valid HTML5, rendered by the browser as soon as they have been downloaded. They aren’t JavaScript strings, they are valid DOM trees which Batman traverses and populates with data without any compilation or string manipulation involved. The best part is that Batman “binds” a node’s value by observing the value using the runtime as presented above. When the value changes in JavaScript land, the corresponding node attribute(s) bound to it update automatically, and the user sees the change. Vice versa remains true: when a user types into an input or checks a checkbox, the string or boolean is set on the bound object in JavaScript. The concept of bindings isn’t new, as you may have seen it in things like Cocoa, or in Knockout or Sproutcore in JS land.

      We chose to use bindings because we a) don’t want to have to manually check for changes to our data, and b) don’t want to have to re-render a whole template every time one piece of data changes. With mustache or jQuery.tmpl and company, I end up doing both those things surprisingly often. It seems wasteful to re-render every element in a loop and pay the penalty for appending all those nodes, when only one key on one element changes, and we could just update that one node. SproutCore’s ‘SC.TemplateView’ with Yehuda Katz' Handlebars.js do a good job of mitigating this, but we still didn’t want to do all the string ops in the browser, and so we opted for the surgical precision of binding all the data in the view to exactly the properties we want.

      What you end up with is a fast render with no initial loading screen, at the expense of the usual level of complex logic in your views. Batman’s view engine provides conditional branching, looping, context, and simple transforms, but thats about it. It forces you to write any complex interaction code in a packaged and reusable Batman.View subclass, and leave the HTML rendering to the thing that does it the best: the browser.


      Batman does more than this fancy deep keypath stuff and these weird HTML views-but-not-templates. We have a routing system for linking from quasi-page to quasi-page, complete with named segments and GET variables. We have a Batman.Model layer for retrieving and sending data to and from a server which works out of the box with storage backends like Rails and localStorage. We have other handy mixins for use in your own objects like Batman.StateMachine and Batman.EventEmitter. And, we have a lot more on the way. I strongly encourage you to check out the project website, the source on GitHub, or visit us in #batmanjs on freenode. Any questions, feedback, or patches will be super welcome, and we’re always open to suggestions on how we can make Batman better for you.

      Until next time….

      Continue reading

      Making Apps using Python, Django and App Engine

      Making Apps using Python, Django and App Engine

      We recently announced the release of our Python adaptor for the Shopify API. Now we would like to inform you that we have got it working well with the popular Django web framework and Google App Engine hosting service. But don't just take my word for it, you can see a live example on App Engine and check out the example's source code on GitHub. The example application isn't limited to Google App Engine, it can run as a regular Django application allowing you to explore other hosting options.

      The shopify_app directory in the example contains the reusable Django app code. This directory contains views for handling user login, authentication, and saves the Shopify session upon finalization. Middleware is included which loads session to automatically re-initialize the Python Shopify API for each request. There is also a @shop_login_required decorator for view functions that require login, which will redirect logged out users to the login page. As a result, your view function can be as simple as the following to display basic information about the shop's products and orders.

      def index(request):
          products = shopify.Product.find(limit=3)
          orders = shopify.Order.find(limit=3, order="created_at DESC")
          return render_to_response('home/index.html', {
              'products': products,
              'orders': orders,
          }, context_instance=RequestContext(request))

      Getting Started for Regular Django App

      1. Install the dependancies with this command:
        easy_install Django ShopifyAPI PyYAML pyactiveresource
      2. Download and unzip the zip file for the example application

      Getting Started for Google App Engine

      1. Install the App Engine SDK
      2. Download and unzip the example application zip file for App Engine which includes all the dependancies.
      3. Create an application with Google App Engine, and modify the application line in app.yaml with the application ID registered with Google App Engine.


        1. Create a Shopify app through the Shopify Partner account with the Return URL set to http://localhost:8000/login/finalize, and modify shopify_settings.py with the API-Key and Shared Secret for the app.
        2. Start the server:
          python manage.py runserver
        3. Visit http://localhost:8000 to view the example.
        4. Modify the code in the home directory.


        1. Update the return URL in your Shopify partner account to point to your domain name (e.g. https://APPLICATION-ID.appspot.com/login/finalize)
        2. Upload the application to the server. For Google App Engine, simply run:
          appcfg.py update .

        Further Information

        Update: Extensive examples on using the Shopify Python API have been added to the wiki. 

        Continue reading

        Webhook Testing Made Easy

        Webhook Testing Made Easy

        Webhooks are fantastic. We use them here at Shopify to notify API users of all sorts of important events. Order creation, product modification, and even app uninstallation all cause webhooks to be fired. They're a really neat way to avoid the problem of polling, which is annoying for app developers and API providers alike.

        The trouble with Webhooks is that you need a publicly visible URL to handle them. Unlike client-side redirects, webhooks originate directly from the server. This means that you can't use localhost as an endpoint in your testing environment as the API server would effectively be calling itself. Bummer.

        Fortunately, there are a couple of tools that make working with webhooks during development much easier. Let me introduce you to PostCatcher and LocalTunnel.


        PostCatcher is a brand new webapp that was created as an entry for last week's Node Knockout. Shopify's own Steven Soroka and Nick Small are on the judging panel this year, and this app caught their eye.

        The app generates a unique url that you can use as a webhook endpoint and displays any POST requests sent to it for you to examine. As you might expect from a js contest, the ui is extremely slick and renders all the requests in real-time as they come in. This is really useful in the early stages of developing an app as you can see the general shape and structure of any webhooks you need without writing a single line of code. On the flip side, API developers can use it to test their own service in a real-world environment.


        The thing I really like about PostCatcher over similar apps like PostBin is that I can sign in using github and keep track of all the catchers I've created. No more copy/pasting urls to a text file to avloid losing them. Hooray!


        LocalTunnel is a Ruby gem + webapp sponsored by Twilio that allows you to expose a given port on your local machine to the world through a url on their site. Setup is really easy (provided you have ruby and rubygems installed) and once it's installed you just start it from the console with the port you want to forward and share the url it spits out.


        From then on that url will point to your local machine so you can register the address as a webhook endpoint and get any incoming requests piped right to your machine. My previous solution was endless deployments to Heroku every time I made a small change to my code, which was a real pain in the arse. Compared to that, LocalTunnel was a godsend.


        Whilst PostCatcher and LocalTunnel are currently my top choices for testing webhooks, they're by no means the only party in town. I've already mentioned PostBin, but LocalTunnel also has a contender in LocalNode (another Node KO entry). The latter boasts wider integration (you don't need ruby) as well as permanent url redirects but setup is more complicated as you have to add a static html file to your web server.

        If there are other services, apps, or tricks that you use to test webhooks when developing apps, call them out in the comments! I'd love to hear what I've missed in this space.

        Continue reading

        How we use git at Shopify

        How we use git at Shopify

        By John Duff
        A little while back, Rodrigo Flores posted to the plataformatec blog, A (successful) git branching model, where he talks about the git workflow they've been using on some projects. I thought this was a great post and decided to do something similar explaining the git workflow that we use at Shopify.


        Git is an incredibly powerful tool that can be used in many different ways. I don't believe there is a 'correct' workflow for using git, just many different options that work for particular situations and people. The workflow that I am going to describe won't work for everyone, and not everyone at Shopify uses git in the same way - you have to modify and massage it to shape your needs and the way you work. I don't consider myself an expert with git, but am comfortable enough with the tool to handle just about everything I might need to do. If there's anything I can't figure out, James MacAulay is our resident git expert in the office and always willing to help out.
        Okay, lets get down to business.


        When working on a project each developer and designer first forks the repository that they want to work on. Forking a repository is really simple, Github even has a guide if you need some help with it. A fork is basically your own copy of the repository, that you can change without affecting anyone else. We use GitHub for all of our projects so it makes managing the forks really easy. All the work is done on your fork of the repository and only gets pulled into the main repository after it has been fully tested, code reviewed, etc. We also use the concept of feature branches to make it easy to switch between tasks and to share the work with other colleagues. A branch is kind of like a fork within your own repository, you can have many branches within your forked repository for each of the tasks you're working on. Your checkout of a project should be setup with a couple of remotes and branches to get started.


        • origin - This is a remote pointing to your clone of the project and added by default when you do 'git clone'.
        • mainline - This is a remote pointing to the main repository for the project. We use this remote to keep up to date and push to the main repository.


        • production - This is the production branch of the main repository (or mainline). This is the code that is ready to be deployed to production.
        • staging - Contains the code that is being run on the staging server (we have two that developers can use). Before a feature is considered 'finished' it must be tested on one of the staging servers, which mirrors the production environment.
        • master - Contains completed features that can be deployed.
        So how do we set all this up? These couple of git commands should take care of it:
        git clone git@github.com:jduff/project.git
        git remote add mainline git@github.com:Shopify/project.git
        Keeping a project up to date is also really easy, you just pull from mainline.
        git checkout master
        git pull --rebase mainline master
        I know what you're thinking, what the heck is that 'rebase' doing in there? Well, you don't really need it, but it helps to use it in case you've merged a new feature that you haven't pushed yet. This keeps the history all tidy with the changes you made on top of what is already in master instead of creating an additional "merge" commit when there's a conflict.

        Day To Day Usage

        So how does all of this work day to day? Here it is, step by step:
        git checkout master
        git checkout -b add_awesome # Feature branches, remember
        # Do some work, listen to a lightning talk, more work
        git commit -m "Creating an awesome feature"
        Mainline master moves pretty fast so we should keep our feature branch up to date
        git checkout master
        git pull --rebase mainline master
        git checkout add_awesome
        git rebase master
        Everything is finished, test it out on staging!
        git push -f mainline add_awesome:staging
        # This blows away what is currently being staged, make sure staging isn't already in use!
        Staging is cool...code review...high fives all around, ship it!
        It's always easier to release a feature if master is up to date and you've rebased your branch. See above for how to 'keep our feature branch up to date'. We also make sure to squash all the commits down as much as possible before merging the feature into master. You can do this with the rebase command:
        # Rebase the last 5 commits
        git rebase -i HEAD~5
        Now we can merge the feature into the master branch:
        git checkout master
        git merge add_awesome
        git push mainline master
        And if you want your code to go out to production right away you have a couple more steps:
        git checkout master
        git pull mainline master # Make sure you're up to date with everything
        git checkout production
        git merge master
        git push mainline production
        # Ask Ops for a deploy
        That's about it. It might seem like a lot to get a hang of at the start but it really works well and keeps the main repository clear of merge commits so it's easy to read and revert if required. I personally really like the idea of feature branches and rebasing as often as possible, it makes it super easy to switch tasks and keeps merge conflicts to a minimum. I almost never have conflicts because I rebase a couple of times a day.

        A Few More git Tips

        I've got a couple more tips that might help you out in your day to day git usage.
        # Unstage the last commit
        git reset HEAD~1
        git reset HEAD^ # Same as above
        # Remove the last commit from history (don't do this if the commit has been pushed to a remote)
        git reset --hard HEAD~1
        # Interactive rebase is awesome!
        git rebase -i HEAD~4
        git rebase -i HEAD^^^^ # Same as above
        # Change the last commit message, or add staged files to the last commit
        git commit --amend
        # Reverses the commit 1b9b50a if it introduced a bug
        git revert 1b9b50a
        # Track down a bug, HEAD is bad but 5 commits back it was good
        git bisect start HEAD HEAD~5


        So there you have it, that's how we use git at Shopify. I don't know about everyone else, but once I got going I found this workflow (particularly the feature branches) to work very well. That doesn't mean this is the only way to use git, like I said earlier it is an incredibly powerful tool and you have to find a way that works well for you and your team. I do hope that this might serve as a starting point for your own git workflow and maybe provide a little insight into how we work here at Shopify.
        Our tools and the way we use them are constantly evolving so I would love to hear about how you use git to see if we might be able to improve our own workflow. Let us know in the comments or better yet, write you own blog post and drop us the link!
        Photo by Paul Hart

        Continue reading

        Developing Shopify Apps, Part 4: Change is Good

        Developing Shopify Apps, Part 4: Change is Good

         So far, in the Developing Shopify Apps series, we've covered:

        • The setup: joining Shopify's Partner Program, creating a new test shop, launching it, adding a private app to it and playing with a couple of quick API calls.
        • Exploring the API: a quick explanation of the API and RESTafarianism, retrieving general information about a shop and dipping a toe into finding out about things like your shop's products, and so on.
        • Even more explaration: REST consoles, getting a complete list of all the products, articles, blogs, customers and so on, retrieving specific items given their ID and creating new items.

        Now these are modifications!

        In this article, we're going to look at another important types of operation: modifying existing items.

        Modifying Customers

        To modify an object, we're going to need an existing one first. I'm going to start with "Peter Griffin", a customer that I created in the previous article in this series. His ID is 51827492, so we can retrieve his record thusly:

        • GET api-key:password@shop-url/admin/customers/51827492.xml for the XML version
        • GET api-key:password@shop-url/admin/customers/51827492.json for the JSON version

        Here's the response in XML:

        <?xml version="1.0" encoding="UTF-8"?>
            <accepts-marketing type="boolean" nil="true" />
            <orders-count type="integer">0</orders-count>
            <id type="integer">51827492</id>
            <note nil="true" />
            <total-spent type="decimal">0.0</total-spent>
            <tags />
            <addresses type="array">
                    <company nil="true" />
                    <address1>31 Spooner Street</address1>
                    <address2 nil="true" />
                    <country>United States</country>
                    <province>Rhode Island</province>
                    <name>Peter Griffin</name>

        Let's suppose that Peter has decided to move to California. We'll need to update his address, and to do it programatically, we'll need the following:

        • His customer ID (we've got that).
        • The new information. For this example, it's
          • address1: 800 Schwarzenegger Lane
          • city: Los Angeles
          • state: California
          • zip: 90210
          • phone: 555-888-9898
        • And finally, the method for calling the Shopify API to modify existing items.

        First, there's the format of the URL for modifying Peter's entry. The URL will specify what operation we want to perform (modify) and on which item (a customer whose ID is 51827492).

        • PUT api-key:password@shop-url/admin/customers/51827492.xml for the XML version
        • PUT api-key:password@shop-url/admin/customers/51827492.json for the JSON version

        For this example, we'll use the XML version. If you're using Chrome's REST console, put the XML URL into the Request field (located in the Target section), as shown below:

        Then there's the message body, which will specify which fields we want to update. Here's the message body to update Peter's address to the new Los Angeles-based one shown above, in XML form:

        <?xml version="1.0" encoding="UTF-8"?>
          <addresses type="array">
              <address1>800 Schwarzenegger Lane</address1>
              <city>Los Angeles</city>

        If you're using Chrome's REST console, put the message body in the RAW Body field (located in the Body section) and make sure Content-Type is set to application/xml:

        Send the request. If you're using Chrome's REST Console, the simplest way to do this is to press the PUT button located at the bottom of the page. You should get a "200 OK" response and the following response body:

        <?xml version="1.0" encoding="UTF-8"?>
          <accepts-marketing type="boolean" nil="true" />
          <orders-count type="integer">0</orders-count>
          <id type="integer">51827492</id>
          <note nil="true" />
          <total-spent type="decimal">0.0</total-spent>
          <tags />
          <addresses type="array">
              <city>Los Angeles</city>
              <company nil="true" />
              <address1>800 Schwarzenegger Lane</address1>
              <address2 nil="true" />
              <country>United States</country>
              <name>Peter Griffin</name>

        As you can see, Peter's address has been updated.

        Modifying Products

        Let's try modifying an existing product in our store. Once again, we'll modify an item we created in the previous article: the Stumpy Pepys Toy Drum.

        When we created it, we never specified any tags. We now want to add some tags to this product -- "Spinal Tap" and "rock" -- to make it easier to find. In order to do this, we need:

        • The product ID. It's.
        • The tags, "Spinal Tap" and "rock".
        • And finally, the method for calling the Shopify API to modify existing items.
        Here's the URL format:
          • PUT api-key:password@shop-url/admin/products/48339792.xml for the XML version
          • PUT api-key:password@shop-url/admin/products/48339792.json for the JSON version

        For this example, we'll use the JSON version. If you're using Chrome's REST console, put the XML URL into the Request field (located in the Target section), as shown below:

        Then there's the message body, which will specify which fields we want to update. Here's the message body to add the tags to the drum's entry, in JSON form:

          "product": {
            "tags": "Spinal Tap, rock",
            "id": 48339792

        If you're using Chrome's REST console, put the message body in the RAW Body field (located in the Body section) and make sure Content-Type is set to application/json:

        Send the request. If you're using Chrome's REST Console, the simplest way to do this is to press the PUT button located at the bottom of the page. You should get a "200 OK" response and the following response body:

          "product": {
            "body_html": "This drum is so good...\u003Cstrong\u003Eyou can't beat it!!\u003C/strong\u003E",
            "created_at": "2011-08-03T18:20:17-04:00",
            "handle": "stumpy-pepys-toy-drum-sp-1",
            "product_type": "drum",
            "template_suffix": null,
            "title": "Stumpy Pepys Toy Drum SP-1",
            "updated_at": "2011-08-08T17:57:55-04:00",
            "id": 48339792,
            "tags": "rock, Spinal Tap",
            "images": [],
            "variants": [{
              "price": "0.00",
              "position": 1,
              "created_at": "2011-08-03T18:20:17-04:00",
              "title": "Default",
              "requires_shipping": true,
              "updated_at": "2011-08-03T18:20:17-04:00",
              "inventory_policy": "deny",
              "compare_at_price": null,
              "inventory_quantity": 1,
              "inventory_management": null,
              "taxable": true,
              "id": 113348882,
              "grams": 0,
              "sku": "",
              "option1": "Default",
              "option2": null,
              "fulfillment_service": "manual",
              "option3": null
            "published_at": "2011-08-03T18:20:17-04:00",
            "vendor": "Spinal Tap",
            "options": [{
                "name": "Title"

        Modifying Things with the Shopify API: The General Formula

        As you've seen, whether you prefer to talk to the API with XML or JSON, modifying things requires:

        1. An HTTP PUT request to the right URL, which includes the ID of the item you want to modify
        2. The information that you want to add or update, which you format and put into the request body

        ...and that's it!


        We've seen getting, adding, and modifying, which leaves...deleting.

        Continue reading

        How a Potato Saved Shopify's Internet

        How a Potato Saved Shopify's Internet

        By Adrian Irving-Beer

        Picture this scenario:  You’ve watched your workplace grow from thirty people to over sixty in just nine months.  Business is booming, but your connection to the internet has reached a crisis point.  Your lines can’t handle the traffic and your router can’t cope with the load.  Developers are having to pull down software updates at home and bring them in to the office to share, and your support team can’t even access your internal website to assist customers.

        There’s no relief in sight, either.  You’re bonding some DSL lines together to create a single faster virtual link, but they don’t make faster DSL lines and your router can’t bond any more of them, never mind handle all that extra traffic.  You’ve received a bunch of different estimates for getting faster technologies installed, like cable and fibre, but each one has taken weeks to come back with the same response: A ton of money, a ripped up sidewalk, and possibly several months until completion – by which time you’ll be preparing to move into a different office anyway.

        There’s only really two options left at this point.  You could add more lines, but you’d have to start buying and configuring more routers and divvying people up across them.
        Or, you could build a better router — one that can handle as many lines as you can throw at it.

        Planting a Potato

        The Potato project at Shopify began as an experimental attempt to bond more lines together using a real PC running Debian Linux, rather than the small router appliance we’d been using.  The old router ran a firmware called “Tomato”, and so the new machine was obviously destined to be called “Potato” (a.k.a. “not a tomato”).

        We rushed the new machine into active service, and it immediately solved a lot of our problems.  Unfortunately, we soon realised we would have to give up on bonding altogether.  Linux’s bonding support was still unreliable under heavy load, and all our attempts to bond extra lines were creating more problems than they solved.  So while we had replaced the overloaded router and improved the overall situation, we were still facing a bandwidth crunch and we needed a new option.

        The next possible approach was load balancing, i.e. divvying up our traffic across all links rather than trying to combine them.  There’s some support for this built in to Linux, but it’s designed for internal networks where you control both sides of the links, and there was no way this was going to work across a bunch of typical DSL lines.

        Instead, we had to design our own load balancing using a complex combination of connection marking, mark-based routing, and IP masquerading.  For each new connection, we mark it with a number, then assign it to a DSL link using that number, and change the source address to the address of that link so the internet would know how to reply.  Any inbound connections would also need to be marked appropriately to ensure their traffic went back out on the same link.  We also had to deal with the case of a link going down, and prevent connections from switching links accidentally (which would invariably fail).

        We refined this unorthodox approach over the following weeks, and the results eventually turned out to be just about perfect.  We ordered more DSL lines and ended up with six links in total – half of them running on USB network sticks after we ran out of slots for network cards.

        Growing Your Potato

        During the entire Potato project, we spent a lot of time coping with link reliability issues.  Even once we gave up on bonding, the individual DSL links themselves were randomly going down on a regular basis.  To help us cope with and troubleshoot these issues, I spent a lot of time enhancing and expanding the Potato system.

        To manage the links, I tossed together a very spartan interface (using Sinatra and JavaScript) that communicated mainly via icons: An unmoving potato signified a downed link, a spinning potato was a link that was trying to connect, and a series of marching potatoes was a fully operational link.  Controlling Potato was now as easy as clicking a link to toggle it on or off.  The interface was an instant hit with everyone and eased the annoyance of having to restart the links so often.

        We quickly learned that if multiple people tried to restart links at once, the results were incredibly confusing and generally counter-productive, especially since you couldn’t actually see who was doing what.  So I added a sign-in feature and an event log that would show along the side of the screen, in a format heavily inspired by Team Fortress 2.

        With so many links, we could now afford to reserve a single “isolated” line for our high-priority low-volume traffic, with the remaining five lines balancing out the “bulk” traffic load.  I added packet loss measurements so we could see which links were healthy and which were having trouble.  I wrote an automated system to test all the links and select which ones to use at any given moment.  I even created a Google Talk bot that would notify us instantly if any links went down or one of the USB sticks got disconnected.

        There were times when we needed an entire internet link for something absolutely critical, such as an online video interview – something that could easily be interrupted by someone’s ill-timed download.  So I added the notion of “reserved” links, where any of our DSL links could be reserved for specific machines as needed.

        To reduce our traffic, I installed some Squid proxy servers.  All our web traffic goes through the proxies, and they attempt to cache as much as they can.  If someone posts a funny cat picture on our Campfire channel, all of our computers are going to try to download it at once – but the Squid proxy only has to download it once, and then it can internally distribute it to any computer that asks for it.  The same goes for all the major sites we use, meaning that everything becomes faster and uses less of our bandwidth.

        In the end, we finally discovered that our modems had been sent to us with incorrect configurations.  Once they were all properly reconfigured, all our mysterious intermittent problems disappeared overnight.

        Tending to Your Potato

        It’s been several months of smooth sailing for Potato now.  Our monitoring system is silent for weeks at a time.  I used to check our Potato status page a dozen or more times a day.  Now, weeks go by without anyone even thinking about Potato.

        About a month ago, our internet provider temporarily lost 15 of their 20 lines to their customers’ DSL links.  Thousands of their clients were without internet access.  We just lost four of our six lines, and we were able to scale back our usage and keep on going.

        These days, our preparations for the new office are well underway.  We’ll have a much more powerful fibre link at the new place (equivalent to a dozen of our current DSL links), and we won’t be moving in until it’s fully up and running.  Our internet troubles will soon be a distant memory.

        Potato will go with them, as we’ll be upgrading to enterprise-class gateway hardware.  It’s something of a bittersweet farewell, as so much of my time went into managing and upgrading it over these past few months that I’ve become rather fond (and proud) of it.  But its departure will mark the start of a time where I don’t have to spend all those hours managing our internet connection, and I’ll be free to concentrate on my regular job again.

        Getting Potato Sprouts

        To run your own Potato, you can grab the source code and configuration from GitHub.   The documentation is a bit sparse since this was mainly an internal (and temporary) project, but feel free to drop me a line on GitHub if you need more info – I’d love to see Potato growing again and solving someone else’s internet crisis, too.

        Continue reading

        Developing Shopify Apps, Part 3: More API Exploration

        Developing Shopify Apps, Part 3: More API Exploration

        Writing Shopify AppsWelcome back to another installment of Developing Shopify Apps!

        In case you missed the previous articles in this series, they are:

        • Part 1: The Setup. In this article, we:
          • Joined Shopify's Partner Program
          • Created a new test shop
          • Launched a new test shop
          • Added an app to the test shop
          • Played with a couple of quick API calls through the browser
        • Part 2: Exploring the API. This article covered:
          • Shopify's RESTful API, including a quick explanation of how to use it
          • Retrieving general information about a shop via the admin panel and the API
          • Retrieving information from a shop, such as products, via the API

        Exploring RESTful APIs with a REST Console

        So far, all we've done is retrieve information from a shop. We did this by using the GET verb and applying it to resources exposed by the Shopify API, such as products, blogs, articles and so on. Of all the HTTP verbs, GET is the simplest to use; you can simply request information by using your browser's address bar. Working with the other three HTTP verbs -- POST, PUT and DELETE -- usually takes a little more work.

        One very easy-to-use way to make calls to the Shopify API using all four verbs is a REST client. You have many options, including:

        • cURL: the web developer's Swiss Army knife. This command line utility gets and sends files using URL syntax using a wide array of protocols including HTTP and friends, FTP and similar, LDAPS, TELNET and mail formats including IMAP, POP3 and SMTP. 
        • Desktop REST clients such as Fiddler for Windows or WizTools' RESTClient
        • Browser-based REST clients such as RESTClient for Firefox or REST Console for Chrome

        Lately, I've been using REST Console for Chrome. It's quite handy -- when installed, it's instantly available with one click on its icon, just to the left of Chrome's "Wrench" menu (which is to the right of the address bar):

        And here's what the REST Console looks like -- clean and simple:

        Let's try a simple GET operation: let's get the list of products in the shop. The format for the URL is:


        • GET api-key:password@shop-url/admin/products.xml (for the XML version)
        • GET api-key:password@shop-url/admin/products.json (for the JSON version)
        where api-key is your app's API key and password is your app's password.

        The URL goes into the Request URL field in the Target section. A tap of the GET button at the bottom of the page yields a response, which appears, quite unsurprisingly, in the Response section of the page:

        Of course, you could've done this with the address bar. But it's much nicer with the REST Console. Before we start exploring calls that require POST, PUT and DELETE, let's take a look at other things we can do with the GET verb.

        Get All the Items!

        If you've been following this series of articles, you've probably had a chance to try a couple of GET calls to various resources exposed by the API. Once again, here's the format for the URL that gets you a listing of all the products available in the shop:

        Get All the Products


        • GET api-key:password@shop-url/admin/products.xml (for the XML version)
        • GET api-key:password@shop-url/admin/products.json (for the JSON version)

        Get All the Articles

        If you go to the API documentation and look at the column on the right side of the page, you'll see a list of resources that the Shopify API makes available to you. One of these resources is Article, which gives you access to all the articles in the blogs belonging to the shop (each shop supports one or more blogs; they're a way for shopowners to write about what they're selling or related topics).

        Here's how you get all the articles:

        • GET api-key:password@shop-url/admin/articles.xml (for the XML version)
        • GET api-key:password@shop-url/admin/articles.json (for the JSON version)

        Get All the Blogs

        Just as you can get all the articles, you can get all the blogs that contain them. Here's how you do it:

        • GET api-key:password@shop-url/admin/blogs.xml (for the XML version)
        • GET api-key:password@shop-url/admin/blogs.json (for the JSON version)

        Get All the Customers

        How about a list of all the shop's registered customers? No problem:

        • GET api-key:password@shop-url/admin/customers.xml (for the XML version)
        • GET api-key:password@shop-url/admin/customers.json (for the JSON version)

        Get All the [WHATEVER]

        By now, you've probably seen the pattern. For any resource exposed by the Shopify API, the way to get a complete listing of all items in that resource is this:

        • GET api-key:password@shop-url/admin/plural-resource-name.xml (for the XML version)
        • GET api-key:password@shop-url/admin/plural-resource-name.json (for the JSON version)
        • api-key is the app's API key
        • password is the app's password
        • plural-resource-name is the plural version of the name of the resource whose items you want: articles, blogs, customers, products, and so on.

        Get a Specific Item, Given its ID

        There will come a time when you want to get the information about just one specific item and not all of them. If you know an item's ID, you can retrieve the info for just that item using this format URL:

        • GET api-key:password@shop-url/admin/plural-resource-name/id.xml (for the XML version)
        • GET api-key:password@shop-url/admin/plural-resource-name/id.json (for the JSON version)
        To get an article with the ID 3671982, we use this URL:
        • GET api-key:password@shop-url/admin/articles/3671982.xml (for the XML version)
        • GET api-key:password@shop-url/admin/articles/3671982.json (for the JSON version)

        If There is Such an Item

        If an article with that ID exists, you get a "200" response header ("OK"):

        Status Code: 200
        Date: Wed, 03 Aug 2011 15:49:44 GMT
        Content-Encoding: gzip
        Status: 304 Not Modified
        X-UA-Compatible: IE=Edge,chrome=1
        X-Runtime: 0.114750
        Server: nginx/0.8.53
        ETag: "fb7cdcc613b1a45698c6cfad05fc7f7e"
        Vary: Accept-Encoding
        Content-Type: application/xml; charset=utf-8
        Cache-Control: max-age=0, private, must-revalidate

        ...and a response body that should look something like this (if you requested the response in XML):

        <?xml version="1.0" encoding="UTF-8"?>
          <body-html><p>This is your blog. You can use it to write about new product launches, experiences, tips or other news you want your customers to read about.</p> <p>We automatically create an <a href="http://en.wikipedia.org/wiki/Atom_feed">Atom Feed</a> for all your blog posts. <br /> This allows your customers to subscribe to new articles using one of many feed readers (e.g. Google Reader, News Gator, Bloglines).</p></body-html>
          <created-at type="datetime">2011-07-22T14:43:22-04:00</created-at>
          <title>First Post</title>
          <updated-at type="datetime">2011-07-22T14:43:25-04:00</updated-at>
          <blog-id type="integer">1127212</blog-id>
          <summary-html nil="true" />
          <id type="integer">3671982</id>
          <user-id type="integer" nil="true" />
          <published-at type="datetime">2011-07-22T14:43:22-04:00</published-at>
          <tags>ratione, repellat, vero</tags>

        If No Such Item Exists

        If no article with that ID exists, you get a "404" response header ("Not Found"). Here's what happened when I tried to retrieve an article with the ID 42. I used this URL:

        • api-key:password@shop-url/admin/articles/3671982.xml (for the XML version)
        • api-key:password@shop-url/admin/articles/3671982.json (for the JSON version)

        I got this header back:

        Status Code: 404
        Date: Wed, 03 Aug 2011 16:00:25 GMT
        Content-Encoding: gzip
        Transfer-Encoding: chunked
        Status: 404 Not Found
        Connection: keep-alive
        X-UA-Compatible: IE=Edge,chrome=1
        X-Runtime: 0.039715
        Server: nginx/0.8.53
        Vary: Accept-Encoding
        Content-Type: application/xml; charset=utf-8
        Cache-Control: no-cache<

        ...and since there was nothing to return, the response body was empty.

        Get [WHATEVER], Given its ID

        The same principle applies to any other Shopify API resource.

        Want the info on a customer whose ID is 50548602? The URL would look like this:

        • GET api-key:password@shop-url/admin/customers/50548602.xml (for the XML version)
        • GET api-key:password@shop-url/admin/customers/50548602.json (for the JSON version)

        ...and if such a customer exists, you'll get a response of a "200" header and the customer's information in the body, similar to what you see below (the following is the JSON response):

            "customer": {
                "accepts_marketing": true,
                "orders_count": 0,
                "addresses": [{
                    "company": null,
                    "city": "Wilkinsonshire",
                    "address1": "95692 O'Reilly Plains",
                    "name": "Roosevelt Colten",
                    "zip": "27131-3440",
                    "address2": null,
                    "country_code": "US",
                    "country": "United States",
                    "province_code": "NH",
                    "phone": "1-244-845-7291 x258",
                    "last_name": "Colten",
                    "province": "New Hampshire",
                    "first_name": "Roosevelt"
                "tags": "",
                "id": 50548602,
                "last_name": "Colten",
                "note": null,
                "email": "ivory@example.com",
                "first_name": "Roosevelt",
                "total_spent": "0.00"

        If no such customer existed, you'd get a "404" response header and an empty response body.

        How about info on a product whose ID is 48143272? Here's the URL you'd use:

        • GET api-key:password@shop-url/admin/products/48143272.xml (for the XML version)
        • GET api-key:password@shop-url/admin/products/48143272.json (for the JSON version)

        Once again: if such a product exists, you'll get a "200" response header and a response body that looks something like this (this is the XML version):

        <?xml version="1.0" encoding="UTF-8"?>
            <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
            <body-html><strong>Good snowboard!</strong></body-html>
            <title>Burton Custom Freestlye 151</title>
            <template-suffix nil="true" />
            <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
            <id type="integer">48143272</id>
            <published-at type="datetime">2011-08-02T12:06:42-04:00</published-at>
            <tags />
            <variants type="array">
                    <price type="decimal">10.0</price>
                    <position type="integer">1</position>
                    <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
                    <requires-shipping type="boolean">true</requires-shipping>
                    <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
                    <compare-at-price type="decimal" nil="true" />
                    <inventory-management nil="true" />
                    <taxable type="boolean">true</taxable>
                    <id type="integer">112957692</id>
                    <grams type="integer">0</grams>
                    <sku />
                    <option2 nil="true" />
                    <option3 nil="true" />
                    <inventory-quantity type="integer">1</inventory-quantity>
                    <price type="decimal">20.0</price>
                    <position type="integer">2</position>
                    <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
                    <requires-shipping type="boolean">true</requires-shipping>
                    <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
                    <compare-at-price type="decimal" nil="true" />
                    <inventory-management nil="true" />
                    <taxable type="boolean">true</taxable>
                    <id type="integer">112957702</id>
                    <grams type="integer">0</grams>
                    <sku />
                    <option2 nil="true" />
                    <option3 nil="true" />
                    <inventory-quantity type="integer">1</inventory-quantity>
            <images type="array" />
            <options type="array">

        You can apply this pattern for retrieving items with specific IDs to other resources in the API.

        Creating a New Item

        Let's quickly look over the HTTP verbs and how they're applied when working with Shopify's RESTful API:

        Verb How it's used
        GET "Read". In the Shopify API, the GET verb is used to get information about shops and related things such as customers, orders, products, blogs and so on.

        GET operations are most often used to get a list of items ("Get me a list of all the products my store carries"), an individual item ("Get me the customer with this particular ID number) or to conduct a search ("Get me a list of the products in my store that come from a particular vendor").
        POST "Create". In the Shopify API, the POST verb is used to create new items: new customers, products and so on.
        PUT "Update". To modify an existing item using the Shopify API, use the PUT verb.
        DELETE "Delete". As you might expect, the DELETE verb is used to delete objects in the Shopify API.

        To create a new item with the Shopify API, use the POST verb and this pattern for the URL:

        • POST api-key:password@shop-url/admin/plural-resource-name.xml (for the XML version)
        • POST api-key:password@shop-url/admin/plural-resource-name.json (for the JSON version)

        Creating a new item also requires providing information about that item. The type of information varies with the item, but it's always in either XML or JSON format, and it's always provided in the request body.

        Let's create a new customer (or more accurately, a new customer record). Here's what we know about the customer:

        • First name: Peter
        • Last name: Griffin
        • Email: peter.lowenbrau.griffin@giggity.com
        • Street address: 31 Spooner Street, Quahog RI 02134
        • Phone: 555-555-1212

        This is enough information to create a new customer record (I'll cover the customer object, as well as all the others, in more detail in future articles). Here's that same information in JSON, in a format that the API expects:

          "customer": {
            "first_name": "Peter",
            "last_name": "Griffin",
            "email": "peter.lowenbrau.griffin@giggity.com",
            "addresses": [{
                "address1": "31 Spooner Street",
                "city": "Quahog",
                "province": "RI",
                "zip": "02134",
                "country": "US",
                "phone": "555-555-1212"

        Since I've got the customer info in JSON format, I'll use the JSON URL for this API call:

        POST api-key:password@shop-url/admin/customers.json

        Here's how we make the call using Chrome REST Console. The URL goes into the Request URL field of the Target section:

        ...while the details of our new customer go into the RAW Body field of the Body section. Make sure that the Content-Type field has the correct content-type selected; in this case, since we're sending (and receiving) JSON, the content-type should be application/json:

        A press of the POST button at the bottom of the page sends the information to the server, and the results are displayed in the Response section:

        Here's the response header:

        Status Code: 200
        Date: Wed, 03 Aug 2011 20:39:13 GMT
        Content-Encoding: gzip
        Transfer-Encoding: chunked
        Status: 200 OK
        Connection: keep-alive
        X-UA-Compatible: IE=Edge,chrome=1
        X-Runtime: 0.198933
        Server: nginx/0.8.53
        ETag: "0409671d7af84b695d5ded4e93c0917c"
        Vary: Accept-Encoding
        Content-Type: application/json; charset=utf-8
        Cache-Control: max-age=0, private, must-revalidate

        The "200" status code means that the operation was successful and we have a new customer in the records.

        Here's the body of the response, which is the complete record of the customer we just created, in JSON format:

            "customer": {
                "accepts_marketing": null,
                "orders_count": 0,
                "addresses": [{
                    "company": null,
                    "city": "Quahog",
                    "address1": "31 Spooner Street",
                    "name": "Peter Griffin",
                    "zip": "02134",
                    "address2": null,
                    "country_code": "US",
                    "country": "United States",
                    "province_code": "RI",
                    "phone": "555-555-1212",
                    "last_name": "Griffin",
                    "province": "Rhode Island",
                    "first_name": "Peter"
                "tags": "",
                "id": 51827492,
                "last_name": "Griffin",
                "note": null,
                "email": "peter.lowenbrau.griffin@giggity.com",
                "first_name": "Peter",
                "total_spent": "0.00"

        Let's create another new item. This time, we'll make it a product and we'll do it in XML.

        Let's say this is the information we have about the product:


        • Title: Stumpy Pepys Toy Drum SP-1
        • Vendor: Spinal Tap
        • Product type: Drum
        • Description: This drum is so good...you can't beat it!

        Here's that same information in XML, in a format that the API expects:

        <?xml version="1.0" encoding="UTF-8"?>
          <body-html>This drum is so good...<strong>you can't beat it!!</strong></body-html>  
          <title>Stumpy Pepys Toy Drum SP-1</title>  
          <vendor>Spinal Tap</vendor>

        (As I wrote earlier, I'll cover the product object and all its fields in an upcoming article.)

        Since I've got the product info in JSON format, I'll use the XML URL for this API call:

        POST api-key:password@shop-url/admin/customers.json

        Let's make the call using the Chrome REST Console again. The URL goes into the Request URL field of the Target section:

        ...while the details of our new product go into the RAW Body field of the Body section. Make sure that the Content-Type field has the correct content-type selected; in this case, since we're sending (and receiving) XML, the content-type should be application/xml:

        Once again, a press of the POST button at the bottom of the page sends the information to the server, and the results appear in the Response section:

        Here's the response header:

        Status Code: 201
        Date: Wed, 03 Aug 2011 22:20:17 GMT
        Transfer-Encoding: chunked
        Status: 201 Created
        Connection: keep-alive
        X-UA-Compatible: IE=Edge,chrome=1
        X-Runtime: 0.122462
        Server: nginx/0.8.53
        Content-Type: application/xml; charset=utf-8
        Location: https://nienow-kuhlman-and-gleason1524.myshopify.com/admin/products/48339792
        Cache-Control: no-cache

        Don't sweat that the code is 201 and not 200 -- all 2xx code mean success. I'm going to go bug the core team and ask why successfully creating a new customer gives you a 200 (OK) code and successfully creating a new product gives you 201 (created).

        Here's the response body -- it's the complete record of the product we just created, in XML format:

        <?xml version="1.0" encoding="UTF-8"?>
            <created-at type="datetime">2011-08-03T18:20:17-04:00</created-at>
            <body-html>This drum is so good...<strong>you can't beat it!!</strong></body-html>
            <title>Stumpy Pepys Toy Drum SP-1</title>
            <template-suffix nil="true" />
            <updated-at type="datetime">2011-08-03T18:20:17-04:00</updated-at>
            <id type="integer">48339792</id>
            <vendor>Spinal Tap</vendor>
            <published-at type="datetime">2011-08-03T18:20:17-04:00</published-at>
            <tags />
            <variants type="array">
                    <price type="decimal">0.0</price>
                    <position type="integer">1</position>
                    <created-at type="datetime">2011-08-03T18:20:17-04:00</created-at>
                    <requires-shipping type="boolean">true</requires-shipping>
                    <updated-at type="datetime">2011-08-03T18:20:17-04:00</updated-at>
                    <compare-at-price type="decimal" nil="true" />
                    <inventory-management nil="true" />
                    <taxable type="boolean">true</taxable>
                    <id type="integer">113348882</id>
                    <grams type="integer">0</grams>
                    <sku />
                    <option2 nil="true" />
                    <option3 nil="true" />
                    <inventory-quantity type="integer">1</inventory-quantity>
            <images type="array" />
            <options type="array">

        Next Time...

        In the next installment, we'll look at modifying and deleting existing objects in your shop.

        Continue reading

        Developing Shopify Apps, Part 2: Exploring the API

        Developing Shopify Apps, Part 2: Exploring the API

        In the previous article in this series, we did the following:

        1. Joined Shopify's Partner Program
        2. Created a new test shop
        3. Launched a new test shop
        4. Added an app to the test shop
        5. Played around with a couple of quick API calls through the browser

        In this article, we'll take a look at some of the calls that you can make to Shopify's API and how they relate to the various parts of your shop. This will give you an idea of what Shopify shops are like as well as show you to control them programmatically.

        My Shop, via the Admin Page

        I've set up a test shop called Joey's World O' Stuff for this series of articles. Feel free to visit it at any time. It lives at this URL:


        If you followed along with the last article, you also have a test shop with a similarly URL. Test shop URLs are randomly generated. The shops themselves are meant to be temporary; they're for experimenting with themes, apps and content. We'll work with real shops later in this series, and they'll have URLs that make sense.

        If you were to visit the URL for my test shop at the time of this writing, you'd see something like this:

        The admin panel for any shop can be accessed by adding /admin to the end of its base URL. If you're not logged into your shop, you'll be sent to the login page. If you're already logged in, you'll be sent to the admin panel's home page, which should look something like this:

        I've highlighted the upper right-hand corner of the admin panel home page, where the Preferences menu is. Click on Preferences, then in the menu that pops up, click on General Settings:

        You should now see the General Settings page, which should look like this:

        The fields on the screen capture of this page are a little small, so I'll list them below:

        • Shop name: Joey's World O' Stuff
        • Email: joey@shopify.com
        • Shop address:
          • Street: 31 Spooner Street
          • Zip: 02903
          • City: Quahog
          • Country: United States
          • State: Rhode Island
          • Phone: (555) 555-5555
        • Order ID formatting: #{{number}}
        • Timezone: (GMT-05:00) Eastern Time (US & Canada)
        • Unit system: Imperial system (Pound, inch)
        • Money formatting: ${{amount}}
        • Checkout language: English

        That's the information for my shop as seen through admin panel on the General Settings page. 

        Just as the admin panel lets you manually get and alter information about your shop, the Shopify API lets applications do the same thing, programatically. What we just did via the admin panel, we'll now do using the API. But first, let's talk about the API.

        Detour: A RESTafarian API

        The Shopify API is RESTful, or, as I like to put it, RESTafarian. REST is short for REpresentational State Transfer, and it's an architectural style that also happens to be a simple way to make calls to web services. I don't want to get too bogged down in explaining REST, but I want to make sure that we're all on the same page.

        The Shopify API exposes a set of resources, each of which is some part of a shop. Here's a sample of some of the resources that the API lets you access:

        • Shop: The general settings of your shop, which include things like its name, its owner's name, address, contact info and so on.
        • Products: The set of products available through your shop.
        • Images: The set of images of your store's products.
        • Customers: The set of your shop's customers.
        • Orders: The orders placed by your customers.
        (If you'd like to see the full list of resources, go check out the API Documentation. They're all listed in a column on the right side of the page.)

        To do things with a shop, whether it's to get the name of the shop or the contact email of its owner, get a list of all the products available for sale, or find out which customers are the biggest spenders, you apply verbs to resources like the ones listed above. In the case of RESTful APIs like Shopify's, those verbs are the four verbs of HTTP:

        1. GET: Read the state of a resource and not make any changes to it in the process. When you type an URL into your browser's address bar and press Enter, your browser responds by GETting that page.
        2. POST: Create a new resource (I'm simplifying here quite a bit; POST is the one HTTP verb with a lot of uses). When you fill out and submit a form on a web page, your browser typically uses the POST verb.
        3. PUT: Update an existing resource.
        4. DELETE: Delete an existing resource.

        Here's an example of putting resources and verbs together. Suppose you were writing an app that let a shopowner do bulk changes to the products in his or her store. Your app would need to access the Products resource and then apply the four HTTP verbs in these ways:

        • If you wanted to get information about one or more products in a shop, whether it's the list of all the products in the shop, information about a single product, or a count of all the products in the shop, you'd use the GET verb and apply it to the Products resource.
        • If you wanted to add a product to a shop, you'd use the POST verb and apply it to the Products resource.
        • If you wanted to modify an existing product in a shop, you'd use the PUT verb and apply it to the Products resource.
        • If you wanted to delete a product from a shop, you'd use the DELETE verb and apply it to the Products resource.
        Keep in mind that not all resources respond to all four verbs. Certain resources like Shop aren't programmatically editable, and as a result, it doesn't respond to PUT.

        My Shop, via the API

        Let's get the same information that we got from the admin panel's General Settings page, but using the API this time. In order to do this, we need to know two things:

        1. Which resource to access. In this case, it's pretty obvious: the Shop resource.
        2. Which verb to use. Once again, it's quite clear: GET. (Actually, if you check the API docs, it's very clear; it's the only verb that the Shop resource responds to.) 

        The nice thing about GET calls to web APIs is that you can try them out very easily: just type them into your browser's address bar!

        You specify a resource with its URL (or more accurately, URI). That's what the the "R" in URL and URI stand for: resource. To access a Shopify resource, you need to form its URI using this general format:



        • api-key is the API key for your app (when you create an app, Shopify's back end generates a unique API key for it)
        • password is the password for your app (when you create an app, Shopify's back end generates a password for it)
        • shop-url is the URL for your shop
        • resource-name is the name of the resource
        • resource-type is the type of the resource; this is typically either xml if you'd like the response to be given to your app in XML format or json is you'd like the response to be in JSON.

        You can find the API key and password for your app on the Shopify API page of your shop's admin panel. You can get there via this URL:


        where shop-url is your shop's URL. You can also get there by clicking on the Apps menu, which is located near the upper right-hand corner of every page in the admin panel and selecting Manage Apps:

        You'll see a list of sets of credentials, one set for each app. Each one looks like this:

        You can copy the API key and password for your app from this box. Better yet, you can copy the example URL, shown below, and then edit it to create the API call you need:

        The easiest way to get general information about your shop is to:


        1. Copy the example URL
        2. Paste it into your browser's address bar
        3. Edit the URL, changing orders.xml to shop.xml
        4. Press Enter

        You should see a result that looks something like this:

          <name>Joey's World O' Stuff</name>
          <address1>31 Spooner Street</address1>
          <created-at type="datetime">2011-07-22T14:43:21-04:00</created-at>
          <public type="boolean">false</public>
          <id type="integer">937792</id>
          <phone>(555) 555-5555</phone>
          <source nil="true"/>
          <province>Rhode Island</province>
          <timezone>(GMT-05:00) Eastern Time (US & Canada)</timezone>
          <shop-owner>development shop</shop-owner>
          <money-with-currency-format>${{amount}} USD</money-with-currency-format>
          <taxes-included type="boolean">false</taxes-included>
          <tax-shipping nil="true"/>

        Note that what you get back is a little more information than what you see on the admin panel's General Settings page; you also get some information that you'd find on other admin panel pages, such as the currency your shop uses and how taxes are applied to your products' and shipping prices.

        You can also get your shop information in JSON by simply changing the last part of the URL from shop.xml to shop.json. You'll see a result like this:

          {"address1":"31 Spooner Street",
           "name":"Joey's World O' Stuff",
           "shop_owner":"development shop",
           "money_with_currency_format":"${{amount}} USD",
           "timezone":"(GMT-05:00) Eastern Time (US \u0026 Canada)",
           "phone":"(555) 555-5555",
           "province":"Rhode Island",

        (Okay, I formatted this one so it would be easy to read. It was originally one long line; easy for computers to read, but not as easy for humans.)

        Other Things in My Shop, via the API

        If you followed my steps from the previous article in this series, your shop should have a small number of predefined products in your store. You can look at all the shop's products by taking the URL you just used and changing the last part of the URL to products.xml.

        Here's a shortened version of the output I got:

        <products type="array">
            <created-at type="datetime">2011-07-22T14:43:24-04:00</created-at>
              ...really long description here...
            <title>Multi-channelled executive knowledge user</title>
            <template-suffix nil="true"/>
            <updated-at type="datetime">2011-07-22T14:43:24-04:00</updated-at>
            <id type="integer">47015882</id>
            <published-at type="datetime">2011-07-22T14:43:24-04:00</published-at>
            <tags>Demo, T-Shirt</tags>
            <variants type="array">
                <price type="decimal">19.0</price>
                <position type="integer">1</position>
                <created-at type="datetime">2011-07-22T14:43:24-04:00</created-at>
                <requires-shipping type="boolean">true</requires-shipping>
                <updated-at type="datetime">2011-07-22T14:43:24-04:00</updated-at>
                <compare-at-price type="decimal" nil="true"/>
                <inventory-management nil="true"/>
                <taxable type="boolean">true</taxable>
                <id type="integer">110148372</id>
                <grams type="integer">0</grams>
                <option2 nil="true"/>
                <option3 nil="true"/>
                <inventory-quantity type="integer">5</inventory-quantity>
            <images type="array"/>
            <options type="array">
          (more products here)

        If you want this information in JSON format, all you need to do is change the URL so that it ends with .json instead of .xml.

        Try Out the Other Resources

        There are a number of Shopify API resources that you can try out -- try out some GET calls on these:

        • articles.xml and articles.json
        • assets.xml or assets.json
        • blogs.xml or blocks.json
        • comments.xml or comments.json
        • customers.xml or customers.json

        There are more resources that you can access through GET; the Shopify Wiki lists them all in the right-hand column. Try them out!

        Next: Graduating to a real store, and trying out the POST, PUT and DELETE verbs.

        [ This article also appears in Global Nerdy. ]

        Continue reading

        StatsD at Shopify

        StatsD at Shopify

        Here at Shopify, we like data. One of the many tools in our data toolbox is StatsD. We've been using StatsD in production at Shopify for many months now, consistently sending multiple events to our StatsD instance on every request.

        What is StatsD good for?

        In my experience, there are two things that StatsD really excels at. First, getting a high level overview of some custom piece of data. We use NewRelic to tell us about the performance of our apps. NewRelic provides a great overview of our performance as a whole, even down to which of our controller actions are slowest, and though it has an API for custom instrumentation I've never used it. For custom metrics we're using StatsD.

        We use lots of memcached, and one metric we track with StatsD is cache hits vs. cache misses on our frontend. On every request that hits a cacheable action we send an event to StatsD to record a hit or miss. 

        Caching Baseline (Green: cache hits, Blue: cache misses)


        Note: The graphs in this article were generated by Graphite, the real-time graphing system that StatsD runs on top of.

        As an example of how this is useful, we recently added some data to a cache key that wasn't properly converted to a string, so that piece of the key was appearing to be unique far more often than it was. The net result was more cache misses than usual. Looking at our NewRelic data we could see that performance was affected, but it was difficult to see exactly where. The response time from our memcached servers was still good, the response time from the app was still good, but our number of cache misses had doubled, our number of cache hits had halved, and overall user-facing performance was down.

        A problem


        It wasn't until we looked at our StatsD graphs that we fully understood the problem. Looking at our caching trends over time we could clearly see that on a specific date something was introduced that was affecting caching negatively. With a specific date we were able to track down the git commit and fix the issue. Keeping an eye on our StatsD graphs we immediately saw the behaviour return to the normal trend.

        Return to Baseline


        The second thing that StatsD excels at is proving assumptions. When we're writing code we're constantly making assumptions. Assumptions about how our web app may be used, assumptions about how often an interaction will be performed, assumptions about how fast a particular operation may be, assumptions about how successful a particular operation may be. Using StatsD it becomes trivial to get real data about this stuff.

        For instance, we push a lot of products to Google Product Search on behalf of our customers. There was a point where I was seeing an abnormally high number of failures returned from Google when we were posting these products via their API. My first assumption was that something was wrong at the protocol level and most of our API requests were failing. I could have done some digging around in the database to get an idea of how many failures we were getting, cross referenced with how many products we were trying to publish and how frequently, etc. But using our StatsD client (see below) I was able add a simple success/failure metric to give me a high level overview of the issue. Looking at the graph from StatsD I could see that my assumption was wrong, so I was able to eliminate that line of thinking.


        We were excited about StatsD as soon as we read Etsy's announcement. We wrote our own client and began using it immediately. Today we're releasing that client. It's been in use in production since then and has been stalwartly collecting data for us. On an average request we're sending ~5 events to StatsD and we don't see a performance hit. We're actually using StatsD to record the raw number of requests we handle over time.

        statsd-instrument provides some basic helpers for sending data to StatsD, but we don't typically use those directly. We definitely didn't want to litter our application with instrumentation details so we wrote metaprogramming methods that allow us to inject that instrumentation where it's needed. Using those methods we have managed to keep all of our instrumentation contained to one file in our config/initializers folder. Check out the README for the full API or pull down the statsd-instrument rubygem to use it.

        A sample of our instrumentation shows how to use the library and the metaprogramming methods:

        # Liquid
        Liquid::Template.extend StatsD::Instrument
        Liquid::Template.statsd_measure :parse, 'Liquid.Template.parse'
        Liquid::Template.statsd_measure :render, 'Liquid.Template.render'
        # Google Base
        GoogleBase.extend StatsD::Instrument
        GoogleBase.statsd_count_success :update_products!, 'GoogleBase.update_products'
        # Webhooks
        WebhookJob.extend StatsD::Instrument
        WebhookJob.statsd_count_success :perform, 'Webhook.perform'

        That being said, there are a few places where we do make use of the helpers directly (sans metaprogramming), still within the confines of our instrumentation initializer:

        ShopAreaController.after_filter do
          StatsD.increment 'Storefront.requests', 1, 0.1
          return unless request.env['cacheable.cache']
          if request.env['cacheable.miss']
            StatsD.increment 'Storefront.cache.miss'
          elsif request.env['cacheable.store'] == 'client'
            StatsD.increment 'Storefront.cache.hit_client'
          elsif request.env['cacheable.store'] == 'server'
            StatsD.increment 'Storefront.cache.hit_server'

        Today we're recording metrics on everything from the time it takes to parse and render Liquid templates, how often our Webhooks are succeeding, performance of our search server, average response times from the many payment gateways we support, success/failure of user logins, and more.

        As I mentioned, we have many tools in our data toolbox, and StatsD is a low-friction way to easily collect and inspect metrics. Check out statsd-instrument on github.

        Continue reading

        Prognostication For Fun And Profit: States And Events

        Prognostication For Fun And Profit: States And Events

        Measuring stuff is hard, even when it stands still. When things change over time the situation gets even worse. 

        I'm Ben Doyle, research scientist and data prophet at Shopify. Over the next little while I'm planning to go into some detail about how we're calculating some common ecommerce metrics (like customer count, churn rate, lifetime value, etc.) here at Shopify.  Most of these metrics come down to fancy ways of counting, but the devil is in the details. 
        For Instance...
        You have a table of users and a boolean flag that says whether each of them is a paying customer. Counting your customers is as simple as counting the rows where the flag is set. Or is it? What if you want to plot your growth over time? You can replace the flag with a date range and count how many of these ranges include a date in question. What if your customer definition changes, perhaps due to a change in your business model? You have to go back through your whole history, which might be a problem if your new definition relies on information you have just started collecting. If you are doing exploratory work you might not even know what data you need yet. 
        These problems have solutions but clearly there's work to do. We need to untangle our methods and sort out our definitions. I thought I'd kick things off by clarifying the elements most basic to counting: intervals and points.

        Intervals and Points



        In the diagram above the dark blue lines represent intervals and the light blue circles represent points (the points should really only be a single pixel but are large so we can see them). In general you need intervals to measure points and you need points to measure intervals. So in a) three of the four intervals overlap with the point and in b) three of the four points overlap with the interval.

        States And Events

        In more familiar terms the space we're usually talking about is time, our points are events and our intervals are states. Events can be found in server logs or rows in a transactional database.  So every time a purchase is made or a signup form is completed, an event is created. At minimum the event type and a timestamp will be recorded, though the event will usually be associated with a user or other entity.

        A state history is often the result of a state machine as Willem discussed in a recent post. For example a subscriber can be subscribed to one or many of several subscription packages. Then keeping a history of those packages means having a start and end date for each package, for each subscriber. States must at minimum have a start and end timestamp and a type, though an entity is usually recorded as well.

        Example: Accounting

        Understanding states and events clarifies the distinction between different sorts of counting. For example standard accounting practise is to report both a balance sheet and an income statement.  

        The balance sheet is a snapshot. This means it is an event in time being used to count states. Consider an account that had $100 in the interval between May 1st and June 18th.  It had $100 on an event date of June 1st, which lies within the interval. The state of every account can be assessed on a given date, and a total reported.

        The income statement reports changes over a period of time. It uses an interval to count a collection of events.  So if there was a withdraw event on June 19th of $5 and deposits on the 20th and 21st of $10, the income statement for June would be $10 + $10 - $5 = $15. If these were the only events for June the balance sheet for July 1st should be $115. 

        These are two different and complementary ways to look at the problem of counting, over time. In theory if you add up all of your income statements before a date and time the sum should be the same as your current balance at that date and time. So the results are equivalent. Unfortunately if there are errors or omissions in your data discrepancies will arise between the two methods.



        Perhaps the most basic metric to track is "How many customers did we have on date X?". If you have a software system already set up that consistently tracks the states of your users (e.g. visitor, customer, churned), and the events that can influence them (e.g. signups, payments, cancellations) the answer can be easy. Like in the finance example above you can present a balance sheet for any point in time, counting customer states at a point in time. In status quo situations this will probably be the preferred method.

        It's nice to know that there is an alternative at hand for when things get complicated though. Counting the events that change customer states will give you greater flexibility to adapt to changes. So if you suddenly decide you want to count “happy” customers separately, looking at the events that indicate happiness - as opposed to "customer-ness" is a good place to start.

        For an example of the above you could count the number of events signifying happiness (e.g. logins or site interactions) within 30 days of a particular date. This could serve as a happiness metric for that date. Since you are counting the states directly it's easy to add flourishes like having some events count more than others, having their weights decay over time, or even incorporating events from an entirely different source. This flexibility especially helps if you are trying to build up a metric to serve as a proxy for or to predict another metric. For example you could use the weights on your event types as adjustable parameters in fitting a model. Continuing with the example, our happiness metric could be constructed to predict customer churn. I'll go into more detail about these sorts of analyses in future posts.

        When In Doubt, Start With Events

        If you are running a store it's nice to know how many customers you have, but it's more important to know how many sales you've made. The definition of customer is abstract and can be arbitrary. Do they need to make a purchase? Several purchases? Do coupons or promotions count? When do they cease to be a customer? In contrast, sales events are much harder to argue about, so they make a great basis for metrics. I hope this article has left you with some insight into the seemingly simple act of counting and welcome questions or comments.

        Continue reading

        Developing Shopify Apps, Part 1: The Setup

        Developing Shopify Apps, Part 1: The Setup

        What is a Shopify App?

        Shopify is a pretty capable ecommerce platform on its own, and for a lot of shopowners, it's all they need for their shops. However, there are many cases where shopowners need features and capabilities that don't come "out of the box" with Shopify. That's what apps are for: to add those extra features and capabilities to Shopify.

        Apps make use of the Shopify API, which lets you programatically access a shop's data -- items for sale, orders and so on -- and take most of the actions available to you from a shop's control panel. An app can automate a tedious or complex task for a shopowner, make the customer's experience better, give shopowners better insight into their sales and other data, or integrate Shopify with other applications' data and APIs in useful ways.

        Here are some apps that you can find at the Shopify App Store. These should give you an idea of what's possible:

        • Jilt: This is an app that makes shopowner's lives easier. It helps turn abandoned carts -- they arise when a customer shops on your store, puts items in the cart and then for some reason never completes the purchase -- into orders. It sends an email to customers who've filled carts but never got around to buying their contents after a specified amount of time. It's been shown to recover sales that would otherwise never have been made.
        • Searchify: Here's an app that makes the customer experience more pleasant. It's an autocompleting search box that uses the data in your shop that lets customers see matching products as they type. The idea is that by making your shop easier to search, you'll get more sales.
        • Beetailer: A good example of taking the Shopify API and combining it with other APIs. It lets your customers comment on your shop's products and share opinions about them on social media sites like Facebook and Twitter. You can harness the power of word-of-mouth marketing to get people to come to your store!

        Shopify apps offer benefits not just for shopowners and their customers, but for developers as well. Developers can build custom private apps for individual shopowners, or reach the 16,000 or so Shopify shopowners by selling their apps through the App Store. The App Store is a great way to get access to some very serious app customers: after all, they're looking for and willing to spend money on apps that make their shops more profitable. Better still, since a healthy app ecosystem is good for us as well, we'll be more than happy to help showcase and promote your apps.

        If you've become convinced to write an app, read on, and follow this series of articles. I'll explore all sorts of aspects of Shopify app-writing, from getting started to selling and promoting your apps. Enjoy!

        Step 1: Become a Partner

        Before you can write apps, you have to become a Shopify Partner. Luckily, it's quick and free to do so. Just point your browser at the Shopify Partners login page (https://app.shopify.com/services/partners/auth/login):

        Once you're there, click on the Become a partner button. That will take you to the Become a Shopify Partner form, a single page in which you provide some information, such as your business' name, your URL and if you're into Shopify consulting, app development or theme design as well as some contact info:

        When you submit this form, you're in the club! You're now a Shopify partner and ready to take on the next step: creating a test shop.

        Step 2: Create a New Test Shop

        Test shops are a feature of Shopify that let you try out store themes and apps without exposing them to the general public. They're a great way to familiarize yourself with Shopify's features; they're also good "sandboxes" in which you can safely test app concepts.

        The previous step should have taken you to your Shopify partner account dashboard, which looks like this:

        It's time to create a test shop. Click on the Test Shops tab, located not too far from the top of the page:

        You'll be taken to the My Test Shops page, where you manage your test shops. It looks like this:

        As you've probably already figured out, you can create a new test shop by either:


        • Clicking on the Create a new Test Shop button near the upper left-hand corner of the page
        • Clicking on the big Create your first Test Shop button in the middle of the page. I'm going to click that one...

        You should see this message near the top of the page for a few moments:

        ...after which you should see the My Test Shops page now sporting a test shop in a list.

        Test shops are given a randomly-generated name. When you decide to create a real, non-test, customer-facing shop, you can name it whatever you want from the start.

        In this example, the test shop is Nienow, Kuhlman and Gleason (sounds like a law firm!). Click on its name in the list to open its admin panel.

        Step 3: Launch Your Test Shop

        Here's what the admin panel for a newly-created shop looks like:

        If you're wondering what the URL for your shop is, it's at the upper left-hand corner fo the page, just to the right of the Shopify wordmark. Make a note of this URL; you'll use it often.

        Just below that, you'll see your shop's password:

        (Don't bother trying to use this password to get to my test shop; I've changed it.)

        You're probably looking at that big text and thinking "7 steps? Oh Shopify, why you gotta be like that?"

        Worry not. Just below that grey bar showing the seven steps you need to get a store fully prepped is a link that reads Skip setting up your store and launch it anyway. Click it:

        This will set up your test store with default settings, a default theme and even default inventory. You'll be taken to the admin panel for your shop, which looks like this:

        This is the first thing shopowners see when they log into their shops' admin panels.

        Now, let's add an app!

        Step 4: Add an App

        Click on the Apps tab, located near the upper right-hand corner of the page. A menu will pop up; click on its Manage Apps menu item:

        You'll be taken to the Installed Applications page, shown below:

        For the purposes of this exercise, a private app -- one that works only for this shop -- will do just fine. Click on the click here link that immediately followed the line Are you a developer interested in creating a private application for your shop?:

        You'll get taken to the Shopify API page, which manages the API keys and other credentials for your test shop's apps:

        For each app in a shop, there's a corresponding set of credentials. Let's generate some credentials now -- click the Generate new application button:

        The page will refresh and you'll see a big grey box containing all sorts of credentials:

        Here's a closer look at the credentials:

        You now have credentials that an app can use. Guess what: we're ready to make some API calls!

        A Quick Taste!

        Here's a quick taste of what we'll do in the next installment: play around with the Shopify API. Just make sure you've gone through the steps above first.

        The Shopify API is RESTful. One of the benefits of this is that you can explore parts of it with some simple HTTP GET calls, which you can easily make by typing into your browser's address bar. These calls use the following format:

        You could type in the URL yourself, but I find it's far easier to simply copy the Example URL from the lost of credentials for your apps and editing it as required:

        For example, if you want some basic information about your shop, copy the Example URL, paste it into your browser's address bar and change orders.xml to shop.xml. Press Enter; you should see results that look something like this:

         Nienow, Kuhlman and Gleason Boston 185 Rideau Street K1N 5X8 2011-07-22T14:43:21-04:00 false US nienow-kuhlman-and-gleason1524.myshopify.com 937792 555 555 5555  Massachusetts joey@joeydevilla.com USD (GMT-05:00) Eastern Time (US & Canada) development shop ${{amount}} ${{amount}} USD false  development 

        How about the products in your shop? There are some: since we skipped the full setup, your test shop comes pre-populated with some example products. Copy the Example URL, paste it into your browser's address bar and change orders.xml to products.xml. You should get a result that looks something like this:

          Shirts multi-channelled-executive-knowledge-user 2011-07-22T14:43:24-04:00 

        So this is a product.

        The text you see here is a Product Description. Every product has a price, a weight, a picture and a description. To edit the description of this product or to create a new product you can go to the Products Tab of the administration menu.

        Once you have mastered the creation and editing of products you will want your products to show up on your Shopify site. There is a two step process to do this.

        First you need to add your products to a Collection. A Collection is an easy way to group products together. If you go to the Collections Tab of the administration menu you can begin creating collections and adding products to them.

        Second you’ll need to create a link from your shop’s navigation menu to your Collections. You can do this by going to the Navigations Tab of the administration menu and clicking on “Add a link”.

        Good luck with your shop!

        Multi-channelled executive knowledge user 2011-07-22T14:43:24-04:00 47015882 Shopify 2011-07-22T14:43:24-04:00 Demo, T-Shirt 19.0 1 2011-07-22T14:43:24-04:00 Medium true 2011-07-22T14:43:24-04:00 deny true 110148372 0 Medium manual 5

        Check out the API Reference for more API calls you can try. That's what we'll be covering in the next installment, in greater detail. Happy APIing!

        Continue reading

        Why developers should be force-fed state machines

        Why developers should be force-fed state machines

        This post is meant to create more awareness about state machines in the web application developer crowd. If you don’t know what state machines are, please read up on them first. Wikipedia is a good place to start, as always.

        State machines are awesome

        The main reason for using state machines is to help the design process. It is much easier to figure out all the possible edge conditions by drawing out the state machine on paper. This will make sure that your application will have less bugs and less undefined behavior. Also, it clearly defines which parts of the internal state of your object are exposed as external API.

        Moreover, state machines have decades of math and CS research behind them about analyzing them, simplifying them, and much more. Once you realize that in management state machines are called business processes, you'll find a wealth of information and tools at your disposal.

        Recognizing the state machine pattern

        Most web applications contain several examples of state machines, including accounts and subscriptions, invoices, orders, blog posts, and many more. The problem is that you might not necessarily think of them as state machines while designing your application. Therefore, it is good to have some indicators to recognize them early on. The easiest way is to look at your data model:

        • Adding a state or status field to your model is the most obvious sign of a state machine.
        • Boolean fields are usually also a good indication, like published, or paid. Also timestamps that can have a NULL value like published_at and paid_at are a usable sign.
        • Finally, having records that are only valid for a given period in time, like subscriptions with a start and end date.

        When you decide that a state machine is the way to go for your problem at hand, there are many tools available to help you implement it. For Ruby on Rails, we have the excellent gem state_machine which should cover virtually all of your state machine needs.

        Keeping the transition history

        Now that you are using state machines for modelling, the next thing you will want to do is keeping track of all the state transitions over time. When you are starting out, you may be only interested in the current state of an object, but at some point the transition history will be an invaluable source of information. It allows you to answer all kinds of questions, like: “How long on average does it take for an account to upgrade?”, “How long does it take to get a draft blog post published?”, or “Which invoices are waiting for an initial payment the longest?”. In short, it gives you great insight on your users' behavior.

        When your state machine is acyclic (i.e. it is not possible to return to a previous state) the simplest way to keep track of the transitions is to add a timestamp field for every possible state (e.g. confirmed_atpublished_atpaid_at). Simply set these fields to the current time whenever a transition to the given state occurs.

        However, it is often possible to revisit the same state multiple times. In that case, simply adding fields to your model won’t do the trick because you will be overwriting them. Instead, add a log table in which all the state transitions will be logged. Fields that you probably want to include are the timestamp, the old state, the new state, and the event that caused the transition.

        For Ruby and Rails, Jesse Storimer and I have developed the Ruby gem state_machine-audit_trail to track this history for you. It can be used in unison with the state_machine gem.

        Deleting records?

        In some cases, you may be tempted to delete state machine records from your database. However, you should never do this. For accountability and completeness of your history alone, it is a good practice to never delete records. Instead of removing it, add an error state for any reason you would have wanted to delete a record. A spam account? Don’t delete, set to the spam state. A fraudulent order? Don’t delete, set to the fraud state.

        This allows you to keep track of these problems over time, like: how many accounts are spam, or how long it takes on average to see that an order is fraudulent.

        In conclusion

        Hopefully, reading this text has made you more aware of state machines and you will be applying them more often when developing a web application. Disclaimer: like any technique, state machines can be overused. Developer discretion is advised.

        Continue reading

        Session Hijacking Protection

        Session Hijacking Protection

        There’s been a lot of talk in the past few weeks about “Firesheep”, a new program that lets users hijack other users’ accounts on many different websites. But there’s no need to worry about your Shopify account — we’ve taken steps to ensure your account can’t be hijacked and your data is safe.

        Firesheep is a Firefox plugin (a program that integrates right into the Firefox browser) that makes it easy to perform HTTP session cookie hijacks when using an insecure connection on an untrusted network. This kind of attack is nothing new, but Firesheep makes it dead simple and shows how prevalent it is.

        The attack consists of stealing cookie data over an untrusted network and using that data to log in to other people’s user accounts. Many websites that you use daily, including Shopify, are susceptible to this kind of attack.

        Naturally we reacted to this by taking measures to ensure that this can’t happen to our users. All of your Shopify admin data is now fully secure, encrypted, and protected from Firesheep attacks.

        Technical Details

        The only way to ensure that cookie data, or any data sent over HTTP for that matter, is not been spied upon is end-to-end encryption. Currently the solution for this is SSL.

        Last week we made the switch to all SSL in the Shopify admin area. This has been applied to all URLs and all subscription plans. This means that any request made to Shopify will be forced to use SSL for secure encryption.

        But this is not quite enough to ensure that cookie data is not hijacked. By default HTTP cookies are sent over secured, as well as unsecured, connections. Without taking the extra step to secure the HTTP cookie as well, your session is still vulnerable.

        The Problem

        In Shopify’s case we weren’t able to use SSL for all traffic on the site. There are two main areas to Shopify, the shop frontend and the shop backend. In the backend is where a shop’s employees manage product data, fulfill orders, etc. In the frontend is where products are viewed, carts are filled, and checkout happens. All traffic in the backend happens under one domain, *.myshopify.com, with individual accounts having unique subdomains. One wildcard SSL cert allows us to protect the entire backend.

        We can’t apply the same strategy to the shop frontends because we allow our merchants to use custom domains for their shops. So there are literally thousands of different domain names pointing at the Shopify servers, each of which would require an SSL cert. An unsecure frontend is not too worrisome since there is no sensitive data being passed around, just information about what’s stored in the cart.

        However, this meant that we would need two different session cookies, one for use in the backend to be sent on encrypted connections only, and one for use in the frontend to be sent unencrypted.

        Using two different session stores based on routes isn’t something that Ruby on Rails supports out of the box. You set one session store for your application that gets inserted into the middleware chain and handles sessions for your application.

        The Solution

        So we came up with a

        that delegates to multiple session stores based on the
        Shopify still has only one session store handling all of its sessions, but if the request comes in under the
        path we’ll use the secure cookie, and if it comes in under another path we’ll use the unsecured cookie.

        Here is our implementation in its entirety: https://gist.github.com/704099

        This last step, the secured cookie, ensures that session cookie data is never available for hijacking.

        Continue reading

        Shopify's path to Rails 3

        Shopify's path to Rails 3

        The TL;DR version

        Shopify recently upgraded to Rails 3!

        We saw minor improvements in overall response times but what we’re most happy with is the new API – it means we get to write cleaner code and get features out faster.

        However, this upgrade wasn’t trivial – as one of the largest and oldest Rails apps around, the adventure involved jumping through a few hoops. Here’s what we did and what you might consider if you’ve got an established Rails app that you’re thinking of upgrading.

        First, some numbers

        The first svn check-in to Shopify was on the release date of Rails 0.5. That was in July of 2004, six years ago, which according to @tobi is “roughly 65 years in internet time”.

        At that time Shopify had only two active developers. Today it has eleven full time devs working on it.

        The Shopify codebase has over 300 files in the app/models directory, over 130 controllers, and almost 100 gem dependencies.
        $ find app/models/ -type f | wc -l
        $ find app/controllers/ -type f | wc -l
        $ bundle show | wc  -l

        Over the past 6 years Shopify has been under constant development, amassing nearly 12000 commits. This makes Shopify one of the oldest, most active Rails projects in existence.

        Our process

        There are many Rails 3 upgrade guides out there, but we didn’t try to follow any of them. We focused on doing as much as we could ahead of time to prepare for Rails 3, and then giving one big final push when 3.0 final was released.

        When upgrading a large app to a major release like this we found there are some things you can do to prepare yourself, but at a certain point you’ve just got to bite the bullet and make the final push to get things working.


        Shopify had been using Bundler in production for 9 months before making the move to Rails 3. Like most, we weren’t convinced of its utility at first, but as the code got more stable we saw how much it helped with deployments and managing development environments. We think Bundler was absolutely the right choice for managing dependencies.

        It was pretty painless to use Bundler with Rails 2.3.x, the Bundler documentation has everything that is needed. We’d definitely recommend doing this step ahead of time as it removes one more obstacle in the Rails 3 migration.


        This was a big one. Some more numbers: Shopify has about 100 helper modules and 130 views. The task of updating all of our views/helpers for the new ‘safe by default’ XSS behaviour was a separate migration all its own. This too, we completed a few months before the release of 3.0.

        There was no secret way to go about this, just the obvious back-breaking way. Here’s the basic process I followed:

        1. Run the functional tests. Fix any issues that show up there.
        2. Boot up Shopify in my development environment and click around, fixing any issues I see there.
        3. Manually scan through all of the modules in app/helpers, looking for anything suspicious.
        4. Deploy the code to our staging server. Have the team try it out and report any errors to a shared Google spreadsheet (great for collaborative editing).
        5. Code review.
        6. Deploy the code to production and hope that no issues slipped through.

        N.B. When new issues come in, do your best to use ack (or some other project search tool) to find any instances of that issue in other views/helpers and correct those as well.

        The rest

        After getting Bundler and XSS out of the way, the rest of the migration was done as one large chunk. Some of the work in upgrading to Rails 3 was actually going on in parallel to the XSS work.

        The first commit to our rails3 branch was made back in February when the first Rails 3 beta was released. At that point we didn’t know how much work it would be to get Shopify running on Rails 3. We were excited about the launch of the beta and the prospect of getting Shopify using it soon.

        After a few days of work we ran into some major blockers that were keeping the app from functioning. Work was abondoned on the rails3 branch for 5 months while the 3.0 release became more stable. When the first release candidate came out in July work we resurrected the rails3 branch.

        From then (mid-July) until mid-October the rails3 branch saw pretty constant action, never going more than a few days without a commit. There was a lull during the XSS migration, and as devs took on other projects while doing the migration. We remained mindful of the fact that 3.0 final wasn’t yet released and didn’t want to put our changes into production until we had the confidence of that final release.

        Since this whole process took several months there was a lot of activity going on in the master branch at the same time. The only advice to offer is merge early and merge often.

        When the final release came out we once again underestimated how much work would be involved in getting Shopify the rest of the way on to Rails 3. The day that it was released @tobi put something like the following into our Campfire room “Let’s get Shopify running on Rails 3! Any devs who want to help join the Meeting Room [campfire room].” It was another few weeks before all was finished.

        Major stumbling blocks


        Shopify also has lots of routes.

        $ rake routes | wc -l

        At the beginning of the upgrade process we used the routes rake task that comes with the rails_upgrade plugin but we were still plagued with missing routes throughout the upgrade.

        Although our routes tripled in size, the increase was worth it because the new routing API is much nicer to work with.

        The old
        map.namespace :admin do |admin|
          admin.resources :products, :collection => { :inventory => :get,
            :count => :get },  
            :member => { :duplicate => :post, 
              :sort => :post,
              :reorganize => :any,
              :update_published_status => :post } do |products|        
            products.resources :variants, :controller => "product_variants", :collection => { :reorder => :post, :set => :post, :count => :get }
        The new
        namespace :admin do
          resources :products do
            collection do
              get :count
              get :inventory
            member do
              post :sort
              post :duplicate
              post :update_published_status
              match :reorganize
            resources :variants, :controller => 'product_variants' do
              collection do
                get :count
                post :set
                post :reorder


        Like everyone else we were tripped up by libraries in need of upgrades for Rails 3 compliance. There was a lot less of this than you’d expect because Shopify implements so much of what it needs internally. Lots of code in Rails core began in Shopify’s code base.

        There were updates required to the plugins that Shopify maintains. Otherwise, when we found issues with libraries we were happy to discover that other maintainers were diligent and had already pushed fixes for Rails 3 compatibility, it was just a matter of updating library versions we were tracking.

        helper :all

        helper(:all) was a configuration option in Rails 2.x. You could add it to a controller and that controller would have access to all helpers modules defined in your application. In 2.x this was part of the default Rails template, but it could be removed for users who didn’t want it.

        In Rails 3.0 this has been moved into ActionController::Base and it can no longer be turned off. This can create very weird behaviour like the following: https://gist.github.com/517669

        This was causing issues for us since a lot of our helpers define methods with the same name. We ended submitting a patch to Rails that let us continue to use routes with the default naming scheme. The fix is to use the
        method in your
        class ApplicationController


        External services

        Shopify integrates with a myriad of external services. Payment gateways through ActiveMerchant, fulfillment services through ActiveFulfillment, shipping providers through ActiveShipping, product search engines, Google Analytics, Google Checkout, the list goes on.

        Ensuring that these integrations continued working was very important for us and we would have had issues had we not thoroughly tested them. Don’t overlook this step.

        Looking ahead

        Towards the end of the upgrade we (jokingly) asked ourselves if it was really worthwhile to upgrade to Rails 3. After all, we were doing just fine with Rails 2.x, and upgrading to 3.0 was not trivial.

        To give you an idea of how much code was changed, here’s the diffstat from Github:

        But we soon came to realize that there are a lot of exciting things coming in future releases in the 3.x series and this is the way forward. We’re really excited about getting to use stuff like Arel 2.0, Automatic Flushing, Identity Map, and lots of other goodies.

        The Rails project and its surrounding ecosystem are moving ahead quickly. By staying on top of it, we can provide the best tools for our developers and the best experience for our customers.

        Continue reading

        ActiveMerchant version 1.9 released

        ActiveMerchant version 1.9 released

        A little bit of background history

        As some of you may know, quite a while ago Shopify extracted all of its payment gateway related code into the open source project ActiveMerchant. Since then the project has evolved into one of the most successful Ruby libraries with over 400 “forks” (meaning that other developers customized the code to their needs and added functionality as required).

        Whenever developers think their changes are a contribution to the official project (e.g. by adding support for a new payment gateway) they send out a so called “pull request” and after we review the implementation we usually merge their changes into ActiveMerchant for everyone to use, meaning all Shopify customers and every programmer using the ActiveMerchant library in their code base will benefit from the new updates.

        Exciting news

        We have been digging around a lot lately for interesting changes to the project and decided to pull some of the bigger ones into the official repository. This resulted in the release of version 1.8.0 last month adding two new gateways.

        Since then we found even more interesting contributions that we decided to merge into the official project and we also developed two offsite integrations internally. The result is that ActiveMerchant (and thus Shopify) now supports seven additional payment gateways for merchants from various countries around the world:

        The seventh new gateway is SagePay Form, an offsite alternative to our existing SagePay implementation, in order to give merchants in the United Kingdom and Ireland the option of using 3D Secure for transactions. 3D Secure is required for certain U.K. credit card brands.

        Open Source is awesome-sauce

        That brings the number of supported gateways in ActiveMerchant to an impressive total of 63. This would have not been possible without the help of an international community so huge thanks go out to all the contributors that helped ActiveMerchant spread around the world!

        If you are aware of any gateway implementations that should make it into the official ActiveMerchant gem let us know and we’ll be happy to review them.

        Continue reading

        Start your free 14-day trial of Shopify