Building an Internal Cloud with Docker and CoreOS

Building an Internal Cloud with Docker and CoreOS

This is the first in a series of posts about adding containers to our server farm to make it easier to scale, manage, and keep pace with our business.

The key ingredients are:

  • Docker: container technology for making applications portable and predictable
  • CoreOS: provides a minimal operating system, systemd for orchestration, and Docker to run containers

Shopify is a large Ruby on Rails application that has undergone massive scaling in recent years. Our production servers are able to scale to over 8,000 requests per second by spreading the load across 1700 cores and 6 TB RAM.

Continue reading

Kafka Producer Pipeline for Ruby on Rails

Kafka Producer Pipeline for Ruby on Rails

In the early fall our infrastructure team was considering Kafka, a highly available message bus. We were looking to solve several infrastructure problems that had come up around that time.
  • We were looking for a reliable way to collect event data and send it to our data warehouse.

  • We were considering a more service-oriented architecture, and needed a standardized way of message passing between the components.

  • We were starting to evaluate containerization of Shopify, and were searching for a way to get logs out of containers.

We were intrigued by Kafka due to its highly available design. However, Kafka runs on the JVM, and its primary user, LinkedIn, runs a full JVM stack. Shopify is mainly Ruby on Rails and Go, so we had to figure out how to integrate Kafka into our infrastructure.

Continue reading

Building a Rack Middleware

Building a Rack Middleware

I'm Chris Saunders, one of Shopify's developers. I like to keep journal entries about the problems I run into while working on the various codebases within the company.

Recently we ran into a issue with authentication in one of our applications and as a result I ended up learning a bit about Rack middleware. I feel that the experience was worth sharing with the world at large so here's is a rough transcription of my entry. Enjoy!


I'm looking at invalid form submissions for users who were trying to log in via their Shopify stores. The issue was actually at a middleware level, since we were passing invalid data off to OmniAuth which would then choke because it was dealing with invalid URIs.

The bug in particular was we were generating the shop URL based on the data that the user was submitting. Normally we'd be expecting something like mystore.myshopify.com or simply mystore, but of course forms can be confusing and people put stuff in there like http://mystore.myshopify.com or even worse my store. We'd build up a URL and end up passing something like https://http::/mystore.myshopify.com.myshopify.com and cause an exception to get raised.

Another caveat is that we aren't able to even sanitize the input before passing it off to OmniAuth, unless we were to add more code to the lambda that we pass into the setup initializer.

Adding more code to an initializer is definitely less than optimal, so we figured that we could implement this in a better way: adding a middleware to run before OmniAuth such that we could attempt to recover the bad form data, or simply kill the request before we get too deep.

We took a bit of time to learn about how Rack middlewares work, and looked to the OmniAuth code for inspiration since it provides a lot of pluggability and is what I'd call a good example of how to build out easily extendable code.

We decided that our middleware would be initialized with a series of routes to run a bunch of sanitization strategies on. Based on how OmniAuth works, I gleaned that the arguments after config.use MyMiddleWare would be passed into the middleware during the initialization phase - perfect! We whiteboarded a solution that would work as follows:

Now that we had a goal we just had to implement it. We started off by building out the strategies since that was extremely easy to test. The interface we decided upon was the following:

We decided that the actions would be destructive, so instead of creating a new Rack::Request at the end of our strategies call, we'd change values on the object directly. It simplifies things a little bit but we need to be aware that order of operations might set some of our keys to nil and we'd have to anticipate that.

The simplest of sanitizers we'd need is one that cleans up our whitespace. Because we are building these for .myshopify.com domains we know the convention they follow: dashes are used as separators between words if the shop was created with spaces. For example, if I signed up with my super awesome store when creating a shop, that would be converted into my-super-awesome-store. So if a user accidentally put in my super awesome store we can totally recover that!

Now that we have a sanitization strategy written up, let's work on our actual middleware implementation.

According to the Rack spec, all we really need to do is ensure that we return the expected result: an array that consists of the following three things: A response code, a hash of headers and an iterable that represents the content body. An example of the most basic Rack response is:

Per the Rack spec, middlewares are always initialized where the first object is a Rack app, and whatever else afterwards. So let's get to the actual implementation:

That's pretty much it! We've written up a really simple middleware that takes care of cleaning up some bad user input that necessarily isn't a bad thing. People make mistakes and we should try as much as possible to react to this data in a way that isn't jarring to the users of our software.

You can check out our implementation on Github and install it via RubyGems. Happy hacking!

Continue reading

IdentityCache: Improving Performance one Cached Model at a Time

IdentityCache: Improving Performance one Cached Model at a Time

A month ago Shopify was at BigRubyConf where we mentioned an internal library we use for caching ActiveRecord models called IdentityCache. We're pleased to say that the library has been extracted out of the Shopify code base and has been open sourced!
 
At Shopify, our core application has been database performance bound for much of our platform’s history. That means that the most straightforward way of making Shopify more performant and resilient is to move work out of the database layer. 
 
For many applications, achieving a very high cache ratio is a matter of storing full cached response bodies, and versioning them based on the associated records in the database, serving always the more current version and relying on the cache’s LRU algorithm for expiration. 
 
That technique, called a “generational page cache”, is well proven and very reliable.  However, part of Shopify’s value proposition is that store owners can heavily customize the look and feel of their shops. We in fact offer a full fledged templating language
 
As a side effect, full page static caching is not as effective as it would be in most other web platforms, because we do not have a deterministic way of knowing what database rows we’ll need to fetch on every page render. 
 
The key metric driving the creation of IdentityCache was our master database’s queries per (second/minute) and thus the goal was to reduce read operations reaching the database as much as possible. IdentityCache does this by moving the workload to Memcached instead.
 
The inability of a full page cache to take load away from the database becomes even more evident during write heavy - and thus page cache expiring - events like Cyber Monday, and flash sales. On top of that, the traffic on our web app servers typically doubles each year, and we invested heavily in building out IdentityCache to help absorb this growth.  For instance, in 2012 during the last pre-IdentityCache sales peak, we saw 130.000 requests per minute generating 21.000 queries per second in comparison with the latest flash sale on April 2013 generated 203.000 requests with only 14.500 queries per second.  

What Exactly is IdentityCache?

IdentityCache is a read through cache for ActiveRecord models. When reading records from the cache, IdentityCache will try to fetch the requested object from memcached. If the cache entry doesn't exist, IdentityCache will load the object from the database and store it in memcache, then the cached copy will be available for subsequent reads and avoid any more trips to the database. This behaviour is key during events that expire the cache often.
 
Expiration is explicit and does not rely on Memcached's LRU. It is automatic, objects are expired from the cache by issuing memcached delete command as they change in the database via after_commit hooks. This is important because given a row in the database we can always calculate its cache key based on the current table schema and the row’s id. There is no need for the user to ever call delete themselves. It was a conscious decision to take expiration away from day-to-day developer concerns.
 
This has been a huge help as the characteristics of our application and Rails have changed. One great example of this is how Ruby on Rails changed what actions would fire after_commit hooks. For instance, in Rails 3.2, touch will not fire an after_commit. Instead of having to add expires, and think about all the possible ramifications every time, we added the after_touch hook into IdentityCache itself.
 
Aside from the default key, built from the schema and the row id, IdentityCache uses developer defined indexes to access your models. Those indexes simply consist of keys that can be created deterministically from other row fields and the current schema. Declaring an index will also add a helper method to fetch your cached models using said index.
 
IdentityCache is opt-in, meaning developers need to explicitly specify what should be indexed and explicitly ask for data from the cache. It is important that developers don’t have to guess whether calling a method will bring a cached entry or not. 
 
We think this is a good thing. Having caching hook in automatically is nice in its simplest form.  However, IdentityCache wasn't built for simple applications, it has been built for large, complicated applications where you want, and need to know what's going on.

Down to the Numbers

If that wasn’t good enough, here are some numbers from Shopify itself.
 
 
This is an example of when we introduced IdentityCache to one of the objects that is heavily hit on the shop storefronts. As you can see we cut out thousands of calls to the database when accessing this model. This was huge since the database is one of the heaviest contended components of Shopify.
 
 
This example shows similar results once IdentityCache was introduced. We eliminated what was approaching 50K calls per minute (which was growing steadily) to almost nothing since the subscription was now being embedded with the Shop object. Another huge win from IdentityCache.

Specifying Indexes

Once you include IdentityCache into your model, you automatically get a fetch method added to your model class. Fetch will behave like find plus the read-through cache behaviour.
 
You can also add other indexes to your models so that you can load them using a different key. Here are a few examples:
class Product < ActiveRecord::Base
  include IdentityCache
end

Product.fetch(id)

class Product < ActiveRecord::Base
  include IdentityCache
  cache_index :handle
end

Product.fetch_by_handle(handle)
We’ve tried to make IdentityCache as simple as possible to add to your models. For each cache index you add, you end up with a fetch_* method on the model to fetch those objects from the cache.
 
You can also specify cache indexes that look at multiple fields. The code to do this would be as follows:
class Product < ActiveRecord::Base
  include IdentityCache
  cache_index :shop_id, :id
end

Product.fetch_by_shop_id_and_id(shop_id, id)

Caching Associations

One of the great things about IdentityCache is that you can cache has_one, has_many and belongs_to associations as well as single objects. This really sets IdentityCache apart from similar libraries.
 
This is a simple example of caching associations with IdentityCache:
class Product < ActiveRecord::Base
  include IdentityCache
  has_many :images
  cache_has_many :images
end

@product = Product.fetch(id)
@images = @product.fetch_images
What happens here is the product is fetched from either Memcached or the database if it's a cache miss. We then look for the images in the cache or database if we get another miss. This also works for both has_one and belongs_to associations with the cache_has_one and cache_belongs_to IdentityCache, respectively.
 
What if we always want to load the images though, do we always need to make the two requests to the cache? 

Embedding Associations

With IdentityCache we can also embed the associations with the parent object so that when you load the parent the associations are also cached and loaded on a cache hit. This avoids needing to make the multiple Memcached calls to load all the cached data. To enable this you simple need to add the ':embed => true' options. Here's a little example:
class Product < ActiveRecord::Base
  include IdentityCache
  has_many :images
  cache_has_many :images, :embed => true
end

@product = Product.fetch(id)
@images = @product.fetch_images
The main difference with this example versus the previous is that the '@product.fetch_images' call won't hit Memcached a second time; the data is already loaded when we fetch the product from Memcached.
 
The tradeoffs of using embed are: first your entries in memcached will be larger, as they’ll have to store data for the model and its embedded associations, second the whole cache entry will expire on changes to any of the models cached.
 
There are a number of other options and different ways you can use IdentityCache which are highlighted on the github page https://github.com/Shopify/identity_cache, I highly encourage anyone interested to take a look at those examples for more details. Please check it out for yourself and let us know what you think!

Continue reading

What Does Your Webserver Do When a User Hits Refresh?

What Does Your Webserver Do When a User Hits Refresh?

Your web application is likely rendering requests when the requesting client has already disconnected. Eric Wong helped us devise a patch for the Unicorn webserver that will test the client connection before calling the application, effectively dropping disconnected requests before wasting app server rendering time.

The Flash Sale

A common traffic pattern we see at Shopify is the flash sale, where a product will be discounted heavily or only available for a very short period of time. Our customer's flash sales can cause traffic spikes an order of magnitude above our typical traffic rate.

This blog post highlights one of the problems dealing with these traffic surges that we solved during our preparation for the holiday shopping season.

In a flash sale scenario, with our app servers under high load, response time grows.  As our response time increases, customers attempting to buy items will hit refresh in frustration.  This was causing a snowball effect that would contribute to reduced availability.

Connection Queues 

Each of our application servers run Nginx in front of many Unicorn workers running our Rails application.  When Nginx receives a request, it opens a queued connection on the shared socket that is used to communicate with Unicorn.  The Unicorn workers work off requests in the order they're placed on the socket’s connection backlog.  

The worker process looks something like:

The second step takes the bulk majority of time of processing a request.  Under load, the queue of pending requests sitting on the UNIX socket from Nginx grows until it reaches maximum capacity (SOMAXCONN).  When the queue reaches capacity, Nginx will immediately return a 502 to incoming requests as it has nowhere to queue the connection.

Pending Requests

While the app worker is busy rendering a request, the pending requests in the socket backlog represent users waiting for a result.  If a users hits refresh, their browser closes the current connection and their new connection enters the end of the queue (or nginx returns a 502 if the queue is full).  So what happens when the application server gets to the user's original request in the queue?

Nginx and HTTP 499

The HTTP 499 response code is not part of the HTTP standard.  Nginx logs this response code when a user disconnects before the application returned a result.  Check your logs - an abundance of 499s is a good indication that your application is too slow or over capacity, as people are disconnecting instead of waiting for a response.  Your Nginx logs will always have some 499s due to clients disconnecting before even a quick request finishes.

HTTP 200 vs HTTP 499 Responses During a Flash Sale

When Nginx logs an HTTP 499 it also closes the downstream connection to the application, but it is up to the application to detect the closed connection before wasting time rendering a page for a client who already disconnected.

Detecting Closed Sockets

With the asynchronous nature of sockets, detecting a closed connection isn't straightforward.  Your options are:

  • Call select() on the socket.  If a connection is closed, it will return as "data available" but a subsequent read() call will fail.
  • Attempt to write to the socket.

Unfortunately it is typical for web applications to find out the client socket is closed only after spending the time and resources rendering the page, when it attempts to write the response.  This is what our Rails application was doing.  The net effect was that for every time a user pressed refresh, we would render that page, even if the user had already disconnected.  This would cause a snowball effect until eventually our app workers were doing little but rendering pages and throwing them away and our service was effectively down.

What we wanted to do was test the connection before calling the application, so we could filter out closed sockets and avoid wasting time.  The first detection option above is not great: select() requires a timeout, and generally select() with even the shortest timeout will take a fraction of a millisecond to complete.  So we went with the second solution:  Write something to the socket to test it, before calling the application.  This is typically the best way to deal with resources anyways: just attempt to use them and there will be an error if there’s something in the way.  Unicorn was already acting that way, just not until after wasting time rendering the page.

Just write an 'H'

Thankfully all HTTP responses start with "HTTP/1.1", so (rather cheekily) our patch to Unicorn writes this string to test the connection before calling the application.  If writing to the socket fails, Unicorn moves on to process the next request and only a trivial amount of time is spent dealing with the closed connection.

Eric Wong merged this change into Unicorn master and soon after released Unicorn V4.5.0.  To use this feature you must add 'check_client_connection true' to your Unicorn configuration.

 

Continue reading

Introducing the Super Debugger: A Wireless, Real-Time Debugger for iOS Apps

Introducing the Super Debugger: A Wireless, Real-Time Debugger for iOS Apps

By Jason Brennan

LLDB is the current state of the art for iOS debugging, but it’s clunky and cumbersome and doesn’t work well with objects. It really doesn't feel very different from gdb. It's a solid tool but it requires breakpoints, and although you can integrate with Objective C apps, it's not really built for it. Dealing with objects is cumbersome, and it's hard to see your changes.

This is where Super Debugger comes in. It's a new tool for rapidly exploring your objects in an iOS app whether they're running on an iPhone, iPad, or the iOS Simulator, and it's available today on Github. Check over the included readme to see what it can do in detail.

Today we're going to run through a demonstration of an included app called Debug Me.

  1. Clone the superdb repository locally to your Mac and change into the directory.

    git clone https://github.com/Shopify/superdb.git
        cd superdb
    
        
  2. Open the included workspace file, SuperDebug.xcworkspace, select the Debug Me target and Build and Run it for your iOS device or the Simulator. Make sure the device is on the same wifi network as your Mac.

  3. Go back to Xcode and change to the Super Debug target. This is the Mac app that you'll use to talk to your iOS app. Build and Run this app.

  4. In Super Debug, you'll see a window with a list of running, debuggable apps. Find Debug Me in the list (hint: it's probably the only one!) and double click it. This will open up the shell view where you can send messages to the objects in your app, all without setting a single break point.

  5. Now let's follow the instructions shown to us by the Debug Me app.

  6. In the Mac app, issue the command .self (note the leading dot). This updates the self pointer, which will execute a Block in the App delegate that returns whatever we want to be pointed to by the variable self. In this case (and in most cases), we want self to point to the current view controller. For Debug Me, that means it points to our instance of DBMEViewController after we issue this command.

  7. Now that our pointer is set up, we can send a message to that pointer. Type self redView layer setMasksToBounds:YES. This sends a chain of messages in F-Script syntax. In Objective C, it would look like [[[self redView] layer] setMasksToBounds:YES]. Here we omit the braces because of our syntax.

    We do use parentheis sometimes, when passing the result of a message send would be ambiguous, for example something like this in Objective C: [view setBackgroundColor:[UIColor purpleColor]] would be view setBackgroundColor:(UIColor purpleColor) in our syntax.

  8. The previous step has no visible result, so lets make a change. Type self redView layer setCornerRadius:15 and see the red view get nice rounded corners!

  9. Now for the impressive part. Move your mouse over the number 15 and see it highlight. Now click and drag left or right, and see the view's corner radius update in real time. Awesome, huh?

That should be enough to give you a taste of this brand new debugger. Interact with your objects in real-time. Iterate instantly. No more build, compile, wait. It's now Run, Test, Change. Fork the project on Github and get started today.

Continue reading

RESTful thinking considered harmful - followup

RESTful thinking considered harmful - followup

My previous post RESTful thinking considered harmful caused quite a bit of discussion yesterday. Unfortunately, many people seem to have missed the point I was trying to make. This is likely my own fault for focusing too much on the implementation, instead of the thinking process of developers that I was actually trying to discuss. For this reason, I would like to clarify some points.

  • My post was not intended as an arguments against REST. I don't claim to be a REST expert, and I don't really care about REST semantics.
  • I am also not claiming that it is impossible to get the design right using REST principles in Rails.

So what was the point I was trying to make?

  • Rails actively encourages the REST = CRUD design pattern, and all tutorials, screencasts, and documentation out there focuses on designing RESTful applications this way.
  • However, REST requires developers to realize that stuff like "publishing a blog post" is a resource, which is far from intuitive. This causes many new Rails developers to abuse the update action.
  • Abusing update makes your application lose valuable data. This is irrevocable damage.
  • Getting REST wrong may make your API less intuitive to use, but this can always be fixed in v2.
  • Getting a working application that properly supports your process should be your end goal, having it adhere to REST principles is just a means to get there.
  • All the focus on RESTful design and discussion about REST semantics makes new developers think this is actually more important and messes with them getting their priorities straight.

In the end, having a properly working application that doesn't lose data is more important than getting a proper RESTful API. Preferably, you want to have both, but you should always start with the former.

Improving the status quo

In the end, what I want to achieve is educating developers, not changing the way Rails implements REST. Rails conventions, generators, screencasts, and tutorials are all part of how we educate new Rails developers.

  • Rails should ship with a state machine implementation, and a generator to create a model based on it. Thinking "publishing a blog post" is a transaction in a state machine is a lot more intuitive.
  • Tutorials, screencasts, and documentation should focus on using it to design your application. This would lead to to better designed application with less bugs and security issues.
  • You can always wrap your state machine in a RESTful API if you wish. But this should always come as step 2.

Hopefully this clarifies a bit better what I was trying to bring across.

Continue reading

RESTful thinking considered harmful

RESTful thinking considered harmful

It has been interesting and at times amusing to watch the last couple of intense debates in the Rails community. Of particular interest to me are the two topics that relate to RESTful design that ended up on the Rails blog itself: using the PATCH HTTP method for updates and protecting attribute mass-assignment in the controller vs. in the model.

REST and CRUD

These discussions are interesting because they are both about the update part of the CRUD model. PATCH deals with updates directly, and most problems with mass-assignment occur with updates, not with creation of resources.

In the Rails world, RESTful design and the CRUD interface are closely intertwined: the best illustration for this is that the resource generator generates a controller with all the CRUD actions in place (read is renamed to show, and delete is renamed to destroy). Also, there is the DHH RailsConf '06 keynote linking CRUD to RESTful design.

Why do we link those two concepts? Certainly not because this link was included in the original Roy Fielding dissertation on the RESTful paradigm. It is probably related to the fact that the CRUD actions match so nicely on the SQL statements in relational databases that most web applications are built on (SELECT, INSERT, UPDATE and DELETE) on the one hand, and on the HTTP methods that are used to access the web application on the other hand. So CRUD seems a logical link between the two.

But do the CRUD actions match nicely on the HTTP methods? DELETE is obvious, and the link between GET and read is also straightforward. Linking POST and create already takes a bit more imagination, but the link between PUT and update is not that clear at all. This is why PATCH was added to the HTTP spec and where the whole PUT/PATCH debate came from.

Updates are not created equal

In the relational world of the database, UPDATE is just an operator that is part of set theory. In the world of publishing hypermedia resources that is HTTP, PUT is just a way to replace a resource on a given URL; PATCH was added later to patch up an existing resource in an application-specific way.

But was it an update in the web application world? It turns out that it is not so clear cut. Most web application are built to support processes: it is an OLTP system. A clear example of an OLTP system supporting a process is an ecommerce application. In an OLTP system, there is two kinds of data: master data of the objects that play a role within the context of your application (e.g. customer and product) and process-describing data, the raison d'être of your application (e.g., an order in the ecommerce example).

For master data, the semantics of an update are clear: the customer has a new address, or a products description gets rewritten [1]. For process-related data it is not so clear cut: the process isn't so much updated, the state of the process is changed due to an event: a transaction. An example would be the customer paying the order.

In this case, a database UPDATE is used to make the data reflect the new reality due to this transaction. The usage of an UPDATE statement actually is an implementation detail, and you can easily do that otherwise. For instance, the event of paying for an order could just as well be stored as a new record INSERTed into the order_payments table. Even better would be to implement the process as a state machine, two concepts that are closely linked, and to store the transactions so you can later analyze the process.

Transactional design in a RESTful world

RESTful thinking for processes therefore causes more harm than it does good. The RESTful thinker may design both the payment of an order and the shipping of an order both as updates, using the HTTP PATCH method:

    PATCH /orders/42 # with { order: { paid: true  } }
    PATCH /orders/42 # with { order: { shipped: true } }

Isn't that a nice DRY design? Only one controller action is needed, just one code path to handle both cases!

But should your application in the first place be true to RESTful design principles, or true to the principles of the process it supports? I think the latter, so giving the different transactions different URIs is better:

    POST /orders/42/pay
    POST /orders/42/ship

This is not only clearer, it also allows you to authorize and validate those transactions separately. Both transactions affect the data differently, and potentially the person that is allowed to administer the payment of the order may not be the same as the person shipping it.

Some notes on implementation and security

When implementing a process, every possible transaction should have a corresponding method in the process model. This method can specify exactly what data is going to be updated, and can easily make sure that no other will be updated unintentionally.

In turn, the controller should call this method on the model. Using update_attributes from your controller directly should be avoided: it is too easy to forget appropriate protection for mass-assignment, especially if multiple transactions in the process update different fields of the model. This also sheds some light in the protecting from mass-assignment debate: protection is not so much part of the controller or the model, but should be part of the transaction.

Again, using a state machine to model the process makes following this following these principles almost a given, making your code more secure and bug free.

Improving Rails

Finally, can we improve Rails to reflect these ideas and make it more secure? Here are my proposals:

  • Do not generate an update action that relies on calling update_attributes when running the resource generator. This way it won't be there if it doesn't need to be reducing the possibility of a security problem.
  • Ship with a state machine implementation by default, and a generator for a state machine-backed process model. Be opinionated!

These changes would point Rails developers into the right direction when designing their application, resulting in better, more secure applications.


[1] You may even want to model changes to master data as transactions, to make your system fully auditable and to make it easy to return to a previous value, e.g. to roll back a malicious update to the ssh_key field in the users table.

A big thanks to Camilo Lopez, Jesse Storimer, John Duff and Aaron Olson for reading and commenting on drafts of this article.

 

Update: apparently many people missed to point I was trying to make. Please read the followup post in which I try to clarify my point.

Continue reading

Webhook Best Practices

Webhook Best Practices

Webhooks are brilliant when you’re running an app that needs up-to-date information from a third party. They’re simple to set up and really easy to consume.

Through working with our third-party developer community here at Shopify, we’ve identified some common problems and caveats that need to be considered when using webhooks. Best practices, if you will.

When Should I Be Using Webhooks?

Let’s start with the basics. The obvious case for webhooks is when you need to act on specific events. In Shopify, this includes actions like an order being placed, a product price changing, etc. If you would otherwise have to poll for data, you should be using webhooks.

Another common use-case we’ve seen is when you’re dealing with data that isn’t easily searchable though the API you’re dealing with. Shopify offers several filters on our index requests, but there’s a fair amount of secondary or implied data that isn’t directly covered by these. Re-requesting the entire product catalog of a store whenever you want to search by SKU or grabbing the entire order history when you need to find all shipping addresses in a particular city is highly inefficient. Fortunately some forward planning and webhooks can help.

Let’s use searching for product SKUs on Shopify as an example:

The first thing you should do is grab a copy of the store’s product catalog using the standard REST interface. This may take several successive requests if there’s a large number of products. You then persist this using your favourite local storage solution.

Then you can register a webhook on the product/updated event that captures changes and updates your local copy accordingly. Bam, now you have a fully searchable up-to-date product catalog that you can transform or filter any way you please.

How Should I Handle Webhook Requests?

There’s no official spec for webhooks, so the way they’re served and managed is up to the originating service. At Shopify we’ve identified two key issues:

  • Ensuring delivery/Detecting failure
  • Protecting our system

To this end, we’ve implemented a 10-second timeout period and a retry period for subscriptions. We wait 10 seconds for a response to each request, and if there isn’t one or we get an error, we retry the connection several times over the next 48 hours.

If you’re receiving a Shopify webhook, the most important thing to do is respond quickly. There have been several historical occurrences of apps that do some lengthy processing when they receive a webhook that triggers the timeout. This has led to situations where webhooks were removed from functioning apps. Oops!

To make sure that apps don’t accidentally run over the timeout limit, we now recommend that apps defer processing until after the response has been sent. In Rails, Delayed Jobs are perfect for this.

What Do I Do if Everything Blows Up?

This one is a key component of good software design in general, but I think it’s worth mentioning here as the scope is beyond the usual recommendations about data validation and handling failures gracefully.

Imagine the worst case scenario: Your hosting centre exploded and your app has been offline for more than 48 hours. Ouch. It’s back on its feet now, but you’ve missed a pile of data that was sent to you in the meantime. Not only that, but Shopify has cancelled your webhooks because you weren’t responding for an extended period of time.

How do you catch up? Let’s tackle the problems in order of importance.

Getting your webhook subscriptions back should be straightforward as your app already the code that registered them in the first place. If you know for sure that they’re gone you can just re-run that and you’ll be good to go. One thing I’d suggest is adding a quick check that fetches all the existing webhooks and only registers the ones that you need.

Importing the missing data is trickier. The best way to get it back is to build a harness that fetches data from the time period you were down for and feeds it into the webhook processing code one object at a time. The only caveat is that you’ll need the processing code to be sufficiently decoupled from the request handlers that you can call it separately.

Webhooks Sound Magic, Where Can I Learn More?

We have a comprehensive wiki page on webhooks as well as technical documentation on how to manage webhooks in your app.

There’s also a good chunk of helpful threads on our Developer Mailing List.

Continue reading

Defining Churn Rate (no really, this actually requires an entire blog post)

Defining Churn Rate (no really, this actually requires an entire blog post)

If you go to three different analysts looking for a definition of "churn rate," they will all agree that it's an important metric and that the definition is self evident. Then they will go ahead and give you three different definitions. And as they share their definitions with each other they all have the same response: why is everyone else making this so com

Continue reading

Application Proxies: The New Hotness

Application Proxies: The New Hotness

I’m pleased to announce a brand new feature that we recently added to the Shopify API: Application Proxies. These will allow you do develop all kinds of crazy things that weren’t possible before, and we’re really excited about it. Let me explain.

What’s an App Proxy?

An App Proxy is simply a page within a Shopify shop that loads its content from another location of your choosing. Applications can tell certain shop pages that they should fetch and display data from another location outside of Shopify.

The really cool thing about the implementation we’ve put together is that if you return data with the application/liquid content-type we’ll run it through Shopify’s template rendering engine before pushing it out to the user. This allows you to create dynamic native pages without having to do anything crazy with iframes. I’ll explain this in more detail later.

How Do I Set This Up?

We have a great App Proxy tutorial over on our API docs that takes you through the steps, but I’ll summarize them here too.

The first thing you need to do is set up the path that should be proxied and where it should be proxied to. This is done from your app’s configuration screen on the Partners dashboard.

Once you’ve done that, you’ll need to work out what data you’re going to return when the specified URL is hit. You can return anything you want but for now we’re going to show some very simple stats to get the ball rolling.

All my examples assume that you’re using the shopify_app gem as a starting point, but the topics I cover translate directly to all languages.

Before we do anything else we need a controller to handle the calls. I generated a ProxyController class and mapped /proxy to hit its index method. I also created a template to render the response.

Here’s the controller:

class ProxyController < ApplicationController
  def index
  end
end

And here’s the template:

<h1>Hello App Proxy World</h1>

Really easy so far. Now we can start our rails app and visit the proxied page in a browser. It should look something like this:

Not much to see here just yet. In fact, it looks nothing like our shop. Let’s do something about that.

What we want is for shopify to render the page just like it were any other native data using the Liquid engine. We tell Shopify to do this by setting the content-type header on our response to application/liquid. At the same time we’re going to tell rails not to use its own layouts when rendering the page.

Add this line to the index method in ProxyController

render :layout => false, :content_type => 'application/liquid'

Now save and reload the page. Tada! Here’s what you’ll see:

Good, eh?

Next Steps

Static text is all well and good, but its not very interesting. What we really want here are some stats. I’ve chosen to display the shop’s takings as well as a link to the most popular product for the last week.

Now that we’re trying to access shop data we need to figure out which shop is sending us the request in the first place. Fortunately the url of the shop is one of the GET parameters on the request, so we can grab that and use it to configure our environment to make API calls. Details on how to do this are documented here, so go set that up and then come back when you’re done. I’ll wait.

Back? Excellent. Let’s put some info into our response. here’s what your ProxyController should look like now:

class ProxyController < ApplicationController

  def index
    ShopifyAPI::Base.site = Shop.find_by_name(params[:shop]).api_url

    @orders = ShopifyAPI::Order.find(:all, :params => {:created_at_min => 1.week.ago})
    @total = 0

    @product_sale_counts =Hash.new()

    @orders.each do |order|
      order.line_items.each do |line_item|
        if @product_sale_counts[line_item.product_id]
          @product_sale_counts[line_item.product_id] = @product_sale_counts[line_item.product_id] + line_item.quantity
        else
          @product_sale_counts[line_item.product_id] = line_item.quantity
        end
      end 
      @total += order.total_price.to_i
    end

    top_seller_stats = @product_sale_counts.max_by{|k,v| v}
    @product = ShopifyAPI::Product.find(top_seller_stats.first)

    @top_seller_count = top_seller_stats.last

    render :layout => false, :content_type => 'application/liquid'
  end
end

And here’s the template:

<h1>This Week's Earnings</h1>
<p><%= number_to_currency(@total)%> from <%= @orders.count%> orders</p>
<h1>Top Seller: <%= link_to(@product.title, url_for_product(@product)) %></h1>
<p>This product sold <%= @top_seller_count %> units</p>

Here's the finished product. The CSS could use some work, but all our info is there and matches the theme perfectly:

A Word On Security

So far so good, but right now there’s no security on our proxy. Anyone sending a request to that url with a ‘shop’ parameter will get data back. Oops! Let’s fix that.

Just like our webhooks, we sign all our proxy requests. There are details in the API docs on exactly how this is done, but for simplicity’s sake just add this private function to your ProxyController and add it as a before_filter:

def verify_request_source
  url_parameters = {
    "shop" => params[:shop],
    "path_prefix" => params[:path_prefix],
    "timestamp" => params[:timestamp]
  }

  sorted_params = url_parameters.collect{ |k, v| "#{k}=#{Array(v).join(',')}" }.sort.join

  calculated_signature = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::Digest.new('sha256'),
  ShopifyAppProxyExample::Application.config.shopify.secret, sorted_params)

  raise 'Invalid signature' if params[:signature] != calculated_signature
end

Great! Now you can be sure that Shopify is the one sending you this data and not some dirty impostor.

There you have it. Application Proxies are a great way to introduce dynamic third-party content into a native shop page. There's a lot more that you can do with them, far too much to cover in a single blog post.

If you're interested I encourage you to set up a quick app and give them a try. You can also discuss potential ideas with other developers on our dev mailing list.

Continue reading

Three Months of CoffeeScript

Three Months of CoffeeScript

Guest Post by Kamil Tusznio!

Kamil’s a developer at Shopify and has been working in our developer room just off the main “bullpen” that I like to refer to as “The Batcave”. That’s where the team working on the Batman.js framework have been working their magic. Kamil asked if he could post an article on the blog about his experiences with CoffeeScript and I was only too happy to oblige.

CoffeeScript

Since joining the Shopify team in early August, I have been working on Batman.js, a single-page app micro-framework written purely in CoffeeScript. I won't go into too much detail about what CoffeeScript is, because I want to focus on what it allows me to do.

Batman.js has received some flack for its use of CoffeeScript, and more than one tweet has asked why we didn't call the framework Batman.coffee. I feel the criticism is misguided, because CoffeeScript allows you to more quickly write correct code, while still adhering to the many best practices for writing JavaScript.

An Example

A simple example is iteration over an object. The JavaScript would go something like this:

var obj = {
  a: 1, 
  b: 2, 
  c: 3
};

for (var key in obj) {
  if (obj.hasOwnProperty(key)) { // only look at direct properties
    var value = obj[key];
    // do stuff...
  }
}

Meanwhile, the CoffeeScript looks like this:

obj =
  a: 1
  b: 2
  c: 3

for own key, value of obj
  # do stuff...

Notice the absence of var, hasOwnProperty, and needing to assign value. And best of all, no semi-colons! Some argue that this adds a layer of indirection to the code, which it does, but I'm writing less code, resulting in fewer opportunities to make mistakes. To me, that is a big win.

Debugging

Another criticism levelled against CoffeeScript is that debugging becomes harder. You're writing .coffee files that compile down to .js files. Most of the time, you won't bother to look at the .js files. You'll just ship them out, and you won't see them until a bug report comes in, at which point you'll be stumped by the compiled JavaScript running in the browser, because you've never looked at it.

Wait, what? What happened to testing your code? CoffeeScript is no excuse for not testing, and to test, you run the .js files in your browser, which just about forces you to examine the compiled JavaScript.

(Note that it's possible to embed text/coffeescript scripts in modern browsers, but this is not advisable for production environments since the browser is then responsible for compilation, which slows down your page. So ship the .js.)

And how unreadable is that compiled JavaScript? Let's take a look. Here's the compiled version of the CoffeeScript example from above:

var key, obj, value;
var __hasProp = Object.prototype.hasOwnProperty;
obj = {
  a: 1,
  b: 2,
  c: 3
};
for (key in obj) {
  if (!__hasProp.call(obj, key)) continue;
  value = obj[key];
}

Admittedly, this is a simple example. But, after having worked with some pretty complex CoffeeScript, I can honestly say that once you become familiar (which doesn't take long), there aren't any real surprises. Notice also the added optimizations you get for free: local variables are collected under one var statement, and hasOwnProperty is called via the prototype.

For more complex examples of CoffeeScript, look no further than the Batman source.

Workflow

I'm always worried when I come across tools that add a level of indirection to my workflow, but CoffeeScript has not been bad in this respect. The only added step to getting code shipped out is running the coffee command to watch for changes in my .coffee files:

coffee --watch --compile src/ --output lib/

We keep both the .coffee and .js files under git, so nothing gets lost. And since you still have .js files kicking around, any setup you have to minify your JavaScript shouldn't need to change.

TL;DR

After three months of writing CoffeeScript, I can hands-down say that it's a huge productivity booster. It helps you write more elegant and succinct code that is less susceptible to JavaScript gotchas.

Further Reading

[ This article also appears in Global Nerdy. ]

Continue reading

Most Memory Leaks are Good

Most Memory Leaks are Good

TL;DR

Catastrophe! Your app is leaking memory. When it runs in production it crashes and starts raising Errno::ENOMEM exceptions. So you babysit it and restart it consistently so that your app keeps responding.

As hard as you try you don’t see any memory leaks. You use the available tools, but you can’t find the leak. Understanding your full stack, knowing your tools, and good ol’ debugging will help you find that memory leak.

Memory leaks are good?

Yes! Depending on your definition. A memory leak is any memory that is allocated, but never freed. This is the basis of anything global in your programs. 

In a Ruby program global variables are allocated but will never be freed. Same goes with constants, any constant you define will be allocated and never freed. Without these things we couldn’t be very productive Ruby programmers.

But there’s a bad kind

The bad kind of memory leak involves some memory being allocated and never freed, over and over again. For example, if a constant is appended each time a web request is made to a Rails app, that's a memory leak. Since that constant will never be freed and it’s memory consumption will only grow and grow.

Separating the good and the bad

Unfortunately, there’s no easy way to separate the good memory leaks from the bad ones. The computer can see that you’re allocating memory, but, as always, it doesn’t understand what you’re trying to do, so it doesn’t know which memory leaks are unintentional.

To make matters more muddy, the computer can’t differentiate betweeen a memory leak in Ruby-land and a memory leak in C-land. It’s all just memory.

If you’re using a C extension that’s leaking memory there are tools specific to the C language that can help you find memory leaks (Valgrind). If you have Ruby code that is leaking memory there are tools specific to the Ruby language that can help you (memprof). Unfortunately, if you have a memory leak in your app and have no idea where it’s coming from, selecting a tool can be really tough.

How bad can memory leaks get?

This begins the story of a rampant memory leak we experienced at Shopify at the beginning of this year. Here’s a graph showing the memory usage of one of our app servers during that time.

You can see that memory consumption continues to grow unhindered as time goes on! Those first two spikes which break the 16G mark show that memory consumption climbed above the limit of physical memory on the app server, so we had to rely on the swap. With that large spike the app actually crashed, raising Errno::ENOMEM errors for our users.

After that you can see many smaller spikes. We wrote a script to periodically reboot the app, which releases all of the memory it was using. This was obviously not a sustainable solution. Case in point: the last spike on the graph shows that we had an increase in traffic which resulted in memory usage growing beyond the limits of physical memory again.

So, while all this was going on we were searching high and low to find this memory leak.

Where to begin?

The golden rule is to make the leak reproducible. Like any bug, once you can reproduce it you can surely fix it. For us, that meant a couple of things:

  1. When testing, reproduce your production environment as closely as possible. Run your app in production mode on localhost, set up the same stack that you have on production. Ensure that you are running the same exact versions of the software that is running on production.

  2. Be aware of any issues happening on production. Are there any known issues with the production environment? Losing connections to the database? Firewall routing traffic properly? Be aware of any weird stuff that’s happening and how it may be affecting your problem.

Memprof

Now that we’ve laid out the basics at a high level, we’ll dive into a tool that can help you find memory leaks.

Memprof is a memory profiling tool built by ice799 and tmm1. Memprof does some crazy stuff like rewriting the current Ruby binary at runtime to hot patch features like object allocation tracking. Memprof can do stuff like tell you how many objects are currently alive in the Ruby VM, where they were allocated, what their internal state is, etc.

VM Dump

The first thing that we did when we knew there was a problem was to reach into the toolbox and try out memprof. This was my first experience with the tool. My only exposure to the tool had been a presentation by @tmm1 that detailed some heavy duty profiling by dumping every live object in the Ruby VM in JSON format and using MongoDB to perform analysis.

Without any other leads we decided to try this method. After hitting our staging server with some fake traffi we used memprof to dump the VM to a JSON file. An important note is that we did not reproduce the memory leak on our staging server, we just took a look at the dump file anyway.

Our dump of the VM came out at about 450MB of JSON. We loaded it into MongoDB and did some analysis. We were surprised by what we found. There were well over 2 million live objects in the VM, and it was very difficult to tell at a glance which should be there and which should not.

As mentioned earlier there are some objects that you want to ‘leak’, especially true when it comes to Rails. For instance, Rails uses ActiveSupport::Callbacks in many key places, such as ActiveRecord callbacks or ActionController filters. We had tons of Proc objects created by ActiveSupport::Callbacks in our VM, but these were all things that needed to stick around in order for Shopify to function properly.

This was too much information, with not enough context, for us to do anything meaningful with.

Memprof stats

More useful, in terms of context, is having a look at Memprof.stats and the middleware that ships with Memprof. Using these you can get an idea of what is being allocated during the course of a single web request, and ultimately how that changes over time. It’s all about noticing a pattern of live objects growing over time without stopping.

memprof.com

The other useful tool we used was memprof.com. It allows you to upload a JSON VM dump (via the memprof gem) and analyse it using a slick web interface that picks up on patterns in the data and shows relevant reports. It has since been taken offline and open sourced by tmm1 at https://github.com/tmm1/memprof.com.

Unable to reproduce our memory leak on development or staging we decided to run memprof on one of our production app servers. We were only able to put it in rotation for a few minutes because it increased response time by 1000% due to the modifications made by memprof. The memory leak that we were experiencing would typically take a few hours to show itself, so we weren’t sure if a few minutes of data would be enough to notice the pattern we were looking for.

We uploaded the JSON dump to memprof.com and started using the web UI to look for our problem. Different people on the team got involved and, as I mentioned earlier, this data can be confusing. After seeing the huge amount of Proc object from ActiveSupport::Callbacks some claimed that “ActiveSupport::Callbacks is obviously leaking objects on every request”. Unfortunately it wasn’t that simple and we weren’t able to find any patterns using memprof.com.

Good ol’ debuggin: Hunches & Teamwork

Unable to make progress using these approaches we were back to square one. I began testing locally again and, through spying on Activity Monitor, thought that I noticed a pattern emerging. So I double-checked that I had all the same software stack running that our production environment has, and then the pattern disappeared.

It was odd, but I had a hunch that it had something to do with a bad connection to memcached. I shared my hunch with @wisqnet and he started doing some testing of his own. We left our chat window open as we were testing and shared all of our findings.

This was immensely helpful so that we could both begin tracing patterns between each others results. Eventually we tracked down a pattern. If we consistently hit a URL we could see the memory usage climb and never stop. We eventually boiled it down to a single of code:

loop { Rails.cache.write(rand(10**10).to_s, rand(10**10).to_s) }

If we ran that code in a console and then shut down the memcached instance it was using, memory usage immediately spiked.

Now What?

Now that it was reproducible we were able to experiment with fixing it. We tracked the issue down to our memcached client library. We immediately switched libraries and the problem disappeared in production. We let the library author know about the issue and he had it fixed in hours. We switched back to our original library and all was well!

Finally

It turned out that the memory leak was happening in a C extension, so the Ruby tools would not have been able to find the problem.

Three pieces of advice to anyone looking for a memory leak:

  1. Make it reproducible!
  2. Trust your hunches, even if they don’t make sense.
  3. Work with somebody else. Bouncing your theories off of someone else is the most helpful thing you can do.

Continue reading

How Batman can Help you Build Apps

How Batman can Help you Build Apps

Batman.js is Shopify’s new open source CoffeeScript framework, and I’m absolutely elated to introduce it to the world after spending so much time on it. Find Batman on GitHub here.

Batman emerges into a world populated with extraordinary frameworks being used to great effect. With the incredible stuff being pushed out in projects like Sproutcore 2.0 and Backbone.js, how is a developer to know what to use when? There’s only so much time to play with cool new stuff, so I’d like to give a quick tour of what makes Batman different and why you might want to use it instead of the other amazing frameworks available today.

Batman makes building apps easy

Batman is a framework for building single page applications. It’s not a progressive enhancement or a single purpose DOM or AJAX library. It’s built from the ground up to make building awesome single page apps by implementing all the lame parts of development like cross browser compatibility, data transport, validation, custom events, and a whole lot more. We provide handy helpers for development to generate and serve code, a recommended app structure for helping you organize code and call it when necessary, a full MVC stack, and a bunch of extras, all while remaining less than 18k when gzipped. Batman doesn’t provide only the basics, or the whole kitchen sink, but a fluid API that allows you to write the important code for your app and none of the boilerplate.

A super duper runtime

At the heart of Batman is a runtime layer used for manipulating data from objects and subscribing to events objects may emit. Batman’s runtime is used similarly to SproutCore’s or Backbone’s in that all property access and assignment on Batman objects must be done through someObject.get and someObject.set, instead of using standard dot notation like you might in vanilla JavaScript. Adhering to this property system allows you to:

  • transparently access “deep” properties which may be simple data or computed by a function,
  • inherit said computed properties from objects in the prototype chain,
  • subscribe to events like change or ready on other objects at “deep” keypaths,
  • and most importantly, dependencies can be tracked between said properties, so chained observers can be fired and computations can be cached while guaranteed to be up-to-date.

All this comes free with every Batman object, and they still play nice with vanilla JavaScript objects. Let’s explore some of the things you can do with the runtime. Properties on objects can be observed using Batman.Object::observe:

crimeReport = new Batman.Object
crimeReport.observe 'address', (newValue) ->
  if DangerTracker.isDangerous(newValue)
    crimeReport.get('currentTeam').warnOfDanger()

This kind of stuff is available in Backbone and SproutCore both, however we’ve tried to bring something we missed in those frameworks to Batman: “deep” keypaths. In Batman, any keypath you supply can traverse a chain of objects by separating the keys by a . (dot). For example:

batWatch = Batman
  currentCrimeReport: Batman
    address: Batman
      number: "123"
      street: "Easy St"
      city: "Gotham"

batWatch.get 'currentCrimeReport.address.number' #=> "123"
batWatch.set 'currentCrimeReport.address.number', "461A"
batWatch.get 'currentCrimeReport.address.number' #=> "461A"

This works for observation too:

batWatch.observe 'currentCrimeReport.address.street', (newStreet, oldStreet) ->
  if DistanceCalculator.travelTime(newStreet, oldStreet) > 100000
    BatMobile.bringTo(batWatch.get('currentLocation'))

The craziest part of the whole thing is that these observers will always fire with the value of whatever is at that keypath, even if intermediate parts of the keypath change.

crimeReportA = Batman
  address: Batman
    number: "123"
    street: "Easy St"
    city: "Gotham"

crimeReportB = Batman
  address: Batman
    number: "72"
    street: "Jolly Ln"
    city: "Gotham"

batWatch = new Batman.Object({currentCrimeReport: crimeReportA})

batWatch.get('currentCrimeReport.address.street') #=> "East St"
batWatch.observe 'currentCrimeReport.address.street', (newStreet) ->
  MuggingWatcher.checkStreet(newStreet)

batWatch.set('currentCrimeReport', crimeReportB)
# the "MuggingWatcher" callback above will have been called with "Jolly Ln"

Notice what happened? Even though the middle segment of the keypath changed (a whole new crimeReport object was introduced), the observer fires with the new deep value. This works with arbitrary length keypaths as well as intermingled undefined values.

The second neat part of the runtime is that because all access is done through get and set, we can track dependencies between object properties which need to be computed. Batman calls these functions accessors, and using the CoffeeScript executable class bodies they are really easy to define:

class BatWatch extends Batman.Object
  # Define an accessor for the `currentDestination` key on instances of the BatWatch class.
  @accessor 'currentDestination', ->
    address = @get 'currentCrimeReport.address'
    return "#{address.get('number')} #{address.get('street')}, #{address.get('city')}"

crimeReport = Batman
  address: Batman
    number: "123"
    street "Easy St"
    city: "Gotham"

watch = new BatWatch(currentCrimeReport: crimeReport)

watch.get('currentDestination') #=> "123 Easy St, Gotham"

Importantly, the observers you may attach to these computed properties will fire as soon as you update their dependencies:

watch.observe 'currentDestination', (newDestination) -> console.log newDestination
crimeReport.set('address.number', "124")
# "124 Easy St, Gotham" will have been logged to the console

You can also define the default accessors which the runtime will fall back on if an object doesn’t already have an accessor defined for the key being getted or setted.

jokerSimulator = new Batman.Object
jokerSimulator.accessor (key) -> "#{key.toUpperCase()}, HA HA HA!"

jokerSimulator.get("why so serious") #=> "WHY SO SERIOUS, HA HA HA!"

This feature is useful when you want to present a standard interface to an object, but work with the data in nontrivial ways underneath. For example, Batman.Hash uses this to present an API similar to a standard JavaScript object, while emitting events and allowing objects to be used as keys.

What’s it useful for?

The core of Batman as explained above makes it possible to know when data changes as soon as it happens. This is ideal for something like client side views. They’re no longer static bundles of HTML that get cobbled together as a long string and sent to the client, they are long lived representations of data which need to change as the data does. Batman comes bundled with a view system which leverages the abilities of the property system.

A simplified version of the view for Alfred, Batman’s todo manager example application, lies below:

<h1>Alfred</h1>

<ul id="items">
    <li data-foreach-todo="Todo.all" data-mixin="animation">
        <input type="checkbox" data-bind="todo.isDone" data-event-change="todo.save" />
        <label data-bind="todo.body" data-addclass-done="todo.isDone" data-mixin="editable"></label>
        <a data-event-click="todo.destroy">delete</a>
    </li>
    <li><span data-bind="Todo.all.length"></span> <span data-bind="'item' | pluralize Todo.all.length"></span></li>
</ul>
<form data-formfor-todo="controllers.todos.emptyTodo" data-event-submit="controllers.todos.create">
  <input class="new-item" placeholder="add a todo item" data-bind="todo.body" />
</form>

We sacrifice any sort of transpiler layer (no HAML), and any sort of template layer (no Eco, jade, or mustache). Our views are valid HTML5, rendered by the browser as soon as they have been downloaded. They aren’t JavaScript strings, they are valid DOM trees which Batman traverses and populates with data without any compilation or string manipulation involved. The best part is that Batman “binds” a node’s value by observing the value using the runtime as presented above. When the value changes in JavaScript land, the corresponding node attribute(s) bound to it update automatically, and the user sees the change. Vice versa remains true: when a user types into an input or checks a checkbox, the string or boolean is set on the bound object in JavaScript. The concept of bindings isn’t new, as you may have seen it in things like Cocoa, or in Knockout or Sproutcore in JS land.

We chose to use bindings because we a) don’t want to have to manually check for changes to our data, and b) don’t want to have to re-render a whole template every time one piece of data changes. With mustache or jQuery.tmpl and company, I end up doing both those things surprisingly often. It seems wasteful to re-render every element in a loop and pay the penalty for appending all those nodes, when only one key on one element changes, and we could just update that one node. SproutCore’s ‘SC.TemplateView’ with Yehuda Katz' Handlebars.js do a good job of mitigating this, but we still didn’t want to do all the string ops in the browser, and so we opted for the surgical precision of binding all the data in the view to exactly the properties we want.

What you end up with is a fast render with no initial loading screen, at the expense of the usual level of complex logic in your views. Batman’s view engine provides conditional branching, looping, context, and simple transforms, but thats about it. It forces you to write any complex interaction code in a packaged and reusable Batman.View subclass, and leave the HTML rendering to the thing that does it the best: the browser.

More?

Batman does more than this fancy deep keypath stuff and these weird HTML views-but-not-templates. We have a routing system for linking from quasi-page to quasi-page, complete with named segments and GET variables. We have a Batman.Model layer for retrieving and sending data to and from a server which works out of the box with storage backends like Rails and localStorage. We have other handy mixins for use in your own objects like Batman.StateMachine and Batman.EventEmitter. And, we have a lot more on the way. I strongly encourage you to check out the project website, the source on GitHub, or visit us in #batmanjs on freenode. Any questions, feedback, or patches will be super welcome, and we’re always open to suggestions on how we can make Batman better for you.

Until next time….

Continue reading

Making Apps using Python, Django and App Engine

Making Apps using Python, Django and App Engine

We recently announced the release of our Python adaptor for the Shopify API. Now we would like to inform you that we have got it working well with the popular Django web framework and Google App Engine hosting service. But don't just take my word for it, you can see a live example on App Engine and check out the example's source code on GitHub. The example application isn't limited to Google App Engine, it can run as a regular Django application allowing you to explore other hosting options.

The shopify_app directory in the example contains the reusable Django app code. This directory contains views for handling user login, authentication, and saves the Shopify session upon finalization. Middleware is included which loads session to automatically re-initialize the Python Shopify API for each request. There is also a @shop_login_required decorator for view functions that require login, which will redirect logged out users to the login page. As a result, your view function can be as simple as the following to display basic information about the shop's products and orders.

@shop_login_required
def index(request):
    products = shopify.Product.find(limit=3)
    orders = shopify.Order.find(limit=3, order="created_at DESC")
    return render_to_response('home/index.html', {
        'products': products,
        'orders': orders,
    }, context_instance=RequestContext(request))

Getting Started for Regular Django App

  1. Install the dependancies with this command:
    easy_install Django ShopifyAPI PyYAML pyactiveresource
  2. Download and unzip the zip file for the example application

Getting Started for Google App Engine

  1. Install the App Engine SDK
  2. Download and unzip the example application zip file for App Engine which includes all the dependancies.
  3. Create an application with Google App Engine, and modify the application line in app.yaml with the application ID registered with Google App Engine.

Develop

    1. Create a Shopify app through the Shopify Partner account with the Return URL set to http://localhost:8000/login/finalize, and modify shopify_settings.py with the API-Key and Shared Secret for the app.
    2. Start the server:
      python manage.py runserver
    3. Visit http://localhost:8000 to view the example.
    4. Modify the code in the home directory.

    Deploy

    1. Update the return URL in your Shopify partner account to point to your domain name (e.g. https://APPLICATION-ID.appspot.com/login/finalize)
    2. Upload the application to the server. For Google App Engine, simply run:
      appcfg.py update .

    Further Information

    Update: Extensive examples on using the Shopify Python API have been added to the wiki. 

    Continue reading

    Webhook Testing Made Easy

    Webhook Testing Made Easy

    Webhooks are fantastic. We use them here at Shopify to notify API users of all sorts of important events. Order creation, product modification, and even app uninstallation all cause webhooks to be fired. They're a really neat way to avoid the problem of polling, which is annoying for app developers and API providers alike.

    The trouble with Webhooks is that you need a publicly visible URL to handle them. Unlike client-side redirects, webhooks originate directly from the server. This means that you can't use localhost as an endpoint in your testing environment as the API server would effectively be calling itself. Bummer.

    Fortunately, there are a couple of tools that make working with webhooks during development much easier. Let me introduce you to PostCatcher and LocalTunnel.

    PostCatcher

    PostCatcher is a brand new webapp that was created as an entry for last week's Node Knockout. Shopify's own Steven Soroka and Nick Small are on the judging panel this year, and this app caught their eye.

    The app generates a unique url that you can use as a webhook endpoint and displays any POST requests sent to it for you to examine. As you might expect from a js contest, the ui is extremely slick and renders all the requests in real-time as they come in. This is really useful in the early stages of developing an app as you can see the general shape and structure of any webhooks you need without writing a single line of code. On the flip side, API developers can use it to test their own service in a real-world environment.

     

    The thing I really like about PostCatcher over similar apps like PostBin is that I can sign in using github and keep track of all the catchers I've created. No more copy/pasting urls to a text file to avloid losing them. Hooray!

    LocalTunnel

    LocalTunnel is a Ruby gem + webapp sponsored by Twilio that allows you to expose a given port on your local machine to the world through a url on their site. Setup is really easy (provided you have ruby and rubygems installed) and once it's installed you just start it from the console with the port you want to forward and share the url it spits out.

     

    From then on that url will point to your local machine so you can register the address as a webhook endpoint and get any incoming requests piped right to your machine. My previous solution was endless deployments to Heroku every time I made a small change to my code, which was a real pain in the arse. Compared to that, LocalTunnel was a godsend.

    Alternatives

    Whilst PostCatcher and LocalTunnel are currently my top choices for testing webhooks, they're by no means the only party in town. I've already mentioned PostBin, but LocalTunnel also has a contender in LocalNode (another Node KO entry). The latter boasts wider integration (you don't need ruby) as well as permanent url redirects but setup is more complicated as you have to add a static html file to your web server.

    If there are other services, apps, or tricks that you use to test webhooks when developing apps, call them out in the comments! I'd love to hear what I've missed in this space.

    Continue reading

    How we use git at Shopify

    How we use git at Shopify

    By John Duff
    A little while back, Rodrigo Flores posted to the plataformatec blog, A (successful) git branching model, where he talks about the git workflow they've been using on some projects. I thought this was a great post and decided to do something similar explaining the git workflow that we use at Shopify.

    Preface

    Git is an incredibly powerful tool that can be used in many different ways. I don't believe there is a 'correct' workflow for using git, just many different options that work for particular situations and people. The workflow that I am going to describe won't work for everyone, and not everyone at Shopify uses git in the same way - you have to modify and massage it to shape your needs and the way you work. I don't consider myself an expert with git, but am comfortable enough with the tool to handle just about everything I might need to do. If there's anything I can't figure out, James MacAulay is our resident git expert in the office and always willing to help out.
    Okay, lets get down to business.

    Setup

    When working on a project each developer and designer first forks the repository that they want to work on. Forking a repository is really simple, Github even has a guide if you need some help with it. A fork is basically your own copy of the repository, that you can change without affecting anyone else. We use GitHub for all of our projects so it makes managing the forks really easy. All the work is done on your fork of the repository and only gets pulled into the main repository after it has been fully tested, code reviewed, etc. We also use the concept of feature branches to make it easy to switch between tasks and to share the work with other colleagues. A branch is kind of like a fork within your own repository, you can have many branches within your forked repository for each of the tasks you're working on. Your checkout of a project should be setup with a couple of remotes and branches to get started.

    Remotes:

    • origin - This is a remote pointing to your clone of the project and added by default when you do 'git clone'.
    • mainline - This is a remote pointing to the main repository for the project. We use this remote to keep up to date and push to the main repository.

    Branches:

    • production - This is the production branch of the main repository (or mainline). This is the code that is ready to be deployed to production.
    • staging - Contains the code that is being run on the staging server (we have two that developers can use). Before a feature is considered 'finished' it must be tested on one of the staging servers, which mirrors the production environment.
    • master - Contains completed features that can be deployed.
    So how do we set all this up? These couple of git commands should take care of it:
    git clone git@github.com:jduff/project.git
    git remote add mainline git@github.com:Shopify/project.git
    
    Keeping a project up to date is also really easy, you just pull from mainline.
    git checkout master
    git pull --rebase mainline master
    
    I know what you're thinking, what the heck is that 'rebase' doing in there? Well, you don't really need it, but it helps to use it in case you've merged a new feature that you haven't pushed yet. This keeps the history all tidy with the changes you made on top of what is already in master instead of creating an additional "merge" commit when there's a conflict.

    Day To Day Usage

    So how does all of this work day to day? Here it is, step by step:
    git checkout master
    git checkout -b add_awesome # Feature branches, remember
    # Do some work, listen to a lightning talk, more work
    git commit -m "Creating an awesome feature"
    
    Mainline master moves pretty fast so we should keep our feature branch up to date
    git checkout master
    git pull --rebase mainline master
    
    git checkout add_awesome
    git rebase master
    
    Everything is finished, test it out on staging!
    git push -f mainline add_awesome:staging
    # This blows away what is currently being staged, make sure staging isn't already in use!
    
    Staging is cool...code review...high fives all around, ship it!
    It's always easier to release a feature if master is up to date and you've rebased your branch. See above for how to 'keep our feature branch up to date'. We also make sure to squash all the commits down as much as possible before merging the feature into master. You can do this with the rebase command:
    # Rebase the last 5 commits
    git rebase -i HEAD~5
    
    Now we can merge the feature into the master branch:
    git checkout master
    git merge add_awesome
    git push mainline master
    
    And if you want your code to go out to production right away you have a couple more steps:
    git checkout master
    git pull mainline master # Make sure you're up to date with everything
    
    git checkout production
    git merge master
    git push mainline production
    # Ask Ops for a deploy
    
    That's about it. It might seem like a lot to get a hang of at the start but it really works well and keeps the main repository clear of merge commits so it's easy to read and revert if required. I personally really like the idea of feature branches and rebasing as often as possible, it makes it super easy to switch tasks and keeps merge conflicts to a minimum. I almost never have conflicts because I rebase a couple of times a day.

    A Few More git Tips

    I've got a couple more tips that might help you out in your day to day git usage.
    # Unstage the last commit
    git reset HEAD~1
    git reset HEAD^ # Same as above
    
    # Remove the last commit from history (don't do this if the commit has been pushed to a remote)
    git reset --hard HEAD~1
    
    # Interactive rebase is awesome!
    git rebase -i HEAD~4
    git rebase -i HEAD^^^^ # Same as above
    
    # Change the last commit message, or add staged files to the last commit
    git commit --amend
    
    # Reverses the commit 1b9b50a if it introduced a bug
    git revert 1b9b50a
    
    # Track down a bug, HEAD is bad but 5 commits back it was good
    git bisect start HEAD HEAD~5
    

    Conclusion

    So there you have it, that's how we use git at Shopify. I don't know about everyone else, but once I got going I found this workflow (particularly the feature branches) to work very well. That doesn't mean this is the only way to use git, like I said earlier it is an incredibly powerful tool and you have to find a way that works well for you and your team. I do hope that this might serve as a starting point for your own git workflow and maybe provide a little insight into how we work here at Shopify.
    Our tools and the way we use them are constantly evolving so I would love to hear about how you use git to see if we might be able to improve our own workflow. Let us know in the comments or better yet, write you own blog post and drop us the link!
    Photo by Paul Hart

    Continue reading

    Developing Shopify Apps, Part 4: Change is Good

    Developing Shopify Apps, Part 4: Change is Good

     So far, in the Developing Shopify Apps series, we've covered:

    • The setup: joining Shopify's Partner Program, creating a new test shop, launching it, adding a private app to it and playing with a couple of quick API calls.
    • Exploring the API: a quick explanation of the API and RESTafarianism, retrieving general information about a shop and dipping a toe into finding out about things like your shop's products, and so on.
    • Even more explaration: REST consoles, getting a complete list of all the products, articles, blogs, customers and so on, retrieving specific items given their ID and creating new items.


    Now these are modifications!

    In this article, we're going to look at another important types of operation: modifying existing items.

    Modifying Customers

    To modify an object, we're going to need an existing one first. I'm going to start with "Peter Griffin", a customer that I created in the previous article in this series. His ID is 51827492, so we can retrieve his record thusly:

    • GET api-key:password@shop-url/admin/customers/51827492.xml for the XML version
    • GET api-key:password@shop-url/admin/customers/51827492.json for the JSON version

    Here's the response in XML:

    <?xml version="1.0" encoding="UTF-8"?>
    <customer>
        <accepts-marketing type="boolean" nil="true" />
        <orders-count type="integer">0</orders-count>
        <id type="integer">51827492</id>
        <note nil="true" />
        <last-name>Griffin</last-name>
        <total-spent type="decimal">0.0</total-spent>
        <first-name>Peter</first-name>
        <email>peter.lowenbrau.griffin@giggity.com</email>
        <tags />
        <addresses type="array">
            <address>
                <city>Quahog</city>
                <company nil="true" />
                <address1>31 Spooner Street</address1>
                <zip>02134</zip>
                <address2 nil="true" />
                <country>United States</country>
                <phone>555-555-1212</phone>
                <last-name>Griffin</last-name>
                <province>Rhode Island</province>
                <first-name>Peter</first-name>
                <name>Peter Griffin</name>
                <province-code>RI</province-code>
                <country-code>US</country-code>
            </address>
        </addresses>
    </customer>

    Let's suppose that Peter has decided to move to California. We'll need to update his address, and to do it programatically, we'll need the following:

    • His customer ID (we've got that).
    • The new information. For this example, it's
      • address1: 800 Schwarzenegger Lane
      • city: Los Angeles
      • state: California
      • zip: 90210
      • phone: 555-888-9898
    • And finally, the method for calling the Shopify API to modify existing items.

    First, there's the format of the URL for modifying Peter's entry. The URL will specify what operation we want to perform (modify) and on which item (a customer whose ID is 51827492).

    • PUT api-key:password@shop-url/admin/customers/51827492.xml for the XML version
    • PUT api-key:password@shop-url/admin/customers/51827492.json for the JSON version

    For this example, we'll use the XML version. If you're using Chrome's REST console, put the XML URL into the Request field (located in the Target section), as shown below:

    Then there's the message body, which will specify which fields we want to update. Here's the message body to update Peter's address to the new Los Angeles-based one shown above, in XML form:

    <?xml version="1.0" encoding="UTF-8"?>
    <customer>
      <addresses type="array">
        <address>
          <address1>800 Schwarzenegger Lane</address1>
          <city>Los Angeles</city>
          <province>CA</province>
          <country>US</country>
          <zip>90210</zip>
          <phone>555-888-9898</phone>
        </address>
      </addresses>
    </customer>

    If you're using Chrome's REST console, put the message body in the RAW Body field (located in the Body section) and make sure Content-Type is set to application/xml:


    Send the request. If you're using Chrome's REST Console, the simplest way to do this is to press the PUT button located at the bottom of the page. You should get a "200 OK" response and the following response body:

    <?xml version="1.0" encoding="UTF-8"?>
    <customer>
      <accepts-marketing type="boolean" nil="true" />
      <orders-count type="integer">0</orders-count>
      <id type="integer">51827492</id>
      <note nil="true" />
      <last-name>Griffin</last-name>
      <total-spent type="decimal">0.0</total-spent>
      <first-name>Peter</first-name>
      <email>peter.lowenbrau.griffin@giggity.com</email>
      <tags />
      <addresses type="array">
        <address>
          <city>Los Angeles</city>
          <company nil="true" />
          <address1>800 Schwarzenegger Lane</address1>
          <zip>90210</zip>
          <address2 nil="true" />
          <country>United States</country>
          <phone>555-888-9898</phone>
          <last-name>Griffin</last-name>
          <province>California</province>
          <first-name>Peter</first-name>
          <name>Peter Griffin</name>
          <province-code>CA</province-code>
          <country-code>US</country-code>
        </address>
      </addresses>
    </customer>

    As you can see, Peter's address has been updated.

    Modifying Products

    Let's try modifying an existing product in our store. Once again, we'll modify an item we created in the previous article: the Stumpy Pepys Toy Drum.

    When we created it, we never specified any tags. We now want to add some tags to this product -- "Spinal Tap" and "rock" -- to make it easier to find. In order to do this, we need:

    • The product ID. It's.
    • The tags, "Spinal Tap" and "rock".
    • And finally, the method for calling the Shopify API to modify existing items.
    Here's the URL format:
      • PUT api-key:password@shop-url/admin/products/48339792.xml for the XML version
      • PUT api-key:password@shop-url/admin/products/48339792.json for the JSON version

    For this example, we'll use the JSON version. If you're using Chrome's REST console, put the XML URL into the Request field (located in the Target section), as shown below:

    Then there's the message body, which will specify which fields we want to update. Here's the message body to add the tags to the drum's entry, in JSON form:

    {
      "product": {
        "tags": "Spinal Tap, rock",
        "id": 48339792
      }
    }

    If you're using Chrome's REST console, put the message body in the RAW Body field (located in the Body section) and make sure Content-Type is set to application/json:

    Send the request. If you're using Chrome's REST Console, the simplest way to do this is to press the PUT button located at the bottom of the page. You should get a "200 OK" response and the following response body:

    {
      "product": {
        "body_html": "This drum is so good...\u003Cstrong\u003Eyou can't beat it!!\u003C/strong\u003E",
        "created_at": "2011-08-03T18:20:17-04:00",
        "handle": "stumpy-pepys-toy-drum-sp-1",
        "product_type": "drum",
        "template_suffix": null,
        "title": "Stumpy Pepys Toy Drum SP-1",
        "updated_at": "2011-08-08T17:57:55-04:00",
        "id": 48339792,
        "tags": "rock, Spinal Tap",
        "images": [],
        "variants": [{
          "price": "0.00",
          "position": 1,
          "created_at": "2011-08-03T18:20:17-04:00",
          "title": "Default",
          "requires_shipping": true,
          "updated_at": "2011-08-03T18:20:17-04:00",
          "inventory_policy": "deny",
          "compare_at_price": null,
          "inventory_quantity": 1,
          "inventory_management": null,
          "taxable": true,
          "id": 113348882,
          "grams": 0,
          "sku": "",
          "option1": "Default",
          "option2": null,
          "fulfillment_service": "manual",
          "option3": null
        }],
        "published_at": "2011-08-03T18:20:17-04:00",
        "vendor": "Spinal Tap",
        "options": [{
            "name": "Title"
        }]
      }
    }

    Modifying Things with the Shopify API: The General Formula

    As you've seen, whether you prefer to talk to the API with XML or JSON, modifying things requires:

    1. An HTTP PUT request to the right URL, which includes the ID of the item you want to modify
    2. The information that you want to add or update, which you format and put into the request body

    ...and that's it!

    Next...

    We've seen getting, adding, and modifying, which leaves...deleting.

    Continue reading

    Developing Shopify Apps, Part 3: More API Exploration

    Developing Shopify Apps, Part 3: More API Exploration

    Writing Shopify AppsWelcome back to another installment of Developing Shopify Apps!

    In case you missed the previous articles in this series, they are:

    • Part 1: The Setup. In this article, we:
      • Joined Shopify's Partner Program
      • Created a new test shop
      • Launched a new test shop
      • Added an app to the test shop
      • Played with a couple of quick API calls through the browser
    • Part 2: Exploring the API. This article covered:
      • Shopify's RESTful API, including a quick explanation of how to use it
      • Retrieving general information about a shop via the admin panel and the API
      • Retrieving information from a shop, such as products, via the API

    Exploring RESTful APIs with a REST Console

    So far, all we've done is retrieve information from a shop. We did this by using the GET verb and applying it to resources exposed by the Shopify API, such as products, blogs, articles and so on. Of all the HTTP verbs, GET is the simplest to use; you can simply request information by using your browser's address bar. Working with the other three HTTP verbs -- POST, PUT and DELETE -- usually takes a little more work.

    One very easy-to-use way to make calls to the Shopify API using all four verbs is a REST client. You have many options, including:

    • cURL: the web developer's Swiss Army knife. This command line utility gets and sends files using URL syntax using a wide array of protocols including HTTP and friends, FTP and similar, LDAPS, TELNET and mail formats including IMAP, POP3 and SMTP. 
    • Desktop REST clients such as Fiddler for Windows or WizTools' RESTClient
    • Browser-based REST clients such as RESTClient for Firefox or REST Console for Chrome

    Lately, I've been using REST Console for Chrome. It's quite handy -- when installed, it's instantly available with one click on its icon, just to the left of Chrome's "Wrench" menu (which is to the right of the address bar):

    And here's what the REST Console looks like -- clean and simple:

    Let's try a simple GET operation: let's get the list of products in the shop. The format for the URL is:

     

    • GET api-key:password@shop-url/admin/products.xml (for the XML version)
    • GET api-key:password@shop-url/admin/products.json (for the JSON version)
    where api-key is your app's API key and password is your app's password.

    The URL goes into the Request URL field in the Target section. A tap of the GET button at the bottom of the page yields a response, which appears, quite unsurprisingly, in the Response section of the page:

    Of course, you could've done this with the address bar. But it's much nicer with the REST Console. Before we start exploring calls that require POST, PUT and DELETE, let's take a look at other things we can do with the GET verb.

    Get All the Items!

    If you've been following this series of articles, you've probably had a chance to try a couple of GET calls to various resources exposed by the API. Once again, here's the format for the URL that gets you a listing of all the products available in the shop:

    Get All the Products

     

    • GET api-key:password@shop-url/admin/products.xml (for the XML version)
    • GET api-key:password@shop-url/admin/products.json (for the JSON version)

    Get All the Articles

    If you go to the API documentation and look at the column on the right side of the page, you'll see a list of resources that the Shopify API makes available to you. One of these resources is Article, which gives you access to all the articles in the blogs belonging to the shop (each shop supports one or more blogs; they're a way for shopowners to write about what they're selling or related topics).

    Here's how you get all the articles:

    • GET api-key:password@shop-url/admin/articles.xml (for the XML version)
    • GET api-key:password@shop-url/admin/articles.json (for the JSON version)

    Get All the Blogs

    Just as you can get all the articles, you can get all the blogs that contain them. Here's how you do it:

    • GET api-key:password@shop-url/admin/blogs.xml (for the XML version)
    • GET api-key:password@shop-url/admin/blogs.json (for the JSON version)

    Get All the Customers

    How about a list of all the shop's registered customers? No problem:

    • GET api-key:password@shop-url/admin/customers.xml (for the XML version)
    • GET api-key:password@shop-url/admin/customers.json (for the JSON version)

    Get All the [WHATEVER]

    By now, you've probably seen the pattern. For any resource exposed by the Shopify API, the way to get a complete listing of all items in that resource is this:

    • GET api-key:password@shop-url/admin/plural-resource-name.xml (for the XML version)
    • GET api-key:password@shop-url/admin/plural-resource-name.json (for the JSON version)
    where:
    • api-key is the app's API key
    • password is the app's password
    • plural-resource-name is the plural version of the name of the resource whose items you want: articles, blogs, customers, products, and so on.

    Get a Specific Item, Given its ID

    There will come a time when you want to get the information about just one specific item and not all of them. If you know an item's ID, you can retrieve the info for just that item using this format URL:

    • GET api-key:password@shop-url/admin/plural-resource-name/id.xml (for the XML version)
    • GET api-key:password@shop-url/admin/plural-resource-name/id.json (for the JSON version)
    To get an article with the ID 3671982, we use this URL:
    • GET api-key:password@shop-url/admin/articles/3671982.xml (for the XML version)
    • GET api-key:password@shop-url/admin/articles/3671982.json (for the JSON version)

    If There is Such an Item


    If an article with that ID exists, you get a "200" response header ("OK"):

    Status Code: 200
    Date: Wed, 03 Aug 2011 15:49:44 GMT
    Content-Encoding: gzip
    P3P: CP="NOI DSP COR NID ADMa OPTa OUR NOR"
    Status: 304 Not Modified
    X-UA-Compatible: IE=Edge,chrome=1
    X-Runtime: 0.114750
    Server: nginx/0.8.53
    ETag: "fb7cdcc613b1a45698c6cfad05fc7f7e"
    Vary: Accept-Encoding
    Content-Type: application/xml; charset=utf-8
    Cache-Control: max-age=0, private, must-revalidate

    ...and a response body that should look something like this (if you requested the response in XML):

    <?xml version="1.0" encoding="UTF-8"?>
    <article>
      <body-html><p>This is your blog. You can use it to write about new product launches, experiences, tips or other news you want your customers to read about.</p> <p>We automatically create an <a href="http://en.wikipedia.org/wiki/Atom_feed">Atom Feed</a> for all your blog posts. <br /> This allows your customers to subscribe to new articles using one of many feed readers (e.g. Google Reader, News Gator, Bloglines).</p></body-html>
      <created-at type="datetime">2011-07-22T14:43:22-04:00</created-at>
      <author>Shopify</author>
      <title>First Post</title>
      <updated-at type="datetime">2011-07-22T14:43:25-04:00</updated-at>
      <blog-id type="integer">1127212</blog-id>
      <summary-html nil="true" />
      <id type="integer">3671982</id>
      <user-id type="integer" nil="true" />
      <published-at type="datetime">2011-07-22T14:43:22-04:00</published-at>
      <tags>ratione, repellat, vero</tags>
    </article>

    If No Such Item Exists


    If no article with that ID exists, you get a "404" response header ("Not Found"). Here's what happened when I tried to retrieve an article with the ID 42. I used this URL:

    • api-key:password@shop-url/admin/articles/3671982.xml (for the XML version)
    • api-key:password@shop-url/admin/articles/3671982.json (for the JSON version)

    I got this header back:

    Status Code: 404
    Date: Wed, 03 Aug 2011 16:00:25 GMT
    Content-Encoding: gzip
    Transfer-Encoding: chunked
    Status: 404 Not Found
    Connection: keep-alive
    X-UA-Compatible: IE=Edge,chrome=1
    X-Runtime: 0.039715
    Server: nginx/0.8.53
    Vary: Accept-Encoding
    Content-Type: application/xml; charset=utf-8
    Cache-Control: no-cache<

    ...and since there was nothing to return, the response body was empty.

    Get [WHATEVER], Given its ID

    The same principle applies to any other Shopify API resource.

    Want the info on a customer whose ID is 50548602? The URL would look like this:

    • GET api-key:password@shop-url/admin/customers/50548602.xml (for the XML version)
    • GET api-key:password@shop-url/admin/customers/50548602.json (for the JSON version)

    ...and if such a customer exists, you'll get a response of a "200" header and the customer's information in the body, similar to what you see below (the following is the JSON response):

    {
        "customer": {
            "accepts_marketing": true,
            "orders_count": 0,
            "addresses": [{
                "company": null,
                "city": "Wilkinsonshire",
                "address1": "95692 O'Reilly Plains",
                "name": "Roosevelt Colten",
                "zip": "27131-3440",
                "address2": null,
                "country_code": "US",
                "country": "United States",
                "province_code": "NH",
                "phone": "1-244-845-7291 x258",
                "last_name": "Colten",
                "province": "New Hampshire",
                "first_name": "Roosevelt"
            }],
            "tags": "",
            "id": 50548602,
            "last_name": "Colten",
            "note": null,
            "email": "ivory@example.com",
            "first_name": "Roosevelt",
            "total_spent": "0.00"
        }
    }

    If no such customer existed, you'd get a "404" response header and an empty response body.

    How about info on a product whose ID is 48143272? Here's the URL you'd use:

    • GET api-key:password@shop-url/admin/products/48143272.xml (for the XML version)
    • GET api-key:password@shop-url/admin/products/48143272.json (for the JSON version)

    Once again: if such a product exists, you'll get a "200" response header and a response body that looks something like this (this is the XML version):

    <?xml version="1.0" encoding="UTF-8"?>
    <product>
        <product-type>Snowboard</product-type>
        <handle>burton-custom-freestlye-151</handle>
        <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
        <body-html><strong>Good snowboard!</strong></body-html>
        <title>Burton Custom Freestlye 151</title>
        <template-suffix nil="true" />
        <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
        <id type="integer">48143272</id>
        <vendor>Burton</vendor>
        <published-at type="datetime">2011-08-02T12:06:42-04:00</published-at>
        <tags />
        <variants type="array">
            <variant>
                <price type="decimal">10.0</price>
                <position type="integer">1</position>
                <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
                <title>First</title>
                <requires-shipping type="boolean">true</requires-shipping>
                <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
                <inventory-policy>deny</inventory-policy>
                <compare-at-price type="decimal" nil="true" />
                <inventory-management nil="true" />
                <taxable type="boolean">true</taxable>
                <id type="integer">112957692</id>
                <grams type="integer">0</grams>
                <sku />
                <option1>First</option1>
                <option2 nil="true" />
                <fulfillment-service>manual</fulfillment-service>
                <option3 nil="true" />
                <inventory-quantity type="integer">1</inventory-quantity>
            </variant>
            <variant>
                <price type="decimal">20.0</price>
                <position type="integer">2</position>
                <created-at type="datetime">2011-08-02T12:06:42-04:00</created-at>
                <title>Second</title>
                <requires-shipping type="boolean">true</requires-shipping>
                <updated-at type="datetime">2011-08-02T12:06:42-04:00</updated-at>
                <inventory-policy>deny</inventory-policy>
                <compare-at-price type="decimal" nil="true" />
                <inventory-management nil="true" />
                <taxable type="boolean">true</taxable>
                <id type="integer">112957702</id>
                <grams type="integer">0</grams>
                <sku />
                <option1>Second</option1>
                <option2 nil="true" />
                <fulfillment-service>manual</fulfillment-service>
                <option3 nil="true" />
                <inventory-quantity type="integer">1</inventory-quantity>
            </variant>
        </variants>
        <images type="array" />
        <options type="array">
            <option>
                <name>Title</name>
            </option>
        </options>
    </product>

    You can apply this pattern for retrieving items with specific IDs to other resources in the API.

    Creating a New Item

    Let's quickly look over the HTTP verbs and how they're applied when working with Shopify's RESTful API:

    Verb How it's used
    GET "Read". In the Shopify API, the GET verb is used to get information about shops and related things such as customers, orders, products, blogs and so on.

    GET operations are most often used to get a list of items ("Get me a list of all the products my store carries"), an individual item ("Get me the customer with this particular ID number) or to conduct a search ("Get me a list of the products in my store that come from a particular vendor").
    POST "Create". In the Shopify API, the POST verb is used to create new items: new customers, products and so on.
    PUT "Update". To modify an existing item using the Shopify API, use the PUT verb.
    DELETE "Delete". As you might expect, the DELETE verb is used to delete objects in the Shopify API.

    To create a new item with the Shopify API, use the POST verb and this pattern for the URL:

    • POST api-key:password@shop-url/admin/plural-resource-name.xml (for the XML version)
    • POST api-key:password@shop-url/admin/plural-resource-name.json (for the JSON version)

    Creating a new item also requires providing information about that item. The type of information varies with the item, but it's always in either XML or JSON format, and it's always provided in the request body.

    Let's create a new customer (or more accurately, a new customer record). Here's what we know about the customer:

    • First name: Peter
    • Last name: Griffin
    • Email: peter.lowenbrau.griffin@giggity.com
    • Street address: 31 Spooner Street, Quahog RI 02134
    • Phone: 555-555-1212

    This is enough information to create a new customer record (I'll cover the customer object, as well as all the others, in more detail in future articles). Here's that same information in JSON, in a format that the API expects:

    {
      "customer": {
        "first_name": "Peter",
        "last_name": "Griffin",
        "email": "peter.lowenbrau.griffin@giggity.com",
        "addresses": [{
            "address1": "31 Spooner Street",
            "city": "Quahog",
            "province": "RI",
            "zip": "02134",
            "country": "US",
            "phone": "555-555-1212"
        }]
      }
    }

    Since I've got the customer info in JSON format, I'll use the JSON URL for this API call:

    POST api-key:password@shop-url/admin/customers.json

    Here's how we make the call using Chrome REST Console. The URL goes into the Request URL field of the Target section:

    ...while the details of our new customer go into the RAW Body field of the Body section. Make sure that the Content-Type field has the correct content-type selected; in this case, since we're sending (and receiving) JSON, the content-type should be application/json:

    A press of the POST button at the bottom of the page sends the information to the server, and the results are displayed in the Response section:

    Here's the response header:

    Status Code: 200
    Date: Wed, 03 Aug 2011 20:39:13 GMT
    Content-Encoding: gzip
    Transfer-Encoding: chunked
    P3P: CP="NOI DSP COR NID ADMa OPTa OUR NOR"
    Status: 200 OK
    HTTP_X_SHOPIFY_API_CALL_LIMIT: 1/3000
    Connection: keep-alive
    X-UA-Compatible: IE=Edge,chrome=1
    X-Runtime: 0.198933
    Server: nginx/0.8.53
    ETag: "0409671d7af84b695d5ded4e93c0917c"
    Vary: Accept-Encoding
    Content-Type: application/json; charset=utf-8
    Cache-Control: max-age=0, private, must-revalidate
    HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT: 1/300

    The "200" status code means that the operation was successful and we have a new customer in the records.

    Here's the body of the response, which is the complete record of the customer we just created, in JSON format:

    {
        "customer": {
            "accepts_marketing": null,
            "orders_count": 0,
            "addresses": [{
                "company": null,
                "city": "Quahog",
                "address1": "31 Spooner Street",
                "name": "Peter Griffin",
                "zip": "02134",
                "address2": null,
                "country_code": "US",
                "country": "United States",
                "province_code": "RI",
                "phone": "555-555-1212",
                "last_name": "Griffin",
                "province": "Rhode Island",
                "first_name": "Peter"
            }],
            "tags": "",
            "id": 51827492,
            "last_name": "Griffin",
            "note": null,
            "email": "peter.lowenbrau.griffin@giggity.com",
            "first_name": "Peter",
            "total_spent": "0.00"
        }
    }

    Let's create another new item. This time, we'll make it a product and we'll do it in XML.

    Let's say this is the information we have about the product:

     

    • Title: Stumpy Pepys Toy Drum SP-1
    • Vendor: Spinal Tap
    • Product type: Drum
    • Description: This drum is so good...you can't beat it!

    Here's that same information in XML, in a format that the API expects:

    <?xml version="1.0" encoding="UTF-8"?>
    <product>  
      <body-html>This drum is so good...<strong>you can't beat it!!</strong></body-html>  
      <product-type>drum</product-type>  
      <title>Stumpy Pepys Toy Drum SP-1</title>  
      <vendor>Spinal Tap</vendor>
    </product>

    (As I wrote earlier, I'll cover the product object and all its fields in an upcoming article.)

    Since I've got the product info in JSON format, I'll use the XML URL for this API call:

    POST api-key:password@shop-url/admin/customers.json

    Let's make the call using the Chrome REST Console again. The URL goes into the Request URL field of the Target section:

    ...while the details of our new product go into the RAW Body field of the Body section. Make sure that the Content-Type field has the correct content-type selected; in this case, since we're sending (and receiving) XML, the content-type should be application/xml:

    Once again, a press of the POST button at the bottom of the page sends the information to the server, and the results appear in the Response section:

    Here's the response header:

    Status Code: 201
    Date: Wed, 03 Aug 2011 22:20:17 GMT
    Transfer-Encoding: chunked
    Status: 201 Created
    HTTP_X_SHOPIFY_API_CALL_LIMIT: 1/3000
    Connection: keep-alive
    X-UA-Compatible: IE=Edge,chrome=1
    X-Runtime: 0.122462
    Server: nginx/0.8.53
    Content-Type: application/xml; charset=utf-8
    Location: https://nienow-kuhlman-and-gleason1524.myshopify.com/admin/products/48339792
    Cache-Control: no-cache
    HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT: 1/300

    Don't sweat that the code is 201 and not 200 -- all 2xx code mean success. I'm going to go bug the core team and ask why successfully creating a new customer gives you a 200 (OK) code and successfully creating a new product gives you 201 (created).

    Here's the response body -- it's the complete record of the product we just created, in XML format:

    <?xml version="1.0" encoding="UTF-8"?>
    <product>
        <product-type>drum</product-type>
        <handle>stumpy-pepys-toy-drum-sp-1</handle>
        <created-at type="datetime">2011-08-03T18:20:17-04:00</created-at>
        <body-html>This drum is so good...<strong>you can't beat it!!</strong></body-html>
        <title>Stumpy Pepys Toy Drum SP-1</title>
        <template-suffix nil="true" />
        <updated-at type="datetime">2011-08-03T18:20:17-04:00</updated-at>
        <id type="integer">48339792</id>
        <vendor>Spinal Tap</vendor>
        <published-at type="datetime">2011-08-03T18:20:17-04:00</published-at>
        <tags />
        <variants type="array">
            <variant>
                <price type="decimal">0.0</price>
                <position type="integer">1</position>
                <created-at type="datetime">2011-08-03T18:20:17-04:00</created-at>
                <title>Default</title>
                <requires-shipping type="boolean">true</requires-shipping>
                <updated-at type="datetime">2011-08-03T18:20:17-04:00</updated-at>
                <inventory-policy>deny</inventory-policy>
                <compare-at-price type="decimal" nil="true" />
                <inventory-management nil="true" />
                <taxable type="boolean">true</taxable>
                <id type="integer">113348882</id>
                <grams type="integer">0</grams>
                <sku />
                <option1>Default</option1>
                <option2 nil="true" />
                <fulfillment-service>manual</fulfillment-service>
                <option3 nil="true" />
                <inventory-quantity type="integer">1</inventory-quantity>
            </variant>
        </variants>
        <images type="array" />
        <options type="array">
            <option>
                <name>Title</name>
            </option>
        </options>
    </product>

    Next Time...

    In the next installment, we'll look at modifying and deleting existing objects in your shop.

    Continue reading

    Developing Shopify Apps, Part 2: Exploring the API

    Developing Shopify Apps, Part 2: Exploring the API

    In the previous article in this series, we did the following:

    1. Joined Shopify's Partner Program
    2. Created a new test shop
    3. Launched a new test shop
    4. Added an app to the test shop
    5. Played around with a couple of quick API calls through the browser

    In this article, we'll take a look at some of the calls that you can make to Shopify's API and how they relate to the various parts of your shop. This will give you an idea of what Shopify shops are like as well as show you to control them programmatically.

    My Shop, via the Admin Page

    I've set up a test shop called Joey's World O' Stuff for this series of articles. Feel free to visit it at any time. It lives at this URL:

    https://nienow-kuhlman-and-gleason1524.myshopify.com/

    If you followed along with the last article, you also have a test shop with a similarly URL. Test shop URLs are randomly generated. The shops themselves are meant to be temporary; they're for experimenting with themes, apps and content. We'll work with real shops later in this series, and they'll have URLs that make sense.

    If you were to visit the URL for my test shop at the time of this writing, you'd see something like this:

    The admin panel for any shop can be accessed by adding /admin to the end of its base URL. If you're not logged into your shop, you'll be sent to the login page. If you're already logged in, you'll be sent to the admin panel's home page, which should look something like this:

    I've highlighted the upper right-hand corner of the admin panel home page, where the Preferences menu is. Click on Preferences, then in the menu that pops up, click on General Settings:

    You should now see the General Settings page, which should look like this:

    The fields on the screen capture of this page are a little small, so I'll list them below:

    • Shop name: Joey's World O' Stuff
    • Email: joey@shopify.com
    • Shop address:
      • Street: 31 Spooner Street
      • Zip: 02903
      • City: Quahog
      • Country: United States
      • State: Rhode Island
      • Phone: (555) 555-5555
    • Order ID formatting: #{{number}}
    • Timezone: (GMT-05:00) Eastern Time (US & Canada)
    • Unit system: Imperial system (Pound, inch)
    • Money formatting: ${{amount}}
    • Checkout language: English

    That's the information for my shop as seen through admin panel on the General Settings page. 

    Just as the admin panel lets you manually get and alter information about your shop, the Shopify API lets applications do the same thing, programatically. What we just did via the admin panel, we'll now do using the API. But first, let's talk about the API.

    Detour: A RESTafarian API

    The Shopify API is RESTful, or, as I like to put it, RESTafarian. REST is short for REpresentational State Transfer, and it's an architectural style that also happens to be a simple way to make calls to web services. I don't want to get too bogged down in explaining REST, but I want to make sure that we're all on the same page.

    The Shopify API exposes a set of resources, each of which is some part of a shop. Here's a sample of some of the resources that the API lets you access:

    • Shop: The general settings of your shop, which include things like its name, its owner's name, address, contact info and so on.
    • Products: The set of products available through your shop.
    • Images: The set of images of your store's products.
    • Customers: The set of your shop's customers.
    • Orders: The orders placed by your customers.
    (If you'd like to see the full list of resources, go check out the API Documentation. They're all listed in a column on the right side of the page.)

    To do things with a shop, whether it's to get the name of the shop or the contact email of its owner, get a list of all the products available for sale, or find out which customers are the biggest spenders, you apply verbs to resources like the ones listed above. In the case of RESTful APIs like Shopify's, those verbs are the four verbs of HTTP:

    1. GET: Read the state of a resource and not make any changes to it in the process. When you type an URL into your browser's address bar and press Enter, your browser responds by GETting that page.
    2. POST: Create a new resource (I'm simplifying here quite a bit; POST is the one HTTP verb with a lot of uses). When you fill out and submit a form on a web page, your browser typically uses the POST verb.
    3. PUT: Update an existing resource.
    4. DELETE: Delete an existing resource.

    Here's an example of putting resources and verbs together. Suppose you were writing an app that let a shopowner do bulk changes to the products in his or her store. Your app would need to access the Products resource and then apply the four HTTP verbs in these ways:

    • If you wanted to get information about one or more products in a shop, whether it's the list of all the products in the shop, information about a single product, or a count of all the products in the shop, you'd use the GET verb and apply it to the Products resource.
    • If you wanted to add a product to a shop, you'd use the POST verb and apply it to the Products resource.
    • If you wanted to modify an existing product in a shop, you'd use the PUT verb and apply it to the Products resource.
    • If you wanted to delete a product from a shop, you'd use the DELETE verb and apply it to the Products resource.
    Keep in mind that not all resources respond to all four verbs. Certain resources like Shop aren't programmatically editable, and as a result, it doesn't respond to PUT.

    My Shop, via the API

    Let's get the same information that we got from the admin panel's General Settings page, but using the API this time. In order to do this, we need to know two things:

    1. Which resource to access. In this case, it's pretty obvious: the Shop resource.
    2. Which verb to use. Once again, it's quite clear: GET. (Actually, if you check the API docs, it's very clear; it's the only verb that the Shop resource responds to.) 

    The nice thing about GET calls to web APIs is that you can try them out very easily: just type them into your browser's address bar!

    You specify a resource with its URL (or more accurately, URI). That's what the the "R" in URL and URI stand for: resource. To access a Shopify resource, you need to form its URI using this general format:

    api-key:password@shop-url/admin/resource-name.resource-type

    Where:

    • api-key is the API key for your app (when you create an app, Shopify's back end generates a unique API key for it)
    • password is the password for your app (when you create an app, Shopify's back end generates a password for it)
    • shop-url is the URL for your shop
    • resource-name is the name of the resource
    • resource-type is the type of the resource; this is typically either xml if you'd like the response to be given to your app in XML format or json is you'd like the response to be in JSON.

    You can find the API key and password for your app on the Shopify API page of your shop's admin panel. You can get there via this URL:

    shop-url/admin/api

    where shop-url is your shop's URL. You can also get there by clicking on the Apps menu, which is located near the upper right-hand corner of every page in the admin panel and selecting Manage Apps:

    You'll see a list of sets of credentials, one set for each app. Each one looks like this:

    You can copy the API key and password for your app from this box. Better yet, you can copy the example URL, shown below, and then edit it to create the API call you need:

    The easiest way to get general information about your shop is to:

     

    1. Copy the example URL
    2. Paste it into your browser's address bar
    3. Edit the URL, changing orders.xml to shop.xml
    4. Press Enter

    You should see a result that looks something like this:

    <shop>
      <name>Joey's World O' Stuff</name>
      <city>Quahog</city>
      <address1>31 Spooner Street</address1>
      <zip>02903</zip>
      <created-at type="datetime">2011-07-22T14:43:21-04:00</created-at>
      <public type="boolean">false</public>
      <country>US</country>
      <domain>nienow-kuhlman-and-gleason1524.myshopify.com</domain>
      <id type="integer">937792</id>
      <phone>(555) 555-5555</phone>
      <source nil="true"/>
      <province>Rhode Island</province>
      <email>joey@shopify.com</email>
      <currency>USD</currency>
      <timezone>(GMT-05:00) Eastern Time (US & Canada)</timezone>
      <shop-owner>development shop</shop-owner>
      <money-format>${{amount}}</money-format>
      <money-with-currency-format>${{amount}} USD</money-with-currency-format>
      <taxes-included type="boolean">false</taxes-included>
      <tax-shipping nil="true"/>
      <plan-name>development</plan-name>
    </shop>

    Note that what you get back is a little more information than what you see on the admin panel's General Settings page; you also get some information that you'd find on other admin panel pages, such as the currency your shop uses and how taxes are applied to your products' and shipping prices.

    You can also get your shop information in JSON by simply changing the last part of the URL from shop.xml to shop.json. You'll see a result like this:

    {"shop":
      {"address1":"31 Spooner Street",
       "city":"Quahog",
       "name":"Joey's World O' Stuff",
       "plan_name":"development",
       "shop_owner":"development shop",
       "created_at":"2011-07-22T14:43:21-04:00",
       "zip":"02903",
       "money_with_currency_format":"${{amount}} USD",
       "money_format":"${{amount}}",
       "country":"US",
       "public":false,
       "taxes_included":false,
       "domain":"nienow-kuhlman-and-gleason1524.myshopify.com",
       "id":937792,
       "timezone":"(GMT-05:00) Eastern Time (US \u0026 Canada)",
       "tax_shipping":null,
       "phone":"(555) 555-5555",
       "currency":"USD",
       "province":"Rhode Island",
       "source":null,
       "email":"joey@shopify.com"}
    }

    (Okay, I formatted this one so it would be easy to read. It was originally one long line; easy for computers to read, but not as easy for humans.)

    Other Things in My Shop, via the API

    If you followed my steps from the previous article in this series, your shop should have a small number of predefined products in your store. You can look at all the shop's products by taking the URL you just used and changing the last part of the URL to products.xml.

    Here's a shortened version of the output I got:

    <products type="array">
      <product>
        <product-type>Shirts</product-type>
        <handle>multi-channelled-executive-knowledge-user</handle>
        <created-at type="datetime">2011-07-22T14:43:24-04:00</created-at>
        <body-html>
          ...really long description here...
        </body-html>
        <title>Multi-channelled executive knowledge user</title>
        <template-suffix nil="true"/>
        <updated-at type="datetime">2011-07-22T14:43:24-04:00</updated-at>
        <id type="integer">47015882</id>
        <vendor>Shopify</vendor>
        <published-at type="datetime">2011-07-22T14:43:24-04:00</published-at>
        <tags>Demo, T-Shirt</tags>
        <variants type="array">
          <variant>
            <price type="decimal">19.0</price>
            <position type="integer">1</position>
            <created-at type="datetime">2011-07-22T14:43:24-04:00</created-at>
            <title>Medium</title>
            <requires-shipping type="boolean">true</requires-shipping>
            <updated-at type="datetime">2011-07-22T14:43:24-04:00</updated-at>
            <inventory-policy>deny</inventory-policy>
            <compare-at-price type="decimal" nil="true"/>
            <inventory-management nil="true"/>
            <taxable type="boolean">true</taxable>
            <id type="integer">110148372</id>
            <grams type="integer">0</grams>
            <sku/>
            <option1>Medium</option1>
            <option2 nil="true"/>
            <fulfillment-service>manual</fulfillment-service>
            <option3 nil="true"/>
            <inventory-quantity type="integer">5</inventory-quantity>
          </variant>
        </variants>
        <images type="array"/>
        <options type="array">
        <option>
        <name>Title</name>
        </option>
        </options>
      </product>
      ...
      (more products here)
      ...
    </products>

    If you want this information in JSON format, all you need to do is change the URL so that it ends with .json instead of .xml.

    Try Out the Other Resources

    There are a number of Shopify API resources that you can try out -- try out some GET calls on these:

    • articles.xml and articles.json
    • assets.xml or assets.json
    • blogs.xml or blocks.json
    • comments.xml or comments.json
    • customers.xml or customers.json

    There are more resources that you can access through GET; the Shopify Wiki lists them all in the right-hand column. Try them out!

    Next: Graduating to a real store, and trying out the POST, PUT and DELETE verbs.

    [ This article also appears in Global Nerdy. ]

    Continue reading

    StatsD at Shopify

    StatsD at Shopify

    Here at Shopify, we like data. One of the many tools in our data toolbox is StatsD. We've been using StatsD in production at Shopify for many months now, consistently sending multiple events to our StatsD instance on every request.

    What is StatsD good for?

    In my experience, there are two things that StatsD really excels at. First, getting a high level overview of some custom piece of data. We use NewRelic to tell us about the performance of our apps. NewRelic provides a great overview of our performance as a whole, even down to which of our controller actions are slowest, and though it has an API for custom instrumentation I've never used it. For custom metrics we're using StatsD.

    We use lots of memcached, and one metric we track with StatsD is cache hits vs. cache misses on our frontend. On every request that hits a cacheable action we send an event to StatsD to record a hit or miss. 

    Caching Baseline (Green: cache hits, Blue: cache misses)

     

    Note: The graphs in this article were generated by Graphite, the real-time graphing system that StatsD runs on top of.

    As an example of how this is useful, we recently added some data to a cache key that wasn't properly converted to a string, so that piece of the key was appearing to be unique far more often than it was. The net result was more cache misses than usual. Looking at our NewRelic data we could see that performance was affected, but it was difficult to see exactly where. The response time from our memcached servers was still good, the response time from the app was still good, but our number of cache misses had doubled, our number of cache hits had halved, and overall user-facing performance was down.

    A problem


     

    It wasn't until we looked at our StatsD graphs that we fully understood the problem. Looking at our caching trends over time we could clearly see that on a specific date something was introduced that was affecting caching negatively. With a specific date we were able to track down the git commit and fix the issue. Keeping an eye on our StatsD graphs we immediately saw the behaviour return to the normal trend.

    Return to Baseline

     

    The second thing that StatsD excels at is proving assumptions. When we're writing code we're constantly making assumptions. Assumptions about how our web app may be used, assumptions about how often an interaction will be performed, assumptions about how fast a particular operation may be, assumptions about how successful a particular operation may be. Using StatsD it becomes trivial to get real data about this stuff.

    For instance, we push a lot of products to Google Product Search on behalf of our customers. There was a point where I was seeing an abnormally high number of failures returned from Google when we were posting these products via their API. My first assumption was that something was wrong at the protocol level and most of our API requests were failing. I could have done some digging around in the database to get an idea of how many failures we were getting, cross referenced with how many products we were trying to publish and how frequently, etc. But using our StatsD client (see below) I was able add a simple success/failure metric to give me a high level overview of the issue. Looking at the graph from StatsD I could see that my assumption was wrong, so I was able to eliminate that line of thinking.

    statsd-instrument

    We were excited about StatsD as soon as we read Etsy's announcement. We wrote our own client and began using it immediately. Today we're releasing that client. It's been in use in production since then and has been stalwartly collecting data for us. On an average request we're sending ~5 events to StatsD and we don't see a performance hit. We're actually using StatsD to record the raw number of requests we handle over time.

    statsd-instrument provides some basic helpers for sending data to StatsD, but we don't typically use those directly. We definitely didn't want to litter our application with instrumentation details so we wrote metaprogramming methods that allow us to inject that instrumentation where it's needed. Using those methods we have managed to keep all of our instrumentation contained to one file in our config/initializers folder. Check out the README for the full API or pull down the statsd-instrument rubygem to use it.

    A sample of our instrumentation shows how to use the library and the metaprogramming methods:

    # Liquid
    Liquid::Template.extend StatsD::Instrument
    Liquid::Template.statsd_measure :parse, 'Liquid.Template.parse'
    Liquid::Template.statsd_measure :render, 'Liquid.Template.render'
    
    # Google Base
    GoogleBase.extend StatsD::Instrument
    GoogleBase.statsd_count_success :update_products!, 'GoogleBase.update_products'
    
    # Webhooks
    WebhookJob.extend StatsD::Instrument
    WebhookJob.statsd_count_success :perform, 'Webhook.perform'
    

    That being said, there are a few places where we do make use of the helpers directly (sans metaprogramming), still within the confines of our instrumentation initializer:

    ShopAreaController.after_filter do
      StatsD.increment 'Storefront.requests', 1, 0.1
    
      return unless request.env['cacheable.cache']
    
      if request.env['cacheable.miss']
        StatsD.increment 'Storefront.cache.miss'
      elsif request.env['cacheable.store'] == 'client'
        StatsD.increment 'Storefront.cache.hit_client'
      elsif request.env['cacheable.store'] == 'server'
        StatsD.increment 'Storefront.cache.hit_server'
      end
    end
    

    Today we're recording metrics on everything from the time it takes to parse and render Liquid templates, how often our Webhooks are succeeding, performance of our search server, average response times from the many payment gateways we support, success/failure of user logins, and more.

    As I mentioned, we have many tools in our data toolbox, and StatsD is a low-friction way to easily collect and inspect metrics. Check out statsd-instrument on github.

    Continue reading

    Prognostication For Fun And Profit: States And Events

    Prognostication For Fun And Profit: States And Events

    Measuring stuff is hard, even when it stands still. When things change over time the situation gets even worse. 

    I'm Ben Doyle, research scientist and data prophet at Shopify. Over the next little while I'm planning to go into some detail about how we're calculating some common ecommerce metrics (like customer count, churn rate, lifetime value, etc.) here at Shopify.  Most of these metrics come down to fancy ways of counting, but the devil is in the details. 
    For Instance...
    You have a table of users and a boolean flag that says whether each of them is a paying customer. Counting your customers is as simple as counting the rows where the flag is set. Or is it? What if you want to plot your growth over time? You can replace the flag with a date range and count how many of these ranges include a date in question. What if your customer definition changes, perhaps due to a change in your business model? You have to go back through your whole history, which might be a problem if your new definition relies on information you have just started collecting. If you are doing exploratory work you might not even know what data you need yet. 
    These problems have solutions but clearly there's work to do. We need to untangle our methods and sort out our definitions. I thought I'd kick things off by clarifying the elements most basic to counting: intervals and points.

    Intervals and Points

     

     

    In the diagram above the dark blue lines represent intervals and the light blue circles represent points (the points should really only be a single pixel but are large so we can see them). In general you need intervals to measure points and you need points to measure intervals. So in a) three of the four intervals overlap with the point and in b) three of the four points overlap with the interval.

    States And Events

    In more familiar terms the space we're usually talking about is time, our points are events and our intervals are states. Events can be found in server logs or rows in a transactional database.  So every time a purchase is made or a signup form is completed, an event is created. At minimum the event type and a timestamp will be recorded, though the event will usually be associated with a user or other entity.

    A state history is often the result of a state machine as Willem discussed in a recent post. For example a subscriber can be subscribed to one or many of several subscription packages. Then keeping a history of those packages means having a start and end date for each package, for each subscriber. States must at minimum have a start and end timestamp and a type, though an entity is usually recorded as well.

    Example: Accounting

    Understanding states and events clarifies the distinction between different sorts of counting. For example standard accounting practise is to report both a balance sheet and an income statement.  

    The balance sheet is a snapshot. This means it is an event in time being used to count states. Consider an account that had $100 in the interval between May 1st and June 18th.  It had $100 on an event date of June 1st, which lies within the interval. The state of every account can be assessed on a given date, and a total reported.

    The income statement reports changes over a period of time. It uses an interval to count a collection of events.  So if there was a withdraw event on June 19th of $5 and deposits on the 20th and 21st of $10, the income statement for June would be $10 + $10 - $5 = $15. If these were the only events for June the balance sheet for July 1st should be $115. 

    These are two different and complementary ways to look at the problem of counting, over time. In theory if you add up all of your income statements before a date and time the sum should be the same as your current balance at that date and time. So the results are equivalent. Unfortunately if there are errors or omissions in your data discrepancies will arise between the two methods.

    Metrics

     

    Perhaps the most basic metric to track is "How many customers did we have on date X?". If you have a software system already set up that consistently tracks the states of your users (e.g. visitor, customer, churned), and the events that can influence them (e.g. signups, payments, cancellations) the answer can be easy. Like in the finance example above you can present a balance sheet for any point in time, counting customer states at a point in time. In status quo situations this will probably be the preferred method.

    It's nice to know that there is an alternative at hand for when things get complicated though. Counting the events that change customer states will give you greater flexibility to adapt to changes. So if you suddenly decide you want to count “happy” customers separately, looking at the events that indicate happiness - as opposed to "customer-ness" is a good place to start.

    For an example of the above you could count the number of events signifying happiness (e.g. logins or site interactions) within 30 days of a particular date. This could serve as a happiness metric for that date. Since you are counting the states directly it's easy to add flourishes like having some events count more than others, having their weights decay over time, or even incorporating events from an entirely different source. This flexibility especially helps if you are trying to build up a metric to serve as a proxy for or to predict another metric. For example you could use the weights on your event types as adjustable parameters in fitting a model. Continuing with the example, our happiness metric could be constructed to predict customer churn. I'll go into more detail about these sorts of analyses in future posts.

    When In Doubt, Start With Events

    If you are running a store it's nice to know how many customers you have, but it's more important to know how many sales you've made. The definition of customer is abstract and can be arbitrary. Do they need to make a purchase? Several purchases? Do coupons or promotions count? When do they cease to be a customer? In contrast, sales events are much harder to argue about, so they make a great basis for metrics. I hope this article has left you with some insight into the seemingly simple act of counting and welcome questions or comments.

    Continue reading

    Developing Shopify Apps, Part 1: The Setup

    Developing Shopify Apps, Part 1: The Setup

    What is a Shopify App?

    Shopify is a pretty capable ecommerce platform on its own, and for a lot of shopowners, it's all they need for their shops. However, there are many cases where shopowners need features and capabilities that don't come "out of the box" with Shopify. That's what apps are for: to add those extra features and capabilities to Shopify.

    Apps make use of the Shopify API, which lets you programatically access a shop's data -- items for sale, orders and so on -- and take most of the actions available to you from a shop's control panel. An app can automate a tedious or complex task for a shopowner, make the customer's experience better, give shopowners better insight into their sales and other data, or integrate Shopify with other applications' data and APIs in useful ways.

    Here are some apps that you can find at the Shopify App Store. These should give you an idea of what's possible:

    • Jilt: This is an app that makes shopowner's lives easier. It helps turn abandoned carts -- they arise when a customer shops on your store, puts items in the cart and then for some reason never completes the purchase -- into orders. It sends an email to customers who've filled carts but never got around to buying their contents after a specified amount of time. It's been shown to recover sales that would otherwise never have been made.
    • Searchify: Here's an app that makes the customer experience more pleasant. It's an autocompleting search box that uses the data in your shop that lets customers see matching products as they type. The idea is that by making your shop easier to search, you'll get more sales.
    • Beetailer: A good example of taking the Shopify API and combining it with other APIs. It lets your customers comment on your shop's products and share opinions about them on social media sites like Facebook and Twitter. You can harness the power of word-of-mouth marketing to get people to come to your store!

    Shopify apps offer benefits not just for shopowners and their customers, but for developers as well. Developers can build custom private apps for individual shopowners, or reach the 16,000 or so Shopify shopowners by selling their apps through the App Store. The App Store is a great way to get access to some very serious app customers: after all, they're looking for and willing to spend money on apps that make their shops more profitable. Better still, since a healthy app ecosystem is good for us as well, we'll be more than happy to help showcase and promote your apps.

    If you've become convinced to write an app, read on, and follow this series of articles. I'll explore all sorts of aspects of Shopify app-writing, from getting started to selling and promoting your apps. Enjoy!

    Step 1: Become a Partner

    Before you can write apps, you have to become a Shopify Partner. Luckily, it's quick and free to do so. Just point your browser at the Shopify Partners login page (https://app.shopify.com/services/partners/auth/login):

    Once you're there, click on the Become a partner button. That will take you to the Become a Shopify Partner form, a single page in which you provide some information, such as your business' name, your URL and if you're into Shopify consulting, app development or theme design as well as some contact info:

    When you submit this form, you're in the club! You're now a Shopify partner and ready to take on the next step: creating a test shop.

    Step 2: Create a New Test Shop

    Test shops are a feature of Shopify that let you try out store themes and apps without exposing them to the general public. They're a great way to familiarize yourself with Shopify's features; they're also good "sandboxes" in which you can safely test app concepts.

    The previous step should have taken you to your Shopify partner account dashboard, which looks like this:

    It's time to create a test shop. Click on the Test Shops tab, located not too far from the top of the page:

    You'll be taken to the My Test Shops page, where you manage your test shops. It looks like this:

    As you've probably already figured out, you can create a new test shop by either:

     

    • Clicking on the Create a new Test Shop button near the upper left-hand corner of the page
    • Clicking on the big Create your first Test Shop button in the middle of the page. I'm going to click that one...

    You should see this message near the top of the page for a few moments:

    ...after which you should see the My Test Shops page now sporting a test shop in a list.

    Test shops are given a randomly-generated name. When you decide to create a real, non-test, customer-facing shop, you can name it whatever you want from the start.

    In this example, the test shop is Nienow, Kuhlman and Gleason (sounds like a law firm!). Click on its name in the list to open its admin panel.

    Step 3: Launch Your Test Shop

    Here's what the admin panel for a newly-created shop looks like:

    If you're wondering what the URL for your shop is, it's at the upper left-hand corner fo the page, just to the right of the Shopify wordmark. Make a note of this URL; you'll use it often.

    Just below that, you'll see your shop's password:

    (Don't bother trying to use this password to get to my test shop; I've changed it.)

    You're probably looking at that big text and thinking "7 steps? Oh Shopify, why you gotta be like that?"

    Worry not. Just below that grey bar showing the seven steps you need to get a store fully prepped is a link that reads Skip setting up your store and launch it anyway. Click it:

    This will set up your test store with default settings, a default theme and even default inventory. You'll be taken to the admin panel for your shop, which looks like this:

    This is the first thing shopowners see when they log into their shops' admin panels.

    Now, let's add an app!

    Step 4: Add an App

    Click on the Apps tab, located near the upper right-hand corner of the page. A menu will pop up; click on its Manage Apps menu item:

    You'll be taken to the Installed Applications page, shown below:

    For the purposes of this exercise, a private app -- one that works only for this shop -- will do just fine. Click on the click here link that immediately followed the line Are you a developer interested in creating a private application for your shop?:

    You'll get taken to the Shopify API page, which manages the API keys and other credentials for your test shop's apps:

    For each app in a shop, there's a corresponding set of credentials. Let's generate some credentials now -- click the Generate new application button:

    The page will refresh and you'll see a big grey box containing all sorts of credentials:

    Here's a closer look at the credentials:

    You now have credentials that an app can use. Guess what: we're ready to make some API calls!

    A Quick Taste!

    Here's a quick taste of what we'll do in the next installment: play around with the Shopify API. Just make sure you've gone through the steps above first.

    The Shopify API is RESTful. One of the benefits of this is that you can explore parts of it with some simple HTTP GET calls, which you can easily make by typing into your browser's address bar. These calls use the following format:

    api-key:password@your-test-shop-URL/admin/resource.xml
    You could type in the URL yourself, but I find it's far easier to simply copy the Example URL from the lost of credentials for your apps and editing it as required:

    For example, if you want some basic information about your shop, copy the Example URL, paste it into your browser's address bar and change orders.xml to shop.xml. Press Enter; you should see results that look something like this:

     Nienow, Kuhlman and Gleason Boston 185 Rideau Street K1N 5X8 2011-07-22T14:43:21-04:00 false US nienow-kuhlman-and-gleason1524.myshopify.com 937792 555 555 5555  Massachusetts joey@joeydevilla.com USD (GMT-05:00) Eastern Time (US & Canada) development shop ${{amount}} ${{amount}} USD false  development 

    How about the products in your shop? There are some: since we skipped the full setup, your test shop comes pre-populated with some example products. Copy the Example URL, paste it into your browser's address bar and change orders.xml to products.xml. You should get a result that looks something like this:

      Shirts multi-channelled-executive-knowledge-user 2011-07-22T14:43:24-04:00 
    

    So this is a product.

    The text you see here is a Product Description. Every product has a price, a weight, a picture and a description. To edit the description of this product or to create a new product you can go to the Products Tab of the administration menu.

    Once you have mastered the creation and editing of products you will want your products to show up on your Shopify site. There is a two step process to do this.

    First you need to add your products to a Collection. A Collection is an easy way to group products together. If you go to the Collections Tab of the administration menu you can begin creating collections and adding products to them.

    Second you’ll need to create a link from your shop’s navigation menu to your Collections. You can do this by going to the Navigations Tab of the administration menu and clicking on “Add a link”.

    Good luck with your shop!

    Multi-channelled executive knowledge user 2011-07-22T14:43:24-04:00 47015882 Shopify 2011-07-22T14:43:24-04:00 Demo, T-Shirt 19.0 1 2011-07-22T14:43:24-04:00 Medium true 2011-07-22T14:43:24-04:00 deny true 110148372 0 Medium manual 5
    ...

    Check out the API Reference for more API calls you can try. That's what we'll be covering in the next installment, in greater detail. Happy APIing!

    Continue reading

    Why developers should be force-fed state machines

    Why developers should be force-fed state machines

    This post is meant to create more awareness about state machines in the web application developer crowd. If you don’t know what state machines are, please read up on them first. Wikipedia is a good place to start, as always.

    State machines are awesome

    The main reason for using state machines is to help the design process. It is much easier to figure out all the possible edge conditions by drawing out the state machine on paper. This will make sure that your application will have less bugs and less undefined behavior. Also, it clearly defines which parts of the internal state of your object are exposed as external API.

    Moreover, state machines have decades of math and CS research behind them about analyzing them, simplifying them, and much more. Once you realize that in management state machines are called business processes, you'll find a wealth of information and tools at your disposal.

    Recognizing the state machine pattern

    Most web applications contain several examples of state machines, including accounts and subscriptions, invoices, orders, blog posts, and many more. The problem is that you might not necessarily think of them as state machines while designing your application. Therefore, it is good to have some indicators to recognize them early on. The easiest way is to look at your data model:

    • Adding a state or status field to your model is the most obvious sign of a state machine.
    • Boolean fields are usually also a good indication, like published, or paid. Also timestamps that can have a NULL value like published_at and paid_at are a usable sign.
    • Finally, having records that are only valid for a given period in time, like subscriptions with a start and end date.

    When you decide that a state machine is the way to go for your problem at hand, there are many tools available to help you implement it. For Ruby on Rails, we have the excellent gem state_machine which should cover virtually all of your state machine needs.

    Keeping the transition history

    Now that you are using state machines for modelling, the next thing you will want to do is keeping track of all the state transitions over time. When you are starting out, you may be only interested in the current state of an object, but at some point the transition history will be an invaluable source of information. It allows you to answer all kinds of questions, like: “How long on average does it take for an account to upgrade?”, “How long does it take to get a draft blog post published?”, or “Which invoices are waiting for an initial payment the longest?”. In short, it gives you great insight on your users' behavior.

    When your state machine is acyclic (i.e. it is not possible to return to a previous state) the simplest way to keep track of the transitions is to add a timestamp field for every possible state (e.g. confirmed_atpublished_atpaid_at). Simply set these fields to the current time whenever a transition to the given state occurs.

    However, it is often possible to revisit the same state multiple times. In that case, simply adding fields to your model won’t do the trick because you will be overwriting them. Instead, add a log table in which all the state transitions will be logged. Fields that you probably want to include are the timestamp, the old state, the new state, and the event that caused the transition.

    For Ruby and Rails, Jesse Storimer and I have developed the Ruby gem state_machine-audit_trail to track this history for you. It can be used in unison with the state_machine gem.

    Deleting records?

    In some cases, you may be tempted to delete state machine records from your database. However, you should never do this. For accountability and completeness of your history alone, it is a good practice to never delete records. Instead of removing it, add an error state for any reason you would have wanted to delete a record. A spam account? Don’t delete, set to the spam state. A fraudulent order? Don’t delete, set to the fraud state.

    This allows you to keep track of these problems over time, like: how many accounts are spam, or how long it takes on average to see that an order is fraudulent.

    In conclusion

    Hopefully, reading this text has made you more aware of state machines and you will be applying them more often when developing a web application. Disclaimer: like any technique, state machines can be overused. Developer discretion is advised.

    Continue reading

    Session Hijacking Protection

    Session Hijacking Protection

    There’s been a lot of talk in the past few weeks about “Firesheep”, a new program that lets users hijack other users’ accounts on many different websites. But there’s no need to worry about your Shopify account — we’ve taken steps to ensure your account can’t be hijacked and your data is safe.

    Firesheep is a Firefox plugin (a program that integrates right into the Firefox browser) that makes it easy to perform HTTP session cookie hijacks when using an insecure connection on an untrusted network. This kind of attack is nothing new, but Firesheep makes it dead simple and shows how prevalent it is.

    The attack consists of stealing cookie data over an untrusted network and using that data to log in to other people’s user accounts. Many websites that you use daily, including Shopify, are susceptible to this kind of attack.

    Naturally we reacted to this by taking measures to ensure that this can’t happen to our users. All of your Shopify admin data is now fully secure, encrypted, and protected from Firesheep attacks.

    Technical Details

    The only way to ensure that cookie data, or any data sent over HTTP for that matter, is not been spied upon is end-to-end encryption. Currently the solution for this is SSL.

    Last week we made the switch to all SSL in the Shopify admin area. This has been applied to all URLs and all subscription plans. This means that any request made to Shopify will be forced to use SSL for secure encryption.

    But this is not quite enough to ensure that cookie data is not hijacked. By default HTTP cookies are sent over secured, as well as unsecured, connections. Without taking the extra step to secure the HTTP cookie as well, your session is still vulnerable.

    The Problem

    In Shopify’s case we weren’t able to use SSL for all traffic on the site. There are two main areas to Shopify, the shop frontend and the shop backend. In the backend is where a shop’s employees manage product data, fulfill orders, etc. In the frontend is where products are viewed, carts are filled, and checkout happens. All traffic in the backend happens under one domain, *.myshopify.com, with individual accounts having unique subdomains. One wildcard SSL cert allows us to protect the entire backend.

    We can’t apply the same strategy to the shop frontends because we allow our merchants to use custom domains for their shops. So there are literally thousands of different domain names pointing at the Shopify servers, each of which would require an SSL cert. An unsecure frontend is not too worrisome since there is no sensitive data being passed around, just information about what’s stored in the cart.

    However, this meant that we would need two different session cookies, one for use in the backend to be sent on encrypted connections only, and one for use in the frontend to be sent unencrypted.

    Using two different session stores based on routes isn’t something that Ruby on Rails supports out of the box. You set one session store for your application that gets inserted into the middleware chain and handles sessions for your application.

    The Solution

    So we came up with a

    MultiSessionStore
    that delegates to multiple session stores based on the
    PATH_INFO
    Shopify still has only one session store handling all of its sessions, but if the request comes in under the
    /admin
    path we’ll use the secure cookie, and if it comes in under another path we’ll use the unsecured cookie.

    Here is our implementation in its entirety: https://gist.github.com/704099

    This last step, the secured cookie, ensures that session cookie data is never available for hijacking.

    Continue reading

    Shopify's path to Rails 3

    Shopify's path to Rails 3

    The TL;DR version

    Shopify recently upgraded to Rails 3!

    We saw minor improvements in overall response times but what we’re most happy with is the new API – it means we get to write cleaner code and get features out faster.

    However, this upgrade wasn’t trivial – as one of the largest and oldest Rails apps around, the adventure involved jumping through a few hoops. Here’s what we did and what you might consider if you’ve got an established Rails app that you’re thinking of upgrading.

    First, some numbers

    The first svn check-in to Shopify was on the release date of Rails 0.5. That was in July of 2004, six years ago, which according to @tobi is “roughly 65 years in internet time”.

    At that time Shopify had only two active developers. Today it has eleven full time devs working on it.

    The Shopify codebase has over 300 files in the app/models directory, over 130 controllers, and almost 100 gem dependencies.
    $ find app/models/ -type f | wc -l
         327
    $ find app/controllers/ -type f | wc -l
         131
    $ bundle show | wc  -l
          95
    

    Over the past 6 years Shopify has been under constant development, amassing nearly 12000 commits. This makes Shopify one of the oldest, most active Rails projects in existence.

    Our process

    There are many Rails 3 upgrade guides out there, but we didn’t try to follow any of them. We focused on doing as much as we could ahead of time to prepare for Rails 3, and then giving one big final push when 3.0 final was released.

    When upgrading a large app to a major release like this we found there are some things you can do to prepare yourself, but at a certain point you’ve just got to bite the bullet and make the final push to get things working.

    Bundler

    Shopify had been using Bundler in production for 9 months before making the move to Rails 3. Like most, we weren’t convinced of its utility at first, but as the code got more stable we saw how much it helped with deployments and managing development environments. We think Bundler was absolutely the right choice for managing dependencies.

    It was pretty painless to use Bundler with Rails 2.3.x, the Bundler documentation has everything that is needed. We’d definitely recommend doing this step ahead of time as it removes one more obstacle in the Rails 3 migration.

    XSS

    This was a big one. Some more numbers: Shopify has about 100 helper modules and 130 views. The task of updating all of our views/helpers for the new ‘safe by default’ XSS behaviour was a separate migration all its own. This too, we completed a few months before the release of 3.0.

    There was no secret way to go about this, just the obvious back-breaking way. Here’s the basic process I followed:

    1. Run the functional tests. Fix any issues that show up there.
    2. Boot up Shopify in my development environment and click around, fixing any issues I see there.
    3. Manually scan through all of the modules in app/helpers, looking for anything suspicious.
    4. Deploy the code to our staging server. Have the team try it out and report any errors to a shared Google spreadsheet (great for collaborative editing).
    5. Code review.
    6. Deploy the code to production and hope that no issues slipped through.

    N.B. When new issues come in, do your best to use ack (or some other project search tool) to find any instances of that issue in other views/helpers and correct those as well.

    The rest

    After getting Bundler and XSS out of the way, the rest of the migration was done as one large chunk. Some of the work in upgrading to Rails 3 was actually going on in parallel to the XSS work.

    The first commit to our rails3 branch was made back in February when the first Rails 3 beta was released. At that point we didn’t know how much work it would be to get Shopify running on Rails 3. We were excited about the launch of the beta and the prospect of getting Shopify using it soon.

    After a few days of work we ran into some major blockers that were keeping the app from functioning. Work was abondoned on the rails3 branch for 5 months while the 3.0 release became more stable. When the first release candidate came out in July work we resurrected the rails3 branch.

    From then (mid-July) until mid-October the rails3 branch saw pretty constant action, never going more than a few days without a commit. There was a lull during the XSS migration, and as devs took on other projects while doing the migration. We remained mindful of the fact that 3.0 final wasn’t yet released and didn’t want to put our changes into production until we had the confidence of that final release.

    Since this whole process took several months there was a lot of activity going on in the master branch at the same time. The only advice to offer is merge early and merge often.

    When the final release came out we once again underestimated how much work would be involved in getting Shopify the rest of the way on to Rails 3. The day that it was released @tobi put something like the following into our Campfire room “Let’s get Shopify running on Rails 3! Any devs who want to help join the Meeting Room [campfire room].” It was another few weeks before all was finished.

    Major stumbling blocks

    Routes

    Shopify also has lots of routes.

    $ rake routes | wc -l
         846
    

    At the beginning of the upgrade process we used the routes rake task that comes with the rails_upgrade plugin but we were still plagued with missing routes throughout the upgrade.

    Although our routes tripled in size, the increase was worth it because the new routing API is much nicer to work with.

    The old
    map.namespace :admin do |admin|
      admin.resources :products, :collection => { :inventory => :get,
        :count => :get },  
        :member => { :duplicate => :post, 
          :sort => :post,
          :reorganize => :any,
          :update_published_status => :post } do |products|        
        products.resources :variants, :controller => "product_variants", :collection => { :reorder => :post, :set => :post, :count => :get }
      end
    end
    
    The new
    namespace :admin do
      resources :products do
        collection do
          get :count
          get :inventory
        end 
    
        member do
          post :sort
          post :duplicate
          post :update_published_status
          match :reorganize
        end 
    
        resources :variants, :controller => 'product_variants' do
          collection do
            get :count
            post :set
            post :reorder
          end 
        end 
      end
    end
    

    Libraries

    Like everyone else we were tripped up by libraries in need of upgrades for Rails 3 compliance. There was a lot less of this than you’d expect because Shopify implements so much of what it needs internally. Lots of code in Rails core began in Shopify’s code base.

    There were updates required to the plugins that Shopify maintains. Otherwise, when we found issues with libraries we were happy to discover that other maintainers were diligent and had already pushed fixes for Rails 3 compatibility, it was just a matter of updating library versions we were tracking.

    helper :all

    helper(:all) was a configuration option in Rails 2.x. You could add it to a controller and that controller would have access to all helpers modules defined in your application. In 2.x this was part of the default Rails template, but it could be removed for users who didn’t want it.

    In Rails 3.0 this has been moved into ActionController::Base and it can no longer be turned off. This can create very weird behaviour like the following: https://gist.github.com/517669

    This was causing issues for us since a lot of our helpers define methods with the same name. We ended submitting a patch to Rails that let us continue to use routes with the default naming scheme. The fix is to use the
    clear_helpers
    method in your
    ApplicationController
    class ApplicationController
      clear_helpers
      ...
    end

    Documentation

    External services

    Shopify integrates with a myriad of external services. Payment gateways through ActiveMerchant, fulfillment services through ActiveFulfillment, shipping providers through ActiveShipping, product search engines, Google Analytics, Google Checkout, the list goes on.

    Ensuring that these integrations continued working was very important for us and we would have had issues had we not thoroughly tested them. Don’t overlook this step.

    Looking ahead

    Towards the end of the upgrade we (jokingly) asked ourselves if it was really worthwhile to upgrade to Rails 3. After all, we were doing just fine with Rails 2.x, and upgrading to 3.0 was not trivial.

    To give you an idea of how much code was changed, here’s the diffstat from Github:

    But we soon came to realize that there are a lot of exciting things coming in future releases in the 3.x series and this is the way forward. We’re really excited about getting to use stuff like Arel 2.0, Automatic Flushing, Identity Map, and lots of other goodies.

    The Rails project and its surrounding ecosystem are moving ahead quickly. By staying on top of it, we can provide the best tools for our developers and the best experience for our customers.

    Continue reading

    ActiveMerchant version 1.9 released

    ActiveMerchant version 1.9 released

    A little bit of background history

    As some of you may know, quite a while ago Shopify extracted all of its payment gateway related code into the open source project ActiveMerchant. Since then the project has evolved into one of the most successful Ruby libraries with over 400 “forks” (meaning that other developers customized the code to their needs and added functionality as required).

    Whenever developers think their changes are a contribution to the official project (e.g. by adding support for a new payment gateway) they send out a so called “pull request” and after we review the implementation we usually merge their changes into ActiveMerchant for everyone to use, meaning all Shopify customers and every programmer using the ActiveMerchant library in their code base will benefit from the new updates.

    Exciting news

    We have been digging around a lot lately for interesting changes to the project and decided to pull some of the bigger ones into the official repository. This resulted in the release of version 1.8.0 last month adding two new gateways.

    Since then we found even more interesting contributions that we decided to merge into the official project and we also developed two offsite integrations internally. The result is that ActiveMerchant (and thus Shopify) now supports seven additional payment gateways for merchants from various countries around the world:

    The seventh new gateway is SagePay Form, an offsite alternative to our existing SagePay implementation, in order to give merchants in the United Kingdom and Ireland the option of using 3D Secure for transactions. 3D Secure is required for certain U.K. credit card brands.

    Open Source is awesome-sauce

    That brings the number of supported gateways in ActiveMerchant to an impressive total of 63. This would have not been possible without the help of an international community so huge thanks go out to all the contributors that helped ActiveMerchant spread around the world!

    If you are aware of any gateway implementations that should make it into the official ActiveMerchant gem let us know and we’ll be happy to review them.

    Continue reading

    Start your free 14-day trial of Shopify