Shopify’s Unique Data Science Hierarchy Of Needs
Share
You’ve probably seen the “Data Science Hierarchy of Needs” (pictured below) before. Inspired by Maslow, it shows the tooling you would use at different levels in data science—from logging and user-generated content at the bottom, to AI and deep learning at the very top.
While hierarchies like this one can serve as helpful guides, at Shopify, we don’t think it always captures the whole picture. For one, it emphasizes particular tools over finding the best solution to a given problem. Plus, it can have a tendency of prioritizing more “advanced” solutions, when a simple one would do.
That’s why we’ve chosen to take a different approach. We’ve created our own Data Science Hierarchy of Needs to reflect the various ways we as a data team create impact, not only for Shopify, but also for our merchants and their customers. In our version, each level of the hierarchy represents a different way we deliver value—not better or worse, just different.
Our philosophy is much more tool-agnostic, and it emphasizes trying simple solutions before jumping to more advanced ones. This enables us to make an impact faster, then iterate with more complex solutions, if necessary. We see the pinnacle of data science not as machine learning or AI, but in the impact that we’re able to have, no matter the technology we use. Above all, we focus on helping Shopify and our merchants make great decisions, no matter how we get there.
Below, we’ll walk you through our Data Science Hierarchy of Needs and show you how our tool-agnostic philosophy was the key to navigating the unprecedented COVID-19 pandemic for Shopify and our merchants.
Tackling The Pandemic With Data
During the pandemic, we depended on our data to give us a clear lens into what was happening, how our merchants were coping, and what we could do to support them. Our COVID-19 impact analysis—a project we launched to understand the impact of the pandemic and support our merchants—is a great example of how our Data Science Hierarchy of Needs works.
For context—at Shopify, data scientists are embedded in different business units and product teams. When the pandemic hit, we were able to quickly launch a task force with data science representatives from each area of the business. The role of these data scientists was to surface important insights about the effects of the pandemic on our merchants and help us make timely, data-informed decisions to support them.
At every step of the way, we relied on our Data Science Hierarchy of Needs to support our efforts. With the foundations we had built, we were able to quickly ship insights to all of Shopify that were used to inform decisions on how we could best help our merchants navigate these challenging times. Let’s break it down.
1. Collecting And Modeling Data To Create A Strong Foundation
The base of our hierarchy is all about building a strong foundation that we can use to support our efforts as we move up the pyramid. After all, we can’t build advanced machine learning models or provide insightful and impactful analysis if we don’t have the data accessible in a clean and conformed manner.Activities At The Collect & Model Level
- Data generation
- Data platform
- Acquisition
- Pipeline build
- Data modeling
- Data cleansing
At Shopify, we follow the Dimensional Modeling methodology developed by Ralph Kimball—a specific set of rules around organizing data in our data warehouse—to ensure all of our data is structured consistently. Since our team is familiar with how things are structured in the foundation, it’s easy for them to interact with the data and start using the tools at higher levels in the pyramid to analyze it.It’s important to note that even though these foundational practices, by necessity, precede activities at the higher level, they’re not “less than”—they are critical to everything we do as data scientists. Having this groundwork in place was absolutely critical to the success of our COVID-19 impact analysis. We weren’t scrambling to find data—it was already clean, structured, and ready to go. Knowing that we had put in the effort to collect data the right way also gave us the security that we could trust the insights that came out during our analysis.
2. Describing The Data To Gain A Baseline Understanding Of The Business
This next level of the hierarchy is about leveraging the data we’ve collected to describe what we observe happening within Shopify. With a strong foundation in place, we’re able to report metrics and answer questions about our business. For instance, for every product we release, we create associated dashboards to help understand how well the product is meeting merchants’ needs.
At this phase, we’re able to start asking key questions about our data. These might be things like: What was the adoption of product X over the last three months? How many products do merchants add in their first week on the platform? How many buyers viewed our merchants’ storefronts? The answers to these questions offer us insight into particular business processes, which can help illuminate the steps we should take next—or, they might establish the building blocks for more complex analysis (as outlined in steps three and four). For instance, if we see that the adoption of product X was a success, we might ask, Why? What can we learn from it? What elements of the product launch can we repeat for next time?
Activities At The Describe Level
- Reporting/dashboards
- Metrics/KPIs
- Segmentation
- Product analytics
- Exploratory data analysis
During our COVID-19 impact analysis, we were interested in discovering how the pandemic was affecting Shopify and our merchants’ businesses: What does COVID-19 mean for our merchants’ sales? Are they being affected in a positive or negative way, and why? This allowed us to establish a baseline understanding of the situation. While for some projects it might have been possible to stop the analysis here, we needed to go deeper—to be able to predict what might happen next and take the right actions to support our merchants.
3. Predicting And Inferring The Answers To Deeper Questions With More Advanced Analytical Techniques
At this level, the problems start to become more complex. With a strong foundation and clear ability to describe our business, we can start to look forward and offer predictions or inferences as to what we think may happen in the future. We also have the opportunity to start applying more specialized skills to seek out the answers to our questions.
Activities At The Predict / Infer Level
- Statistics
- Causal inference
- Deep dives
- Machine learning—including predictive analytics, anomaly detection, and classification
These questions might include things like: What do we think sales will be like in the future? What do we think caused the adoption of a particular product? Once we have the answers, we can start to explain why certain things are happening—giving us a much clearer picture of our business. We’re also able to start making predictions about what is likely to happen next.
Circling back to our COVID-19 impact analysis, we investigated what was happening globally and conducted statistical analysis to predict how different regions we serve might be affected. An example of the kinds of questions we asked includes: Based on what we see happening to our merchants in Italy as they enter lockdown, what can we predict will happen in the U.S. if they were to do the same? Once we had a good idea of what we thought might happen, we were able to move on to the next level of the pyramid and decide what we wanted to do about it.
4. Using Insights To Prescribe Action
At this level, we’re able to take everything from the underlying levels of the hierarchy to start forming opinions about what we should do as a business based on the information we’ve gathered. Within Shopify, this means offering concrete recommendations internally, as well as providing guidance to our merchants.
- Analytics/analysis
- Machine learning
- A/B testing and experimentation
- Deep dives
When it came to our COVID-19 impact analysis, our research at the lower levels helped provide the insights to pivot our product roadmap and ship products that we knew could support our merchants. For example:
- We observed an increase of businesses coming online due to lockdowns, so we offered an extended 90-day free trial to all new merchants
- Knowing the impact lockdowns would have on businesses financially, we expanded Shopify Capital (our funding program for merchants), then only available in the U.S., to Canada and the UK
- With the increase of online shopping and delays in delivery, we expanded our shipping options, adding local delivery and the option to buy online, pick up in-store
- Observing the trend of consumers looking to support local businesses, we made gift cards available for all Shopify plans and added a new feature to our shopping app, Shop, that made it easier to discover and buy from local merchants
By understanding what was happening in the world and the commerce industry, and how that was impacting our merchants and our business, we were able to take action and create a positive impact—which is what we’ll delve into in our next and final section.
5. Influencing The Direction Of Your Business
This level of the hierarchy is the culmination of the work below and represents all we should strive to achieve in our data science practice. With a strong foundation and a deep understanding of our challenges, we’ve been able to put forward recommendations—and now, as the organization puts our ideas into practice, we start to make an impact.
Activities At The Influence Level
- Analytics
- Machine learning
- Artificial intelligence
- Deep dives
- Whatever it takes!
It’s critical to remember that the most valuable insights don’t necessarily have to come from using the most advanced tools. Any insight can be impactful if it helps us inform a decision, changes the way we view something, or (in our case) helps our merchants.
Our COVID-19 impact analysis didn’t actually involve any artificial intelligence or machine learning, but it nevertheless had wide-reaching positive effects. It helped us support our merchants through a challenging time and ensured that Shopify also continued to thrive. In fact, in 2020, our merchants made a total of $119.6 billion, an increase of 96% over 2019. Our work at all the prior levels ensured that we could make an impact when it mattered most.
Delivering Value At Every Level
In practice, positive influence can occur as a result of output at any level of the hierarchy—not just the very top. The highest level represents something that we should keep in mind as we deliver anything, whether it be a model, tool, data product, report analysis, or something else entirely. The lower levels of the hierarchy enable deeper levels of inquiry, but this doesn’t make them any less valuable on their own.
Using our Data Science Hierarchy of Needs as a guide, we were able to successfully complete our COVID-19 impact analysis. We used the insights we observed and put them into action to support our merchants at the moment they needed them most, and guided Shopify’s overarching business and product strategies through an unprecedented time.
No matter what level in the hierarchy we’re working at, we ensure we’re always asking ourselves about the impact of our work and how it is enabling positive change for Shopify and our merchants. Our Data Science Hierarchy of Needs isn’t a rigid progression—it’s a mindset.
Phillip Rossi is the Head of Expansion Intelligence at Shopify. He leads the teams responsible for using data to inform decision making for Shopify, our merchants, and our partners at scale.
If you’re passionate about solving complex problems at scale, and you’re eager to learn more, we're always hiring! Reach out to us or apply on our careers page.