About a year ago, my friend and I wanted to transition from backend engineering to machine learning engineering. We tried to solve some of the popular Kaggle datasets like credit card fraud detection and found ourselves writing a lot more glue code than machine learning code. Most of our time was spent figuring out how to process data into feature columns, writing lots of TensorFlow boilerplate code, and configuring infrastructure to deploy our models as JSON APIs.

We expected there to be a framework like Rails that would help us build our first machine learning application, but couldn’t find anything that came close, so we decided to build Cortex. The parallels are more philosophical than technical since Cortex is not an MVC framework, but a lot of the same principles apply. Our goal is to create a framework that makes building a machine learning application as simple as Rails makes building a CRUD application. After a year of development I thought it would be interesting to cross reference Cortex with the Rails Doctrine:

Optimize for programmer happiness

Rails has been designed with a similar principle to Principle of Least Surprise (to Matz). The Principle of The Bigger Smile (of DHH), which is just what it says on the tin: APIs designed with great attention paid to whatever would make me smile more and broader. (rubyonrails.org/doctrine)

When we started building Cortex, our core metric was Mean Time to Predictions (MTTP). In other words, we wanted to go from dataset to prediction API as quickly as possible. We started by optimizing MTTP for beginner datasets (i.e. clean, structured) and continued to add functionality to minimize MTTP for more advanced datasets (i.e. large, unbalanced).

Our main effort with Cortex is to hide the complexity of machine learning pipelines behind a simple API while still providing developers with as much freedom as possible to build any application. We also wanted the API to be self explanatory in order to minimize round trips to the documentation. Below is Cortex configuration that ingests a column from the dataset and validates that all data are integers within the range 0 to 10 with no missing values:

- kind: raw_column
  name: my_column
  type: INT_COLUMN
  min: 0
  max: 10
  required: true

And this is all it takes to deploy 3 replicas of a prediction API:

- kind: api
  name: classifier
  model_name: dnn
  compute:
    replicas: 3

If this reminds you of Kubernetes, that’s not an accident. We drew a lot of inspiration from their design, but that’s a topic for another post.

Convention over Configuration

Part of the Rails’ mission is to swing its machete at the thick, and ever growing, jungle of recurring decisions that face developers creating information systems for the web. There are thousands of such decisions that just need to be made once, and if someone else can do it for you, all the better. (rubyonrails.org/doctrine)

Building end-to-end machine learning pipelines is hard. There are many moving parts, and conventions can significantly streamline development. For example, one of the first challenges we encountered was transforming prediction request payloads in the same way the training data was transformed. Cortex handles this automatically, even with transformations that depend on preprocessing a full column to compute an aggregate value (e.g. the mean of a numeric column).

Developers should be able to use the framework without spending hours reading documentation. It should be simple to deploy an application without understanding all of the features, and most configuration should be optional with sensible defaults. For example, model training should default to a 80/20 training/evaluation split, which is reasonable for most applications and can be adjusted if necessary.

The menu is omakase

For programming, the benefits of this practice, letting others assemble your stack, is similar to those we derive from Convention over Configuration, but at a higher level. Where CoC is occupied with how we best use individual frameworks, omakase is concerned with which frameworks, and how they fit together. (rubyonrails.org/doctrine)

There are many open source tools and libraries that can be used to build machine learning pipelines. One of the goals of Cortex is to assemble a stack that fits together in an intuitive way while being flexible enough for a wide variety of applications. The primary components of our stack are Python, TensorFlow, TensorFlow Serving, Spark, Docker, Kubernetes, and Argo. We believe that most developers are better off not having to make all of these decisions by themselves.

Our choices were driven by a focus on designing an API that is simple and familiar, while enabling scale through distributed workloads. Some decisions were obvious, while others required research. Not supporting PyTorch or Apache Beam is not an oversight, but a reaffirmation of our focus on providing a better development experience by tightly integrating a narrow stack.

The most difficult decisions tend not to revolve around what to include, but what to leave out. Notebooks are a notable absence in our stack because we found them to be unnecessary overhead.

No one paradigm

[Rails] isn’t a single, perfect cut of cloth. It’s a quilt. A composite of many different ideas and even paradigms. Many that would usually be seen in conflict, if contrasted alone and one by one. But that’s not what we’re trying to do. It isn’t a single championship of superior ideas where a sole winner must be declared. (rubyonrails.org/doctrine)

Cortex aims to simplify the experience of leveraging popular machine learning tools and libraries effectively without overriding their APIs. Like Rails, Cortex embraces multiple paradigms, namely TensorFlow and PySpark.

Both TensorFlow and PySpark can be used to transform data in preparation for training. Spark transformations are generally more efficient on large scale datasets, but TensorFlow transformations may be more relevant for certain applications.

We encourage beginners to gain experience with these tools in an iterative fashion by making it as easy as possible to get a Hello World application up and running quickly.

Exalt beautiful code

We write code not just to be understood by the computer or other programmers, but to bask in the warm glow of beauty. Aesthetically pleasing code is a value unto itself and should be pursued with vigor. That doesn’t mean that beautiful code always trumps other concerns, but it should have a full seat at the table of priorities. (rubyonrails.org/doctrine)

TensorFlow Estimators are beautiful. While the code may not be as poetic as some Ruby expressions, its power is immense. This is all the TensorFlow you need to build a neural network that classifies iris flowers based on 4 attributes:

import tensorflow as tf


def create_estimator(run_config, model_config):
    feature_columns = [
        tf.feature_column.numeric_column("sepal_length_normalized"),
        tf.feature_column.numeric_column("sepal_width_normalized"),
        tf.feature_column.numeric_column("petal_length_normalized"),
        tf.feature_column.numeric_column("petal_width_normalized"),
    ]

    return tf.estimator.DNNClassifier(
        feature_columns=feature_columns,
        hidden_units=[8, 4],
        n_classes=3,
        config=run_config,
    )

Estimators help with model training, evaluation, and exporting. We built Cortex around Estimators because it enables us to provide a lot of automation around the execution of training workloads without requiring developers to modify their code. Moreover, Estimators are simple to share which further reduces the barrier to entry for new developers.

Provide sharp knives

Ruby on Rails is an environment for chefs and those who wish to become chefs. You might start out doing the dishes, but you can work your way up to running the kitchen. Don’t let anyone tell you that you can’t be trusted with the best tool in the trade as part of that journey. (rubyonrails.org/doctrine)

Deep learning is complicated under the hood, but that doesn’t mean that developers shouldn’t be trusted with it. It is certainly possible to build useful applications without understanding softmax activation functions or the mathematics of backpropagation.

Once a machine learning application is running end-to-end it becomes a lot less intimidating to dive deeper into writing custom TensorFlow Estimators or building more sophisticated pipelines. Beginners can use Cortex without understanding anything but the basics of machine learning, and advanced practitioners can customize almost every piece of the pipeline.

Value integrated systems

Rails specifically seeks to equip generalist individuals to make these full systems. Its purpose is not to segregate specialists into small niches and then require whole teams of such in order to build anything of enduring value. (rubyonrails.org/doctrine)

It may be tempting to focus on one part of the machine learning stack (e.g. infrastructure for serving natural language models which optimizes inference latency or infrastructure for highly distributed model training). While these capabilities are critical for some applications, most applications don’t require peak computing performance. It’s more important to enable efficient and reliable end-to-end workflow because handoffs across siloed systems could have a devastating impact on productivity. Data preparation, model training, and prediction serving should be abstracted in such a way that a developer has a unified programming model across the entire pipeline.

An integrated system empowers developers to build applications faster. If the framework focused exclusively on prediction serving, developers would still have to cobble together tooling to prepare data and train models. Alternatively, a framework focused on data preparation would speed up model development, but increase the friction of deploying the model as an API. Applications that run end-to-end provide value sooner and can be improved iteratively.

Progress over stability

Likewise, it’s why it’s so important for us to continue to welcome and encourage new members of the community. We need fresh blood and fresh ideas to make better progress. (rubyonrails.org/doctrine)

We believe machine learning is here to stay, but besides that, not much is certain. Recently, deep learning has been gaining traction in industry but we already know that reinforcement learning is coming. Moreover, increasingly powerful data streaming engines are beginning to change how we build data infrastructure.

The machine learning engineering field is in its early stages and, like web development, will continue to evolve rapidly. Cortex is designed to adapt, and it depends on community involvement to ensure that it continues to meet the needs of developers over time.

Push up a big tent

You never know when the next person who starts just fixing a misspelling in the documentation ends up implementing the next great feature. But you stand a chance to find out if you smile and say thank you for whatever small contribution that gets the motivation flowing. (rubyonrails.org/doctrine)

Cortex is open source and we love feedback and contributions. We are investing in lowering the barrier to entry to building machine learning pipelines so that it becomes accessible to any software developer. That means that we spend a lot of time on our documentation, examples, and APIs. We are also investing in creating abstractions that will make it easier for more advanced users to share custom PySpark code for data transformations and custom TensorFlow Estimators for model training.

We’ve already benefited from the diversity of perspective. Our current team is composed of engineers with different backgrounds including DevOps, data engineering, machine learning research, and web development. We believe machine learning engineering is a movement that all developers can be part of.