Internal architecture

The current architecture is split into 3 main parts: The Ryax micro-services, the external services, and the infrastructure.

Overview

Architecture diagram

We choose the following architecture so that our code can evolve quickly while keeping good quality.

Micro-services are designed to manage one functional scope only. They do not share common libraries and avoid coupling to other micro-services as much as possible. They are following the Domain-Driven Design principles and more especially the guidelines provided by the Architechture Patterns with Python Book.

They communicate through the broker with each others using a specific message for each interaction (no reuse of data structure to avoid coupling). The definition of messages is defined in the Protobuf format. The definition of these messages is owned by the service that produces this message. Because those services are not sharing code if the inter-services message definition changes it has to be copied from the owner code base to the consumers.

Each service’s data share nothing with other services. Thus, if the data schema evolves, the service manages its data migration (or retro compatibility) itself.

Interfaces of micro-services

HTTP API

Most of these APIs are accessible by users and documented with OpenAPI. We try to limit the use of these APIs between microservices as it creates a dependency.

gRPC

HTTP suffers from multiple limitations (no heartbeats, uni-directional communication, limit in size, limit in response time). Thus, we use gRPC between runners and the actions.

RabbitMQ+protobuf

It is used for asynchronous events.

To manage these protocols more easily we use this set of rules:

  • Emitters are responsible for the definition of the Protobuf.

  • Emitters define up to the exchange (included).

  • Consumers define from the exchange (included) to their queue.

Thus, we need a way to define and share easily RMQ exchanges (and their parameters) from one microservice to another. For now, the microservice that publishes the message is responsible for the .proto definition. These files are copied directly into the code base of the receiving microservice.

Ryax code: our microservices

A common architecture

These micro-services are internally divided into layers. The Domain layer defines business objects; ie. objects that also exist for users of the application (“PostgreSQL” is not such an object, however a “database” might, or a “candy” in an app about candies). The Application layer holds the brain of the App. As for animals, the brain can think using abstractions (objects of our Domain layer) but cannot concretely do things by itself. Finally, the Infrastructure layer is the “arms and legs” of our App. It is the part that interacts with the external world (the HTTP API, the database, the filesystem, the servomotors, …).

Domain

The Domain layer represents the model of the services. This is the core of the services and must be able to evolve faster. This layer depends on no other layer (following the dependency inversion principle) and imports no external libraries: unless justified exceptions, it is only raw python code.

A domain is a dataclass defining a business object. Most of the methods of these dataclasses are helpers to manipulate the dataclass state.

Some of these classes are abstract classes that will be implemented by classes that are in the infrastructure layer.

Methods of these classes can return Domain objects, states (“something went wrong”, “no problem here”, “only steps 1 and 3 worked”…), or nothing.

The general rule is to put as much stuff possible there.

Application

The Application layer contains all the services provided by the application using the Domain structures and the Infrastructure as a backend.

These Application services “orchestrate” the Domain structures and the Infrastructure services so that they work together.

Application data should not be modified here; it is the job of the methods of the classes of the Domain layer. Thus, you might find this kind of code in a domain class:

from myapp.domain.a_domain_object import ADomainObject

..

class SuperService:
	..
  def update_me(self, object_id, new_data):
  	object: ADomainObject = self.unit_of_work.get_object(object_id)
    try:
	    object.update_this(new_data)
    except ..

As mentioned before, no data is directly modified here. However, we catch exceptions and use object methods to apply the right business rules.

Infratructure

Manage all the interactions with external systems like databases, file systems, networks, APIs, etc.

These services act as “wrappers” around external dependencies so that they can be used within the Domain layer. These wrappers should not return data (as it would mean that some data will be managed in the domain, which is forbidden). Most of the time they return nothing, or they return how well the command performs. For example, “something went wrong”, “no problem here”, “only steps 1 and 3 worked”…

The micro-services

This section contains the list of all Ryax services.

If a module is accessible by users, the accessible endpoints require to be protected using the authentification microservice.

Authorization

This service manages the users, the projects, and the authorization associated with them. It is a critical part of the infrastructure because it is called by other services for each user HTTP request to verify the authorization of the current user for the current project.

It is also used for login and user and project management so it exposes an HTTP API to the user.

It uses the Postgres datastore to persist its state using an ORM.

Description: Manage user authorization and projects in Ryax. It is called by all other user-facing services to get the current project and user authorization in this project.

Responsibilities:

  • authentification

  • user management (through projects)

Accessible by users: Yes

Runner

The Runner service is the core of the Ryax tool because it is the execution engine for the user’s workflows. It gets workflow deployment and undeployment orders from the Studio. It also provides feedback and history on the workflow runs through an HTTP API.

It uses the Postgres datastore to persist its state using an ORM.

Description: Make the interface between Ryax and the computing resources to launch modules and run workflows. Thus, it does the deploying, running, and scheduling of workflows. It also manages the execution metadata and data.

https://gitlab.com/ryax-tech/dev/backend/runner

This micro-service speaks gRPC (with the modules), RMQ with Studio, and HTTP for standalone usage.

Responsibilities:

  • deployment (manage the infrastructure on which the modules will be executed)

    • Get computing resources

    • Deploy/undeploy modules

  • manage the workflow execution:

    • Get new execution triggers.

    • Fetch/push execution data from/to the filestore (Minio)

    • Push data to modules so that they create executions

  • scheduling

    • scale the infrastructure

    • scale modules

    • Trigger executions (if necessary)

    • Store waiting executions, and has an algorithm to decide which execution to start first.

  • archiving

    • Holds and manages execution metadata

    • keep track of all executions and expose an API for querying execution metadata

    • keep track of the workflow deployment states

Accessible by users: Yes

Repository

Description: The Repository service allows the Ryax users to scan Git repositories and import Ryax modules. It also permits to trigger actions build through the Action Builder. Once the build is finished, the modules are sent to the Studio to be placed in the Module Store.

It exposes an HTTP Rest API that is used by the WebUI and the CLI.

It uses the Postgres datastore to persist its state using an ORM.

Responsibilities:

  • Manage repositories

  • Command the module builder to build new modules

  • detect some module errors

Accessible by users: Yes

Studio

The Studio service permits Ryax users to create, update and deploy workflows.

It exposes an HTTP Rest API that is used by the WebUI and the CLI.

It uses the Postgres datastore to persist its state using an ORM.

Description: The Studio service permits Ryax users to create and deploy workflows.

Responsibilities:

  • Create and edit workflows

Accessible by users: Yes

Action Builder

Description: This service is stateless. It only receives module build orders from the Repository service and then does the build sequentially and synchronously (one at a time).

It depends on the module wrapper to build actions.

Responsibilities:

  • build modules and push them to the registry.

Accessible by users: No

WebUI

Description: An NGINX server that serves our frontend written with Angular.

Angular is based on the CQRS (Command Query Responsibility Segregation) with reactive programming. The application is split into DDD bounded contexts (BC). BC contains Domains that include business logic, Components for UI, Features, and shells. Shells are entry points to the BC: It maps features to HTTP endpoints.

The external services

They are used as infrastructure for message passing and storage.

External: Datastore (PostgreSQL)

Description: It is a PostgreSQL database that stores the state of all stateful services. Each service has a different access credential and a separate database.

Responsibilities:

Accessible by users: No

Filestore (Minio)

Description: It is a Minio file storage service that exposes an S3-compatible API. It stores execution I/O files and directories.

Responsibilities:

  • store and serve execution IO

Accessible by users: No

Broker (RabbitMQ)

Description: This is a RabbitMQ message broker. It enables internal communication between all services using messages serialized in Protobuf.

Responsibilities:

Accessible by users: No

Registry (docker registry)

Description: A container registry where the users’ modules are stored. It is populated by the module builder and the images are pulled from Kubernetes at deployment time.

Responsibilities:

  • store and serve containers of modules

Accessible by users: No

Kubernetes

Description:

Responsibilities:

  • run all Ryax services (except external Runners and modules)

  • abstract underlying infrastructure

Accessible by users: No

Tools and other

ADM

Description: Our administration tool for Ryax.

Responsibilities:

  • deploy a ryax instance

  • backup/restore a ryax instance

  • update a ryax instance

Accessible by users: No, only tech admins.

SDK and CLI

An SDK is automatically built using the swagger definition. This SDK is then used to implement a command-line interface (CLI).

Action wrapper

Description: A small bit of code between our system and the user code to be able to run it.

This is not an internal Ryax service but a wrapper that is put around user code to communicate with Ryax. It creates a gRPC server with a simple interface that initializes the module and then runs executions. This wrapper works for both processors and source modules with the same protocol: the source modules are streaming execution responses, while the processors are only sending one response and closing the connection.

Responsibilities:

  • Run user code when asked to

  • prepare data to be able to run user code

  • get the resulting data and push it back

  • clean the system so that it can start a new computation without old data being around.

Accessible by users: Not through an API, but the user code is tightly coupled to this.

Infrastructure

Our infrastructure is based on Kubernetes and every service is deployed with it.

Our monitoring stack is based on Prometheus for metrics, Loki for the logs aggregation, and Grafana for querying data and visualization.

The Ingress Network is managed by Traefik which routes the requests depending on the path prefix, e.g. /runner goes to the Runner service