Workflows, Modules and executions

Ryax is more than just a tool, it is a complete framework to develop, deploy, and maintain data driven applications. To do so, Ryax proposes a paradigm, to fully understand the framework we must go through the concepts of workflows, modules, stream operators, and executions. All these are explained in details through this section.

Workflow overview

The boxes are called modules, which are unit of computations that from some inputs produces some outputs. The arrows are the links between modules. They constitute a data stream where the data flow between each module. Modules linked together are called a workflow.

Workflows:

  • a set of modules all linked together

  • without any loops ; they are Directed Acyclic Graphs

  • links are data streams. A module uses some data from the input stream and adds its output data to the stream. This way, the data that a module outputs is accessible to every downstream module. In other words, any module has access to the data of upstream modules.

There is 4 types of modules:

  • Sources. They are the modules that ingest data from the outside world. They are long running processes triggering new workflow executions. For example, a source can be triggered every day at 6pm or every time that an email arrives in a given mailbox.

  • Processors. These are stateless processes that basically compute things. They ingest data from upstream through their inputs and produce outputs added to the down stream.

  • Stream Operators. Manipulating data streams can be complex. Ryax comes with multiple stream operators to simplify this process. For example, you may merge multiple data streams together or buffer data at any given point.

  • Publishers. Ryax is neither a database nor an archiving tool, so you need to push the resulting data to wherever you may need it. Publishers do exactly that: these are stateless processes that push data to external services, like a database or an online service for example.

A module computing some data is called an execution. In Ryax, an execution hold everything related to it: the input data, the output data, the logs…

Architecturing a Ryax workflow

To architecture a Ryax workflow, answer to these questions:

  1. What triggers the workflows? Many events in the outside world may trigger a Ryax worflow. It can be a mail, a new file in a file share, a form, a new contact in a CRM or some IoT data, for example.

  2. Who needs the results of the workflows? What are the tools they can access? For instance, a salesperson may need this data to be attached to the company profiles in a CRM, or predictive maintenance results may need to be accessed by managers to plan actions and by workers to perform the recommended actions.

  3. What data do you need to be able to run the required computations? Where do you gather it? In what formats do they come?

  4. Are my computations done in several steps? Are there some steps that I can reuse in other workflows? Can I salvage modules from prior workflows?

The first question, helps define sources. The second one is for publishers. Finally, 3 and 4 help defining re-usable processors to prepare data and analyse it.