Python modules

Python is one of the most used programming language for data analysis in the world. Ryax supports modules written in python 3.

Here we will describe what a ryax python module is, what it consists of, as well as some technical functionalities made available.

General

A module requires at least 2 files: the python code in a .py file, and the Ryax metadata file: ryax_metadata.yaml. If the module has some external dependencies (ex: pandas, tensorflow), it needs a standard requirements.txt file with those dependencies listed on separate lines in plain text. You can also add a logo to your module, to be more easily seen in the webui. This logo must be at most 250x250px, and be in either jpg or png format.

Dependencies

In a typical python3 program, you might need to install some dependencies for your program to work. The common approach is to have a file requirements.txt with a list of all modules needed. Ryax will look at this file in the root of the module to know the external python libraries that are required by your code. If your code does not have any external dependencies, you can omit this file.

Windows users must be careful because windows sometimes adds a .txt resulting in a requirements.txt.txt.

Metadata file

To describe your module, you need a ryax_metadata.yaml file that contains a high-level description of your module and the inputs/outputs. The file ryax_metadata.yaml follow the YAML standard that is equivalent to json in many aspects.

Here is an example:

apiVersion: "ryax.tech/v1"
kind: Functions
spec:
  id: tfdetection
  human_name: Tag objects
  detail: "Tag detected objects on images using Tensorflow"
  type: python3
  version: "1.0"
  logo: mylogo.png
  inputs:
  - help: Model name
    human_name: Model name
    name: model
    type: enum
    enum_values:
     - ssdlite_mobilenet_v2_coco_2018_05_09
     - mask_rcnn_inception_v2_coco_2018_01_28
  - help: An image to be tagged; in any format accepted by OpenCV
    human_name: Image
    name: image
    type: file
  outputs:
  - help: Path of the tagged image
    human_name: Tagged image
    name: tagged_image
    type: file

Some explanation on the fileds:

  • kind to tell the kind of the module: a source, a processor or, a publisher?

  • id unique name of the module

  • version unique version of the module

  • human_name a short name, readable by a human

  • detail a description of the module

  • type the programming language

  • logo optional, a relative path to a logo file.

inputs and outputs contain the list of all the inputs and outputs of this module. Here, the name is the name in your python program, while the human_name is to display in the webUI.

Module’s code

Ryax know how to call your code by importing a python code and starting a specific function in it.

Depending on the kind of the module, we use 2 different filenames:

  • handler.py for processors and publishers

  • run.py for sources

When creating a module, Ryax will copy all the files that are in the module directory. Thus, if you split your code in multiple files and use some resource file, thy will be copied to into the module.

Processors and publishers

The handler.py file should contain a handle method that takes a dict as first parameter and returns a dict, a list of dict, or nothing (None).

def handle(request):
    a = request["an_integer_input"]
    a = a + 42
    return {
        "output1": "value",
        "output2": a,
        }

By looking at this code, we can guess that this modules has at least an input called an_integer_input (an integer), and has 2 outputs: output1 and output2. Ryax should know about these by describing them in the ryax_metadata.yaml file.

If the function returns a dict, it should contains an entry for all outputs of the module, with a valid value.

If the function returns a list of dict, the dicts should contains an entry for all outputs of the module, with a valid value.

If the list does not have any element, it means that the workflow will end here. You can use this feature to create filters, that stops the execution of the workflow under specific conditions.

If the list has more than one element, it means that the following modules in the workflow will execute each element independently. For example, this feature can be used to analyze an image faster by spliting the input image into several smaller images.

Filtering and Scaling

Filtering and scaling are two useful workflow abstractions in Ryax that give users execution control and save them lines of code.

Filtering

“Filtering” a workflow occurs at the end of a module, and means that none of the downstream modules will be executed. If other modules are executing and are not downstream of this module, they will still run normally (for example if the workflow has a fork).

However, if the workflow is set up to run every so often, or upon some event, it will still start and execute modules normally. If a filtering event is found, it will stop executions and wait until the next time the workflow is designated to start.

To filter a workflow from a python module, your code should execute the following in the handle() function: return [ ].

Returning an empty python list will designate to Ryax that a filtering scenario has occurred and no downstream modules from the module which returned the empty array will be executed.

Scaling

“Scaling” is an abstraction that allows users to execute a given module more than once, without explicitly having to do so in their code.

To trigger a scaling scenario, your python module should execute the following in the handle() function: return [{...}, {...}, {...}]

In the above, instead of returning a dictionary containing the outputs desgnated in the ryax_metadata.yaml file, the user is returning 3 dictionaries. These dictionaries should all have the same keys as specified in the outputs of the metadata file.

When Ryax receives a list of valid output dictionaries from a module, the modules downstream that use these outputs will execute N times, where N is the length of the list (3 in the above example), and use each of the different outputs.

Sources

run.py is the file to write your python code for a source. Your source module code must implement a class that inherits from RyaxGateway. Your class is called on main and should have a handler(self) function as below. Note that function handler is asynchronous and has no input parameter besides the class reference.

import asyncio
from ryax_gateway.gateway import RyaxGateway, main

class TestGateway(RyaxGateway):
    async def handler(self):
        while True:
            text = self.inputs_values["input1"]
            await self.send_execution({"output1": "hello "+text})
            await asyncio.sleep(1)

if __name__ == "__main__":
    main(TestGateway)

In this particular case, we are creating an execution every second. We can guess that this module has 1 input called input1 and 1 output called output1. Ryax should know about these by describing them in the ryax_metadata.yaml file.