What is ZenML?
It is a cloud and tool-agnostic open-source MLOPS framework that can be used to create a portable, production ready MLOps pipelines. It consists of following core components that you would need to know to get started.
- Steps
- Pipelines
- Stack
- Stack Components
What is a ZenML Step?
Step is an atomic components of a ZenML Pipeline. Each Step is well defined to take some input, apply some logic on it and give an output. An example of a simple step could be as follows:
from zenml.steps import step, Output
@step
def step_one() -> Output(output_a=int, output_b=int):
"""This Step returns a predefined values for a and b"""
return 5, 12
Let’s define another step that takes two values as input and returns a sum as output.
from zenml.steps import step, Output
@step
def step_two(input_a: int, input_b: int) -> Output(output_sum=int):
"""Step that add the inputs and returns a sum"""
return input_a + input_b
Note:
You can run a step function by itself by calling .entrypoint() method with the same input parameters. For example:step_two.entrypoint(input_a = 6, input_b = 10)
What is a ZenML Pipeline?
A Pipeline consists of a series of Steps, organized in any order as per your usecase. It is used to simply route the outputs through the steps. For example:
from zenml.pipelines import pipeline
@pipeline
def pipeline_one(step_1,
step_2 ):
output_a, output_b = step_one()
output_sum = step_two(output_a, output_b)
After you define your pipeline you can instantiate and run your pipeline by calling:
pipeline_one(step_1 = step_one(), step_2 = step_two()).run()
You should see an output similar to this in your command line:
Creating run for pipeline: `pipeline_one`
Cache disabled for pipeline `pipeline_one`
Using stack `default` to run pipeline `pipeline_one`
Step `step_one` has started.
Step `step_one` has finished in 0.010s.
Step `step_two` has started.
Step `step_two` has finished in 0.012s.
Pipeline run `pipeline_one-20_Feb_23-13_11_20_456832` has finished in 0.152s.
You can learn more about pipelines here .
What is a ZenML Stack?
A stack is a set of configurations for your infrastructure on how to run your pipeline. For example if you want to run your pipeline locally or on a cloud. ZenML uses a default stack that runs your pipeline and stores the artifacts locally, if nothing is defined by the user.
What are the Components of a Stack?
A Stack Component is responsible for one specific task of an ML workflow. Consists mainly of two main groups:
- Orchestrator, responsible for the execution of the steps within the pipeline.
- Artifact Store, responsible for storing the artifacts generated by the pipeline.
Remember, for any of the stack components you need to first register the stack component to the respective component group and then set the registered stack as active to use it in the current run. For example if you want to use an S3 bucket as your artifact storage, then you need to first register the S3 bucket with the artifact-store with a stack name and then set the stack name as active. You can learn more about how to do this from here .