Skip to content

Good practices and tips

Here below are some good practices to make sure that your Daisies perform at best:

Resource/Usage Limits

There are limits to the maximum amount of usage of a Daisi. These currently are:

  • CPU Usage: 1800 seconds of CPU time per execution
  • Memory Usage: 8GB of memory consumed per execution
  • Disk Usage: 80,000 open files per execution

These limits are intnetionally set high enough to enable nearly all practical use-cases for Daisies, but if very large deep learning models are used or significant data is consumed / produced, you may need to adjust your Daisi code to satisfy these constraints.

Testing and debugging

The general rule is that your code should first work as expected in your workspace, before being pushed to the Daisi platform. Even if it calls / remixes other Daisies, you should be able to test it locally since you can invoke these Daisies from anywhere.

If it has a Streamlit front end, you should be able to run it in your workspace with streamlit run <your Python script> before pushing it in the platform.

If it connects to APIs, you should first test these connections in your workspace. If it doesn't work locally, there is no reason that it will perform better in the Daisi platform.

Imports and global variables

When a Daisi is started as a service for execution, the Daisi platform will load imports and global variables in memory for the lifespan of the service (10 minutes for now). So any loading or import wich can be time consuming (data import, a large ML model, etc...) should be done globally, so it will not be loaded at each execution.

For instance, instead of doing this:

from tensorflow import keras

def predict(input_data):
    # The model is loaded inside a function, so it will be reloaded
    # at each execution of the predict() endpoint.
    my_ml_model = keras.models.load_model("path to the model")
    result = my_ml_model.predict(input_data)
    return result

Do this:

from tensorflow import keras

# The model will be loaded when the service starts and will be kept in memory
# for subsequent executions during the service lifespan (10 minutes).
my_ml_model = keras.models.load_model("path to the model")

def predict(input_data):
    result = my_ml_model.predict(input_data)
    return result

Streamlit apps

Similarly, if the Streamlit components are called in the main script and not in a function, these calls will be made each time the service load. They will be probably unconsequential but at the same time they are unecessary and can slow down the service start time.

So it is a good practice to wrap all the Streamlit calls inside a function, and call this function in a if __name__ == "__main__" block.

For instance, instead of doing this:

import streamlit as st

#There is a my_func() endpoint for this Daisi
def my_func(name = "World):
    return f"Hello {name}"

# These statments will be called when loading the service
st.title("My app")
name = st.text_input("Enter your name")
st.write(my_func(name))

Do this:

import streamlit as st

def my_func(name = "World):
    return f"Hello {name}"

#Streamlit calls inside a function
def st_ui():
    st.title("My app")
    st.text_input("Enter your name")
    st.write(my_func(param))

# This block is ignored when loading the service
# but will be processed when launching the Streamlit app
if __name__ == "__main__":
    st_ui()