Build a dual-mode Serverless worker
Create a flexible Serverless worker that supports a Pod-first development workflow.
Developing machine learning and AI applications often requires powerful GPUs, making local development of API endpoints challenging. A typical development workflow for Serverless would be to write your handler code, deploy it directly to a Serverless endpoint, send endpoint requests to test, debug using worker logs, and repeat.
This can have signifcant drawbacks, such as:
- Slow iteration: Each deployment requires a new build and test cycle, which can be time-consuming.
- Limited visibility: Logs and errors are not always easy to debug, especially when running in a remote environment.
- Resource constraints: Your local machine may not have the necessary resources to test your application.
This tutorial shows how to build a “Pod-first” development environment: creating a flexible, dual-mode Docker image that can be deployed as either a Pod or a Serverless worker.
Using this method, you’ll leverage a Pod—a GPU instance ideal for interactive development, with tools like Jupyter Notebooks and direct IDE integration—as your cloud-based development machine. The Pod will be deployed with a flexible Docker base, allowing the same container image to be seamlessly deployed to a Serverless endpoint.
This workflow lets you develop and thoroughly test your application using a containerized Pod environment, ensuring it works correctly. Then, when you’re ready to deploy to production, you can deploy it instantly to Serverless.
Follow the steps below to create a worker image that leverages this flexibility, allowing for faster iteration and more robust deployments.
To get a basic dual-mode worker up and running immediately, you can clone this repository and use it as a base.
What you’ll learn
In this tutorial you’ll learn how to:
- Set up a project for a dual-mode Serverless worker.
- Create a handler file (
handler.py
) that adapts its behavior based on a user-specified environment variable. - Write a startup script (
start.sh
) to manage different operational modes. - Build a Docker image designed for flexibility.
- Understand and utilize the “Pod-first” development workflow.
- Deploy and test your worker in both Pod and Serverless environments.
Requirements
- You’ve created a RunPod account.
- You’ve installed Python 3.x and Docker on your local machine and configured them for your command line.
- Basic understanding of Docker concepts and shell scripting.
Step 1: Set up your project structure
First, create a directory for your project and the necessary files.
Open your terminal and run the following commands:
This creates:
handler.py
: Your Python script with the RunPod handler logic.start.sh
: A shell script that will be the entrypoint for your Docker container.Dockerfile
: Instructions to build your Docker image.requirements.txt
: A file to list Python dependencies.
Step 2: Create the handler.py
file
This Python script will contain your core logic. It will check for a user-specified environment variable MODE_TO_RUN
to determine whether to run in Pod or Serverless mode.
Add the following code to handler.py
:
Key features:
MODE_TO_RUN = os.getenv("MODE_TO_RUN", "pod")
: Reads the mode from an environment variable, defaulting topod
.async def handler(event)
: Your core logic. It’s anasync
function as required byrunpod.serverless.start
.if __name__ == '__main__':
: This block controls what happens when the script is executed directly.- In
serverless
” mode, it starts the RunPod Serverless worker. - In
pod
mode, it runs a sample test call to yourhandler
function, allowing for quick iteration.
- In
Step 3: Create the start.sh
script
This script will be the entrypoint for your Docker container. It reads the MODE_TO_RUN
environment variable and configures the container accordingly.
Add the following code to start.sh
:
Key features:
case $MODE_TO_RUN in ... esac
: This structure directs the startup based on the mode.serverless
mode: Executeshandler.py
, which then starts the RunPod Serverless worker.exec
replaces the shell process with the Python process.pod
mode: Prints messages indicating it’s ready for development. It then runssleep infinity
to keep the container alive so you can connect to it (e.g., via SSH ordocker exec
). You would then manually runpython /app/handler.py
inside the Pod to test your handler logic.set -e
: Ensures the script exits if any command fails.
Step 4: Create the Dockerfile
This file defines how to build your Docker image.
Add the following content to Dockerfile
:
Key features:
FROM python:3.10-slim
: Starts with a lightweight Python image.WORKDIR /app
: Sets the current directory inside the container.COPY requirements.txt .
andRUN pip install ...
: Installs Python dependencies.COPY handler.py .
andCOPY start.sh .
: Copies your application files.ENV MODE_TO_RUN="pod"
: Sets the default operational mode to “Pod”. This can be overridden at runtime.RUN chmod +x /app/start.sh
: Makes your startup script executable.CMD ["/app/start.sh"]
: Specifiesstart.sh
as the command to run when the container starts.
Step 5: Build and push your Docker image
Instead of building and pushing your image via Docker Hub, you can also deploy your worker from a GitHub repository.
Now, build your Docker image and push it to a container registry like Docker Hub.
Build your Docker image
Build your Docker image, replacing [YOUR_USERNAME]
with your Docker Hub username and choosing a suitable image name:
The --platform linux/amd64
flag is important for compatibility with RunPod’s infrastructure.
Push the image to your container registry
You might need to run docker login
first.
Step 6: Testing in Pod mode
Now that you’ve finished building our Docker image, let’s explore how you would use the Pod-first development workflow in practice.
You can run your container locally with Docker:
Or, deploy the image to a Pod:
- Go to the Pods page in the RunPod console and click Create Pod.
- Choose a GPU.
- For “Docker Image Name”, enter
[YOUR_USERNAME]/dual-mode-worker:latest
. - Under Pod Template, select Edit Template.
- Under Public Environment Variables, select Add environment variable. Set variable key to
MODE_TO_RUN
and the value topod
. - Select Set Overrides, then deploy your Pod.
After connecting to the Pod, navigate to /app
and run your handler directly:
This will execute the Pod-specific test harness in your handler.py
, giving you immediate feedback. You can edit handler.py
within the Pod and re-run it for rapid iteration.
Step 7: Deploy to a Serverless endpoint
Once you’re confident with your handler.py
logic tested in Pod mode, you’re ready to deploy your dual-mode worker to a Serverless endpoint.
- Go to the Serverless section of the RunPod console.
- Click New Endpoint.
- Under Custom Source, select Docker Image, then select Next.
- In the Container Image field, enter your Docker image URL:
docker.io/[YOUR_USERNAME]/dual-mode-worker:latest
. - Under Advanced Settings > Environment Variables, set
MODE_TO_RUN
toserverless
. - Configure GPU, workers, and other settings as needed.
- Select Create Endpoint.
The same image is used, but start.sh
will now direct it to run in Serverless mode, starting the runpod.serverless.start
worker.
Step 8: Test your endpoint
After deploying your endpoint in to Serverless mode, you can test it with the following steps:
- Navigate to your endpoint’s detail page in the RunPod console.
- Click the Requests tab.
- Use the following JSON as test input:
- Click Run.
After a few moments for initialization and processing, you should see output similar to this:
Explore the Pod-first development workflow
Congratulations! You’ve successfully built, deployed, and tested a dual-mode Serverless worker. Now, let’s explore the recommended iteration process for a Pod-first development workflow:
Develop using Pod mode
- Deploy your initial Docker image to a RunPod Pod, ensuring
MODE_TO_RUN
is set topod
(or rely on the Dockerfile default). - Connect to your Pod (via SSH or web terminal).
- Navigate to the
/app
directory. - As you develop, install any necessary Python packages (
pip install [PACKAGE_NAME]
) or system dependencies (apt-get install [PACKAGE_NAME]
). - Iterate on your
handler.py
script. Test your changes frequently by runningpython handler.py
directly in the Pod’s terminal. This will execute the test harness you defined in theelif MODE_TO_RUN == "pod":
block, giving you immediate feedback.
Update your Docker image
Once you’re satisfied with a set of changes and have new dependencies:
- Add new Python packages to your
requirements.txt
file. - Add system installation commands (e.g.,
RUN apt-get update && apt-get install -y [PACKAGE_NAME]
) to yourDockerfile
. - Ensure your updated
handler.py
is saved.
Deploy and test in Serverless mode
- Deploy your worker image to a Serverless endpoint using Docker Hub or GitHub.
- During deployment, ensure that the
MODE_TO_RUN
environment variable for the endpoint is set toserverless
.
For instructions on how to set environment variables during deployment, see Manage endpoints.
- After your endpoint is deployed, you can test it by sending API requests.
This iterative loop—write your handler, update the Docker image, test in Pod mode, then deploy to Serverless—allows for rapid development and debugging of your Serverless workers.
Next steps
Now that you’ve mastered the dual-mode development workflow, you can: