Build a Simple Dataset Web App on EMBL Cloud Infrastructure

Learn to create and deploy a web application using Python, Streamlit, Docker/Podman, and Kubernetes on EMBL cloud computing infrastructure

Stable: This tutorial is considered complete and reliable.

Overview

Time estimation: 2H

Version: 1.0

Last update: 2026-02-03

Questions:

How do I create a simple web application for data visualization?

How do I containerize an application using Docker or Podman?

How do I deploy a web application on Kubernetes cloud infrastructure?

What are the basic components needed for cloud deployment?

Objectives:

Create a simple Python web app using Streamlit

Build a container image for your application

Deploy the application on EMBL Kubernetes infrastructure

Access your web app through a public URL

People hiking in a row on the ice of Perito Moreno glacier, Los Glaciares national park, Santa Cruz province, Patagonia Argentina

Introduction

In this tutorial we will create a simple web app for a toy dataset in Python, using a CSV file as the dataset, then we will run this web app on EMBL infrastructure so it will be available online.

The goal of this tutorial is to give you the basics of making a simple webapp that can run on cloud! It’s not an in-depth tutorial, it’s just enough for you to get an intuition of what is possible and hopefully enough to get inspired to learn how to make a real academic grade webapp.

It only uses 6 files and will do the bare minimum, but these steps can be the backbone of any cloud application.

In this tutorial, we will cover:

Introduction

Prerequisites

Create Your Git Project

Set Up Your Local Environment

Create the Application Files

Test the App Locally

Containerize Your Application

Create the Containerfile

Build and Test the Container

Upload the Image to Git Registry

Deploy to Kubernetes

Set Up Kubernetes Access

Create the Deployment Configuration

Deploy and Test

Conclusion

Prerequisites

Before starting this tutorial, you will need:

A computer with bash, git, and Python installed
Access to EMBL infrastructure (VPN if working remotely)
Basic familiarity with the command line
An EMBL account

Installing required software
If bash, git, or Python are not installed on your system, try this installation guide.

Create Your Git Project

Version control is essential for tracking changes and sharing your code. We’ll start by setting up a Git repository.

Set up Git repository

Go to https://git.embl.de/

Click New project

Select Create blank project

Configure your project:

Project name: my-data-dashboard

Visibility level: Public

Why public visibility?
The repository needs to be public so Kubernetes can access your container image later in the tutorial.

Click Create project

Set Up Your Local Environment

Now we’ll clone the repository to your computer and set up the project structure.

Clone and configure the project locally
Create or navigate to your projects folder:
Bash
cd ~/projects
# Or create it if it doesn't exist
mkdir -p ~/projects
cd ~/projects
Clone your repository:
Bash
git clone https://git.embl.de/<username>/my-data-dashboard.git
cd my-data-dashboard
Replace <username> with your EMBL username.

Create the Application Files

We’ll create three core files for our web application: the Python app, a data file, and a requirements file.

Create application files
Create app.py with the following content:
Python (app.py)
import streamlit as st
import pandas as pd

st.write("# Data Dashboard")

df = pd.read_csv("data.csv")
st.dataframe(df)
This simple app uses Streamlit to quickly build a web interface that displays your data.
Create data.csv with sample data:
CSV (data.csv)
a,b,c
1,2,3
Create requirements.txt to specify dependencies:
Text (requirements.txt)
streamlit
pandas

Test the App Locally

Before deploying to the cloud, it’s important to test your application locally.

Run the webapp on your computer
Create a conda environment:
Bash
conda create -n my-data-dashboard-env python -y
conda activate my-data-dashboard-env
python -m pip install -r requirements.txt
Test Streamlit installation:
Bash
streamlit hello
If everything went well, this will open a colorful webpage about Streamlit in your browser.
Stop the test server:

Press Ctrl+C in the terminal
Run your application:
Bash
streamlit run app.py
If successful, a simple webpage will open showing “Data Dashboard” with your data table.

What happens when you run streamlit run app.py?

How would you modify the app to display different data?

Streamlit starts a local web server and opens your default browser to display the app, typically at http://localhost:8501

You can modify data.csv with your own data, as long as it’s in CSV format that pandas can read

Congratulations!
If you made it this far, you’ve successfully run a webapp on your computer! This is a great achievement!

Now you can experiment on your computer before deploying to the cloud. Feel free to modify the dataset or try different Streamlit features. When you’re ready, commit your changes to git and continue with the tutorial.

Containerize Your Application

Now we’ll create a container image of your application. A container is like a snapshot of your code and its environment, ensuring it runs consistently anywhere.

Understanding Containers and Images
An image is like a snapshot or template that contains your code and all its dependencies. A container is a running instance of that image, providing an isolated environment for your application.

We’ll use Podman, which is free and open source. Podman commands are fully compatible with Docker, so you can replace podman with docker in any command if you prefer.

Install Podman
Install Podman following these instructions

Create and start your first Podman machine
Test your installation:
Bash
podman info

Create the Containerfile

The containerfile tells Podman how to build your image.

Create the containerfile

Create a file named containerfile (no extension) with the following content:

Containerfile

FROM --platform=linux/amd64 python:3.13-slim

WORKDIR /app
# Expose port you want your app on
EXPOSE 8501

# Upgrade pip and install requirements
COPY requirements.txt requirements.txt
RUN pip install -U pip
RUN pip install -r requirements.txt

# Copy app code and set working directory
COPY . .

ENTRYPOINT []
# Run
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

Understanding the Containerfile
Let’s break down what each part does:

FROM --platform=linux/amd64 python:3.13-slim: Start from a Python image, specifying the platform for compatibility

WORKDIR /app: Set the working directory to avoid issues with the root folder

EXPOSE 8501: Open port 8501 for web traffic

COPY requirements.txt and RUN pip install: Install Python dependencies

COPY . .: Copy all project files into the container

CMD [...]: Command to run when the container starts, including server configuration

Build and Test the Container

Build and run the container locally
Build the image:
Bash
podman build -t my-data-dashboard:0.0.1 -f containerfile .
First build takes time
The first time you run this, it might take a while because it’s downloading base images and dependencies. Subsequent builds will be much faster.
Run the container:
Bash
podman run -p 8501:8501 localhost/my-data-dashboard:0.0.1
Accessing the containerized app
Unlike running locally, this command doesn’t automatically open a browser. Open your browser manually and navigate to localhost:8501 to see your webapp.

What is the difference between building an image and running a container?

Why do we need to specify port mapping with -p 8501:8501?

Building creates the image (template), running creates a container (active instance) from that image

Port mapping connects the container’s internal port 8501 to your computer’s port 8501, allowing you to access the app through your browser

Upload the Image to Git Registry

Once your image works locally, upload it to the Git container registry so Kubernetes can access it.

Push image to registry
Navigate to your project at https://git.embl.de/<username>/my-data-dashboard

Go to Deploy → Container Registry
Follow the instructions shown, adapting them for Podman:
Bash
podman login registry.git.embl.de
podman build -t registry.git.embl.de/<username>/my-data-dashboard:0.1.0 -f containerfile .
podman push registry.git.embl.de/<username>/my-data-dashboard:0.1.0
Replace <username> with your EMBL username.
Verify the upload at https://git.embl.de/<username>/my-data-dashboard/container_registry

Deploy to Kubernetes

The final step is deploying your application to the cloud using Kubernetes.

What is Kubernetes?
Kubernetes (K8s) is a system for running applications on clusters of computers, often in the cloud. You use kubectl (a command-line tool) to communicate with the Kubernetes cluster and ask it to run your application.

Think of Kubernetes as a manager that ensures your application runs smoothly on cloud infrastructure.

Set Up Kubernetes Access

Configure kubectl
Log in to https://kubeportal.embl.de/

Remote access requirements
If accessing remotely, you’ll need:

EMBL VPN connection

Two-factor authentication set up on your phone

Create a tenant named my-data-dashboard with default cores and RAM

What is a tenant?
A tenant tells Kubernetes how many computing resources (CPU, RAM) your project needs. You can have multiple tenants for different projects, each with different resource requirements.
Install kubectl:
Download this setup script
Make it executable and run it:
Bash
chmod +x setup-prod.sh
./setup-prod.sh
Verify installation:
Bash
kubectl version --short
Create a namespace for your tenant:
Bash
kubectl create namespace my-data-dashboard-ns1
Understanding namespaces
Namespaces help organize resources within a tenant. They’re useful for separating different environments (development, testing, production). The namespace name should start with your tenant name to help administrators track resources.
Verify your namespace at https://kubeportal.embl.de/tenants/tenants

Create the Deployment Configuration

Now we’ll create a YAML file that tells Kubernetes how to deploy your application.

Create deploy-image.yaml

Create a file named deploy-image.yaml

Add a comment at the top (optional but helpful):

YAML comment

# Replace the following strings:
# <username> x6 times - your EMBL username
# <username2> x2 times - your supervisor's EMBL username
# <appname> x17 times - my-data-dashboard
# After testing, create a ticket at https://itsupport.embl.de
# to enable www.my-data-dashboard.embl.de

Add the deployment section:

YAML (deploy-image.yaml - Part 1)

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    owner-username: <username>  # your EMBL username
    fallback-username: <username2>  # supervisor's username
  name: <appname>-<username>
  namespace: <appname>-ns1
spec:
  selector:
    matchLabels:
      app: <appname>
  replicas: 1
  template:
    metadata:
      labels:
        app: <appname>
    spec:
      containers:
        - name: <appname>
          image: registry.git.embl.de/<username>/<appname>:0.1.0
          imagePullPolicy: "Always"
          ports:
            - name: http
              containerPort: 8501
              protocol: TCP
          resources:
            limits:
              cpu: 1
              memory: 512Mi
            requests:
              cpu: 300m
              memory: 128Mi
---

Add the ingress section (network access configuration):

YAML (deploy-image.yaml - Part 2)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    traefik.ingress.kubernetes.io/router.tls.certresolver: sectigo
  name: <appname>-<username>
  namespace: <appname>-ns1
spec:
  ingressClassName: internal-users
  rules:
  - host: <appname>.embl.de
    http:
      paths:
      - backend:
          service:
            name: <appname>
            port:
              name: http
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - <appname>.embl.de
---

Enabling the URL
To enable the URL www.<appname>.embl.de, you need to create a ticket at https://itsupport.embl.de/ requesting web support. The name might already be taken, so be prepared to choose an alternative.

Add the service section:

YAML (deploy-image.yaml - Part 3)

apiVersion: v1
kind: Service
metadata:
  name: <appname>
  namespace: <appname>-ns1
spec:
  ports:
  - name: http
    port: 8501
    protocol: TCP
    targetPort: 8501
  selector:
    app: <appname>

Replace all instances of <username>, <username2>, and <appname> with your actual values

Deploy and Test

Run your app in the cloud
Apply the deployment configuration:
Bash
kubectl apply -f deploy-image.yaml
Check if your container is running:
Bash
kubectl get pods -n my-data-dashboard-ns1
Note the NAME of your pod (something like my-data-dashboard-<username>-<random-id>). You’ll need this for the next steps.
Test the service using port-forwarding:
Bash
kubectl port-forward pod/<podname> 8501:8501 -n my-data-dashboard-ns1
Replace <podname> with the pod name from step 2.
Open your browser and navigate to localhost:8501

Success!
If you can see your Data Dashboard, congratulations! Your app is running in the cloud!
To check the pod status and details:
Bash
kubectl describe pod <podname> -n my-data-dashboard-ns1

What is the purpose of port-forwarding in Kubernetes?

How would you update your app with new changes?

Port-forwarding creates a tunnel from your local computer to the pod in Kubernetes, allowing you to test the app before making it publicly accessible

To update: modify your code, rebuild the container image with a new version number, push it to the registry, update the version in deploy-image.yaml, and run kubectl apply -f deploy-image.yaml again

Conclusion

Congratulations on completing this tutorial! You’ve successfully:

Created a simple web application using Python and Streamlit
Containerized the application with Podman
Deployed it to EMBL’s Kubernetes cloud infrastructure
Made your application accessible (with port-forwarding, and potentially via a public URL)

This tutorial covered the fundamental workflow for cloud deployment. While we used a minimal example, these same principles apply to more complex applications.

Next Steps
Now that you understand the basics, consider:

Expanding your dataset and adding more visualizations

Exploring Streamlit’s advanced features

Learning more about Kubernetes resource management

Experimenting with different container configurations

Building more sophisticated data dashboards for your research

Take some time to review each step and understand how they connect. In the future, you might want to deploy different types of applications using this same workflow!

Key Points

A simple web app requires only 6 files and minimal code

Containers ensure your app runs consistently across different environments

Kubernetes enables cloud deployment and makes your app publicly accessible

Git is essential for version control and sharing your container images

Testing locally before deploying to the cloud saves time and catches errors early

💬 Feedback: Found something unclear or want to suggest an improvement? Open a feedback issue.

👥 Contribution: We also welcome contributions when you spot an opportunity to improve the training materials. Please review the contribution page first. Then, edit this material on GitHub to suggest your improvements.

Contributions

Author(s): author 1, author 2