Running an Interactive Service
What is an Interactive Service?​
Transformer Lab supports two distinct types of workloads, designed to map to the different ways researchers work: Tasks and Interactive Services.
Tasks vs. Interactive Services​
- Tasks (Batch Workloads): These are jobs that are queued and scheduled to run automatically when the necessary resources become available. They execute a set of instructions and terminate upon completion. A commoon usecase would be for training a model.
- Interactive Services (On-Demand): These workloads function like a reservation. When you launch an Interactive Service, a specific computer or set of resources is held exclusively for you. It remains active and available until you explicitly release it.
Common Use Cases for Interactive Services​
Interactive Services are best suited for workflows that require persistent access or real-time interaction.
1. Model Inference​
Interactive services are ideal for hosting models that need to stay online to serve requests.
- Examples: Running inference servers like Ollama or vLLM to host a model in the cloud and query it via API.
2. Exploratory Research & Development​
In the early stages of research, you often need an environment to experiment, debug, and iterate quickly without waiting for a queue.
- Tools: Gain direct access to the compute resources via VSCode, Jupyter Notebooks, or SSH.
- Workflow: Use an Interactive Service to prototype your code. Once your script is finalized and stable, you can convert it into a Task to run large-scale training jobs efficiently.
Prerequisites​
Before running an interactive service, ensure you have a Compute Provider set up and active.
-
Navigate to Team Settings and set up a Compute Provider.

-
Make sure the provider is active by clicking on the health button.

Steps to Run an Interactive Service​
-
Go to the Interact page in Transformer Lab.
-
Click on the "New" button to create a new interactive service.

-
Select the type of interactive service you want to launch: VSCode, Jupyter Notebook, SSH, vLLM Server, or Ollama Server.

-
Configure the service:
- Enter a name for the service.
- Select the Compute Provider to use.
- Specify the resources: CPU, memory, and GPUs.
- For certain services, provide additional inputs such as model_name for vLLM Server or ngrok auth token for services that launch a tunnel.

-
Click "Launch" to start the service.
-
Once launched, a card will appear for the service. Click the "Interactive Setup" button on the card.

-
Follow the provided URL or steps to access the service.
