open source · cloud-agnostic · bioinformatics-first

The right machine
for every
workload

cloudfit scores and ranks cloud instances across AWS, GCP, and Azure against your workload profile — and stays current as providers deprecate and release new machine types.

terminal
$ pip install cloudfit-core
Successfully installed cloudfit-core-0.1.0
$ python
>>> from cloudfit import rank, WorkloadProfile, MachineType
>>> profile = WorkloadProfile(vcpu=60, ram_gb=224, archetype="io", optimize_for="balanced")
>>> rank(profile, candidates)[0].instance.id
'c2-standard-60'

01

Try it now

The cloudfit API is running live on Hugging Face Spaces with a bundled snapshot of 875 GCP machine types across 5 regions (us-central1, us-east1, us-west1, europe-west4, asia-southeast1) with realistic asymmetric availability. No credentials needed. Five endpoints in total: /recommend, /instances, /providers, /diff, and /health.

Open Swagger UI ↗   /health ↗

The Space sleeps when idle, so the first request may take 30 to 60 seconds to wake the container. Subsequent requests are instant.

POST /recommend  ·  rank machine types
$curl -sX POST https://chaitanyakasaraneni-cloudfit-api.hf.space/recommend \
  -H 'content-type: application/json' \
  -d '{
    "workload": {"vcpu": 32, "ram_gb": 128, "optimize_for": "balanced"},
    "region": "us-central1",
    "top_k": 3
  }'
→ ranked list of 3 instances, filtered to instances available in us-central1
GET /instances  ·  browse the catalog
$# filter by region, vCPU, GPU, status
$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/instances?region=europe-west4&min_vcpu=64&limit=5'
$# asia-southeast1 has fewer families in the bundled snapshot
$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/instances?region=asia-southeast1&limit=5'
→ matching instances with full specs and pricing
GET /providers  ·  snapshot summary
$curl 'https://chaitanyakasaraneni-cloudfit-api.hf.space/providers'
→ per-provider instance count, regions present, and status breakdown (active / deprecated / tombstoned)
$# useful for "what's in the snapshot right now?"
POST /diff  ·  compare two workloads
$curl -sX POST https://chaitanyakasaraneni-cloudfit-api.hf.space/diff \
  -H 'content-type: application/json' \
  -d '{
    "a": {"workload": {"vcpu": 16, "ram_gb": 64}},
    "b": {"workload": {"vcpu": 64, "ram_gb": 256}}
  }'
→ top pick for each + price/hr, monthly cost, vCPU, RAM deltas

02

How it works

1
Describe your workload
Declare vCPU, RAM, archetype, disk requirements, GPU needs, and spot tolerance in a WorkloadProfile or a YAML file. cloudfit understands five resource archetypes — I/O, CPU, memory, GPU, and burst-parallel.
2
Score every candidate
Hard floor filters eliminate instances that can't meet your minimum requirements. The scoring engine then weights cost, performance, and availability according to your optimize_for mode. No cloud credentials needed.
3
Get a ranked recommendation
Receive a ranked list with composite scores and sub-scores for each candidate. Feed the result into Terraform, Nextflow, or any IaC pipeline. As provider plugins come online, the registry stays current as instance families are deprecated and released.

03

Bioinformatics pipelines

I/O
I/O bound
disk-saturating
CPU
CPU bound
thread-parallel
MEM
Memory bound
large index
GPU
GPU / ML
inference
BURST
Burst parallel
scatter-gather

04

Scoring model

score = w_cost × cost_score  +  w_perf × perf_score  +  w_avail × avail_score
Mode w_cost w_perf w_avail Best for
cost
0.70
0.20
0.10
Batch jobs, dev environments
balanced
0.33
0.34
0.33
Default — production workloads
performance
0.10
0.80
0.10
Latency-sensitive, GPU inference
availability
0.10
0.20
0.70
Long-running, deprecation-sensitive
python example.py
from cloudfit import rank, WorkloadProfile, MachineType

profile = WorkloadProfile(
    vcpu=60,
    ram_gb=224,
    workload="io-intensive",
    archetype="io",
    optimize_for="balanced",
)

# supply candidates yourself, or fetch them live with cloudfit-provider-gcp
candidates = [
    MachineType(id="c2-standard-60", provider="gcp", vcpu=60, ram_gb=240, price_hr=3.13),
    MachineType(id="c3d-standard-60-lssd", provider="gcp", vcpu=60, ram_gb=240, price_hr=3.39),
    MachineType(id="c7i.24xlarge", provider="aws", vcpu=96, ram_gb=192, price_hr=4.28),
]

for r in rank(profile, candidates):
    print(f"{r.instance.id:<30} score={r.score:.2f} ${r.instance.price_hr:.2f}/hr")
#1c2-standard-60         score=0.81  $3.13/hr
#2c3d-standard-60-lssd   score=0.80  $3.39/hr
#3c7i.24xlarge           score=0.00  $4.28/hr  — disqualified: 192 GB RAM < 224 required

05

Ecosystem


Author
Chaitanya Krishna
Kasaraneni

Software engineer focused on cloud-based data infrastructure for batch and bioinformatics workloads. Published researcher across AI/ML, medical imaging, and computational drug discovery.

Publications