Built for developers. Designed for scale.

APIs and infrastructure that gets out of your way.

Platform Features

Everything you need to build intelligent data workflows

1

Catalyzed Data

Query across your private datasets and public marketplace data with standard SQL. Vector search and joins in a single query.

500+ connectors Automatic indexing Petabyte scale Billions of vectors
import requests

API_URL = "YOUR_API_URL"
API_KEY = "YOUR_API_KEY" # Find patents similar to a concept and join with claims
response = requests.post(
f"{API_URL}/queries",
    headers={"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json" },
    json={"tables": {"patents": "PATENTS_TABLE_ID",
"claims": "CLAIMS_TABLE_ID" },
"sql": """
            SELECT p.title, p.assignee, p._distance, c.claim_text
            FROM knn_search('patents', 'abstract_embedding',
text_to_embedding('machine learning image classification'), 5) p
            JOIN claims c ON p.patent_id = c.patent_id
            WHERE c.claim_type = 'independent'
            ORDER BY p._distance
        """ }
)

result = response.json()
# {
#   "rows": [
# {"title": "Deep Learning System for Radiological Image Analysis", "_distance": 0.127, ...},
# {"title": "Automated Diagnostic Imaging Pipeline", "_distance": 0.183, ...},
# {"title": "Neural Network for CT Scan Interpretation", "_distance": 0.241, ...}
#   ],
#   "rowCount": 3
# }
2

Catalyzed Orchestration

Define workflows once, run at scale. Trigger pipelines via API and stream real-time progress events.

API/SDK-defined pipelines Real-time streaming LLM integration Human-in-the-loop
import requests
import json

API_URL = "YOUR_API_URL"
API_KEY = "YOUR_API_KEY"
PIPELINE_ID = "YOUR_PIPELINE_ID" # Trigger pipeline with streaming enabled
response = requests.post(
f"{API_URL}/pipelines/{PIPELINE_ID}/trigger",
    headers={"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
"Accept": "text/event-stream" },
    json={"input": {"files": {"documents": ["file_abc123", "file_def456"]},
"dataInputs": {"extractionPrompt": "Extract key contract terms"}}},
    stream=True
)
# Stream execution events for line in response.iter_lines():
if line:
        event = json.loads(line.decode().removeprefix("data: "))
        print(f"{event['type']}: {event.get('message', event.get('output', ''))}")
# Events: # {"type": "started", "executionId": "exec_abc123"} # {"type": "progress", "pct": 25, "message": "Processing document 1 of 2..."} # {"type": "progress", "pct": 75, "message": "Extracting terms..."} # {"type": "completed", "output": {"extractedTerms": [...], "confidence": 0.94}}
3

Catalyzed Control

Capture expert feedback, measure quality with evaluations, and continuously improve pipelines with AI-generated synthesis.

Feedback capture Evaluation Active learning Continuous improvement
import requests

API_URL = "https://api.catalyzed.ai"
API_KEY = "YOUR_API_KEY"
PIPELINE_ID = "YOUR_PIPELINE_ID" # 1. Capture expert feedback on a pipeline execution
requests.post(
f"{API_URL}/signals",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"teamId": "team_abc123",
"data": {"type": "comment", "text": "Missing key financial metrics"},
"executionIds": ["exec_xyz789"]
}
)
# 2. Run evaluation against ground truth examples
eval_response = requests.post(
f"{API_URL}/pipelines/{PIPELINE_ID}/evaluate",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"exampleSetId": "examples_abc",
"evaluatorType": "llm_judge" }
)
evaluation = eval_response.json()
print(f"Evaluation started: {evaluation['evaluationId']}")
# 3. Generate improvements from accumulated feedback
synth = requests.post(
f"{API_URL}/pipelines/{PIPELINE_ID}/synthesize",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"handlerType": "apg_v1"}
).json()
# 4. Apply proposed changes if synth["status"] == "generated":
    requests.post(f"{API_URL}/synthesis-runs/{synth['synthesisRunId']}/apply",
                  headers={"Authorization": f"Bearer {API_KEY}"})

Beta: These APIs are available now. View documentation →

Signals | Evaluations | Synthesis | Full Guide

Architecture

Built on open standards

Feedback Loop
Connectors500+
StorageParquet/Arrow
QuerySQL + Vector
Orchestration
Outputs

Open formats

Parquet, Arrow, and standard SQL. Your data stays portable.

No vendor lock-in

Export anytime in standard formats. No proprietary schemas.

Your cloud or ours

SaaS, hybrid, or fully on-prem for enterprise customers.

Security & Compliance

Built for regulated industries

SOC 2 Aligned

Controls in place, formal audit planned. Documentation available on request.

HIPAA Ready

BAA available for healthcare customers. PHI handling protocols in place.

GDPR Compliant

DPA available. Data residency options for EU customers.

Data Residency

Enterprise customers can specify where data lives and where queries execute.

Need to review our security posture? Request our security documentation →

Ready to build?

Explore the docs or talk to our engineering team about your use case.