InfrastructureSeniorFull-time
Backend Engineer — LLM Infrastructure
Work on LLM routing, cost tracking, and high-throughput proxy infrastructure. Python, asyncio, Postgres, and a lot of provider-specific edge cases.
- Location
- Remote — US / EU
- Employment
- Full-time
- Base salary
- $150k – $210k /yr
PythonFastAPIasyncioPostgreSQLRedisLLM Routing
About the role
You will work on the always-in-path FastAPI proxy that sits between every customer request and every provider. That means guardrails, prompt management, A/B testing, cost attribution, rate limits, and streaming. You will own the parts of the request lifecycle that have to stay correct under real production load.
What you'll do
- Extend the Nemo Backend FastAPI proxy with new guardrails and features
- Own streaming, retries, and provider failover correctness
- Build and maintain cost attribution from x-nemo-request-cost
- Profile and tune hot paths — every millisecond is in the user-facing latency budget
- Harden multi-tenancy isolation at the request layer
Required experience
- 5+ years of backend Python in production
- Deep experience with asyncio and high-concurrency services
- Comfortable with Postgres, connection pooling, and query optimization
- Production experience with streaming APIs or proxies
Nice to have
- Prior work on LLM APIs, model gateways, or SSE streaming
- Experience with LLM routing engines or model gateways
- Performance profiling and flame graph literacy
Compensation & location
Base salary range
$150k – $210k /yr
Offer depends on experience and location. Equity offered on top of base.
Remote policy
Remote — US / EU
We hire through employer-of-record services in 11+ countries.
Ready to apply?
We reply to every application within 5 business days.