Portfolio · MMXXVIIslamabad / RemoteOpen · 2026

Engineering High-Performance Systems for Production AI.

M. AWAIS KHAN’25 — present

Muhammad Awais Khan — Systems-focused Cloud Platform & MLOps Engineer specializing in maximizing hardware efficiency, accelerating container orchestration, and eliminating serverless execution bottlenecks.

Download CV View systems work

§ 01 · Signals

Defining outcomes.

Measured in production

01 · RUNPOD · INFERENCE

80s → <1s

Serverless GenAI Cold Start

Eliminated cold starts on RunPod by migrating model weights from ephemeral Docker layers to persistent volume mounts.

02 · TEMPLIX · GPU

2× Throughput

Image-to-Video Pipeline

Production acceleration powering 50,000+ users of the Templix consumer app via custom kernel + memory tuning.

03 · DOCKER · CLOUD RUN

60% ↓

Container Footprint

Multi-stage Docker builds for an enterprise RAG engine shipped to GCP Cloud Run with rapid provisioning.

§ 02 · Production Track

AI Engineer, Funsol Technologies

Jul 2025 — Present

Owning the cloud + GPU substrate behind production GenAI products, including Templix — powering 50,000+ users with accelerated image-to-video pipelines.

RunPod Serverless Engineering

Migrated massive model weights out of ephemeral Docker layers into persistent volume mounts — serverless inference initialization dropped to under one second.

GPU Memory Optimization

Integrated custom Sage Attention kernels and vmtouch system caching to maximize active GPU memory bandwidth utilization.

Asynchronous Audio Systems

Formulated high-volume async audio-to-text pipelines orchestrated through OpenAI Whisper frameworks.

§ 03 · Systems Dossier

Repositories & deployed systems.

github.com/AwaisKhan-01 ↗

01GitHub ↗

Custom CUDA Neural Network Acceleration

Implemented CUDA kernels from scratch — shared memory tiling and coalesced global memory access patterns to optimize matrix multiply-accumulate workloads.

C++CUDATensor CoresParallel Computing

02GitHub ↗