Muhammad Uzair — AI engineer & RL researcher

About

A short introduction.

§ 01

I'm a software engineer by training (NUST, 2025) and an AI engineer by trade. My work sits at the intersection of shipped systems and published research — voice AI pipelines, agentic LLMs, and reinforcement-learning policies for wireless networks.

I've spent time benchmarking PPO in cart-pole and pendulum environments, integrating human advisory signals into RL training loops at McMaster, and training multi-agent DRL frameworks for CR-NOMA IoT — work that turned into two IEEE publications.

Day-to-day, I build: voice agents on VAPI + Deepgram, RAG retrieval layers with pgvector, and LLM orchestration services in FastAPI.

Selected work

Projects, shipped and studied.

§ 02

RAG-Based Banking Assistant

Domain-specific LLM assistant for banking, built on LLaMA-3.2-3B with LoRA fine-tuning, FAISS retrieval, and real-time document ingestion.

LLM RAG PEFT / LoRA FastAPI

2025 GitHub →
PPO on Gymnasium Inverted Pendulum

Proximal Policy Optimization agent solving the MuJoCo inverted pendulum, with a hand-shaped reward function and Optuna-driven hyperparameter search.

RL PPO MuJoCo Optuna

2024 GitHub →
Federated Learning for Information Ranking

Privacy-preserving ranking system trained with federated averaging on the ANTIQUE QA dataset, matching centralized baselines without pooling user data.

Federated Learning TensorFlow IR

2024 GitHub →
AI-Driven Air Quality Monitoring

End-to-end IoT pipeline: ESP32 sensors → Azure IoT Hub → blob storage → ML model → PowerBI dashboards. 75% AQI prediction accuracy on a self-collected dataset.

IoT Azure ESP32 ML

2024 GitHub →

Peer-reviewed

Publications.

§ 03

[1]

Multiagent Reinforcement Learning for Joint Spectrum and Energy Optimization in CR-NOMA Enabled Internet of Unmanned Agents

Saleha Ahmed, Muhammad Uzair, Syed Asad Ullah, et al.

IEEE Internet of Things Journal, 2025

A cooperative multi-agent DRL framework for CR-NOMA IoT, where distributed agents jointly learn spectrum access and power-control policies under partial observability.

PDF DOI →
[2]

Energy Efficient Uplink Communications for Wireless Powered Networks with EH Diversity: A DRL-driven Strategy

Saleha Ahmed, Muhammad Uzair, Syed Asad Ullah, et al.

IEEE International Conference on Communications (ICC), 2025

DRL-driven transmit-power control for energy-harvesting uplink nodes, evaluated against MRC, SC, and EGC diversity-combining schemes under Rayleigh fading.

PDF DOI →

Timeline

Experience & education.

§ 04

Industry · Islamabad, PK

Nov 2025 — present

AI Engineer — Adept Tech Solutions
- Built end-to-end voice AI pipelines on VAPI + Deepgram with sub-400ms transcription latency.
- Engineered a multi-agent LLM orchestration system over FastAPI microservices and PostgreSQL.
- Shipped a RAG retrieval layer with MPNet (768-dim) embeddings and pgvector.
- Deployed email intelligence agents with intent detection, processing 500+ messages/day.
Research · Hamilton, ON

Jun 2025 — Aug 2025

Research Intern — Mitacs Globalink — McMaster University
- Fully funded Mitacs internship on guided policy optimization in sequential decision-making.
- Benchmarked REINFORCE and PPO in Gymnasium Pac-Man; tuned reward shaping and entropy regularization.
- Integrated human advisory signals via subjective-logic belief modeling — accelerated convergence over baseline PPO.
Research · Islamabad, PK

Jun 2024 — Sep 2025

Research Collaborator — Information Processing & Transmission Lab
- Benchmarked DDPG, TD3, and PPO for continuous-action transmit-power control under stochastic fading.
- Developed a multi-agent DRL framework for joint spectrum access + power control in CR-NOMA IoT.
- Analyzed MRC / SC / EGC diversity-combining under Rayleigh fading.
Education · Islamabad, PK

Nov 2021 — Jun 2025

B.E. Software Engineering — National University of Sciences and Technology
- CGPA 3.61 / 4.0.
- Coursework: Machine Learning, Reinforcement Learning, Large Language Models, Probability & Statistics.
- 4× FBISE HSSC merit scholarship recipient.

Get in touch

Open to conversations.

§ 05

Currently open to AI engineering roles and PhD opportunities in reinforcement learning, multi-agent systems, and applied ML.

The fastest way to reach me is email — I usually respond within a day. You can also grab my CV below if you're evaluating me for a role or program.

Download full CV

Email: raouzair890@gmail.com
GitHub: @raouzair10
LinkedIn: in/raouzair10
Phone: +92 332 6609658

I build intelligent systems — shipping voice AI and agentic LLMs in production, and publishing research on multi-agent reinforcement learning.

A short introduction.

Projects, shipped and studied.

RAG-Based Banking Assistant

PPO on Gymnasium Inverted Pendulum

Federated Learning for Information Ranking

AI-Driven Air Quality Monitoring

Publications.

Multiagent Reinforcement Learning for Joint Spectrum and Energy Optimization in CR-NOMA Enabled Internet of Unmanned Agents

Energy Efficient Uplink Communications for Wireless Powered Networks with EH Diversity: A DRL-driven Strategy

Experience & education.

AI Engineer — Adept Tech Solutions

Research Intern — Mitacs Globalink — McMaster University

Research Collaborator — Information Processing & Transmission Lab

B.E. Software Engineering — National University of Sciences and Technology

Open to conversations.