Uri Maayan

Resume

The long-form version

This is the hyperlinked, war-story version with the bugs and the numbers. For the ATS-friendly one-pager, grab the PDF.

How to read me

I’m a chemical-engineering M.Sc. (BGU, 2024, weighted 93.75 / 100; thesis 94 / 100) who argued the faculty committee into a three-year B.Sc. and finished it, then spent the M.Sc. years (2021–2024) writing numerical software for the thesis, and the years since building AI systems on my own clock. Roughly: the M.Sc. years (2021–2024), writing numerical software for the thesis, followed by ~2 years independent AI/ML (2024–present). One continuous computational trajectory.

Between M.Sc. completion in 2024 and now, I haven’t held a salaried engineering role — partly by choice, partly by circumstance. The work has been independent: building the five systems below on my own time, with hands-on event-tech work on the side — physical setup for LED walls, projector rigs, and broadcast wiring at events — to keep the lights on. The goal next is to bring this practice into a team where the work compounds.

The short-list of what that looks like in practice: a 16M-trajectory-point Langevin simulator and a slip-phase damping-coefficient method with no direct precedent I could find · a live-transcription RAG quoting SaaS across 16 REST modules · a nine-camera detection grid running during active alerts · an MCP orchestrator that lets Claude Code delegate to six free-tier LLMs in a DAG · a sub-8-second siren-to-air Hebrew broadcast system with two AI co-hosts · a pure-PyTorch Mamba2 backbone ported to AMD ROCm on consumer silicon.

What I’m looking for

ML-engineering and applied-research roles where the physics or math of the problem shapes the architecture of the system. Production-AI infrastructure work where someone has to own the numerics is the natural fit — RAG over noisy real-world inputs, real-time CV pipelines, multi-LLM orchestration, custom GPU kernels, model porting onto non-CUDA hardware.

Available for full-time conversations now. Bilingual Hebrew / English, based in Tel Aviv — comfortable with remote, hybrid, or on-site in Israel. Easiest way to reach me: uri@maayan.dev, or book a 15-min call.

Independent work, 2024–2026

Five independent systems — each in production, running live during events, or shipped to a single user (me) and still useful. Short version below; full case studies linked.

EventQuote AI →

Live-transcription RAG quoting platform · Python · FastAPI · Vue 3 · ChromaDB · Whisper

Listens to live client meetings in Hebrew or English, extracts technical requirements with an LLM, matches them to an equipment inventory via RAG, emits a ready-to-send .docx / .pdf quote in seconds. 16 REST modules, 650+ pytest tests covering 1,360+ assertions, 235-key i18n with full Hebrew RTL, multi-tenant JWT auth, WhatsApp Business bot, QuickBooks / Hashavshevet / Priority ERP integrations. The interesting part: streaming partial-transcript chunks into the RAG pipeline without re-embedding on every token, and keeping live suggestions under perceptible latency while the speaker is still talking.

SkyWatch →

Multi-camera detection grid · Python · OpenCV · ffmpeg HW decode · MOG2 · Hungarian tracker

Nine public camera feeds in parallel with a five-stage detector (ROI crop → MOG2 background subtraction → streak / flash filters → Hungarian multi-frame tracking → false-positive rejection) plus cross-camera temporal correlation. ~10 FPS per stream on one machine with hardware-decoded ffmpeg (AMF / D3D11VA / VideoToolbox). Built and run during active alerts. Operational detail is under NDA; the part I can say is that the real constraints were decoder backpressure and the false-positive budget, not detector accuracy in isolation.

Ozer-AI →

Multi-LLM delegation middleware · Python 3.12 · MCP · asyncio · 6 provider APIs

An MCP server that lets Claude Code delegate structured subtasks to a pool of free-tier LLM providers. DAG-based plan execution with dynamic task claiming under asyncio.Semaphore(4), context injection across dependencies, specialist routing by task type, self-healing fallbacks on failure, and a final senior-model integration-review pass. Routing gets better with use: per-provider, per-task-type outcomes feed a scoring formula that biases future dispatches. The point isn’t cost — it’s keeping the senior model’s tokens for design and review instead of boilerplate.

Channel Shesh →

Autonomous Hebrew broadcast system · FastAPI · OBS WebSocket · Gemini · Edge TTS · Redis

Pikud HaOref alert → social-clip fetch → keyframe + audio analysis (Gemini + Whisper in parallel) → dual-host Hebrew script → Edge TTS synthesis → OBS scene switching. Sub-8-second siren-to-air. Two personality-distinct AI co-hosts (Dana / Omer) with a 20-segment Redis short-term memory and per-host ChromaDB long-term memory. Circuit breakers on every external call; a producer dashboard where “kill this segment” is one click. The design axiom: “the stream must not die” is the hardest requirement, and every external call has to be written to survive that.

Zonos Hebrew TTS →

Pure-PyTorch Mamba2 + Attention backbone · ROCm 7.12 · RX 6800 XT (gfx1030)

Ported a state-space Hebrew TTS model to AMD GPUs by rewriting the hybrid Mamba2 + Attention backbone in pure PyTorch — removing CUDA-only dependencies (mamba_ssm, causal_conv1d, flash_attn). Implemented SSD chunked scan, RMSNormGated, and single-step decode from scratch with weight layouts matching the pretrained checkpoint exactly. War-story catalogue: a bf16 SSM state-drift bug that collapsed audio past ~2.8 s (fixed by forcing the recurrence to fp32 while keeping I/O in bf16); silent corruption in ROCm’s cuDNN SDPA path, bisected down to a math-SDPA fallback; matching flash_attn’s half-split rotary convention exactly; ROCm SDPA memory-access faults under enable_gqa=True. Full write-up →

Research

M.Sc. thesis (2024, 94 / 100), Berkovich Lab, BGU Chemical Engineering. Extracting damping coefficients from nanoscale friction slip dynamics using linear approximations.

Extracting the damping coefficient γ from AFM slip-phase dynamics by fitting the post-slip ringdown to a damped-harmonic-oscillator solution — a direct approach where no direct method I found existed. Validated against a Python Langevin PT simulator (4th-order stochastic Runge-Kutta, Numba-JIT, 16M+ trajectory points). Currently extending the pipeline to molecular-dynamics data. Full technical write-up at /research; simulator source at github.com/Zuzutus/sde-solver.

Education & prior roles

2021–2024
M.Sc. Chemical Engineering — Ben-Gurion University, thesis track. Weighted 93.75 / 100; thesis 94 / 100. Relevant coursework: Deep Learning for Physical Systems & Inverse Problems / PINNs (91, in English), Analytical Math Methods in Chem. Eng. (92), Advanced Thermodynamics (93), Nanomaterials (95).
2021–2023
Teaching Assistant & Tutor — BGU. Instructed undergraduate Chemical Engineering labs, graded Control Systems homework, and tutored Calculus II / General Chemistry individually for the Dean’s Office Students-With-Disabilities Program.
2018–2021
B.Sc. Chemical Engineering — BGU, accelerated three-year track.
2016–2018
Pre-Academic Mathematics & Programming — Open University of Israel. Dean’s List. Calculus I / II, Linear Algebra (mathematicians’ track), intro Java.
2.5 yrs
Non-Commissioned Officer — IDF. Led a small team responsible for base security in central Israel. NCO course completed.

What I’m good at

Primary
What I reach for daily and would happily get asked about in an interview. Python · PyTorch with custom Mamba2 / SSD / RMSNormGated / rotary / SDPA backends, ROCm / HIP · FastAPI · asyncio · LLM orchestration across OpenAI, Gemini, DeepSeek, Groq, Ollama, OpenRouter · ChromaDB and RAG pipelines · Whisper / Hebrew TTS (Zonos, Edge TTS) · Numba JIT · SciPy + NumPy · stochastic differential equations and Runge-Kutta integrators · pytest · MCP servers.
Secondary
Used and competent, not load-bearing. TypeScript / JavaScript · Vue 3 + Vite + Tailwind + Pinia · vue-i18n with Hebrew RTL · SQL / PostgreSQL / SQLAlchemy + Alembic · MATLAB · OpenCV · ffmpeg hardware decode (AMF / D3D11VA / VideoToolbox) · ONNX Runtime / DirectML · OBS WebSocket v5 automation · WebSockets · JWT · httpx · C / C++ for HIP kernel work · Docker · uv · bun · Git · Linux / WSL2.

Publications & signals

  • E. Chetrit et al., “Nonexponential kinetics captured in sequential unfolding of polyproteins over a range of loads,” Current Research in Structural Biology 4 (2022), pp. 106–117. (co-author)
  • Poster, NANO.IL national nanotechnology conference.
  • Selected technical writing: Porting Mamba2 to ROCm — what hipify handles, what has to be rewritten by hand, and where the consumer-RDNA2 path diverges from CUDA.