Eduardo Heise

I'm a

About Me

Eduardo Heise

Staff AI Engineer | LLMs, GenAI & Agentic Workflows

Staff AI Engineer who turns ambiguous business problems into production ML and LLM systems. I own end-to-end delivery — from problem framing and architecture through data integration, evaluation, observability, and deployment. My work spans agentic workflows, retrieval systems, ranking, and structured-data decision support across hiring, geospatial, legal, fintech, and IoT domains. I am strongest in roles that combine technical leadership with hands-on engineering across backend, data, and ML systems.

Porto Alegre, Brazil
Staff AI Engineer @ goLance Inc.
PhD Candidate, Computer Science

Resume

Education

PhD Candidate

Pontifical Catholic University of Paraná, Curitiba, Brazil

2021 — 2026

Computer Science

Developing Shapley-based methods to explain how tree ensembles measure pairwise similarity at the feature level — bridging prediction explainability (Tree SHAP) with similarity explainability in forests.

Thesis: Explaining Target-Aware Pairwise Similarity in Tree-Based Models with Tree Similarity SHAP

Visiting Researcher

Université de Rouen Normandie, Rouen, France

2025-03 — 2025-12

Doctoral research at LITIS

Research stay at LITIS, a leading laboratory in tree-based methods — home to Prof. Laurent Heutte and Prof. Simon Bernard, whose work on random forest induction, strength-correlation analysis, and ensemble dynamics has shaped the field. Collaborated on tree-ensemble similarity and explainability research.

Master of Science

Pontifical Catholic University of Paraná, Curitiba, Brazil

2018 — 2020

Computer Science

Studied how recommender systems degrade when user behavior shifts over time, and developed an adaptive learning technique for long-history stream-based recommendations.

Bachelor Degree

Pontifical Catholic University of Paraná, Curitiba, Brazil

2016 — 2020

Computer Science

Best performance merit prize.

Publications

ADADRIFT: An Adaptive Learning Technique for Long-history Stream-based Recommender Systems

IEEE International Conference on Systems, Man, and Cybernetics (SMC)

2020

Proposes an adaptive method that detects and responds to concept drift in long-history data streams, keeping recommender systems accurate as user behavior shifts over time.

Uses Bayesian networks to predict leprosy reactions from clinical variables, enabling earlier intervention in treatment protocols.

Skills

Hover over a skill to see where I used it

LLM & GenAI

LangGraph LangChain DSPy Gemini Vertex AI OpenAI SDK Anthropic SDK RAG PEFT / LoRA Whisper FAISS LiteLLM Prompt Engineering LangSmith

Machine Learning

XGBoost LightGBM PyTorch scikit-learn TensorFlow SHAP Optuna MLflow spaCy NLP / Transformers Recommender Systems Computer Vision Bayesian Networks Kedro Tree Ensembles Explainability / XAI Shapley Values

Backend & Data

Python FastAPI PostgreSQL Snowflake BigQuery DuckDB PySpark SQLAlchemy Pydantic Redis / Valkey Google Cloud Pub/Sub Alembic httpx SQL

Infra & DevOps

Docker Kubernetes / Helm Google Cloud Platform AWS Apache Airflow GitHub Actions Prometheus OpenTelemetry DataDog GitOps

Frontend & Product

React TypeScript Streamlit Vite Tailwind CSS Electron

Research & Methodology

Experimental Design Statistical Analysis Literature Review Scientific Writing Reproducible Research Data Analysis Academic Collaboration

Selected Projects

Matrix Factorization Recommender System using PyTorch

Heise Mind

Building a collaborative filtering recommender from scratch using matrix factorization in PyTorch — covering embeddings, training loops, and evaluation.

LangChain for Retrieval Augmentation

Heise Mind

End-to-end RAG pipeline showing how to connect document retrieval to LLM generation for grounded, reliable answers.

Fine Tuning Llama 2 with QLoRA

Heise Mind

Parameter-efficient fine-tuning of Llama 2 using quantized LoRA adapters — making LLM adaptation feasible on consumer GPUs.

Report Generation with LaTeX and LLM

Heise Mind

Using language models to generate structured LaTeX documents — automated report writing with consistent formatting.

Reimplementing Context Rot by Chroma

Heise Mind

Investigating how retrieval quality degrades as context grows — reimplementing Chroma's context rot analysis from scratch.

Experience

goLance Inc.

Dover, DE

Aug. 2025 — Present

Staff AI Engineer

Clients and operations teams had no self-serve way to query workforce analytics — answers about contracts, productivity, payments, and anomalies required manual SQL or support requests. Built a permission-scoped LLM assistant that translates natural language into semantic queries, returning actionable answers with reasoning visibility and inline platform links.

  • No ML infrastructure existed — designed and built the full AI backend from scratch using Python, FastAPI, and Docker, with Prometheus monitoring and structured error handling.
  • Business teams couldn't get answers from fragmented data across Snowflake, DuckDB, and PostgreSQL — built a unified decision workflow that let the LLM application answer product and operational questions with live, context-rich data.
  • Built a ReAct-based LLM agent using LangGraph, DSPy, and Gemini 2.5 Pro/Flash that plans queries through a typed semantic layer, enforces permission scoping at query level, and returns markdown answers with inline links to the platform.
  • Shipped streaming responses with reasoning visibility, audit trails, and session management — all permission-enforced so users only see contracts they have access to.

Andela Inc.

New York, NY

Oct. 2023 — Aug. 2025

Senior AI Engineer

Talent marketplace needed AI across the full hiring funnel — job descriptions lacked skills, matching was shallow, and recruiters couldn't search interview transcripts at scale. Built AI systems spanning deep search, job generation, LLM fine-tuning, and match fitness scoring that product and operations teams could trust.

  • Recruiters couldn't search through thousands of interview transcripts — built a RAG system using pgvector, Vertex AI embeddings, and Gemini so they could query interviews in natural language with source attribution.
  • Job descriptions lacked standardization and missed required skills — generated customized job posts by learning templates from past positions, then fitting new roles into the template, extracting skills and metadata with Gemini structured output.
  • Traditional matching algorithms couldn't understand nuanced fit — fine-tuned Gemini on interview transcripts (Whisper speech-to-text → Vertex AI) to score job-talent fit, outperforming traditional matching on statistical validation.
  • Trained XGBoost and LightGBM match fitness models to score job-talent pairs, using Kedro pipelines for training, evaluation, and hyperparameter tuning.
  • Partnered with product and operations teams to translate ambiguous hiring problems into reliable internal tools with clear feedback loops.

Grupo Index

Remote

Oct. 2022 — Oct. 2023

Deep Learning Engineer

Manual geospatial inspection couldn't scale — reviewers took four days per image batch and missed defects in high-resolution aerial imagery. Built a PyTorch semantic segmentation pipeline for 5B+ pixel imagery that cut review time to 30 minutes and raised detection accuracy from 85% to 95%.

  • Inspectors couldn't keep up with image volume — built PyTorch-based semantic segmentation workflows that processed imagery above 5 billion pixels per image automatically.
  • Initial model quality wasn't production-ready at 85% — improved to 95% through model refinement, post-processing heuristics, and better validation workflows.
  • High-resolution imagery exceeded single-machine capacity — deployed on scalable on-premises infrastructure and shipped a real-time geospatial interface for consuming model output.

Sumersoft Tecnologia

Remote

Sep. 2021 — Oct. 2022

Senior Data Scientist

The company needed an intelligent IoT product but the target microcontroller had only 520 KB of RAM — most ML approaches wouldn't fit on-device. Led the team through prototype development, ensuring the final model and feature-extraction pipeline actually ran on constrained edge devices.

  • Standard ML models were too large for the target microcontroller — compared model and feature-extraction tradeoffs early to find approaches that fit within 520 KB RAM.
  • Led a team of developers and data scientists while actively coding on the first product prototype with embedded TensorFlow and TinyML.
  • Off-the-shelf feature extraction was too expensive — ran HOG-based experiments and lightweight computer vision alternatives for low-power deployment.

Melo Advogados Associados

Remote

Sep. 2020 — Sep. 2021

Senior Data Scientist

Legal teams were manually classifying and reviewing thousands of documents to find leads and draft juridical responses — the process was slow, inconsistent, and couldn't scale with case volume. Built NLP and document-automation workflows that made high-volume legal analysis systematic and repeatable.

  • Manual document classification couldn't keep up with case volume — built transformer-based text classification using BERT semantic vectors that automatically categorized legal documents at scale.
  • Lawyers spent hours drafting tailored juridical documents — used LLMs for automated document generation aligned to specific case requirements, reducing drafting time significantly.
  • Lead identification across thousands of documents was inconsistent and error-prone — automated the pipeline to surface relevant leads systematically, improving client acquisition workflows.

4KST

Curitiba, Brazil

Jul. 2019 — Sep. 2020

Data Scientist

Enterprise clients in telecom and banking needed credit risk and fraud models but lacked clarity on which approach would work best for their data. Compared statistical, online-learning, and gradient-boosted methods across large financial datasets and translated results into client-facing PoCs.

  • No consensus on modeling approach — compared statistical, online-learning, and XGBoost methods on 3- and 6-month payment targets to find the best fit for each client.
  • Client datasets were large and messy — used PySpark across roughly 4 million rows to build reproducible research workflows and production-ready pipelines.
  • Enterprise stakeholders needed to see model value before committing — built interactive PoC dashboards and APIs to communicate model behavior and business impact.

Pontifícia Universidade Católica do Paraná

Curitiba, Brazil

Jun. 2016 — Jul. 2019

Junior Researcher Data Scientist

R&D role at the intersection of industry and academia, working on applied NLP, recommender systems, and large-scale classification problems where research prototypes needed to become production systems.

  • NLP tasks required fast prototyping across techniques — built pipelines with bag-of-words, TF-IDF, and Word2Vec for text processing and classification.
  • Sentiment analysis datasets were heavily imbalanced — applied over- and undersampling strategies to improve model performance on minority classes.
  • Recommender systems degraded as user behavior shifted over time — contributed to online recommender systems that adapted to concept drift.
  • Publishing-domain classification had extreme label cardinality — supported a classification task with more than 1,000 classes using PySpark.

Portfolio

A small set of technical videos covering personal projects, recommender systems, and hands-on LLM and NLP implementations.

  • All
  • Recommenders
  • Nlp

Matrix Factorization Recommender System using PyTorch

Heise Mind

LangChain for Retrieval Augmentation

Heise Mind

Fine Tuning Llama 2 with QLoRA

Heise Mind

Report Generation with LaTeX and LLM

Heise Mind

Reimplementing Context Rot by Chroma

Heise Mind