Hello, my name is
Mahmoud Mayaleh
And I'm an
Reach out

About me

I'm Mahmoud and I'm an AI engineer

AI Engineer with a Master’s in Artificial Intelligence, skilled in deep learning, NLP, and large-scale model training. Background in distributed AI systems and applied ML research.

linkedin View CV

My Journey

Research Engineer

CNAM - Paris

  • Researched RDMA + LLM traffic to design scheduling strategies, model topologies, and compute overlaps using ns-3 & SimAI.
  • Simulated GPT workloads (up to 175B on 1024 GPUs) with SimAI to optimize NCCL/RDMA, and prototyped MSCCL collectives for scalability.

Master's in Artificial Intelligence for Connected Industries

CNAM, Paris, France

Back End Developer

Ostim Technical University - IT department

  • Built 5+ Odoo ERP modules, improving workflows and boosting efficiency 20%.
  • Automated backend with Python, cutting manual entry 40% and reducing errors.
  • Optimized queries, speeding up ERP load times 25%.

Machine Learning Intern

ArkSigner

  • Built object detection pipeline with 90% accuracy.
  • Integrated AI solutions, improving system performance 30%.
  • Validated on custom datasets for robust deployment.

Bachelor of Engineering: Computer Engineering

Ostim Technical University, Ankara, Türkiye

Skills

Core stack

Applied AI/ML engineer focused on building end-to-end systems—data pipelines, model training/serving, and MLOps. Strong in Python + PyTorch, with hands-on work in NLP and Computer Vision.

Python
PyTorch
TensorFlow / Keras
scikit-learn
NLP (Transformers)
Computer Vision (OpenCV)

Publications

Bar chart from sentiment classification study

Enhancing Sentiment Classification on Small Datasets

This study compares EDA, back-translation, and contextual (NLPaug) augmentation for small-scale sentiment classification, finding contextual augmentation best for BERT and EDA/back-translation more effective for traditional models.
Abstract:

The classification of sentiment on a small scale often suffers from the small amount of data, which limits the generalization ability of the models. This study evaluates and compares the effectiveness of three data augmentation strategies:

  • EDA (Easy Data Augmentation)
  • Back-translation
  • Contextual token substitution (nlpaug-style)

Methods: Tested on both traditional ML classifiers (SVM, Random Forest) and transformer-based models (BERT) using low-resource sentiment datasets.

  • All augmentation methods improved performance
  • Contextual augmentation gave the most consistent gains for BERT
  • EDA and back-translation were more effective for traditional classifiers
Published in: TechRxiv, 2025
Read Publication
LipLingo CNN lip-reading model schematic

LipLingo: CNN Model for Lip Reading

LipLingo is a deep learning lip-reading system built on LipNet with enhanced preprocessing and spatio-temporal modeling. It achieves 95% character-level and 87% sentence-level accuracy, showing strong potential for assistive tech and biometrics.
Abstract:

Lip reading plays a crucial role in applications such as speech recognition, assistive technologies for the hearing-impaired, and biometric authentication. However, performance often degrades when the speaker varies, the environment changes, or the speech sounds are visually similar. To overcome these challenges, we propose LipLingo:

  • Based on the LipNet architecture
  • Enhanced with standardized preprocessing for consistent mouth-region representation
  • Uses rotational validation for better generalization
  • Combines 3D convolutional layers (spatio-temporal features) and bidirectional recurrent layers (sequence modeling)
  • Optimized with Adam optimizer and CTC loss

Results:

  • 95% character-level accuracy
  • 87% sentence-level accuracy
  • Outperforms baseline LipNet
Published in: 2025
Read Publication

Projects

LipLingo: Deep Learning Model for Lip Reading

LipLingo logo and model schematic
LipLingo is a lip-reading tool that turns silent video of someone’s mouth into text. It learns typical lip-movement patterns, useful when audio is missing or noisy. View on GitHub

Highlights

  • Visual lip-reading system that converts silent mouth video into text
  • Data & prep: trained and evaluated on the GRID corpus with face/mouth detection, frame normalization, and basic augmentations.

Tech Stack

Python TensorFlow/Keras NumPy OpenCV

Sentiment Analysis on Social Media Comments

Sentiment analysis
Text-classification pipeline extracting emotions from comments using classic ML and lightweight NLP. View on GitHub

Highlights

  • Computed per-comment scores from Twitter_Data.csv and wrote labeled outputs.
  • Generated a sentiment-score histogram and calculated the dataset’s average compound sentiment.

Tech Stack

Python scikit-learn NLTK pandas Matplotlib

Data-Driven Sales Trend Predictor

Line and bar chart showing forecasted sales
Sales forecasting with linear regression and clear error metrics for decision-ready insights. View on GitHub

Highlights

  • Sales forecasting tool that predicts future demand per store/item from historical transactions.
  • Pipeline: EDA → cleaning & time-indexing → feature engineering (lags, rolling means, seasonality/holiday flags) → model training → backtesting → error analysis.

Tech Stack

Python NumPy pandas Matplotlib

Contact me

Name
Mahmoud Mayaleh
Address
Paris, France
Email
mahmoudmayaleh@gmail.com
Message me