Sakayo Toadoum Sari

I'm

About

I have Master's degree in Machine Learning at African Master for Machine Intelligence, I have a Cooperative Master's in Industrial mathematics with interest in Data Science ( I am a long-life learner, I keep learning AI and ML everyday) and Physicist specialized in Electronics Electrical Engineering Automatics (research). My main interest currently is Natural Language Processing (Machine Translation and Conversational AI). I am also looking for an internship or PhD position in NLP and Machine Learning to solve real world problems.

NLP/Machine Learning Researcher & Software Engineering Enthusiast

As an NLP researcher, I am working on many projects related to African Languages and also a Machine Learning Freelancer at STEM-Away. I am a Google and Meta Scholar at the African Master of Machine Intelligence (AMMI) at the African Institute for Mathematical Sciences (AIMS) Senegal. I am equally an Applied Physics Researcher.

  • Birthday: February 16th
  • Phone: +235 66 04 90 94 (WhatsApp)
  • City: NDjamena, Chad
  • Degree: Master
  • Email: tsakayo@aimsammi.org
  • Freelance: Available

Skills

Being NLP researcher, Software engineering student and a machine intelligence student, I have developed the following skills

HTML 80%
CSS 80%
R 50%
Python 80%
C 50%
Pytorch 70%
Django 60%
JAX 60%

Resume

Sakayo is a young, dynamic, goal-driven and result oriented professional with 1.5+ years of experience in the field of NLP.

Sumary

Sakayo Toadoum Sari

Sakayo is open to new learning opportunities to grow his carrer and unlock new horizons and challanges

  • N'Djaména, Chad
  • (+235) 66-04-90-94 (WhatsApp)
  • tsakayo@aimsammi.org

Education

Master of Machine Intelligence

2022 - 2023

African Institute for Mathematical Sciences, M'Bour, Senegal

Activities and societies: Activities and Societies: Some of the courses taught include but not limited to : Foundations of machine learning , Mathematics for machine learning, Kernel Methods for Machine Learning, Natural Language Processing, Reinforcement Learning, Computer Vision etcActivities and societies: Activities and Societies: Some of the courses taught include but not limited to : Foundations of machine learning, Mathematics for machine learning, Kernel Methods for Machine Learning, Natural Language Processing, Reinforcement Learning, Computer Vision etc AMMI is a novel fully funded by Google and Facebook one-year intensive graduate program that provides brilliant young Africans with state-of-the-art training in machine learning and its applications. The AMMI program prepares well rounded machine intelligence researchers who respond to both present and future needs of Africa and the world.

Cooperative Master's in Industrial Mathematics

2018 - 2020

African Institute for Mathematical Sciences, Limbe, Cameroon

Activities and societies: Give back activities, footballActivities and societies: Give back activities, football The Co-operative Master’s program includes two work placements of three and four months throughout the regular coursework. Successful completion of all coursework and both work placements is required for graduation. The Co-operative Master’s program is designed specifically to prepare students for work in the areas of big data and IT security. Applicants are requested to choose their preferred area of focus. We have courses such as: Data Science, PDEs of Mathematical Physics, Mathematical Modelling of Dynamics Systems, Advanced Data Analysis and Visualization Techniques, Differential Calculus, Python Programming, Measure theory and Probability , Linear Algebra, Statistical inference, Geostatistical for Public Health, Mathematical and Numerical Aspects for Vibration Control, Mathematical Modelling of Infectious Diseases, Operations Research, Stochastic Analysis, Financial Mathematics and Professional Development

Master Degree in Electronics Electrical Engineering Automation

2017 - 2019

The University of Ngaoundere, Ngaoundere, Cameroon

The University of Ngaoundéré (French: Université de Ngaoundéré) is a public university located in Ngaoundéré, Adamawa Region in Cameroon. It was established on 19 January 1993 by Presidential decree.

Professional Experience

NLP Manager

October 2023 - Now

Batazia, Rotterdam, Holland

  • Lead a team of NLP researchers and data scientists in developing cutting-edge NLP models, algorithms, and tools tailored for African languages;
  • Identify vulnerabilities, and provide a comprehensive information strategy;
  • Work closely with our engineering and product teams to ensure the successful integration of NLP solutions into product;
  • Oversee the acquisition and curation of language data relevant to African languages in collaboration with data acquisition teams;
  • Ensure the ethical and responsible use of data in compliance with industry standards and data privacy regulations;
  • Implement and maintain rigorous quality assurance processes for NLP models and systems, including testing, evaluation, and continuous improvement;
  • Guarantee that our NLP solutions meet high accuracy, reliability, and scalability standards.

Summer Research Internship in NLP

June 2023 - September 2023

École Polytechnique Fédérale de Lausanne(EPFL), Lausanne, Switzerland

    Working on modeling/ predicting online polarization using Natural Language Processing Techniques.
  • Design a pipeline to generate Dataset with a safe similarity threshold
  • Fine-tuned Language Model using parameter-efficient approaches (LORA and QLORA)
  • Develop application for labeling online news articles

Machine Learning Freelancer

2022 - Now

STEM-Away, Santa Clara County, California, United States · Remote

    I mostly work on NLP projects for STEM-Away:
  • Named Entity Recognition
  • Text Summarization
  • Sentiment Analysis
  • Salary Prediction
  • Text Classification

Lead Trainer in Data

2022 - Now

Tech4Tchad, N'Djamena, Chad

    I work on the professional Bachelor in Data Development part at Université de N'Djaména. Besides managing other trainers, I teach:
  • Introduction to Python programming;
  • Introduction to Machine Learning;
  • Git & Github.

Data Science & Business Analytics intern

May 2022 - August 2022

The Sparks Foundation, Singapore . Remote

  • Data Analysis (Retail, terrorism, sport, Covid-19);
  • Data vizsualization;
  • Supervised Learning & Unsupervised Learning.

Machine Learning Intern, Project Manager Lead

June 2022 - September 2022

STEM-Away, Santa Clara County, California, United States . Remote

    Recommender systems and classification using Natural Language Processing (Sentence BERT)

Web Developer and Machine Learning Researcher Intern

July 2019 - January 2020

Limbe City Council, Limbe, Cameroon

  • Data collection and Data entry;
  • Development of a database management system with website as an interface for civil status documents;
  • Data Science research project: "Hybrid Architecture for Long-Short Term Memory and Autoregressive: Application on Time Series data for Birth Prediction."

Projects

I have contributed/work on the following projects as part of studies and personal research

Ngambay-French Neural Machine Translation

The aim of this project is to develop bitext dataset Ngambay-French a low resource language and finetune pretrained models using this data.

Masakhane Decolonise

A collaborative research project from Masakhane. An ongoing research project about text summarization and Machine Translation using scientific

AfricaNEWS: News Topic Classification for African languages

A collaborative research project from Masakhane. We have developed A new dataset for news topic classification in 13 African languages, baseline and few-shot learning experiments.

Hybrid Deep Recurrent Neural Neutwork (LSTM)

The aim is to combine best statistical model and LSTM in single architecture that that will take advantage of the two models and capture linear and nonlinear phenomena because it has both linear and nonlinear modeling capabilities.

Linear Regression with Regularization and Cross-Validation From Scratch

In this project we have created a linear regression model with regularization and cross-validation from scratch.

Sentimental Analysis Using Bernoulli Naive Bayes

Here, we made use of the Bernoulli Naive Bayes to classify sentiments on the review of the movies dataset presented by Kaggle.

Publications

Here is the list of my publications as author/co-author

MasakhaNEWS: News Topic Classification for African languages

Abstract: Recently, prompt-based tuning of large language models (LLMs) is growing in popularity, and have shown to be very effective for few-shot learning with as little as 5 or 10 labelled examples. However, most extensive evaluations are mostly done in English and other high-resource languages, and it is unclear how effective this prompting approach perform for low-resource languages. In this paper, we extend this evaluation to low-resource African languages on the news topic classification dataset. Despite the simplicity of the task, it is still often part of the evaluation of prompt-based LLMs...

Error Correction Based Deep Neural Networks for Modeling and Predicting South African Wildlife–Vehicle Collision Data

Abstract: The seasonal autoregressive integrated moving average with exogenous factors (SARIMAX) has shown promising results in modeling small and sparse observed time-series data by capturing linear features using independent and dependent variables. Long short-term memory (LSTM) is a promising neural network for learning nonlinear dependence features from data. With the increase in wildlife roadkill patterns, the SARIMAX-only and LSTM-only models would likely fail to learn the precise endogenous and/or exogenous variables driven by this wildlife roadkill data. In this paper, we design and implement an error correction mathematical framework based on LSTM-only...

Ngambay-French Neural Machine Translation (sba-Fr)

Abstract: In Africa, and the world at large, there is an in- creasing focus on developing Neural Machine Translation (NMT) systems to overcome lan- guage barriers. NMT for Low-resource lan- guage is particularly compelling as it involves learning with limited labelled data. However, obtaining a well-aligned parallel corpus for low- resource languages can be challenging. The disparity between the technological advance- ment of a few global languages and the lack of research on NMT for local languages in Chad is striking. End-to-end NMT trials on low-resource Chad languages have not been attempted. Additionally, there is a dearth of online and well-structured data gathering for research in Natural Language Processing, unlike some African languages....

Contact

You can reach out to me through the following channels

Location:

N'Djamena, Chad

Call:

+235 66 04 90 94

Loading
Your message has been sent. Thank you!