Sebastián Castaño, Developer in Berlin, Germany
Sebastián is available for hire
Hire Sebastián

Sebastián Castaño

Verified Expert  in Engineering

Data Scientist and ML Developer

Berlin, Germany
Toptal Member Since
September 13, 2021

Sebastián拥有机器学习和数据科学博士学位,在医学跨学科项目方面有十年的经验, banking, marketing, and consumer products, among others. His expertise includes designing data collection systems, analyzing and modeling complex data, and developing and deploying ML pipelines. As a seasoned researcher and educator, Sebastián不断为技术和非技术同事提供令人信服的数据驱动见解和直观工具.


Stealth Startup
GPT, Generative Pre-trained Transformers (GPT)...
Global Food and Beverage Corporation
Python,机器学习,SQL, TensorFlow, Docker,亚马逊网络服务(AWS)...
机器学习,SQL, Python, R, Java,神经网络,统计建模...




Preferred Environment

Windows, Linux, Spyder, PyCharm, Jupyter Notebook, Scikit-learn, Visual Studio Code (VS Code), Git, Docker

The most amazing...

...project I've developed is an ML-based, 优化帕金森病和特发性震颤患者脑刺激治疗的闭环系统.

Work Experience


2022 - PRESENT
Stealth Startup
  • 共同创立了一家为IT团队开发知识管理框架的公司.
  • 利用基于变压器的自然语言处理模型开发并实现了一个多领域语料库分析系统.
  • 设计和部署AWS基础设施,为神经搜索引擎提供服务.
Technologies: Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Amazon Web Services (AWS), SQL, Docker, Generative Systems, OpenAI, Text Processing

Machine Learning Engineer

2021 - 2022
Global Food and Beverage Corporation
  • 作为MLOps团队的一员,为新的地域市场和产品系列建立并推出现有的媒体组合模型.
  • Designed, implemented, 并为基于贝叶斯统计建模的媒体混合模型部署了PoC,作为ML R的成员&D team.
  • Led a team of five ML engineers and data scientists. The team developed, benchmarked, productized, and deployed a next-generation media mix model for a marketing team.
Technologies: Python,机器学习,SQL, TensorFlow, Docker,亚马逊网络服务(AWS), Bayesian Statistics


2021 - 2021
  • Validated credit risk and accounting models in a large German bank.
  • 在德甲球队对足球运动员的伤病预测和人才培养数据进行预测性和规范性的统计分析.
  • Deployed a data management system for a large European bank.
  • Developed MLOps pipelines, including architecture optimization for neural networks, for an in-house project.
Technologies: 机器学习,SQL, Python, R, Java,神经网络,统计建模, Predictive Analytics, Bayesian Statistics, Data Analysis, Data Analytics, Data Science, Machine Learning Operations (MLOps), Scikit-learn, Probability Theory, PyTorch, Statistics, Statistical Methods, Seaborn, Matplotlib, Database Management Systems (DBMS), Data Warehousing, CI/CD Pipelines, Jupyter, Statistical Analysis, Artificial Intelligence (AI), NumPy, Consulting

Doctoral Research Assistant

2014 - 2020
University of Freiburg
  • Developed the first machine learning-based, closed-loop, deep brain stimulation system implemented in freely moving patients. 与这一成就相关的项目是在临床医生和行业伙伴的密切合作下进行的.
  • 建立了数据驱动的适应性脑深部刺激作为大学的一个新的研究领域.
  • 在同行评议的科学期刊上发表了7篇研究论文,并在机器学习领域的科学研讨会和会议上发表了10多篇论文, data science, and neuroscience.
  • 支持计算机科学硕士项目的机器学习讲座5年,以练习、考试和辅导为概念. The average attendance of the lecture was around 100 students per semester.
  • 指导一个2-5人(有偿)的硕士研究助理团队完成实验室的辅助任务.
  • 指导15名以上学生在实验室完成硕士和学士学位论文.
Technologies: Machine Learning, Digital Signal Processing, Data Science, Research, Neuroscience, Statistics, Data Analysis, Data Analytics, Writing & Editing, Matplotlib, Pandas, Scikit-learn, Time Series Analysis, Control Theory, Probability Theory, Linear Algebra, Deep Learning, Reinforcement Learning, Python, MATLAB, Technical Writing, PyTorch, Statistical Methods, Data Engineering, Seaborn, Neural Networks, Predictive Analytics, Statistical Modeling, Jupyter, AutoML, Artificial Intelligence (AI), NumPy, Consulting, Classification Algorithms

Research Assistant

2012 - 2014
National University of Colombia
  • Taught a course on analog electronics as the sole lecturer. In addition to preparing lectures, exercises sheets, and exams, I supervised the execution of student's projects.
  • 指导一名(带薪)本科生助教讲授模拟电子学.
  • Developed new methods for source localization of neural signals, resulting in a scientific publication in a peer-reviewed journal.
Technologies: Bayesian Statistics, Linear Algebra, Research, Neuroscience, MATLAB, Python, Probability Theory, Statistical Analysis, Artificial Intelligence (AI), NumPy

Research Discovery Engine Based on NLP Methods

机器学习团队中知识管理和发现框架的概念和发展. The product is framed as a discovery engine for machine learning research, 解决信息溢出的问题,比如每天有100多篇机器学习论文上传到arXiv.

Key Activities
• Designed and deployed cloud infrastructure serving the discovery engine.
• Created a business model and go-to-market strategy.

Media Mix Model for Consumer Products

Development, implementation, 并为公司的全球运营部署了基于贝叶斯统计模型的营销组合模型.

Key Activities
• Implemented a learning model based on state-of-the-art research papers.
• Customized the model based on specific properties of the data available.
• Deployed the model to the cloud to be used by the MLOps team.
• Improved the model iteratively using feedback from the business unit.

首个基于机器学习的自适应深度脑刺激系统在自由活动的特发性震颤患者中实现. 这个项目是与华盛顿大学的同事合作完成的.

Key Activities
• Conceived and developed the underlying machine learning, control, and digital signal processing methods.
• Executed the data collection experiments.
• Performed offline analysis of the collected data.
• Wrote and edited the final manuscript for a peer-reviewed publication.

Data Analysis of Injury Data in a Soccer Team

德甲足球队伤病数据的描述性和规范性分析. After I analyzed the data and presented the results to all the stakeholders, including non-technical personnel, 该项目成为该团队研究部门的一部分,旨在研究伤害预防和人才发展.

Key Activities
• Prepared and cleaned the data from several databases.
• Conducted descriptive and prescriptive analyses of the data.

Deployment of a Data Management System


Key Activities
• Configured and deployed UAT and production environments.
• Implemented a CI/CD pipeline for the back and front end.

Decoding Parkinson's Disease Symptoms from Brain Signals
一种新的监督机器学习方法,用于从大脑信号中解码帕金森病症状的强度. 我们收集了7名接受深部脑刺激治疗的患者的数据,并表明我们的新ML方法提高了症状的解码性能.

Key Activities
• Designed and executed the data collection experiments.
• Preprocessed data and performed exploratory data analysis.
• Conceived and implemented the novel ML method.
•用收集到的数据和最先进模型的基准验证了新的ML方法, including deep convolutional neural networks.
• Applied AutoML for hyperparameter optimization of all considered models.

Data Augmentation Framework for ML in Neuroscience

We tackled the following challenges:
• Unreliability of the available labels
• High level of noise in the raw signals.

Key Activities
• Conceived the idea.
• Executed the data analysis.
• Wrote the scientific manuscript published in a peer-reviewed journal.
2014 - 2020

PhD in Computer Science

University of Freiburg - Freiburg, Germany

2012 - 2014

Master's Degree in Engineering

National University of Colombia - Manizales, Colombia

2007 - 2012

Engineer's Degree in Electronics Engineering

National University of Colombia - Manizales, Colombia


Scikit-learn, Matplotlib, Pandas, NumPy, PyTorch, React, TensorFlow


Spyder, MATLAB, Git, Seaborn, Jupyter, AutoML, PyCharm, Excel 2010


Python, SQL, R, Java, C#


Data Science, User Acceptance Testing (UAT)


Windows, Linux, Jupyter Notebook, Docker, JBoss EAP, Amazon Web Services (AWS), Visual Studio Code (VS Code)


Database Management Systems (DBMS)


Digital Signal Processing, Programming, Time Series Analysis, Linear Algebra, Machine Learning, Neuroscience, Deep Learning, Research, Technical Writing, Statistics, Statistical Methods, Data Analytics, Data Analysis, Data Engineering, Neural Networks, Statistical Modeling, Predictive Analytics, Writing & Editing, Statistical Analysis, Artificial Intelligence (AI), Consulting, Classification Algorithms, Generative Pre-trained Transformers (GPT), Probability Theory, Reinforcement Learning, Bayesian Statistics, Algorithms, Electronics, Natural Language Processing (NLP), OpenAI, Text Processing, GPT, Circuit Design, Control Theory, Calculus, Machine Learning Operations (MLOps), Data Warehousing, CI/CD Pipelines, Generative Systems, Large Language Models (LLMs), Business Planning, IT Project Management, Natural Language Understanding (NLU), Customer Research

Collaboration That Works

How to Work with Toptal



Share your needs


Choose your talent


Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring