Sebastián Castaño, Developer in Berlin, Germany
Sebastián is available for hire
Hire Sebastián

Sebastián Castaño

Verified Expert  in Engineering

Data Scientist and ML Developer

Location
Berlin, Germany
Toptal Member Since
September 13, 2021

Sebastián拥有机器学习和数据科学博士学位,在医学跨学科项目方面有十年的经验, banking, marketing, and consumer products, among others. His expertise includes designing data collection systems, analyzing and modeling complex data, and developing and deploying ML pipelines. As a seasoned researcher and educator, Sebastián不断为技术和非技术同事提供令人信服的数据驱动见解和直观工具.

Portfolio

Stealth Startup
GPT, Generative Pre-trained Transformers (GPT)...
Global Food and Beverage Corporation
Python,机器学习,SQL, TensorFlow, Docker,亚马逊网络服务(AWS)...
D-fine
机器学习,SQL, Python, R, Java,神经网络,统计建模...

Experience

Availability

Part-time

Preferred Environment

Windows, Linux, Spyder, PyCharm, Jupyter Notebook, Scikit-learn, Visual Studio Code (VS Code), Git, Docker

The most amazing...

...project I've developed is an ML-based, 优化帕金森病和特发性震颤患者脑刺激治疗的闭环系统.

Work Experience

Co-founder

2022 - PRESENT
Stealth Startup
  • 共同创立了一家为IT团队开发知识管理框架的公司.
  • 利用基于变压器的自然语言处理模型开发并实现了一个多领域语料库分析系统.
  • 设计和部署AWS基础设施,为神经搜索引擎提供服务.
Technologies: Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Amazon Web Services (AWS), SQL, Docker, Generative Systems, OpenAI, Text Processing

Machine Learning Engineer

2021 - 2022
Global Food and Beverage Corporation
  • 作为MLOps团队的一员,为新的地域市场和产品系列建立并推出现有的媒体组合模型.
  • Designed, implemented, 并为基于贝叶斯统计建模的媒体混合模型部署了PoC,作为ML R的成员&D team.
  • Led a team of five ML engineers and data scientists. The team developed, benchmarked, productized, and deployed a next-generation media mix model for a marketing team.
Technologies: Python,机器学习,SQL, TensorFlow, Docker,亚马逊网络服务(AWS), Bayesian Statistics

Consultant

2021 - 2021
D-fine
  • Validated credit risk and accounting models in a large German bank.
  • 在德甲球队对足球运动员的伤病预测和人才培养数据进行预测性和规范性的统计分析.
  • Deployed a data management system for a large European bank.
  • Developed MLOps pipelines, including architecture optimization for neural networks, for an in-house project.
Technologies: 机器学习,SQL, Python, R, Java,神经网络,统计建模, Predictive Analytics, Bayesian Statistics, Data Analysis, Data Analytics, Data Science, Machine Learning Operations (MLOps), Scikit-learn, Probability Theory, PyTorch, Statistics, Statistical Methods, Seaborn, Matplotlib, Database Management Systems (DBMS), Data Warehousing, CI/CD Pipelines, Jupyter, Statistical Analysis, Artificial Intelligence (AI), NumPy, Consulting

Doctoral Research Assistant

2014 - 2020
University of Freiburg
  • Developed the first machine learning-based, closed-loop, deep brain stimulation system implemented in freely moving patients. 与这一成就相关的项目是在临床医生和行业伙伴的密切合作下进行的.
  • 建立了数据驱动的适应性脑深部刺激作为大学的一个新的研究领域.
  • 在同行评议的科学期刊上发表了7篇研究论文,并在机器学习领域的科学研讨会和会议上发表了10多篇论文, data science, and neuroscience.
  • 支持计算机科学硕士项目的机器学习讲座5年,以练习、考试和辅导为概念. The average attendance of the lecture was around 100 students per semester.
  • 指导一个2-5人(有偿)的硕士研究助理团队完成实验室的辅助任务.
  • 指导15名以上学生在实验室完成硕士和学士学位论文.
Technologies: Machine Learning, Digital Signal Processing, Data Science, Research, Neuroscience, Statistics, Data Analysis, Data Analytics, Writing & Editing, Matplotlib, Pandas, Scikit-learn, Time Series Analysis, Control Theory, Probability Theory, Linear Algebra, Deep Learning, Reinforcement Learning, Python, MATLAB, Technical Writing, PyTorch, Statistical Methods, Data Engineering, Seaborn, Neural Networks, Predictive Analytics, Statistical Modeling, Jupyter, AutoML, Artificial Intelligence (AI), NumPy, Consulting, Classification Algorithms

Research Assistant

2012 - 2014
National University of Colombia
  • Taught a course on analog electronics as the sole lecturer. In addition to preparing lectures, exercises sheets, and exams, I supervised the execution of student's projects.
  • 指导一名(带薪)本科生助教讲授模拟电子学.
  • Developed new methods for source localization of neural signals, resulting in a scientific publication in a peer-reviewed journal.
Technologies: Bayesian Statistics, Linear Algebra, Research, Neuroscience, MATLAB, Python, Probability Theory, Statistical Analysis, Artificial Intelligence (AI), NumPy

Research Discovery Engine Based on NLP Methods

机器学习团队中知识管理和发现框架的概念和发展. The product is framed as a discovery engine for machine learning research, 解决信息溢出的问题,比如每天有100多篇机器学习论文上传到arXiv.

Key Activities
•实现一个PoC系统,该系统由一个发现引擎组成,用于使用大型语言模型进行机器学习研究.
• Designed and deployed cloud infrastructure serving the discovery engine.
• Created a business model and go-to-market strategy.
•对50多名受访者进行用户发现和开发访谈.

Media Mix Model for Consumer Products

Development, implementation, 并为公司的全球运营部署了基于贝叶斯统计模型的营销组合模型.

Key Activities
• Implemented a learning model based on state-of-the-art research papers.
• Customized the model based on specific properties of the data available.
• Deployed the model to the cloud to be used by the MLOps team.
• Improved the model iteratively using feedback from the business unit.
•向业务部门的几个非技术利益相关者展示结果.

基于ml的特发性震颤患者适应性深部脑刺激系统

http://www.frontiersin.org/articles/10.3389/fnhum.2020.541625/full
首个基于机器学习的自适应深度脑刺激系统在自由活动的特发性震颤患者中实现. 这个项目是与华盛顿大学的同事合作完成的.

Key Activities
•建立了我们在弗莱堡大学的研究实验室(大脑状态解码实验室)和华盛顿大学(生物机器人实验室)之间的合作.
• Conceived and developed the underlying machine learning, control, and digital signal processing methods.
•将算法部署在主机PC和患者神经刺激器的嵌入式系统上.
• Executed the data collection experiments.
• Performed offline analysis of the collected data.
• Wrote and edited the final manuscript for a peer-reviewed publication.

Data Analysis of Injury Data in a Soccer Team

德甲足球队伤病数据的描述性和规范性分析. After I analyzed the data and presented the results to all the stakeholders, including non-technical personnel, 该项目成为该团队研究部门的一部分,旨在研究伤害预防和人才发展.

Key Activities
• Prepared and cleaned the data from several databases.
• Conducted descriptive and prescriptive analyses of the data.
•将结果呈现给所有利益相关者,包括非技术人员.

Deployment of a Data Management System

移交欧洲一家大型银行用于信用风险数据仓储和分析的数据管理系统.

Key Activities
• Configured and deployed UAT and production environments.
• Implemented a CI/CD pipeline for the back and front end.

Decoding Parkinson's Disease Symptoms from Brain Signals

http://www.sciencedirect.com/science/article/pii/S2213158220302138
一种新的监督机器学习方法,用于从大脑信号中解码帕金森病症状的强度. 我们收集了7名接受深部脑刺激治疗的患者的数据,并表明我们的新ML方法提高了症状的解码性能.

Key Activities
• Designed and executed the data collection experiments.
• Preprocessed data and performed exploratory data analysis.
• Conceived and implemented the novel ML method.
•用收集到的数据和最先进模型的基准验证了新的ML方法, including deep convolutional neural networks.
• Applied AutoML for hyperparameter optimization of all considered models.
•撰写并编辑了在同行评审期刊上发表的最终手稿.

Data Augmentation Framework for ML in Neuroscience

http://www.frontiersin.org/articles/10.3389/fninf.2019.00055/full
该项目开发的新框架允许对新的ML算法进行客观评估和基准测试,以分析神经学数据.

We tackled the following challenges:
•在使用数据驱动的方法分析大脑信号时,缺乏可用的数据.
• Unreliability of the available labels
• High level of noise in the raw signals.

Key Activities
• Conceived the idea.
• Executed the data analysis.
• Wrote the scientific manuscript published in a peer-reviewed journal.
2014 - 2020

PhD in Computer Science

University of Freiburg - Freiburg, Germany

2012 - 2014

Master's Degree in Engineering

National University of Colombia - Manizales, Colombia

2007 - 2012

Engineer's Degree in Electronics Engineering

National University of Colombia - Manizales, Colombia

Libraries/APIs

Scikit-learn, Matplotlib, Pandas, NumPy, PyTorch, React, TensorFlow

Tools

Spyder, MATLAB, Git, Seaborn, Jupyter, AutoML, PyCharm, Excel 2010

Languages

Python, SQL, R, Java, C#

Paradigms

Data Science, User Acceptance Testing (UAT)

Platforms

Windows, Linux, Jupyter Notebook, Docker, JBoss EAP, Amazon Web Services (AWS), Visual Studio Code (VS Code)

Storage

Database Management Systems (DBMS)

Other

Digital Signal Processing, Programming, Time Series Analysis, Linear Algebra, Machine Learning, Neuroscience, Deep Learning, Research, Technical Writing, Statistics, Statistical Methods, Data Analytics, Data Analysis, Data Engineering, Neural Networks, Statistical Modeling, Predictive Analytics, Writing & Editing, Statistical Analysis, Artificial Intelligence (AI), Consulting, Classification Algorithms, Generative Pre-trained Transformers (GPT), Probability Theory, Reinforcement Learning, Bayesian Statistics, Algorithms, Electronics, Natural Language Processing (NLP), OpenAI, Text Processing, GPT, Circuit Design, Control Theory, Calculus, Machine Learning Operations (MLOps), Data Warehousing, CI/CD Pipelines, Generative Systems, Large Language Models (LLMs), Business Planning, IT Project Management, Natural Language Understanding (NLU), Customer Research

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring