William Steimel

Full Stack Data Scientist | Microsoft Certified: Azure Data Scientist Associate | Signate Master | JLPT N2

Tokyo, Japan

WS

About

As a Data Science technical lead I have a passion for developing end-to-end AI solutions. My journey has been shaped by hands-on experience across diverse industries from financial services to Pharma, where I’ve harnessed data-driven strategies, embraced agile methodologies, and provided mentorship to drive meaningful change. Whether it’s wrangling data, designing models, or deploying solutions, I thrive on the entire stack. My track record? Delivering immediate bottom line impact through production-grade AI systems aligned to business strategy.

Work Experience

GSK
Technical Lead

2022 - 2025

Senior Data Scientist → Principal Data Scientist (Global Associate Director)

Led the technical development and oversight of Commercial Pharma AI systems in Japan and globally, directing a team of 8 engineers and data scientists in executing their work.

Architected and launched a next-gen ML and data engineering platform on Databricks, optimizing automation and ML model deployment. Reduced pipeline runtimes by 80%, enabling daily automation and saving ~$100K annually in compute and labor costs.

Engineered an agentic generative AI web app using React, Next.js, TypeScript, Python, FastAPI, and AI content generation APIs, leading end-to-end development and enhancing global content generation capabilities.

Deployed and scaled an MVP Omnichannel Decisioning AI engine, expanding usage from 0 to ~2000 users across multiple brands and markets. Delivered multiple ML system deployments, driving significant impact across priority brands while reducing technical resource costs and saving £1.5M in outsourcing.

Served as the key point of contact for data requests, interfacing with business stakeholders in Japanese, prioritizing tasks, and facilitating agile processes as Scrum Master to ensure timely and high-quality delivery.

Reduced vendor lock-in by upskilling 4 internal data scientists and engineers for smoother, higher-quality delivery.

Fostered a knowledge-sharing culture by maintaining comprehensive documentation and best practices, equipping new members with essential Japan Pharma and technical expertise.

Led the development of multiple successful PoCs, generating interest in genAI and securing significant funding.

Data Scientist

Reduced 15-25 hours of manual development work per document layout to one UI click, through development of scalable Azure Machine Learning Pipelines for self-service training of layout based NER models for extraction of key-value pairs from financial documents.

Utilized Scrapy web scraping and crawling framework and developed reusable ingestion scripts on over 20 Japanese open datasets for KPMG’s global Signals Repository platform.

Maintained code quality and documentation of pipelines and software libraries for easier onboarding of new members.

Designed and developed configuration based excel extraction and spreadsheet analysis library with open source libraries Pandas/Openpyxl/xlrd for ingestion of real estate property management reports and other related documents.

Designed, planned, and implemented six generic reusable python libraries utilized across KPMGs document ingestion platforms.

Implemented generic feature extraction functions on polygon shape data utilizing vector and raster data. Functions contributed to geoscaler python library for scaling geospatial data between geographical levels.

Developed scenario based simulation system with Streamlit and Docker to forecast profit loss and cash flow of buildings in REIT portfolio using accounting and real estate domain knowledge.

Lombard Inc.

2018 - 2019

Machine Learning Specialist

Analyzed data from internal affiliate marketing application to get insight into partner behavior and develop recommender system.

ADP

2015 - 2017

Intern → Infrastructure & Operations Analyst → IT Service Manager

Responsible for management of overall quality of IT Service Areas including Data Center Site Mgmt, IT Hosting Ops, Critical Incident Response Team, Enterprise Change Control, and Mainframe

Responsible for implementing reporting automation and statistical analysis to facilitate process optimization, capacity/demand forecasting, service level management, and audit compliance

Education

Sophia University

2017 - 2019
Master of Science in Green Science and Engineering, Geospatial Data Analysis/Machine Learning: Mext Scholarship

United Nations University

2017 - 2019
UNU-IAS Joint Diploma in Sustainability Science

Felician University

2010 - 2015
Bachelor of Science in Business Administration, Minor: Global Peace and Justice Studies

Kansai Gaidai University

2012 - 2013
Asian Studies Program, Study Abroad for two semesters

Skills

Python
Applied Machine Learning
MLOps
Omni-Channel Marketing
Scrum/Agile
Data Visualization
Pandas
NumPy
PySpark
NLP
Agentic AI
DataBricks
Azure
JavaScript
TypeScript
React/Next.js
Node.js
git