Participants will advance their existing machine learning skills and be able to:
-
Understand industry applications of the end-to-end ML lifecycle
-
Complete the foundations of ML with analytical methods and statistical deep dives
-
Preprocess data to fit the needs of modern ML algorithms
-
Understand the entire ML lifecycle and its applications
-
Create robust modelling of supervised and unsupervised algorithms
-
Know which algorithm to select for various real-world scenarios
-
Approach deep learning with applied knowledge of neural networks
-
Leverage deep learning methods using modern tools like PyTorch
-
Know the differences in compute when using deep learning
-
Apply foundational large language models (LLMs) to current use cases
The course covers the entire machine learning lifecycle and the toolkit needed to create robust machine learning pipelines:
-
Data preprocessing techniques
-
Statistical foundations of ML
-
Exploratory Data Analysis
-
Python best-practices
-
Modern ML libraries
-
Model selection and hyperparameter tuning
-
Supervised ML algorithms
-
Unsupervised techniques
-
ML pipelines
-
Data bias and how to avoid it in modelling
-
Dimensionality reduction toolkit
-
Deep Learning with neural networks
-
PyTorch application of deep learning
-
Bias-Variance Trade-off
-
Natural Language Processing (NLP)
-
NLP and deep learning applied to generative AI (LLMs)
-
Leverage deep learning techniques such as FFNNs, RNNs, LSTMs
-
The modern ML lifecycle
-
MLOps and ML deployment to production
This program is designed for experienced professionals with a background in engineering, science, or related fields such as aerospace, chemistry, biology, electronics, finance, communications, or technology. It is ideal for those who want to integrate data science and machine learning into their work. Learners are expected to have a solid understanding of calculus, linear algebra, probability, statistics, and basic programming skills, including Python. The course provides a balanced combination of theoretical concepts and practical applications.
What business outcomes will my teams deliver?
A deploy-ready, end-to-end ML artifact per cohort (code + brief report) tied to your use cases—e.g., predictive maintenance, quality, demand, or risk. Teams leave with reusable notebooks, a reference pipeline, and a short adoption plan for the next 90 days.
How is the curriculum tailored to our stack and use cases?
We co-scope your goals (NDA available), then tune modules (Python refresh, ML, PyTorch DL, LLMs/GenAI, MLOps/governance) and examples to your domain and cloud/tooling (AWS, Azure, GCP; Git/CI/CD; feature stores).
What’s the skills baseline? Can you level-set?
Ideal learners use Python and basic stats. We can add prework (asynchronous refreshers or a ½-day primer) and a short diagnostic to ensure a consistent starting point.
How do you verify learning and performance?
Participation + in-lab checkpoints + a capstone reviewed against rubric criteria (data prep, model quality, documentation, reproducibility). We can issue completion reports to managers and optional digital badges.
Can we measure ROI and track adoption?
We provide a simple impact framework (use-case funnel, quick-win metrics, “from notebook to pilot” milestones) and can align to your PMO/OKRs. Optional post-program coaching accelerates pilot delivery.
What are delivery formats and schedules?
Private cohorts on campus, your locations, live online, or hybrid. Standard is 5 sessions (e.g., five consecutive days or spaced Fridays). We can run multiple time zones and staggered schedules for shifts/global teams.
How large is a cohort? Can we scale?
15–25 is optimal for hands-on support. We can run parallel cohorts, regional waves, or a train-the-trainer to scale across business units.
How do you handle data security and compliance?
We support sanitized sample data by default. With NDA, we can use your secure environment (e.g., VPC/JupyterHub/Azure ML/SageMaker/Vertex AI) with no external data egress. We follow your IT policies for access, recording, and retention.
What software/hardware do participants need?
Live-online: laptop with modern browser, webcam/headset. Labs run in your environment or ours (managed notebooks). No local GPU required unless requested.
Which libraries and tools are used?
Python, NumPy/Pandas/Scikit-learn, PyTorch for DL, plus LLM/GenAI workflows and MLOps concepts (packaging, evaluation, basic CI/CD). We align versions with your standards.
Can teams use proprietary datasets?
Yes—if permitted under your policies and delivered within your secure environment. Otherwise, we provide domain-relevant datasets that mirror your use cases.
What if people miss a session?
These are highly interactive. We can schedule make-up labs, office hours, or a condensed catch-up block. Recording policies follow your security rules.
What credentials are issued?
Caltech CTME Certificate for each participant; CEUs are available. We can provide completion data for HR/LMS systems.
What about accessibility and global participation?
We support accommodations and can design schedules for cross-region teams. Materials are accessible and provided ahead of sessions where needed.
How do procurement and billing work?
We support PO/invoicing and group pricing. We can work under an existing MSA or set up a new agreement with your vendor team.
Can Caltech faculty or JPL researchers teach or appear as guest speakers?
Yes—subject to scope, availability, and institutional policies. We can request campus faculty or JPL researchers for lectures, panels, or keynotes (virtual or on-site); additional approvals, scheduling lead time, and fees may apply. Note: JPL participation follows NASA/JPL guidelines.
How do we bring this to our company?
Email execed@caltech.edu. We’ll set up a brief scoping call, outline a tailored plan (scope, schedule, security, pricing), and deliver a proposal within a few business days.