Data and Workflow Management for Bioinformatics | BINF.X410
This course explains where large data sets come from and how they are stored and managed. It also examines data sizes, accessibility approaches, and how data are transformed and used for AI consumption. You will examine the challenges and considerations when choosing data for training sets.
By the end of course, you will understand the types of data used in bioinformatics, how the data are collected, stored, managed and searched, and how the data are transformed for further processing and analysis. You will also develop skills on how to aggregate and normalize the data to be used for machine learning and/or AI training sets.
Learning Outcomes
At the conclusion of the course, you should be able to
- Identify the different types of data used in bioinformatics, their sources, and how they are collected, stored, searched, and managed.
- Explain how bioinformatics data are processed, transformed, and prepared for further analysis, including machine learning and AI applications.
- Demonstrate skills to aggregate, clean, and normalize bioinformatics data to ensure quality and consistency for AI training sets.
- Analyze the sizes, formats, and accessibility of bioinformatics datasets and understand key storage and management considerations.
- Evaluate the challenges and key considerations in selecting bioinformatics data for AI model training, including data quality, bias, and ethical implications.
Topics Include
- Pipeline Design
- Workflow management systems and workflow analysis with open-source tools
- Documentation skills / proof of concept with foresight
- Using SQL for bioinformatics data
- Data lakes (e.g, Databricks, Redshift and/or Snowflake)
- Large data sets
- Databases - how to store, move, and learn what AI models to use
Have a question about this course?
ENROLL EARLY!
- Save Your Seat
Help us confirm course scheduling. Enroll at least seven days before your course starts. - Accessing Canvas
Learn more about gaining access to your course on Canvas in our FAQ section. -
Accessibility and Accommodation
For accessibility questions or to request an accommodation, please visit Access for Students with Disabilities or email the Extension registrar. -
Finance Your Education
Here are ways to pay for your education.
This course is related to the following programs:
Sections Open for Enrollment:
Schedule
Date: | Start Time: | End Time: | Meeting Type: | Location: |
---|---|---|---|---|
Mon, 01-12-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 01-26-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 02-02-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 02-09-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 02-23-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 03-02-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 03-09-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 03-16-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 03-23-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |
Mon, 03-30-2026 | 6:30 p.m. | 9:30 p.m. | Flexible | SANTA CLARA / REMOTE |