Welcome to AI for Genomic Science
Author: Joon-Yong An, Korea University
Last Update: 2025/11/2 (Under Construction - probably weekly update by 2026 if I am not lazy enough..)

This textbook introduces how artificial intelligence is revolutionizing biological research — from analyzing genetic variants to modeling entire cells. It is designed specifically for 3rd-year (junior/senior) biology undergraduates who want to understand the computational approaches that are transforming genomics, without requiring prior experience in AI or advanced programming.
The field of genomics has been rapidly transformed by machine learning, deep learning, and large-scale computational methods. These advances now allow us to analyze massive genomic datasets, predict functional impacts of variants, and model complex biological systems with unprecedented accuracy. This textbook takes a biology-first approach to cover the essential AI concepts and methods that undergraduate students need to master, integrating computational techniques with genomic applications.
This book is written for biology majors who have a solid foundation in molecular biology and genetics, are curious about computational approaches, want to understand the why and how behind AI methods in genomics, and are comfortable with basic mathematics. Rather than assuming extensive programming background, we start from the basics and build up gradually, emphasizing conceptual understanding alongside practical applications.
All 17 chapters are now available, organized into 5 parts. Each chapter includes an interactive companion page with hands-on simulations and visualizations to reinforce key concepts.
Each chapter starts with a real biological challenge—experimental limitations that motivate computational solutions. You’ll never wonder “why do I need to learn this?”
Every chapter includes Google Colab-based coding exercises. No installation needed—just click and start learning! All code is heavily commented and designed for beginners.
Math concepts are explained in Math Boxes with biological examples. We won’t shy away from equations, but we’ll make sure you understand what they mean.
Learn from actual research papers and real datasets. See how these methods are being used to make biological discoveries right now.
By the end of this book, you will be able to:
✅ Understand the fundamental concepts of machine learning and deep learning
✅ Explain how AI methods predict the effects of genetic variants
✅ Use pre-trained models to analyze genomic sequences
✅ Interpret results from tools like CADD, DeepSEA, Enformer, and DNABERT
✅ Understand how language models are applied to DNA and RNA sequences
✅ Analyze single-cell omics data using foundation models
✅ Critically evaluate AI-based studies in genomics literature
✅ Write basic Python code for bioinformatics analyses
This textbook is organized into five parts:
Learn the essential AI concepts every biologist should know — from Bayesian intuition to neural network architectures.
Understand how traditional and machine learning methods help us characterize genetic variation and interpret variant effects.
Explore how CNNs and transformers predict regulatory elements and variant effects directly from DNA sequences.
Discover how NLP techniques power DNA language models and genomic foundation models, from BERT-style architectures to next-generation long-context models.
See how AI is resolving cell-type heterogeneity at single-cell resolution and driving progress toward whole-cell computational models.
All coding exercises use Google Colab, which runs in your web browser. You’ll need:
No software installation required. We’ll walk you through everything in Chapter 1.
Part 1: Foundations of AI for Biology
Part 2: Genomics Foundations and Traditional Methods
Part 3: Deep Learning for Genomics
Part 4: Language Models and Foundation Models
Part 5: Single-Cell Genomics and Whole-Cell Modeling
Happy Learning! 🧬🤖
License information to be added