Welcome to AI for Genomic Science
Author: Joon-Yong An, Korea University
Last Update: 2025/11/2 (Under Construction - probably weekly update by 2026 if I am not lazy enough..)

This textbook introduces how artificial intelligence is revolutionizing biological research — from analyzing genetic variants to modeling entire cells. It is designed specifically for undergraduate biology students (senior level) who want to understand the computational approaches that are transforming genomics, without requiring prior experience in AI or advanced programming.
The field of genomics has been rapidly transformed by machine learning, deep learning, and large-scale computational methods. These advances now allow us to analyze massive genomic datasets, predict functional impacts of variants, and model complex biological systems with unprecedented accuracy. This textbook takes a biology-first approach to cover the essential AI concepts and methods that undergraduate students need to master, integrating computational techniques with genomic applications.
This book is written for biology majors who have a solid foundation in molecular biology and genetics, are curious about computational approaches, want to understand the why and how behind AI methods in genomics, and are comfortable with basic mathematics. Rather than assuming extensive programming background, we start from the basics and build up gradually, emphasizing conceptual understanding alongside practical applications.
Please note that chapters are currently being written and improved. The complete version is expected to be finished by 2026!
Each chapter starts with a real biological challenge—experimental limitations that motivate computational solutions. You’ll never wonder “why do I need to learn this?”
Every chapter includes Google Colab-based coding exercises. No installation needed—just click and start learning! All code is heavily commented and designed for beginners.
Math concepts are explained in Math Boxes with biological examples. We won’t shy away from equations, but we’ll make sure you understand what they mean.
Learn from actual research papers and real datasets. See how these methods are being used to make biological discoveries right now.
By the end of this book, you will be able to:
✅ Understand the fundamental concepts of machine learning and deep learning
✅ Explain how AI methods predict the effects of genetic variants
✅ Use pre-trained models to analyze genomic sequences
✅ Interpret results from tools like CADD, DeepSEA, Enformer, and DNABERT
✅ Understand how language models are applied to DNA and RNA sequences
✅ Analyze single-cell omics data using foundation models
✅ Critically evaluate AI-based studies in genomics literature
✅ Write basic Python code for bioinformatics analyses
This textbook is organized into five parts:
Learn the essential AI concepts every biologist should know—from neural networks to different architectures.
Understand how traditional and machine learning methods help us prioritize and interpret genetic variants.
Explore how convolutional neural networks and transformers predict regulatory elements and variant effects from DNA sequences.
Discover how natural language processing techniques are revolutionizing genomics through DNA language models and foundation models.
See how AI is helping us understand individual cells and move toward whole-cell computational models.
All coding exercises use Google Colab, which runs in your web browser. You’ll need:
No software installation required. We’ll walk you through everything in Chapter 1.
Part 1: Foundations of AI
Part 2: Genetic Variants and Early AI
Part 3: Deep Learning for Genomic Sequences
Part 4: Language Models Meet DNA
Part 5: Single-Cell Omics and Foundation Models
Happy Learning! 🧬🤖
License information to be added