Statistical Bioinformatics: For Biomedical And Life Science Researchers
商品資訊
系列名:Methods of Biochemical Analysis
ISBN13:9780471692720
出版社:John Wiley & Sons Inc
作者:Lee
出版日:2010/01/26
裝訂/頁數:平裝/368頁
規格:22.9cm*15.2cm*1.9cm (高/寬/厚)
定價
:NT$ 5470 元優惠價
:90 折 4923 元
若需訂購本書,請電洽客服 02-25006600[分機130、131]。
商品簡介
作者簡介
目次
相關商品
商品簡介
This book provides an essential understanding of statistical concepts necessary for the analysis of genomic and proteomic data using computational techniques. The author presents both basic and advanced topics, focusing on those that are relevant to the computational analysis of large data sets in biology. Chapters begin with a description of a statistical concept and a current example from biomedical research, followed by more detailed presentation, discussion of limitations, and problems. The book starts with an introduction to probability and statistics for genome-wide data, and moves into topics such as clustering, classification, multi-dimensional visualization, experimental design, statistical resampling, and statistical network analysis.
* Clearly explains the use of bioinformatics tools in life sciences research without requiring an advanced background in math/statistics
* Enables biomedical and life sciences researchers to successfully evaluate the validity of their results and make inferences
* Enables statistical and quantitative researchers to rapidly learn novel statistical concepts and techniques appropriate for large biological data analysis
* Carefully revisits frequently used statistical approaches and highlights their limitations in large biological data analysis
* Offers programming examples and datasets
* Includes chapter problem sets, a glossary, a list of statistical notations, and appendices with references to background mathematical and technical material
* Features supplementary materials, including datasets, links, and a statistical package available online
Statistical Bioinformatics is an ideal textbook for students in medicine, life sciences, and bioengineering, aimed at researchers who utilize computational tools for the analysis of genomic, proteomic, and many other emerging high-throughput molecular data. It may also serve as a rapid introduction to the bioinformatics science for statistical and computational students and audiences who have not experienced such analysis tasks before.
*
Written for biologists rather than statisticians or computer scientists.
* Clearly explains the use of bioinformatics tools in life sciences research without requiring an advanced background in math/statistics
* Enables biomedical and life sciences researchers to successfully evaluate the validity of their results and make inferences
* Enables statistical and quantitative researchers to rapidly learn novel statistical concepts and techniques appropriate for large biological data analysis
* Carefully revisits frequently used statistical approaches and highlights their limitations in large biological data analysis
* Offers programming examples and datasets
* Includes chapter problem sets, a glossary, a list of statistical notations, and appendices with references to background mathematical and technical material
* Features supplementary materials, including datasets, links, and a statistical package available online
Statistical Bioinformatics is an ideal textbook for students in medicine, life sciences, and bioengineering, aimed at researchers who utilize computational tools for the analysis of genomic, proteomic, and many other emerging high-throughput molecular data. It may also serve as a rapid introduction to the bioinformatics science for statistical and computational students and audiences who have not experienced such analysis tasks before.
*
Written for biologists rather than statisticians or computer scientists.
作者簡介
Jae K. Lee, Ph.D., is a professor of biostatistics and epidemiology in the Department of Health Evaluation Sciences at the University of Virginia School of Medicine, where he designed and teaches a course on Statistical Bioinformatics in Medicine. He earned his doctorate in statistical genetics from the University of Wisconsin, Madison. He was previously a research scientist in the Laboratory of Molecular Pharmacology, National Cancer Institute. Among his current research interests is the integration of statistical and genomic information for the analysis of microarray data.
目次
PREFACE xi
CONTRIBUTORS xiii
1 ROAD TO STATISTICAL BIOINFORMATICS 1
Challenge 1: Multiple-Comparisons Issue 1
Challenge 2: High-Dimensional Biological Data 2
Challenge 3: Small-n and Large-p Problem 3
Challenge 4: Noisy High-Throughput Biological Data 3
Challenge 5: Integration of Multiple, Heterogeneous Biological Data Information 3
References 5
2 PROBABILITY CONCEPTS AND DISTRIBUTIONS FOR ANALYZING LARGE BIOLOGICAL DATA 7
2.1 Introduction 7
2.2 Basic Concepts 8
2.3 Conditional Probability and Independence 10
2.4 Random Variables 13
2.5 Expected Value and Variance 15
2.6 Distributions of Random Variables 19
2.7 Joint and Marginal Distribution 39
2.8 Multivariate Distribution 42
2.9 Sampling Distribution 46
2.10 Summary 54
3 QUALITY CONTROL OF HIGH-THROUGHPUT BIOLOGICAL DATA 57
3.1 Sources of Error in High-Throughput Biological Experiments 57
3.2 Statistical Techniques for Quality Control 59
3.3 Issues Specific to Microarray Gene Expression Experiments 66
3.4 Conclusion 69
References 69
4 STATISTICAL TESTING AND SIGNIFICANCE FOR LARGE BIOLOGICAL DATA ANALYSIS 71
4.1 Introduction 71
4.2 Statistical Testing 72
4.3 Error Controlling 78
4.4 Real Data Analysis 81
4.5 Concluding Remarks 87
Acknowledgments 87
References 88
5 CLUSTERING: UNSUPERVISED LEARNING IN LARGE BIOLOGICAL DATA 89
5.1 Measures of Similarity 90
5.2 Clustering 99
5.3 Assessment of Cluster Quality 115
5.4 Conclusion 123
References 123
6 CLASSIFICATION: SUPERVISED LEARNING WITH HIGH-DIMENSIONAL BIOLOGICAL DATA 129
6.1 Introduction 129
6.2 Classification and Prediction Methods 132
6.3 Feature Selection and Ranking 140
6.4 Cross-Validation 144
6.5 Enhancement of Class Prediction by Ensemble Voting Methods 145
6.6 Comparison of Classification Methods Using High-Dimensional Data 147
6.7 Software Examples for Classification Methods 150
References 154
7 MULTIDIMENSIONAL ANALYSIS AND VISUALIZATION ON LARGE BIOMEDICAL DATA 157
7.1 Introduction 157
7.2 Classical Multidimensional Visualization Techniques 158
7.3 Two-Dimensional Projections 161
7.4 Issues and Challenges 165
7.5 Systematic Exploration of Low-Dimensional Projections 166
7.6 One-Dimensional Histogram Ordering 170
7.7 Two-Dimensional Scatterplot Ordering 174
7.8 Conclusion 181
References 182
8 STATISTICAL MODELS, INFERENCE, AND ALGORITHMS FOR LARGE BIOLOGICAL DATA ANALYSIS 185
8.1 Introduction 185
8.2 Statistical/Probabilistic Models 187
8.3 Estimation Methods 189
8.4 Numerical Algorithms 191
8.5 Examples 192
8.6 Conclusion 198
References 199
9 EXPERIMENTAL DESIGNS ON HIGH-THROUGHPUT BIOLOGICAL EXPERIMENTS 201
9.1 Randomization 201
9.2 Replication 202
9.3 Pooling 209
9.4 Blocking 210
9.5 Design for Classifications 214
9.6 Design for Time Course Experiments 215
9.7 Design for eQTL Studies 215
References 216
10 STATISTICAL RESAMPLING TECHNIQUES FOR LARGE BIOLOGICAL DATA ANALYSIS 219
10.1 Introduction 219
10.2 Resampling Methods for Prediction Error Assessment and Model Selection 221
10.3 Feature Selection 225
10.4 Resampling-Based Classification Algorithms 226
10.5 Practical Example: Lymphoma 226
10.6 Resampling Methods 227
10.7 Bootstrap Methods 232
10.8 Sample Size Issues 233
10.9 Loss Functions 235
10.10 Bootstrap Resampling for Quantifying Uncertainty 236
10.11 Markov Chain Monte Carlo Methods 238
10.12 Conclusions 240
References 247
11 STATISTICAL NETWORK ANALYSIS FOR BIOLOGICAL SYSTEMS AND PATHWAYS 249
11.1 Introduction 249
11.2 Boolean Network Modeling 250
11.3 Bayesian Belief Network 259
11.4 Modeling of Metabolic Networks 273
References 279
12 TRENDS AND STATISTICAL CHALLENGES IN GENOMEWIDE ASSOCIATION STUDIES 283
12.1 Introduction 283
12.2 Alleles, Linkage Disequilibrium, and Haplotype 283
12.3 International HapMap Project 285
12.4 Genotyping Platforms 286
12.5 Overview of Current GWAS Results 287
12.6 Statistical Issues in GWAS 290
12.7 Haplotype Analysis 296
12.8 Homozygosity and Admixture Mapping 298
12.9 Gene Gene and Gene Environment Interactions 298
12.10 Gene and Pathway-Based Analysis 299
12.11 Disease Risk Estimates 301
12.12 Meta-Analysis 301
12.13 Rare Variants and Sequence-Based Analysis 302
12.14 Conclusions 302
Acknowledgments 303
References 303
13 R AND BIOCONDUCTOR PACKAGES IN BIOINFORMATICS: TOWARDS SYSTEMS BIOLOGY 309
13.1 Introduction 309
13.2 Brief overview of the Bioconductor Project 310
13.3 Experimental Data 311
13.4 Annotation 318
13.5 Models of Biological Systems 328
13.6 Conclusion 335
13.7 Acknowledgments 336
References 336
INDEX 339
CONTRIBUTORS xiii
1 ROAD TO STATISTICAL BIOINFORMATICS 1
Challenge 1: Multiple-Comparisons Issue 1
Challenge 2: High-Dimensional Biological Data 2
Challenge 3: Small-n and Large-p Problem 3
Challenge 4: Noisy High-Throughput Biological Data 3
Challenge 5: Integration of Multiple, Heterogeneous Biological Data Information 3
References 5
2 PROBABILITY CONCEPTS AND DISTRIBUTIONS FOR ANALYZING LARGE BIOLOGICAL DATA 7
2.1 Introduction 7
2.2 Basic Concepts 8
2.3 Conditional Probability and Independence 10
2.4 Random Variables 13
2.5 Expected Value and Variance 15
2.6 Distributions of Random Variables 19
2.7 Joint and Marginal Distribution 39
2.8 Multivariate Distribution 42
2.9 Sampling Distribution 46
2.10 Summary 54
3 QUALITY CONTROL OF HIGH-THROUGHPUT BIOLOGICAL DATA 57
3.1 Sources of Error in High-Throughput Biological Experiments 57
3.2 Statistical Techniques for Quality Control 59
3.3 Issues Specific to Microarray Gene Expression Experiments 66
3.4 Conclusion 69
References 69
4 STATISTICAL TESTING AND SIGNIFICANCE FOR LARGE BIOLOGICAL DATA ANALYSIS 71
4.1 Introduction 71
4.2 Statistical Testing 72
4.3 Error Controlling 78
4.4 Real Data Analysis 81
4.5 Concluding Remarks 87
Acknowledgments 87
References 88
5 CLUSTERING: UNSUPERVISED LEARNING IN LARGE BIOLOGICAL DATA 89
5.1 Measures of Similarity 90
5.2 Clustering 99
5.3 Assessment of Cluster Quality 115
5.4 Conclusion 123
References 123
6 CLASSIFICATION: SUPERVISED LEARNING WITH HIGH-DIMENSIONAL BIOLOGICAL DATA 129
6.1 Introduction 129
6.2 Classification and Prediction Methods 132
6.3 Feature Selection and Ranking 140
6.4 Cross-Validation 144
6.5 Enhancement of Class Prediction by Ensemble Voting Methods 145
6.6 Comparison of Classification Methods Using High-Dimensional Data 147
6.7 Software Examples for Classification Methods 150
References 154
7 MULTIDIMENSIONAL ANALYSIS AND VISUALIZATION ON LARGE BIOMEDICAL DATA 157
7.1 Introduction 157
7.2 Classical Multidimensional Visualization Techniques 158
7.3 Two-Dimensional Projections 161
7.4 Issues and Challenges 165
7.5 Systematic Exploration of Low-Dimensional Projections 166
7.6 One-Dimensional Histogram Ordering 170
7.7 Two-Dimensional Scatterplot Ordering 174
7.8 Conclusion 181
References 182
8 STATISTICAL MODELS, INFERENCE, AND ALGORITHMS FOR LARGE BIOLOGICAL DATA ANALYSIS 185
8.1 Introduction 185
8.2 Statistical/Probabilistic Models 187
8.3 Estimation Methods 189
8.4 Numerical Algorithms 191
8.5 Examples 192
8.6 Conclusion 198
References 199
9 EXPERIMENTAL DESIGNS ON HIGH-THROUGHPUT BIOLOGICAL EXPERIMENTS 201
9.1 Randomization 201
9.2 Replication 202
9.3 Pooling 209
9.4 Blocking 210
9.5 Design for Classifications 214
9.6 Design for Time Course Experiments 215
9.7 Design for eQTL Studies 215
References 216
10 STATISTICAL RESAMPLING TECHNIQUES FOR LARGE BIOLOGICAL DATA ANALYSIS 219
10.1 Introduction 219
10.2 Resampling Methods for Prediction Error Assessment and Model Selection 221
10.3 Feature Selection 225
10.4 Resampling-Based Classification Algorithms 226
10.5 Practical Example: Lymphoma 226
10.6 Resampling Methods 227
10.7 Bootstrap Methods 232
10.8 Sample Size Issues 233
10.9 Loss Functions 235
10.10 Bootstrap Resampling for Quantifying Uncertainty 236
10.11 Markov Chain Monte Carlo Methods 238
10.12 Conclusions 240
References 247
11 STATISTICAL NETWORK ANALYSIS FOR BIOLOGICAL SYSTEMS AND PATHWAYS 249
11.1 Introduction 249
11.2 Boolean Network Modeling 250
11.3 Bayesian Belief Network 259
11.4 Modeling of Metabolic Networks 273
References 279
12 TRENDS AND STATISTICAL CHALLENGES IN GENOMEWIDE ASSOCIATION STUDIES 283
12.1 Introduction 283
12.2 Alleles, Linkage Disequilibrium, and Haplotype 283
12.3 International HapMap Project 285
12.4 Genotyping Platforms 286
12.5 Overview of Current GWAS Results 287
12.6 Statistical Issues in GWAS 290
12.7 Haplotype Analysis 296
12.8 Homozygosity and Admixture Mapping 298
12.9 Gene Gene and Gene Environment Interactions 298
12.10 Gene and Pathway-Based Analysis 299
12.11 Disease Risk Estimates 301
12.12 Meta-Analysis 301
12.13 Rare Variants and Sequence-Based Analysis 302
12.14 Conclusions 302
Acknowledgments 303
References 303
13 R AND BIOCONDUCTOR PACKAGES IN BIOINFORMATICS: TOWARDS SYSTEMS BIOLOGY 309
13.1 Introduction 309
13.2 Brief overview of the Bioconductor Project 310
13.3 Experimental Data 311
13.4 Annotation 318
13.5 Models of Biological Systems 328
13.6 Conclusion 335
13.7 Acknowledgments 336
References 336
INDEX 339
主題書展
更多
主題書展
更多書展今日66折
您曾經瀏覽過的商品
購物須知
外文書商品之書封,為出版社提供之樣本。實際出貨商品,以出版社所提供之現有版本為主。部份書籍,因出版社供應狀況特殊,匯率將依實際狀況做調整。
無庫存之商品,在您完成訂單程序之後,將以空運的方式為你下單調貨。為了縮短等待的時間,建議您將外文書與其他商品分開下單,以獲得最快的取貨速度,平均調貨時間為1~2個月。
為了保護您的權益,「三民網路書店」提供會員七日商品鑑賞期(收到商品為起始日)。
若要辦理退貨,請在商品鑑賞期內寄回,且商品必須是全新狀態與完整包裝(商品、附件、發票、隨貨贈品等)否則恕不接受退貨。