Thứ Bảy, 8 tháng 2, 2014
Khai thác dữ liệu ESTs ở chi cam chanh cho việc phát triển marker phân tử SSR
v
ABSTRACT
LUU TRAN CONG HUY, NONG LAM UNIVERSITY, DATA MINING
FOR DEVELOPING SIMPLE SEQUENCE REPEATS (SSR) MARKER IN
EXPRESSED SEQUENCE TAGS (ESTs) FROM CITRUS
Supervisor:
The research was carried out at the department of biotechnology at Nong
Lam University.
Recent advances in genomic technologies have generated a vast amount of
publicly available expressed sequence tags (ESTs) in Citrus. These data can be
mined to identify Simple sequence repeats (SSRs) or microsatellites. These SSRs
are useful because of a broad range of application, such as genome mapping and
characterization, phenotype mapping, marker assisted selection of plant breeding,
additional map-based cloning of important genes. Moreover, this method of
developing SSR marker from ESTs is inexpensive comparing to the traditional
methods.
Methodology
1) We used perl script to receive EST sequences from database NCBI
2) Finded and separated SSRs include in ESTs database
3) We were learning about relationship database model to used to saved
nucleotide, SSRs citrus sequences data and created database contain them.
4) Finding SSR which are homologous with tristeza virus resistance gene.
5) Designed web that contain database control software to share information
with users
Results:
28,241 SSR-containing ESTs (EST-SSRs) were identified by analyzing
191,110 ESTs sequences belonging to Citrus in dbEST division of GenBank.
19,755 primers, which were filtered with repetition checking and BLAST checking,
vi
were designed in flanking regions of SSRs. These data were put into relational
database and integrated SSR finder tool into the BUILDING SSRs DATABASE of
Citrus Website. After cleaning, masking repeat, vector and organelle sequences, the
EST-SSR sequences and the related EST sequences without SSRs were assembled
into contigs and singletons, to reduce redundancy, to enlarge EST-SSRs for primer
designed and to develop consensus sequences. As a result, more 1071 primers were
design for these enlarged EST-SSRs. Using a stringent BLAST search with a
threshold e-value = 10
-10
against typical pathogen resistance gene database in
Citrus, we identified 33 EST-SSRs which are homologous with tristeza virus
resistance gene.
vii
Mục Lục
iii
iv
ABSTRACT vi
xi
Chƣơng 1 1
1
1.1 Đặt vấn đề
1.2.Mục tiêu của khóa luận
Chƣơng 2 3
3
3
3
m 4
2.1.3 6
2.2 EST 7
7
7
8
8
9
9
10
11
12
12
viii
2.3.5 13
15
16
18
18
2.7.1 NCBI 19
19
19
Chƣơng 3 20
20
20
20
20
3.1.2.1 Chương trình Perl ssrfinder_1 20
BLAST 22
23
3.1.2.4 Egassembler 23
3.1.3 Apache web Server 24
25
Chƣơng 4 37
37
4.1 37
EGassembler
38
38
4.2.2 39
39
ix
4.3 Assembling 41
42
4.4.1 BLASTn: 43
4.5. 45
4.6 tBLASTx 48
4
49
49
49
SRs (SSRs PAGE) 50
Chƣơng5 52
52
52
53
54
57
x
DANH SÁCH CÁC TỪ VIẾT TẮT
BLAST Basic Local Alignment Search Tool
CGI Common Gateway Interface
CSDL
DBD Database Driver
DBI Database Interface
DNA deoxyribonucleic acid
EST Expressed Sequence Tag
HTML Hypertext Markup Language
HTTP Hypertext Transfer Protocol
NCBI the National Center for Biotechnology Information
NIG the National Institute of Genetics
NIH the National Institutes of Health
NLM the Nation Library of Medicine
Perl Practical Extraction and Report Language
PHP Hypertext Preprocessior
RDBMS Relational Database Management System
SNP Single Nucleotide Polymorphism
SSCP Single- Strand Conformation Polymorphism
SSR Simple Sequence Repeats
STS Sequence Tagged Site
xi
DANH SÁCH CÁC BẢNG
26
26
34
34
34
37
38
39
39
41
42
43
45
45
48
50
51
xii
DANH SÁCH CÁC HÌNH
6
8
12
13
16
19
www.NCBI.nlm.nih.gov/genomes/plant/plantlist.html#est) 27
2 Egassembler 29
30
31
31
32
-
india.org/ssr/ssr.htm) 36
37
40
41
42-43
44
46
47
49
49
50
51
Chƣơng 1
MỞ ĐẦU
1.1 Đặt vấn đề
. SSR
T
marker
.
Microsatellite.
1.2.Mục tiêu của khóa luận
2
KHAI THÁC DỮ LIỆU ESTs (EXPRESSED
SEQUENCE TAGs) Ở CHI CAM CHANH (CITRUS) CHO VIỆC PHÁT
TRIỂN MARKER PHÂN TỬ SSR (SIMPLE SEQUENCE REPEATS)
u:
1.
2.
3.
4. K-
Egassembler)
5.
6.
7.
Đăng ký:
Đăng Nhận xét (Atom)
Không có nhận xét nào:
Đăng nhận xét