control.Rmd
Imagine you have a data set of primer sequences. You want to quickly find out information about this set of primers. Let’s start by loading the list.
primers <- read.csv("primers.csv", header = FALSE, stringsAsFactors = FALSE)
str(primers) ## 'data.frame': 560 obs. of 1 variable: ## $ V1: chr "GAACAAAGAAGTACAAAGGAGTAAATACAATTTTATTATCCGGATCCCCGGGTTAATTAA " "TACACCGAGTATACAACATGACTATACCATGGAAGGTTGGCGGATCCCCGGGTTAATTAA " "TGACCACCTGGCTTGCAGCTAATAGTGAAAAAACACAAATGAATTCGAGCTCGTTTAAAC " "AAGAATGGACGACTTCTTATCACGTATAGGAGTGATATACCGGATCCCCGGGTTAATTAA" ...
There are 562 primers in the list, so if you wanted to find the length of all of them, you probably wouldn’t want to count the length of each one by hand. Similarly, if you wanted to find the molecular weight of all of them, doing so by hand would be extremely time consuming. Control statements can help us make these tasks quicker.
Each programming language has its own syntax for the different control statements. In R, you create a for-loop according to the following basic syntax:
for (object in list) { expression with object }
When a for-loop is executed, R iterates through the given list and assigns each element of the list to the given variable, then executes the given expression for that variable. In the following example, the for-loop says iterate over 1:5
and use the print
function to display each number.
for (a in 1:5) { print(a) } ## [1] 1 ## [1] 2 ## [1] 3 ## [1] 4 ## [1] 5
If you run ls
after running the for loop you see that R created a variable a
and it has the value 5
.
ls() ## [1] "a" "primers" a ## [1] 5
The value of a
makes sense because the last element in the given list, 1:5
, is 5
.
You can use a for-loop to do something to every element in a list. Create a list of years, called yrs
then print a statement saying “The year is ____”. You can check the length of a vector/list with the function length
.
yrs <- c("2000", "2005", "2010") for (i in 1:length(yrs)) { print(paste("The year is", yrs[i])) } ## [1] "The year is 2000" ## [1] "The year is 2005" ## [1] "The year is 2010"
Traditionally, for-loops iterate over the sequence 1...N
like above. However, R allows for-loops to iterate over any type of list. Rather than iterate over 1:length(yrs)
R can just iterate over yrs
.
for (j in yrs) { print(paste("The year is", j)) } ## [1] "The year is 2000" ## [1] "The year is 2005" ## [1] "The year is 2010"
For loops can also be nested. For example, you can iterate over every element in a matrix.
mat <- matrix(1:6, nrow = 3) mat ## [,1] [,2] ## [1,] 1 4 ## [2,] 2 5 ## [3,] 3 6 for (i in 1:nrow(mat)) { for (j in 1:ncol(mat)) { print(paste("mat[", i, ",", j, "] = ", mat[i, j], sep = "")) } } ## [1] "mat[1,1] = 1" ## [1] "mat[1,2] = 4" ## [1] "mat[2,1] = 2" ## [1] "mat[2,2] = 5" ## [1] "mat[3,1] = 3" ## [1] "mat[3,2] = 6"
The example above iterates first by row, then by column. In other words, start with row 1 and go through every column, then move to the next row and iterate through every column, and so on.
Let’s look at an example with our list of primers. We are interested in knowing the length of each of our primers.
for (i in primers[[1]]) { # could also use primers$V1 or other subsetting here print(nchar(i)) } ## [1] 61 ## [1] 61 ## [1] 62 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 61 ## [1] 61 ## [1] 61 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 30 ## [1] 22 ## [1] 24 ## [1] 24 ## [1] 26 ## [1] 24 ## [1] 24 ## [1] 24 ## [1] 27 ## [1] 27 ## [1] 24 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 76 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 21 ## [1] 21 ## [1] 23 ## [1] 24 ## [1] 24 ## [1] 26 ## [1] 25 ## [1] 22 ## [1] 24 ## [1] 30 ## [1] 30 ## [1] 54 ## [1] 54 ## [1] 30 ## [1] 40 ## [1] 47 ## [1] 37 ## [1] 20 ## [1] 18 ## [1] 25 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 27 ## [1] 23 ## [1] 35 ## [1] 22 ## [1] 22 ## [1] 23 ## [1] 20 ## [1] 27 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 20 ## [1] 19 ## [1] 20 ## [1] 22 ## [1] 23 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 29 ## [1] 27 ## [1] 21 ## [1] 24 ## [1] 48 ## [1] 48 ## [1] 22 ## [1] 17 ## [1] 19 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 61 ## [1] 60 ## [1] 35 ## [1] 49 ## [1] 43 ## [1] 37 ## [1] 22 ## [1] 18 ## [1] 22 ## [1] 22 ## [1] 46 ## [1] 44 ## [1] 41 ## [1] 34 ## [1] 21 ## [1] 18 ## [1] 21 ## [1] 61 ## [1] 60 ## [1] 61 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 22 ## [1] 25 ## [1] 23 ## [1] 22 ## [1] 24 ## [1] 21 ## [1] 23 ## [1] 23 ## [1] 23 ## [1] 51 ## [1] 30 ## [1] 25 ## [1] 24 ## [1] 25 ## [1] 23 ## [1] 22 ## [1] 24 ## [1] 23 ## [1] 20 ## [1] 24 ## [1] 37 ## [1] 60 ## [1] 60 ## [1] 59 ## [1] 22 ## [1] 23 ## [1] 20 ## [1] 20 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 24 ## [1] 54 ## [1] 61 ## [1] 61 ## [1] 18 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 24 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 20 ## [1] 22 ## [1] 25 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 24 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 17 ## [1] 23 ## [1] 18 ## [1] 20 ## [1] 22 ## [1] 22 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 81 ## [1] 80 ## [1] 81 ## [1] 80 ## [1] 22 ## [1] 23 ## [1] 20 ## [1] 20 ## [1] 60 ## [1] 60 ## [1] 81 ## [1] 80 ## [1] 81 ## [1] 80 ## [1] 24 ## [1] 20 ## [1] 43 ## [1] 40 ## [1] 40 ## [1] 19 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 61 ## [1] 20 ## [1] 20 ## [1] 17 ## [1] 17 ## [1] 18 ## [1] 38 ## [1] 42 ## [1] 60 ## [1] 29 ## [1] 26 ## [1] 42 ## [1] 50 ## [1] 18 ## [1] 18 ## [1] 50 ## [1] 23 ## [1] 22 ## [1] 21 ## [1] 20 ## [1] 21 ## [1] 22 ## [1] 20 ## [1] 22 ## [1] 21 ## [1] 53 ## [1] 60 ## [1] 20 ## [1] 71 ## [1] 33 ## [1] 33 ## [1] 62 ## [1] 18 ## [1] 20 ## [1] 51 ## [1] 37 ## [1] 45 ## [1] 60 ## [1] 35 ## [1] 39 ## [1] 55 ## [1] 56 ## [1] 35 ## [1] 34 ## [1] 35 ## [1] 39 ## [1] 49 ## [1] 48 ## [1] 47 ## [1] 41 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 18 ## [1] 57 ## [1] 20 ## [1] 18 ## [1] 20 ## [1] 22 ## [1] 20 ## [1] 20 ## [1] 23 ## [1] 23 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 20 ## [1] 22 ## [1] 23 ## [1] 22 ## [1] 20 ## [1] 19 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 59 ## [1] 60 ## [1] 35 ## [1] 34 ## [1] 35 ## [1] 35 ## [1] 55 ## [1] 52 ## [1] 35 ## [1] 30 ## [1] 24 ## [1] 22 ## [1] 20 ## [1] 20 ## [1] 21 ## [1] 20 ## [1] 23 ## [1] 22 ## [1] 23 ## [1] 22 ## [1] 20 ## [1] 22 ## [1] 22 ## [1] 23 ## [1] 22 ## [1] 22 ## [1] 23 ## [1] 19 ## [1] 21 ## [1] 20 ## [1] 20 ## [1] 18 ## [1] 20 ## [1] 21 ## [1] 20 ## [1] 22 ## [1] 21 ## [1] 21 ## [1] 21 ## [1] 22 ## [1] 19 ## [1] 18 ## [1] 19 ## [1] 22 ## [1] 21 ## [1] 19 ## [1] 21 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 20 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 20 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 20 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 19 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 21 ## [1] 18 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 21 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 20 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 22 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 59 ## [1] 59 ## [1] 59 ## [1] 20 ## [1] 22 ## [1] 20 ## [1] 23 ## [1] 60 ## [1] 60 ## [1] 40 ## [1] 40 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 60 ## [1] 23 ## [1] 22 ## [1] 20 ## [1] 20 ## [1] 23 ## [1] 24 ## [1] 23 ## [1] 18 ## [1] 58 ## [1] 58 ## [1] 21 ## [1] 22 ## [1] 60 ## [1] 60 ## [1] 61 ## [1] 58 ## [1] 59 ## [1] 23 ## [1] 23 ## [1] 19 ## [1] 58 ## [1] 59 ## [1] 58 ## [1] 59 ## [1] 58 ## [1] 59 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 40 ## [1] 27
The function nchar
returns the number of characters in a string. Therefore, this code prints the number of characters in each entry in the first column of the primers data set.
This is helpful, but we might want to use this data later. To do so, we will want to assign the primer lengths to a variable rather than just printing a list of numbers.
len <- numeric(length = length(primers$V1)) for (i in 1:length(primers$V1)) { len[i] <- nchar(primers[i, 1]) }
In R, you create an if/else statement with the following syntax:
if (logical statment) { do something } else { do something different }
For example:
if (TRUE) { print("TRUE") } else { print("FALSE") } ## [1] "TRUE" if (FALSE) { print("TRUE") } else { print("FALSE") } ## [1] "FALSE" if (1 > 2) { print("TRUE") } else { print("FALSE") } ## [1] "FALSE"
Not all if/else statements have to have an else
clause. By default, if no else
statement is given R does nothing.
x <- 1000 if (x > 100) print("'x' is greater than 100") ## [1] "'x' is greater than 100"
If/else statements can also be nested:
Another way to combine if/else statements is to combine the conditional statement. There are four operators for combining logical statements: &
, &&
, |
, ||
. The “and” operators (&
/&&
) return return TRUE
if both statements are TRUE
. The “or” operators (|
/||
) return TRUE
if either statement is TRUE
. The The single-character operators (&
and |
) return a vector.
1 > 0 & TRUE ## [1] TRUE FALSE | 10 ## Remember from class 1 how 10 is coerced to TRUE ## [1] TRUE 1 > 0 & c(TRUE, FALSE, FALSE, TRUE) ## [1] TRUE FALSE FALSE TRUE 0 > 1 & c(TRUE, FALSE, FALSE, TRUE) ## [1] FALSE FALSE FALSE FALSE 0 > 1 | c(TRUE, FALSE, FALSE, TRUE) ## [1] TRUE FALSE FALSE TRUE
The double-character operators (&&
and ||
) will only return a single TRUE
or FALSE
value.
1 > 0 && 10 < 100 ## [1] TRUE 1 > 0 && FALSE ## [1] FALSE 1 > 0 || FALSE ## [1] TRUE 1 > 0 && c(TRUE, FALSE) ## [1] TRUE 1 > 0 && c(FALSE, TRUE) ## [1] FALSE
Notice in the last two statements above the order of the vector matters. The double-character operators are lazy – they only check as much as they need to. If you give a double-character operator a vector, it only checks the first element.
Furthermore, if the first expression is TRUE
the ||
operator will not even evaluate the second expression. Similarly, if the first expression is FALSE
the &&
operator will not evaluate the second expression.
TRUE || r ## [1] TRUE TRUE && r ## Error in eval(expr, envir, enclos): object 'r' not found FALSE || r ## Error in eval(expr, envir, enclos): object 'r' not found FALSE && r ## [1] FALSE
You can combine logical expressions within an if/else statement (recall we defined y
as 25
in an earlier expression).
The final control statement we will discuss are while statements. Although while statements are rarely used in R, they are often used in other languages and are good to understand. The while syntax in R is:
while (condition) { do something update condition }
Try the simple example below.
val <- 3 while (val < 10) { print(val) val <- val + 1 } ## [1] 3 ## [1] 4 ## [1] 5 ## [1] 6 ## [1] 7 ## [1] 8 ## [1] 9
If you omit the second part that updates val
R would have printed 3
until you manually killed the process.
Repeat-statements are very similar to while-statements, but do not include the termination criteria in in the statement call. This amounts to a difference of when the criteria are checked. while
always checks the criteria at the beginning of each call, but repeat
allows the user to define when to check the criteria. break
provides the necessary ability to terminate the repeat.
The repeat syntax in R is:
repeat { do something if (condition) break update condition }
Suppose we want to find the first primer starting with C, and return it.
i <- 1 repeat { startBase <- substr(x = primers$V1[i], start = 1, stop = 1) if (startBase == "C") break i <- i + 1 } i ## [1] 12
Somtimes using a repeat-statement greatly simplifies the syntax over a while-statement. See exercise 5. Repeat statements also allow for much more complicated termination criteria, eg. there could be multiple break points, each with their own multivariable criteria.
Sometimes it is useful to skip iterations in loops without exiting the loop; next
provides this functionality. The following shows how to use both next
and break
in a for-loop.
set.seed(1234) letterJumble <- sample(letters) for (i in letterJumble) { if (i == "r") break if (i > "r") next print(i) } ## [1] "p" ## [1] "e" ## [1] "l" ## [1] "o" ## [1] "i" ## [1] "f" ## [1] "d" ## [1] "b" ## [1] "g" ## [1] "j" ## [1] "n" ## [1] "q" ## [1] "h" ## [1] "k" ## [1] "m" ## [1] "c" ## [1] "a"
First, we randomized the order of the letters using sample
(setting the seed using set.seed
just ensures the “random” sample is the same every time). Then we look loop through the letters printing the letters that come before “r” until we find “r”.
Let’s start with a simple example of combining for-loops and if/else statments. We want to print just those primers that are longer than 45 basepairs.
# loop through all primers for (i in 1:length(primers$V1)) { # print primer if it is longer than .. if (nchar(primers[i, 1]) > 45){ print(primers[i, 1]) } } ## [1] "GAACAAAGAAGTACAAAGGAGTAAATACAATTTTATTATCCGGATCCCCGGGTTAATTAA " ## [1] "TACACCGAGTATACAACATGACTATACCATGGAAGGTTGGCGGATCCCCGGGTTAATTAA " ## [1] "TGACCACCTGGCTTGCAGCTAATAGTGAAAAAACACAAATGAATTCGAGCTCGTTTAAAC " ## [1] "AAGAATGGACGACTTCTTATCACGTATAGGAGTGATATACCGGATCCCCGGGTTAATTAA" ## [1] "GGAAACGGTAGGTATTCACAGTCCTATTGATGAAAAATGCCGGATCCCCGGGTTAATTAA" ## [1] "AAACAAGTATATATGCTTATGAACTAGTGAATTCCTTACAGAATTCGAGCTCGTTTAAAC" ## [1] "AAACAAGTATATATGCTTATGAACTAGTGAATTCCTTACAGAATTCGAGCTCGTTTAAAC" ## [1] "TGTTCCTATAAAAGGCGGCGATAAAGCTTCCTCTGAGCTCCGGATCCCCGGGTTAATTAA" ## [1] "GATATTTGAATGACACTTTTAAATGCGTATATAACAGCTCGAATTCGAGCTCGTTTAAAC" ## [1] "TAAATCGTACAACTATTAAATTGATATATGAAAGACGGATCGGATCCCCGGGTTAATTAA " ## [1] "AGGTCGTGACGTGACTGACTGGTGGGAATCCAGGGAAATTCGGATCCCCGGGTTAATTAA " ## [1] "CTTATTTTTATTCTATGTCTATCAAATGCTAAACGTTACAGAATTCGAGCTCGTTTAAAC " ## [1] "CTATCAAAGTACACAAACGTAGAATCAGTACATCGGAACTCGGATCCCCGGGTTAATTAA" ## [1] "ACACGAGTCGAGACTATTTATAAGTGATGGGGCCGTGTTTCGGATCCCCGGGTTAATTAA" ## [1] "ATTAAAGGAAAAAATAATAAAATAACCTCCCTGTCACAAGGAATTCGAGCTCGTTTAAAC" ## [1] "TATCAAAGTACACAAACGTAGAATCAGTACATCGGAACTCGCGGCCGCCAGGGGATAACT" ## [1] "ACCCTTCGGTATCTCAGCATCTAGGTAGCTTTCATTGAGATTGCGGCCGCATAGGCCACT" ## [1] "CCATATAGAAACCCCTTCTGTATCAATTCAAATTAAGTGCGCGGCCGCCAGGGGATAACT" ## [1] "ACCTAGGATGATTACTTTCAAAATATTTTTTTTTCTAGAAGAGCGGCCGCATAGGCCACT" ## [1] "AAATCGTACAACTATTAAATTGATATATGAAAGACGGATCGCGGCCGCCAGGGGATAACT " ## [1] "GGTAGATCTTGATGGATCAAAAACCGTAATATAGTGTAGTCTGCGGCCGCATAGGCCACT" ## [1] "TGCTAATGATGGGACCAGCGCAAACAGTGCTTGCAGTTGTCGGATCCCCGGGTTAATTAA" ## [1] "CCAGACTTTTTTTTTATATATATTTATTTTCCCCTCTTGCGAATTCGAGCTCGTTTAAAC" ## [1] "TCTAAGCTATAAAAAAATATCCCTTTTATCACACAAAAAACGGATCCCCGGGTTAATTAA" ## [1] "TTGTTTAATTTGCCAGACGGAATCTAACCCAAAAATAGTACGGATCCCCGGGTTAATTAA" ## [1] "ACTTTTATGTAACCAAAGTTGTATTAAATATTTAGAAATGGAATTCGAGCTCGTTTAAAC" ## [1] "AAATATATATCTGCCGAGACCATTACTCATTACACCTAGACGGATCCCCGGGTTAATTAA" ## [1] "TCGAATAACTTCGTATAGCATACATTATACGAAGTTATTTAATTAAGGCGCGCC" ## [1] "CTAGGGCGCGCCTTAATTAAATAACTTCGTATAATGTATGCTATACGAAGTTAT" ## [1] "TCTCGGATCCTCTTCTAGAAAAAAAAATATTTTGAAAGTAATCATCC" ## [1] "CATCTCCGTGAAGCATTGAGGGAAGGGTTTAACTCCAACACGGATCCCCGGGTTAATTAA" ## [1] "GTGACGAAATTTAAATTTTGAAGCACCAATTATCAACCAAGAATTCGAGCTCGTTTAAAC" ## [1] "GATTGATCATTTACATAATCTGGCACAATACTGGCGGACCCGGATCCCCGGGTTAATTAA" ## [1] "GCGAATCCTATTGCATGCAGAGAAGGGTAAAAGATACATACGGATCCCCGGGTTAATTAA" ## [1] "TTTGCAGCTAAATGAAAGAAAAAAAAAGAAATGGCACATAGAATTCGAGCTCGTTTAAAC" ## [1] "AAATATCTACAAGTGTGAAGACGAGAGAATTGATTATTTTCGGATCCCCGGGTTAATTAA" ## [1] "TGGCAGTCTTGTCATCCATATAAAAAAGGAGTTCTGATTGCGGATCCCCGGGTTAATTAA" ## [1] "CACTTCGGTCCTTCAAGCTTGGTAGTTGATAAATTGTAAAGAATTCGAGCTCGTTTAAAC" ## [1] "ACACCACAAGGGAGCTGTATGTGTCGCATTAAACGATGTACGGATCCCCGGGTTAATTAA" ## [1] "AAAATATTGTTAAAGAAAGTAACAGGAAGAGAAATCGGATCGGATCCCCGGGTTAATTAA" ## [1] "CTCCATTTCTATATAAAAAGCATACATAGAGTTACAAATTGAATTCGAGCTCGTTTAAAC" ## [1] "CAGCGATACGGACGGCATGCAAGACCAGTCAAGTAATATACGGATCCCCGGGTTAATTAA" ## [1] "TAAGCAATAGTTTGCTCATAACATATTCTCTACATTAGATCGGATCCCCGGGTTAATTAA" ## [1] "ACTCAGTCAAGTTTTACCTGAAAGTGAAAGAAGGTTGGATCGGATCCCCGGGTTAATTAA" ## [1] "TACGTAAGCAAAGTTTTATGTAACAAAAAAAAAAAAAGAAGAATTCGAGCTCGTTTAAAC" ## [1] "ATTGGGATATATCATATATCCTTACTGAGTAACTATAATTCGGATCCCCGGGTTAATTAA" ## [1] "GGAAAATGTGGGCTACATGCATACACAGCCACAACAAAGGCGGATCCCCGGGTTAATTAA" ## [1] "TAGCTACTCTTTTAAACTGTGTGCTTCGTGTTGGTTGTTTGAATTCGAGCTCGTTTAAAC" ## [1] "AAGAAAGCAGAACATACAGCCCGGTTGAATAGCATGAGTCCGGATCCCCGGGTTAATTAA" ## [1] "AGTATTTTTCATGAGTGATATGAACCTTTCTAAAGAAGGTCGGATCCCCGGGTTAATTAA" ## [1] "TTTCTTTAGTTGTGCCCTTTAAAATAAAACTTTACCATTTGAATTCGAGCTCGTTTAAAC" ## [1] "GTTGTTCTATAAGGTAACAAAATAAAGTGAAGAAGTAAATCGGATCCCCGGGTTAATTAA" ## [1] "AAACGCTAATAAAAATGATGCCATATCAAAAAAAGTTGAACGGATCCCCGGGTTAATTAA" ## [1] "AATTTACATTATCCTCATGGCTACTCTAGCTATTTGCATTGAATTCGAGCTCGTTTAAAC" ## [1] "TTCAAATCTCTTTTACAACACCAGACGAGAAATTAAGAAACGGATCCCCGGGTTAATTAA" ## [1] "GGTGCTTGTTCATGTATGAAGTTGACATAGTTGCAAATGTATATATAG" ## [1] "CTATATATACATTTGCAACTATGTCAACTTCATACATGAACAAGCACC" ## [1] "GTTGTTCGGAAAGTACTTCTTTTATTTTCTTTTATACATCCGGATCCCCGGGTTAATTAA" ## [1] "GTTTCAGGTCACTGCGGTTGTGGTTTCATACCAGGGAGTTCGGATCCCCGGGTTAATTAA" ## [1] "AATTAAAATCTTGTCATTTGTGACAAACGTTTAGCACTGTGAATTCGAGCTCGTTTAAAC" ## [1] "GCAAATCAATGATAAGTACAAGTCCAATCGGACTGATTCGCGGATCCCCGGGTTAATTAA" ## [1] "ATATGAATTGAATATATATCAAAAATGTCTGCAAAAATTTGAATTCGAGCTCGTTTAAAC" ## [1] "CAAATTAAGAGGAACCCTTTTTTTTTTTGATTTCGATACACGGATCCCCGGGTTAATTAA" ## [1] "AGGCAGAAGTACCGCCCAAAGAGGCGGTTATAGCGCCGTTCGGATCCCCGGGTTAATTAA" ## [1] "GCTGTTGCAAAAATATCGAATTGTAAGCCAGTAAACTTATGAATTCGAGCTCGTTTAAAC " ## [1] "TCAATCGATGCGATAGATAAAGGTAAGGAAAGCTTTCACGCGGATCCCCGGGTTAATTAA" ## [1] "ACCCGCGGCCGCTTTATGTGATGATTGATTGATTGATTGTACAGTTTG " ## [1] "CAAACAGCGGCCGCATGAGTAAAGGAGAAGAACTTTTCACTGGAG " ## [1] "CAATATTCGCCTAGATGGAGAAAATAATTCTTGTAGCTGTCGGATCCCCGGGTTAATTAA " ## [1] "AAAGGATTACATAATAGAAGATACAATTAAGTAGTACAGCGAATTCGAGCTCGTTTAAAC" ## [1] "TCCATATAGAAACCCCTTCTGTATCAATTCAAATTAAGTGCGGATCCCCGGGTTAATTAA " ## [1] "CTAAAAAGAGGGAAAAGAAATAGTATACCATTCCGCAAAGCGGATCCCCGGGTTAATTAA" ## [1] "AAATTTGTCAATTTTTGGATTTGATATGTTTAATAGAAAGCGGATCCCCGGGTTAATTAA" ## [1] "ATACTAACTCACTTATTTGTCTTCTTTGCCGTTAACCAGAGAATTCGAGCTCGTTTAAAC" ## [1] "ACTGGGGCGAAGAATATCTAGTTATCCACTCCTTCATAGACGGATCCCCGGGTTAATTAA" ## [1] "CCATATGACTTTAGACTCATGGCAGCCATTGACCCCAAAACGGATCCCCGGGTTAATTAA" ## [1] "ACAAACGGAACAACAACCACACTTCAAAGATAACATATTCGAATTCGAGCTCGTTTAAAC" ## [1] "CAACGGATCCTAGTAAGCCGATCCCATTACCGACATTTGGGCGCTATACG " ## [1] "ATACTCGTATAAGCAAGAAATAAAGATACGAATATACAATCGGATCCCCGGGTTAATTAA" ## [1] "ACACATGGATGACGAATTTGTCAGTGGAAGATTCGAAATACGGATCCCCGGGTTAATTAA" ## [1] "AGGAAAAGAAGAGGAAGGGCAAGAGGAGCGATTGAGAAAGAATTCGAGCTCGTTTAAAC" ## [1] "AGCATTTTAACGAAGAGTATATACCTACTATTAGACATTACGGATCCCCGGGTTAATTAA" ## [1] "AAGCGATATTGAAATTAATGACGACTTAAATGGTGTTTTACGGATCCCCGGGTTAATTAA" ## [1] "AAGTGTACACTTGCCTTGTGTATTAAATGATGATTCGATAGAATTCGAGCTCGTTTAAAC" ## [1] "CAAAGTTCTACAAGAGTCATTCATACATCCCCTGCGGATCCCCGGGTTAATTAA" ## [1] "CATAAGACATGAAGGGCAAAGGAGCAAAAATTCCGTTAAACGGATCCCCGGGTTAATTAA " ## [1] "CTCCGGTATTCAATATGTAAAGTTCCGTTTCTATTTACCAGAATTCGAGCTCGTTTAAAC " ## [1] "GGAGTCAAAGAGTTGCGCACCCAGAACCATTGTAATTAGCCGGATCCCCGGGTTAATTAA" ## [1] "GAAAGTGAAGCAGATGAGGAAGAAAAAAGACAAGGGTGACCGGATCCCCGGGTTAATTAA" ## [1] "ACATATTTGTTTATGCAAAAACAAAAACAGGAAGCAAAAGGAATTCGAGCTCGTTTAAAC" ## [1] "AAAGGGAAGTAAAAGTTAAAAACTAGAATCCTAGTATGACCGGATCCCCGGGTTAATTAA" ## [1] "AACAGTCACAGGACCTCATGACCGATGGTACGTGGTAGGCCGGATCCCCGGGTTAATTAA" ## [1] "ATGCAACTTTATACACACGGCAGGAAAAAAAGTGCGCACTGAATTCGAGCTCGTTTAAAC" ## [1] "TTATCGAATACGATTAAACACTACGCCAGATTTCCACAATCGGATCCCCGGGTTAATTAA" ## [1] "TATATTAAGATTGAAGTACAATGACGCTAACACTAAGTTACGGATCCCCGGGTTAATTAA" ## [1] "TTTGTGCGTAACCCACGCTTACGATATTGGAATTACAATTGAATTCGAGCTCGTTTAAAC" ## [1] "GCGAACAGCAGAATTTGTCCTTGGTTTTCAGAGTTTGAAACGGATCCCCGGGTTAATTAA" ## [1] "AAAACCAATTCAAGATTTTATCTTCAATTTGGTTGGGGAACGGATCCCCGGGTTAATTAA" ## [1] "ACTTGTGTAATATATGTGTATATAAAAAATATACATGTTCGAATTCGAGCTCGTTTAAAC" ## [1] "AAATCGGCCAATAAAAGAGCATAACAAGGCAGGAACAGCTCGGATCCCCGGGTTAATTAA" ## [1] "AATTCTCCAAGTTTATTCATGGTGTGGCTACCTTACTAAACGGATCCCCGGGTTAATTAA" ## [1] "ATCCATTTGAGGTAACCAAAATGGGATTGAATCCACTTTAGAATTCGAGCTCGTTTAAAC" ## [1] "AGGGTGTTCTTTCTTCTGTACTATATATACATTTGCAACTCGGATCCCCGGGTTAATTAA" ## [1] "ATTGTATAATATTACTCAACAGATTTTACAATTTTTACATCGGATCCCCGGGTTAATTAA" ## [1] "TGATAAAAATTATAATGCCTAGTCCCGCTTTTGAAGAAAAGAATTCGAGCTCGTTTAAAC" ## [1] "AGTACAACATTATGACGAATACTACCATGAGTTGACCAACCGGATCCCCGGGTTAATTAA" ## [1] "CTACGATGCTTTGGACATTGCTAACAGAATCGGTTACATTCGGATCCCCGGGTTAATTAA" ## [1] "TTTAAAAATAATATTAAATTTATTAATTAAACCAATTAGAGAATTCGAGCTCGTTTAAAC" ## [1] "ATTTTCCATATAATAATAGTAATTACCGTCCATTAGTACAGAGCTCGTTTTCGACACTGG" ## [1] "TCCTTACCATTAAGTTGATCTGGCATAATATTCGTAAGGTTTAGGTTAGGTGGTGAGTTC" ## [1] "AGTTATGGCAGCGAACCCTGATTTTCCATATAATAATAGTAATTACCGTCCATTAGTACAGAACGCACCACCTAACCTAA " ## [1] "TCGACTCTTCAGCAAACTCTACATCAAAGAAAATGTTTCTTGGCATAATATTCGTAAGGTTTAGGTTAGGTGGTGCGTTC" ## [1] "AGTTATGGCAGCGAACCCTGATTTTCCATATAATAATAGTAATTACCGTCCATTAGTACAGAACGAACCACCTAACCTAA " ## [1] "TCGACTCTTCAGCAAACTCTACATCAAAGAAAATGTTTCTTGGCATAATATTCGTAAGGTTTAGGTTAGGTGGTTCGTTC" ## [1] "GTTATGTTTAGCCTAAATGAGCTTTCTTCTTCTAAAGCAGGAGCTCGTTTTCGACACTGG" ## [1] "TCCTTACCATTAAGTTGATCGTAGGAATAAAAACGTCACTGAATCTTGAAGGACTTATTA" ## [1] "ATGAAGCAACAAGTTCCATTGTTATGTTTAGCCTAAATGAGCTTTCTTCTTCTAAAGCAGTAATAGCTCCTTCAAGATTC " ## [1] "TTTTTCTCAATAACATTGTGAGATTTGCCGTAACTTGTGTAGGAATAAAAACGTCACTGAATCTTGAAGGAGCTATTACT" ## [1] "ATGAAGCAACAAGTTCCATTGTTATGTTTAGCCTAAATGAGCTTTCTTCTTCTAAAGCAGTAATAGAACCTTCAAGATTC " ## [1] "TTTTTCTCAATAACATTGTGAGATTTGCCGTAACTTGTGTAGGAATAAAAACGTCACTGAATCTTGAAGGTTCTATTACT" ## [1] "AAAGAACGACTACACCTCAACATAACGACACTTTTTTGACCGGATCCCCGGGTTAATTAA" ## [1] "AACATAAAAACACATGGTCTCAGTAGATAGAGTACATATTGAATTCGAGCTCGTTTAAAC_" ## [1] "ACGGATCCGAGCTCGAATTCATGTCAGCGATATTATCAACAACTAGCAAAAGTTTCTTAT" ## [1] "ACGGATCCGAGCTCGAATTCATGGGATCCATGGAGCTTACCATCTTTATC" ## [1] "AAACGACGGCCAGTGTTATTTGTATAGTTCATCCATGCCATGTGTAATCC" ## [1] "ACCAAGCATACAATCAACTATCCCGGGTAGTCGACATGGCCTCCTCCGAGGAC" ## [1] "TCACGACGTTGTAAAACGACGGCCAGTGAATTCTTATTTGTATAGTTCATCCATGCCATG" ## [1] "ACCAAGCATACAATCAACTATCCCGGGTAGATGTCAGCGATATTATCAACAACTAGCAAAAGTTTCTTATC" ## [1] "AGTCACGACGTTGTAAAACGACGGCCAGTGAATTCTTATTTGTATAGTTCATCCATGCCATG" ## [1] "ACCAAGCATACAATCAACTATCCCGGGTAGATGGAGCTTACCATCTTTATC" ## [1] "AGTCACGACGTTGTAAAACGACGGCCAGTGAATTCTTATTTGTATAGTTCATCCATGCCA" ## [1] "TAAGCACCCGGGTAGATGGCTCCATCTGGTATGTGAACTGCAATATTAATAGCAC" ## [1] "TGCTTACCGCGGTTCTAATGTAACCGATTCTGTTAGCAATGTCCAAAGCATCGTAG" ## [1] "GAAGAACCGCGGCCGCTCCACCGAGAGCCTCCTCCGAGGACGTCATCAA" ## [1] "CTTGATGAATTCTTATTTGTATAGTTCATCCATGCCATGTGTAATCCC" ## [1] "TTAGCTCCCGGGTAGATGTCAGCGATATTATCAACAACTAGCAAAAG" ## [1] "GAAGAACCCGGGTAAGAACCGCGGCCGCTCCACCGAGAGCCTCCTCCGAGGACGTCATCA" ## [1] "ATCATACCAAAATAAAAAGAGTGTCTAGAAGGGTCATATACGGATCCCCGGGTTAATTAA" ## [1] "TGATATTTATATGCTATAAAGAAATTGTACTCCAGATTTCGAATTCGAGCTCGTTTAAAC" ## [1] "GAAGAACCCGGGTAAGAACCGCTCCACCGAGAATGGCCTCCTCCGAGGACGTCATCA" ## [1] "TCATACCAAAATAAAAAGAGTGTCTAGAAGGGTCATATAATGCGTACGCTGCAGGTCGAC" ## [1] "ATATTTATATGCTATAAAGAAATTGTACTCCAGATTTCTTAATCGATGAATTCGAGCTCG" ## [1] "TTCTAGAAGAACGGAGATAGGAAACCTATGATGTAAGTATGCGTACGCTGCAGGTCGAC" ## [1] "TATTTGAATGACACTTTTAAATGCGTATATAACAGCTCTTAATCGATGAATTCGAGCTCG" ## [1] "TAAGCACTCGAGTAGATGGCTCCATCTGGTATGTGAACTGCAATATTAATAGCAC" ## [1] "TGCTTAGGATTCAATGTAACCGATTCTGTTAGCAATGTCCAAAGCATCGTAG" ## [1] "ATTTCTGCCAAAAAGAAAGTTCAAGAGCGTCCCATTCATCCGGATCCCCGGGTTAATTAA" ## [1] "GCGGTGCACAGGTGTTTTTATAGGGGGGTGATGGATTACAGAATTCGAGCTCGTTTAAAC" ## [1] "CCATAAGTGCGCGTGTTTGTGCCTTCTGATATGATATCGTCGGATCCCCGGGTTAATTAA" ## [1] "TTTTTTTTTAGATTGTTCGGTACTTAGTCAAGTTTTATTTGAATTCGAGCTCGTTTAAAC" ## [1] "AAAGGGAAGTAAAAGTTAAAAACTAGAATCCTAGTATGACCGGATCCCCGGGTTAATTAA" ## [1] "ATGCAACTTTATACACACGGCAGGAAAAAAAGTGCGCACTGAATTCGAGCTCGTTTAAAC" ## [1] "GTCACTGTTTTCGCAAAGACTCCCAGACACGGGCATTAAACGGATCCCCGGGTTAATTAA" ## [1] "CTTTATCACATTTATGAAAAAATGCATTTATATGAACTACGAATTCGAGCTCGTTTAAAC" ## [1] "TGGTTTTACCTATTAGGGATAGTAATCATAATTTAAAAATCGGATCCCCGGGTTAATTAA" ## [1] "AGATTAAATGGCAGTCCAAAAGAGATTTTTGATTTTCAGTGAATTCGAGCTCGTTTAAAC" ## [1] "AAACGTACGACAAGAACAAGAAGATCATCACATTGTAATTCGGATCCCCGGGTTAATTAA" ## [1] "TTATATGTCGTATGTATCTATTTATGGTATTCAGGGGCTTGAATTCGAGCTCGTTTAAAC" ## [1] "ACGTGAAGCTGTCGATATTGGGGAACTGTGGTGGTTGGCACGGATCCCCGGGTTAATTAA" ## [1] "GGGCAATCCGACTATATCTGAGACGAACAAAGACACTGTTGAATTCGAGCTCGTTTAAAC" ## [1] "CAGCAACTTTGGCGTCTCCAGCTCGTTTTCCTTCAAGCCTCGGATCCCCGGGTTAATTAA" ## [1] "TATATATATTCTGGTGTGAGTGTCAGTACTTATTCAGAGAGAATTCGAGCTCGTTTAAAC" ## [1] "ATTGTTTATTTATTTATTTAAGTACCCCTATTGTACGTACGAATTCGAGCTCGTTTAAAC" ## [1] "TTCAAATCTCTTTTACAACACCAGACGAGAAATTAAGAAACGGATCCCCGGGTTAATTAA" ## [1] "GGTCATTTGTACTTAATAAGAAAACCATATTATGCATCACGAATTCGAGCTCGTTTAAAC" ## [1] "ACAATTTTTCGGTATTAGTTACTAAAAGGCTCACATATACCGGATCCCCGGGTTAATTAA" ## [1] "TGTGTCAGGATGCTACTTTTGGAAACCTCCTTAAATATGAGAATTCGAGCTCGTTTAAAC" ## [1] "GTCGTATAAATTTATTACCAAGAACAAAAAATACACCCCGCGGATCCCCGGGTTAATTAA" ## [1] "TTTCCGAAGGATACTGCATTATCATCAGTGATTTATTAATGAATTCGAGCTCGTTTAAAC" ## [1] "CCAAAGGATATACTCTCAATTATAAATGGAAAAGCACATCCGGATCCCCGGGTTAATTAA" ## [1] "ACGGAATCCAAAATGCAAAATCGAAATGACACCTAAAAATGAATTCGAGCTCGTTTAAAC" ## [1] "GTCTATACACAGTGTTTACAACTCAGCTTATATTCATATCCGGATCCCCGGGTTAATTAA" ## [1] "AATATATATTATTAAATATATATATTTGAAGGGGAGTTGAGAATTCGAGCTCGTTTAAAC" ## [1] "CAGCTCTAACACATAATCATATCAACCACAGTACTCAGTACGGATCCCCGGGTTAATTAA" ## [1] "AAAGAAAACACTGACTCTTATAAAACAAACAATGAACATTGAATTCGAGCTCGTTTAAAC" ## [1] "CGACGACACATGTGTTCCATAAGCTAAACTCAAGGAGCAACGGATCCCCGGGTTAATTAA" ## [1] "AATAATAAATAAAGTAAAACAACTGCAAAAAACATCGATGGAATTCGAGCTCGTTTAAAC" ## [1] "TTTTGTTCCAGAATTAGTTATAGTTCCTTCAACCACATAGCGGATCCCCGGGTTAATTAA" ## [1] "GAAAACTCTTCCAAGCAAAGTCGGTTTGAGGCGTTTTCTGGAATTCGAGCTCGTTTAAAC" ## [1] "AAGTTGAACCGCATTTTCAAACGTTCAAACCAACCGAATCCGGATCCCCGGGTTAATTAA" ## [1] "TGCATAAATATGCTATATAAAGTCCACTACAAAAAGTCATGAATTCGAGCTCGTTTAAAC" ## [1] "ATTAGAAACAATAGGAAACAACAGAGTCGGAAGAAGCCAACGGATCCCCGGGTTAATTAA" ## [1] "CGAGAGCGGAGGAAGGGTAAGGAAGTAAAGAGAAAAATATGAATTCGAGCTCGTTTAAAC" ## [1] "CAGTTGTGCTCACATGATATAACAAAAGTAATACTGACAACGGATCCCCGGGTTAATTAA" ## [1] "AACAAATGTACATAAGAAATGTTTATCACAACTATCAAGTGAATTCGAGCTCGTTTAAAC" ## [1] "ATAAAAATAACCCTACTGCCAATTTGAAAGGGCCCGAAAACGGATCCCCGGGTTAATTAA" ## [1] "TATCATTATGTACGTGTTATAACCGCCATGTCTCACAGTAGAATTCGAGCTCGTTTAAAC" ## [1] "TGCCTATTGGCGCAAAGAAGACAGAGTGTGCAAACAAGAGCGGATCCCCGGGTTAATTAA" ## [1] "ATATACACATATATTATAGACTAATTGATAAATTTTTTTTGAATTCGAGCTCGTTTAAAC" ## [1] "CTGATTCAGGTACTAGTGGTGGAGAGAGCGGCATATTAAACGGATCCCCGGGTTAATTAA" ## [1] "AGTTCATATAAGGCGGCTCAATGCAGAACCGAGGATAGCGGAATTCGAGCTCGTTTAAAC" ## [1] "ATTGCAATTTAAACAGATTCTGGAGAATTACCGTCCAACACGGATCCCCGGGTTAATTAA" ## [1] "AAACAGTTTTGCGGTTTCCTTTATACTAAGAAGGTCTATACGGATCCCCGGGTTAATTAA" ## [1] "TAAAGGTGTAGTAGGACAGTAAGTATTCAATGAAATACAAGAATTCGAGCTCGTTTAAAC" ## [1] "AAACCTTTATCAGGTGGCCGACACTAGGGAATAAGACAGCCGGATCCCCGGGTTAATTAA" ## [1] "ATAAAATATTTAACATATGCTCTTCCAAATGTACATACTTGAATTCGAGCTCGTTTAAAC" ## [1] "TTTTGGTATTGTTTCTCAAAGAAGAAAATAGAAAGTGAGACGGATCCCCGGGTTAATTAA" ## [1] "TATGCGAGTGTATGTAAGTTTATAAGTGCCTGTGTGGCTAGAATTCGAGCTCGTTTAAAC" ## [1] "AGTAGAAATAATTGAAGGGCGTGTATAACAATTCTGGGAGCGGATCCCCGGGTTAATTAA" ## [1] "GGTGATTCTATACTTCCCCGGTTACTTATAGTTTTTTGTCGAATTCGAGCTCGTTTAAAC" ## [1] "GCGGAAGCGATAGTAATACATTTGGTAGCGATAAGTGCACCGGATCCCCGGGTTAATTAA" ## [1] "ATCTGATAGTCATAAAGATTCTTTTGGGAGATGCTGTCCCGAATTCGAGCTCGTTTAAAC" ## [1] "ATTGTTTATTTATTTATTTAAGTACCCCTATTGTACGTACGAATTCGAGCTCGTTTAAAC" ## [1] "ACGTGAAGCTGTCGATATTGGGGAACTGTGGTGGTTGGCAGTAGAATTTTCTCATTGGTC" ## [1] "ATTGTTTATTTATTTATTTAAGTACCCCTATTGTACGTACTTAAGCAAGGATTTTCTTAA" ## [1] "TGATAAGGGTGATGGTAAATTCTGGAGCTCGAAAAAGGACCGGATCCCCGGGTTAATTAA" ## [1] "ATGTTATGCGGGAACCAACCCTTTACAATTAGCTATCTAAGAATTCGAGCTCGTTTAAAC" ## [1] "TGACTTCACGAATGAAATCATCAACAAATTATCTACCATGCGGATCCCCGGGTTAATTAA" ## [1] "CACTTAAGTTGCAGAACAAAAAAAAGGGGAATTGTTTTCAGAATTCGAGCTCGTTTAAAC" ## [1] "CACATCTATCCTTGGTATGCAAGACATGTGGAAAGCTACACGGATCCCCGGGTTAATTAA" ## [1] "GCGGAGAATAGCCAAATAAAAAAAAAAGATGAAAAGAAAGGAATTCGAGCTCGTTTAAAC" ## [1] "CCCTGACTGGCAAAATGGACAGGTCGAAGACTCCATCCCACGGATCCCCGGGTTAATTAA" ## [1] "CAATTTCCGTTAAAAAACTAATTACTTACATAGAATTGCGGAATTCGAGCTCGTTTAAAC" ## [1] "AGTGAACACTGAACAAGCATACTCTCAACCATTTAGATACCGGATCCCCGGGTTAATTAA" ## [1] "TAAGTGACGATGATAACCGAGATGACGGAAATATAGTACAGAATTCGAGCTCGTTTAAAC" ## [1] "GTATCATACCAGAAGAGCATTCTTGATTCCATTTGTATTTCGGATCCCCGGGTTAATTAA" ## [1] "TCTCTTTACCTTGCATTTGGGCATGTTGCAAACAGGAGGAGAATTCGAGCTCGTTTAAAC" ## [1] "AAATTTTAGAAGTATGGCAGAAATGTTCTTGTGAATACTAATCGATGAATTCGAGCTCG " ## [1] "CAAAGACATGCTTTACGATGAACTAATGAAGACCATGGAACGTACGCTGCAGGTCGAC " ## [1] "GCATTTAGAATAAAGAAGCTGAATAAAATAAAAAAAATTAATCGATGAATTCGAGCTCG" ## [1] "GGTTGCTTGTGATAACTCAGGGCTTATTGGAATCTTCCAACGTACGCTGCAGGTCGAC " ## [1] "AAAAGCTGTAAGGTTATCAAAAAGGAAGGCATACAGTATACGGATCCCCGGGTTAATTAA" ## [1] "TACGCATTTAGAATAAAGAAGCTGAATAAAATAAAAAAAAGAATTCGAGCTCGTTTAAAC" ## [1] "GCTGATGAACAGCGCGATGAACGTTCAACCCACAGGATTTCGGATCCCCGGGTTAATTAA" ## [1] "ACACGATGGGGTACAGCAAACGAATTATTTTATCCACGTCGAATTCGAGCTCGTTTAAAC" ## [1] "TAAGATCGGTGCACCTGCATCCGGATTAAGAATACGTGTACGGATCCCCGGGTTAATTAA" ## [1] "CAGAATACAATCCCGAGAAAATCAATAGGCAAAATAGCATGAATTCGAGCTCGTTTAAAC" ## [1] "TCTTGAGGGTAAATGGGAACCCGCTGGTGAAGTTCATCAGCGGATCCCCGGGTTAATTAA" ## [1] "TTTTCTTTTGAGATGTTTCATTTTAAATTCTTGATACTCTGAATTCGAGCTCGTTTAAAC" ## [1] "AATTATTACTTTTATCGCTTCGTTAATGACTTTGAACAAACGGATCCCCGGGTTAATTAA" ## [1] "AAGCTGAGTAGAAAACAGGTTACGAAAGTTGTTTGTTGGCGAATTCGAGCTCGTTTAAAC" ## [1] "CAGTTAACTCTGTATCCTTTTCTTCTTCGGCCTGACAATGCGTACGCTGCAGGTCGAC" ## [1] "GTGACGTACGGAAGGCAGCGCGAGACACTTCCGTGATCAATCGATGAATTCGAGCTCG" ## [1] "TTCCAGTTAACTCTGTATCCTTTTCTTCTTCGGCCTGACACGGATCCCCGGGTTAATTAA" ## [1] "CGTTGTGACGTACGGAAGGCAGCGCGAGACACTTCCGTGAGAATTCGAGCTCGTTTAAAC" ## [1] "AAAAGCTGTAAGGTTATCAAAAAGGAAGGCATACAGTATAATGCGTACGCTGCAGGTCGAC" ## [1] "CTCCGTGAAGCATTGAGGGAAGGGTTTAACTCCAACAATGCGTACGCTGCAGGTCGAC" ## [1] "ACGAAATTTAAATTTTGAAGCACCAATTATCAACCAATCAATCGATGAATTCGAGCTCG" ## [1] "GTATAAATTTATTACCAAGAACAAAAAATACACCCCGATGCGTACGCTGCAGGTCGAC" ## [1] "CCGAAGGATACTGCATTATCATCAGTGATTTATTAATCTAATCGATGAATTCGAGCTCG" ## [1] "CTCTAACACATAATCATATCAACCACAGTACTCAGTAATGCGTACGCTGCAGGTCGAC" ## [1] "GAAAACACTGACTCTTATAAAACAAACAATGAACATTTTAATCGATGAATTCGAGCTCG" ## [1] "CTATTGGCGCAAAGAAGACAGAGTGTGCAAACAAGAGATGCGTACGCTGCAGGTCGAC" ## [1] "TACACATATATTATAGACTAATTGATAAATTTTTTTTTCAATCGATGAATTCGAGCTCG"
The following example is more complicated and combines for-loops and if/else statements. We now want to calculate the molecular weight of each of our primers. We know that the molecular weights dCTP, dATP, dGTP, and dTTP are 467.2, 491.2, 507.2, and 482.2 g/mol respectively.
# make empty numeric vector with the same length as the primers vector masses <- numeric(length = length(primers$V1)) for (i in 1:length(primers$V1)) { plen <- nchar(primers[i, 1]) #get length of ith primer pmass <- 0 # set mass of primer equal to zero for (j in 1:plen) { l <- substr(primers[i, 1], j, j) # get jth nucleic acid of ith primer if (l == "C") { m <- 467.2 } else if (l == "A") { m <- 491.2 } else if (l == "G") { m <- 507.2 } else if (l == "T") { m <- 482.2 } # get mass of jth nucleic acid of ith primer pmass <- pmass + m #add mass of jth nucleic acid to the mass of the primer } masses[i] <- pmass # store the mass of the ith primer in the vector "masses" }
You can now look at the results:
masses ## [1] 29788.2 29749.2 30144.4 29288.0 29313.0 29205.0 29205.0 29192.0 29212.0 ## [10] 29770.2 29893.2 29571.2 29195.0 29327.0 29210.0 11851.8 11646.8 11815.8 ## [19] 11619.8 11643.8 11686.8 11713.8 11713.8 14712.0 10543.4 11655.8 11650.8 ## [28] 12685.2 11655.8 11788.8 11597.8 13136.4 13310.4 11734.8 29246.0 29135.0 ## [37] 29154.0 29180.0 37030.2 29305.0 29297.0 29004.0 29112.0 29207.0 29250.0 ## [46] 29112.0 10255.2 10169.2 10136.2 11097.6 11569.8 11704.8 12586.2 12119.0 ## [55] 10559.4 11548.8 14862.0 14487.0 26275.8 26306.8 14698.0 19492.0 22815.4 ## [64] 17866.4 9690.0 8829.6 12266.0 29250.0 29199.0 29200.0 29331.0 29339.0 ## [73] 29326.0 29270.0 29220.0 29274.0 29411.0 29149.0 29277.0 10632.4 29148.0 ## [82] 29328.0 29322.0 13146.4 11087.6 17036.0 10647.4 10661.4 11229.6 9777.0 ## [91] 13198.4 29203.0 29286.0 29206.0 29301.0 29284.0 29095.0 29368.0 29275.0 ## [100] 29081.0 9746.0 9179.8 9813.0 10681.4 11243.6 10856.4 10657.4 29169.0 ## [109] 14091.8 13178.4 10306.2 11720.8 23511.6 23227.6 10704.4 8316.4 9265.8 ## [118] 29134.0 29301.0 29203.0 29242.0 29252.0 29170.0 29301.0 29713.2 29322.0 ## [127] 16960.0 23938.8 20967.6 18100.4 10721.4 8820.6 10641.4 10721.4 22540.2 ## [136] 21338.8 20198.2 16542.8 10239.2 8807.6 10203.2 29745.2 29321.0 29609.2 ## [145] 29317.0 29313.0 29066.0 29200.0 29114.0 29116.0 10172.2 10636.4 12208.0 ## [154] 11252.6 10863.4 11743.8 10150.2 11209.6 11237.6 11096.6 24784.2 14576.0 ## [163] 12223.0 11624.8 12138.0 11317.6 10752.4 11686.8 11170.6 9780.0 11766.8 ## [172] 17891.4 29266.0 29337.0 29084.8 10803.4 11052.6 9714.0 9733.0 29206.0 ## [181] 29317.0 29259.0 11574.8 26182.8 29824.2 29573.2 8820.6 10601.4 29259.0 ## [190] 29512.0 29290.0 11580.8 29347.0 29276.0 29235.0 10625.4 29112.0 29255.0 ## [199] 29180.0 29286.0 29253.0 29250.0 29303.0 29156.0 29222.0 9666.0 10853.4 ## [208] 12125.0 29153.0 29137.0 29230.0 11721.8 29186.0 29206.0 29210.0 8354.4 ## [217] 11101.6 8700.6 9837.0 10670.4 10672.4 11121.6 29139.0 29313.0 39294.2 ## [226] 38989.0 39318.2 38964.0 10690.4 11185.6 9745.0 9806.0 29194.0 29221.0 ## [235] 39319.2 38996.0 39352.2 38962.0 11590.8 9708.0 21031.6 19477.0 19443.0 ## [244] 9269.8 19336.0 19584.0 19449.0 10792.4 10641.4 29106.0 29715.2 9610.0 ## [253] 9686.0 8240.4 8339.4 8741.6 18450.6 20441.4 29167.0 14240.8 12592.2 ## [262] 20441.4 24298.0 8788.6 8720.6 24312.0 11043.6 10768.4 10377.2 9797.0 ## [271] 10377.2 10768.4 9766.0 10768.4 10377.2 25696.6 29181.0 9699.0 34465.2 ## [280] 16164.6 15957.6 30179.4 8887.6 9828.0 24726.2 18131.4 21762.0 29190.0 ## [289] 17177.0 18991.8 26810.0 27234.2 17080.0 16423.8 17000.0 19062.8 23775.8 ## [298] 23269.6 22907.4 19758.2 29162.0 29298.0 29202.0 10745.4 8796.6 27719.4 ## [307] 9705.0 8801.6 9773.0 10703.4 9665.0 9832.0 11206.6 11248.6 9723.0 ## [316] 9776.0 9893.0 9643.0 9875.0 9723.0 9682.0 9742.0 10674.4 11194.6 ## [325] 10706.4 9804.0 9194.8 10754.4 10737.4 29315.0 29202.0 10663.4 28895.8 ## [334] 29203.0 16999.0 16656.8 17176.0 17077.0 26809.0 25334.4 17079.0 14509.0 ## [343] 11670.8 10611.4 9828.0 9692.0 10230.2 9799.0 11296.6 10850.4 11296.6 ## [352] 10817.4 9810.0 10712.4 10706.4 11163.6 10537.4 10712.4 11203.6 9351.8 ## [361] 10203.2 9757.0 9752.0 8687.6 9748.0 10177.2 9659.0 10688.4 10225.2 ## [370] 10261.2 10183.2 10549.4 9180.8 8709.6 9394.8 10755.4 10285.2 9167.8 ## [379] 10249.2 10252.2 29153.0 29438.0 10214.2 9819.0 29220.0 29194.0 10716.4 ## [388] 10209.2 29347.0 29235.0 10760.4 10645.4 29170.0 29155.0 11268.6 10310.2 ## [397] 29267.0 29292.0 11236.6 10761.4 29243.0 29263.0 10685.4 10620.4 29424.0 ## [406] 29242.0 9841.0 11179.6 29046.0 29290.0 10234.2 10163.2 29126.0 29169.0 ## [415] 29181.0 10601.4 11214.6 29173.0 29220.0 11319.6 10814.4 29154.0 29194.0 ## [424] 10746.4 10143.2 29194.0 29220.0 10654.4 9668.0 29100.0 29354.0 10181.2 ## [433] 10590.4 29066.0 29194.0 11178.6 11172.6 29212.0 29267.0 10519.4 10221.2 ## [442] 29131.0 29246.0 10676.4 9238.8 29122.0 29183.0 10119.2 10166.2 29351.0 ## [451] 29542.0 10203.2 8762.6 29225.0 29207.0 10816.4 10270.2 29204.0 29140.0 ## [460] 11268.6 10779.4 29357.0 29175.0 10699.4 10645.4 29385.0 29323.0 10723.4 ## [469] 10721.4 29192.0 29220.0 29368.0 10700.4 10310.2 29268.0 29115.0 10709.4 ## [478] 9890.0 29366.0 29347.0 29417.0 29135.0 10841.4 11277.6 29354.0 29236.0 ## [487] 10758.4 10771.4 29126.0 29420.0 29140.0 29418.0 29158.0 29136.0 29304.0 ## [496] 29209.0 29431.0 29165.0 29165.0 29146.0 29354.0 29171.0 29237.0 29357.0 ## [505] 28769.8 28870.8 28747.8 9766.0 10850.4 9712.0 11219.6 29363.0 29322.0 ## [514] 19587.0 19296.0 29235.0 29208.0 29265.0 29236.0 29337.0 29123.0 29154.0 ## [523] 29383.0 11366.6 10665.4 9726.0 9721.0 11101.6 11652.8 11259.6 8851.6 ## [532] 28102.6 28342.6 10254.2 10732.4 29026.0 29307.0 29871.2 28317.6 28676.8 ## [541] 11141.6 11286.6 9233.8 28205.6 28721.8 28108.6 28709.8 28408.6 28684.8 ## [550] 19640.0 19390.0 19420.0 19440.0 19627.0 19481.0 19471.0 19502.0 19460.0 ## [559] 19431.0 13198.4
These exercises are to help you solidify and expand on the information given above and in the supplemental material.
Write a for loop that prints the first 3 nucleotides of each primer. Hint: you might want to use the substr()
function.
Write a for loop that prints “primer is longer than 24 bases” if the primer is longer than 24 bases, “primer is exactly 24 bases” if the primer is 24 bases, and “primer is shorter than 24 bases” if the primer is shorter than 24 bases.
Write a for loop that prints “primer is between 20 and 24 bases” if the primer has 20 or more bases and 24 or fewer bases.
Write a script that calculates the melting temperature (\(T_m\)) of each primer. For primers that are 13 bases or less the formula for \(T_m\) in Celcius is: \(T_m = (A + T)* 2 + (G + C)*4\), where A, T, G, and C represent the number of each base. For primers that are 14 bases or more the formula for \(T_m\) in Celcius is: \(T_m = 64.9 + 41*(G + C - 16.4)/(A + T + G + C)\).
Write a while-loop that finds the first primer that starts with “C”. Ensure it returns the same index as the repeat statement above.