![]() |
|||||
![]() |
|||||
![]() |
|||||
Strona startowa Frederik Pohl - Dorastanie w Mieście na Skraju.txt, FANTASTYKA NAUKOWA - SUBIEKTYWNY WYBÓR ANDRZEJA KRZEMIŃSKIEGO Fizyka sukcesu. Naukowe metody osiągania osobistego i finansowego szczęścia, OnePress Factors affecting, Neurologia i Neurochirurgia Polska od 2012 Flaszyńska - Po wiedze i prace do UE (ZDZ), Praca Farmacja i ja 2011.05, Farmacja i ja Fotografia dla ciekawych - Przemysław Oziemblewski, Fotografia Fiat 126p - Maly Wielki Samochod(1), Kupno samochodu, Sam naprawaw samochód Fourier Transforms In Spectroscopy - Kauppinen J, Angielskie techniczne Formaldehyd - IARC Monographs on the Evaluation, PDFy Final Destination 4 - Oszukać przeznaczenie 4, aaa |
Factoring Integers, Prace naukowe[ Pobierz całość w formacie PDF ]IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 4, NO. 2, JUNE 2005 149 Fast Parallel Molecular Algorithms for DNA-Based Computation: Factoring Integers Weng-Long Chang, Minyi Guo* , Member, IEEE , and Michael Shan-Hui Ho Through advances in molecular biology [1], it is now pos- sible to produce roughly 10 DNA strands that t in a test tube. Those 10 DNA strands can also be applied to represent 10 bits of information. In the future (perhaps after many years) if bi- ological operations can be applied to deal with a tube with 10 DNA strands and they are run without errors, then 10 bits of information can simultaneously be correctly processed. Hence, in the future, it is possible that biological computing can provide a huge amount of parallelism for dealing with many computa- tionally intensive problems in the real world. The fastest super computers currently available can execute approximately 10 integer operations per second. This implies that 128 10 bits of information can be simultaneously processed in a second. The fastest super computers can process 128 10 bits of information in 1000 seconds. The extract operation is one of basic biological operations of the longest execution time. An extract operation could be approximately done in 1000 s [12]. In the future, if an extract operation can be used to deal with a tube with 10 DNA strands and it is run without errors, then 10 bits of information can simultaneously be correctly processed in 1000 s. If it becomes true in the future, then basic biological operations will perhaps be faster than the fastest super computer in the future. In [12], it was pointed out that storing information in molecules of DNA allows for an in- formation density of approximately 1 bit/nm . Videotape is a kind of traditional storage media and its information density is approximately 1 bit/10 nm . This implies that an information density in molecules of DNA is better than that of traditional storage media. In this paper, we rst construct solution spaces of DNA strands for encoding every integer of bits. By using basic biological operations, we then develop DNA-based algorithms for a parallel subtractor, a parallel comparator, and a parallel divider, respectively, to factor the product of two large prime numbers of bits. We also show that cryptosystems based on the dramatic difference between the ease of nding large prime numbers of bits and the difculty of factoring the product of two large prime numbers of bits can be broken. Furthermore, this work presents clear evidence of molecular computing ability to nish parallel mathematical operations. The rest of this paper is organized as follows. Section II rst introduces DNA models of computation proposed by Adleman et al. and compares them with other models. Section III intro- duces the DNA program to factor the product of two large prime numbers of bits for solution spaces of DNA strands. Discus- sion and conclusion are drawn in Section IV and Section V, respectively. Abstract— The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difculty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations. Index Terms— Biological parallel computing, DNA-based algo- rithms, DNA-based computing, factoring integers, RSA public-key cryptosystem. I. I NTRODUCTION T HE RSA public-key cryptosystem [34] is an algorithm that converts input data to an unrecognizable encryption, and converts the unrecognizable data back into its original decryp- tion form. The construction of the RSA public-key cryptosystem is based on the ease of nding large prime numbers. The security for the cryptosystem using public-key is based on the difculty of factoring the product of two large prime numbers. The RSA public-key cryptosystem is the most popular cryptosystem. No method in a reasonable amount of time can be applied to break the RSA public-key cryptosystem. Feynman proposed molecular computation in 1961, but his idea was not implemented by experiment for a few decades [37]. In 1994 Adleman [2] succeeded in solving an instance of the Hamiltonian path problem in a test tube, just by handling DNA strands. Lipton [3] demonstrated that the Adleman tech- niques could be used to solve the satisability problem (the rst NP-complete problem). Adleman et al. [14] proposed sticker for enhancing the error rate of hybridization. Manuscript received April 7, 2004; revised November 25, 2004. This work was supported in part by the Republic of China National Science Foundation under Grant NSC-92-2213-E-218-026 and Japan JSPS Grant-in-Aid for Sci- entic Research under Grant (C) 14580386. Asterisk indicates corresponding author. W.-L. Chang and M. S.-H. Ho are with the Department of Information Man- agement, Southern Taiwan University of Technology, Tainan, Taiwan, R.O.C. (E-mail: changwl@mail.stut.edu.tw; MHoInCerritos@yahoo.com). *M. Guo is with the School of Computer Science and Engineering, University of Aizu, Aizu-Wakamatsu City 965-8580, Japan (E-mail: minyi@u-aizu.ac.jp). Digital Object Identier 10.1109/TNB.2005.850474 1536-1241/$20.00 © 2005 IEEE 150 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 4, NO. 2, JUNE 2005 also 3 end matches 5 end. The length of a single-stranded DNA is the number of nucleotides composing the single strand. Thus, if a single stranded DNA includes 20 nucleotides, then we say that it is a 20 mer (i.e., it is a polymer containing 20 monomers). The length of a double-stranded DNA (where each nucleotide is base paired) is counted in the number of base pairs. Thus, if we make a double-stranded DNA from a single stranded 20 mer, then the length of the double stranded DNA is 20 base pairs, also written 20 bp. Hybridization is a special technology term for the pairing of two single DNA strands to make a double helix and also takes advantages of the specicity of DNA base pairing for the detection of specic DNA strands. (For more discussions of the relevant biological background, refer to [1] and [16]). Fig. 1. A schematic representation of a nucleotide. II. B ACKGROUND In this section we review the basic structure of the DNA mole- cule and then discuss available techniques for dealing with DNA that will be used to solve the problem of factoring integers. Si- multaneously, several well-known DNA models are compared. A. The Structure of DNA From [1], [16], DNA ( DeoxyriboNucleic Acid ) is the mole- cule that plays the main role in DNA-based computing. In the biochemical world of large and small molecules, polymers, and monomers , DNA is a polymer, which is strung together from monomers called deoxyriboNucleotides . The monomers used for the construction of DNA are deoxyribonucleotides. Each de- oxyribonucleotide contains three components: a sugar ,a phos- phate group, and a nitrogenous base. The sugar has ve carbon atoms—for the sake of reference there is a xed numbering of them. Because the base also has carbons, to avoid confusion the carbons of the sugar are numbered from 1 to 5 (rather than from one to ve). The phosphate group is attached to the 5 carbon, and the base is attached to the 1 carbon. Within the sugar struc- ture there is a hydroxyl group attached to the 3 carbon. Distinct nucleotides are detected only with their bases, which come in two sorts: purines and pyrimidines . Purines include adenine and guanine , abbreviated and . Pyrimidines contain cytosine and thymine , abbreviated and . Because nucleotides are distinguished solely from their bases, they are simply represented as , , ,or nucleotides, depending upon the kinds of base that they have. The structure of a nucleotide, cited from [16], is illustrated (in a very simplied way) in Fig. 1. In Fig. 1, B is one of the four possible bases ( , , ,or ), P is the phosphate group, and the rest (the “stick”) is the sugar base (with its carbons enumerated 1 through 5 ). Nucleotides can be linked together in two different ways [1], [16]. The rst method is that the 5 -phosphate group of one nu- cleotide is joined with 3 -hydroxyl group of the other forming a phosphodiester bond. The resulting molecule has the 5 -phos- phate group of one nucleotide, denoted as 5 end, and the 3 -OH group of the other nucleotide available, denoted as 3 end, for bonding. This gives the molecule the directionality , and we can talk about the direction of 5 end to 3 end or 3 end to 5 end. The second way is that the base of one nucleotide interacts with the base of the other to form a hydrogen bond. This bonding is the subject of the following restriction on the base pairing: and can pair together, and and can pair together—no other pairings are possible. This pairing principle is called the Watson–Crick complementarity (named after J. D. Watson and F. H. C. Crick, who deduced the famous double helix structure of DNA in 1953 and won the Nobel Prize for the discovery). A DNA strand is essentially a sequence (polymer) of four types of nucleotides detected by one of four bases they contain. Two strands of DNA can form (under appropriate conditions) a double strand, if the respective bases are the Watson–Crick complements of each other— B. Adleman’s Experiment for Solving the Hamiltonian Path Problem Assume a directed graph , where and are the set of vertices and the set of edges respectively. In general, the Hamiltonian path problem consists of deciding whether has a Hamiltonian path or not. with designed vertices and is said to have a Hamiltonian path if and only if there exists a sequence of compatible “one way” edges (that is, a “path”), which begins at , ends at , and enters every other vertex exactly once [2]. Adleman’s experiment is used to solve the Hamil- tonian path problem for a directed , where and [2]. The rst step of Adleman’s experiment is to generate random paths through the directed graph . To generate random paths, each vertex in for was associated with a random 20-mer sequence of DNA denoted . For each edge in ,an oligonucleotide was created which was the 3 10 mer of (unless , in which case it was all of ) followed by the 5 mer of (unless , in which case it was all of ). The 20-mer sequence Watson–Crick complementary to was denoted . For each vertex in (except and ) and for each edge in , large quantities of oligonucleotides and were mixed together in a single ligation reaction. Here the oligonucleotides served as splints to bring oligonucleotides associated with compatible edges together for ligation. Consequently, the ligation reaction resulted in the formation of DNA molecules that can be viewed as encoding random paths through the directed graph . From the random paths generated, basic biological operations are applied to remove illegal paths and select a Hamiltonian path [2]. C. The Sticker-Based Model The sticker-based model employs two basic groups of single-stranded DNA molecules in its representation of a bit string [14]. Consider a memory strand bases in length subdivided into nonoverlapping regions each bases long (thus, ). Each region is identied with exactly one bit position (or equivalently one Boolean variable) during the course of the computation. Adleman et al. [14] also designed different sticker strands or simply stickers . Each sticker is bases long and is complementary to one and only one of matches and matches ; CHANG et al. : FAST PARALLEL MOLECULAR ALGORITHMS FOR DNA-BASED COMPUTATION: FACTORING INTEGERS 151 . Thus synthesis began by assembling the two 15 base oligonucleotides with sequences and . This process was repeated until all 6 variables had been treated. The probes used for separating the library strands have se- quences complementary to the value sequences. Errors in the separation of the library strands are errors in the computation. Sequences must be designed to ensure that library strands have little secondary structure that might inhibit intended probe-li- brary hybridization. The design must also exclude sequences that might encourage unintended probe-library hybridization. To help achieve these goals, sequences were computer-gener- ated to satisfy the proposed seven constraints [22]. The similar method also is applied to solve a 20-variable of 3-SAT [46]. Fig. 2. An example of a sticker memory. E. DNA Manipulations Fig. 3. Examples of memory complexes. In the past decade, there have been revolutionary advances in the eld of biomedical engineering, particularly in recombinant DNA and RNA manipulating. Due to the industrialization of the biotechnology eld, laboratory techniques for recombinant DNA and RNA manipulation are becoming highly standard- ized. Basic principles about recombinant DNA can be found in [47]–[50]. In this subsection we describe eight biological oper- ations that are useful for solving the problem of factoring inte- gers. The method of constructing DNA solution space for the problem of factoring integers is based on the proposed method in [22], [46]. A (test) tube is a set of molecules of DNA (a multiset of nite strings over the alphabet the memory regions. If a sticker is annealed to its matching region on a given memory strand, then the bit corresponding that particular region is on for that strand. If no sticker is annealed to a region, then that region’s bit is off. Each memory strand along with its annealed stickers (if any) represents one bit string. Such partial duplexes are called memory complexes . A large set of bit strings is represented by a large number of identical memory strands each of which has stickers annealed only at the required bit positions. Such a collection of memory complexes is called as a tube. In this model, a unique association of memory strands and stickers represents each possible bit string. In the illustration given in Fig. 2, we consider a memory strand of length , divided into regions, each of length . Thus, in this case the necessary complexes are interpreted as containing four bits of information. In particular, consider the memory com- plexes depicted in Fig. 3. In the rst memory complex, all re- gions are off, whereas in the last complex the last two regions are on. The binary numbers represented by these four memory complexes are 0000, 0100, 1001, and 0011, respectively. ). Given a tube, one can perform the following operations. 1. Extract . Given a tube and a short single strand of DNA, , the operation produces two tubes and , where is all of the molecules of DNA in which contain as a substrand and is all of the molecules of DNA in which do not contain . 2. Merge . Given tubes and , yield , where . This operation is to pour two tubes into one, without any change in the individual strands. D. Adleman’s Experiment for Solution of a Satisfability Problem Adleman et al. [22], [46] performed experiments that were applied to, respectively, solve a six-variable 11-clause for- mula and a 20-variable 24-clause three-conjunctive normal form (3-CNF) formula. A Lipton encoding [3] was used to represent all possible variable assignments for the chosen six-variable or 20-variable SAT problem. For each of the six variables , two distinct 15 base value sequences were designed. One represents true , , and another represents false , for . Each of the truth assignments was represented by a library sequence of 90 bases consisting of the concatenation of one value sequence for each variable. DNA molecules with library sequences are termed library strands and a combinatorial pool containing library strands is termed a library . The six-variable library strands were synthesized by employing a mix-and-split combinatorial synthesis technique [24]. The library strands were assigned library sequences with 3. Detect . Given a tube ,if includes at least one DNA molecule, we have “yes,” and if contains no DNA mol- ecule, we have “no.” 4. Discard . Given a tube , the operation will discard . 5. Amplify . Given a tube , the operation Amplify , will produce two new tubes and so that and are totally a copy of ( and are now identical) and becomes an empty tube. 6. Append . Given a tube containing a short strand of DNA , the operation will append onto the end of every strand in . 7. Append-head . Given a tube containing a short strand of DNA, , the operation will append onto the head of every strand in . 8. Read . Given a tube , the operation is used to describe a single molecule, which is contained in tube .Even if contains many different molecules each encoding a different set of bases, the operation can give an explicit description of exactly one of them. at the 5 -end and at the 3 -end 152 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 4, NO. 2, JUNE 2005 F. Comparisons of Various Famous DNA Models eral H systems, and by Hubert and Schuler [45]. Barua et al. [31] proposed a recursive DNA algorithm for adding two binary numbers, which require biosteps using only dif- ferent type of DNA strands, where is the size of the binary string representing the larger of the two numbers. Adleman et al. [14] proposed a sticker-based model to enhance the error rate of hybridization in the Adleman–Lipton model. Their model can be used for determining solutions of an instance in the set cover problem. Simultaneously, Adleman et al. [52] also pointed out that the data encryption standard could be easily broken from solution space of stickers in the sticker-based model. Perez-Jimenez et al. [15] employed the sticker-based model [14] to resolve knapsack problems. In our previous work, Chang et al. [25], [32], [36], [53] also employed the sticker-based model and the Adleman–Lipton model for dealing with Cook’s theorem [9], [10], the set-split- ting problem, the subset-sum problem, and the dominating-set problem for decreasing the error rate of hybridization. Based on solution space of splint in the Adleman–Lipton model, their methods [7], [17]–[20], [35] could be applied toward solving the traveling salesman problem, the dominating-set problem, the vertex cover problem, the clique problem, the inde- pendent-set problem, the three-dimensional matching problem, the set-packing problem, the set-cover problem, and the problem of exact cover by three-sets. Lipton et al. [51] indicated that DNA-based computing had been shown to easily be capable of breaking the data encryption standard from solution space of splint . The methods used for solving problems have exponentially increased volumes of DNA and linearly increased the time. Bach et al. [33] proposed a volume, time molecular algorithm for the three-coloring problem and a volume, time molecular algorithm for the independent set problem, where and are, subsequently, the number of ver- ticesandthenumberofedgesintheproblemsresolved.Fu[21]pre- sented a polynomial-time algorithm with a volume for the three-SAT problem, a polynomial-time algorithm with a volume for the three-coloring problem, and a polynomial-time al- gorithm with a volume for the independent set. Though the size of those volumes [21], [33] is lower, constructing those vol- umes is more difcult and the time complexity is higher. Quyang et al. [4] showed that enzymes could be used to solve the NP-complete clique problem. Because the maximum number of vertices that they can process is limited to 27, the maximum number of DNA strands for solving this problem is 2 [4]. Shin et al. [8] presented an encoding scheme for decreasing the error rate of hybridization. This method [8] can be employed toward ascertaining the traveling salesman problem for representing integers and real values with xed-length codes. Arita et al. [5] and Morimoto et al. [6] proposed a new molecular experimental technique and a solid-phase method to nd a Hamiltonian path. Amos [13] proposed a parallel ltering model for resolving the Hamiltonian path problem, the subgraph isomorphism problem, the three-vertex-colorability problem, the clique problem, and the independent-set problem. The methods in [5], [6], and [13] have lowered the error rate in real molecular experiments. In [26], [27], and [30], the methods for DNA-based computing by self-assembly require the use of DNA nanostructures, called tiles, to own expressive computational power and convenient input and output (I/O) mechanisms. That is, DNA tiles have lower error rate in self-assembly. One of the earliest attempts to perform arithmetic operations (addition of two positive binary numbers) using DNA is by Guarneiri et al. [38], utilizing the idea of encoding differently bit values zero and one as single-stranded DNAs, based upon their positions and the operands in which they appear. Gupta et al. [39] performed logic and arithmetic operations using the xed bit encoding of the full corresponding truth tables. Qiu and Lu [40] applied substitution operation to insert results (by en- coding all possible outputs of bit by bit operation along with second operand) in the operand strands. Ogihara and Ray [41], as well as Amos and Dunne [42] proposed methods to realize any Boolean circuit (with bounded fan in) using DNA strands in a constructive fashion. Other new suggestions to perform all basic arithmetic operations are by Atanasiu [43] using P sys- tems and by Frisco [44] using splicing operation under gen- III. F ACTORING THE P RODUCT OF T WO L ARGE P RIME N UMBERS OF B ITS A. RSA Public-Key Cryptosystem In the RSA cryptosystem [34], a participant creates his public and secret keys with the following steps. The rst step is to select at random two large prime numbers and , assuming that the length of and are both bits. The second step is to compute by the equation . The third step is to select a small odd integer that is relatively prime to , which is equal to . The fourth step is to compute as the multiplicative inverse of , module . The fth step is to publish the pair as his RSA public key. The sixth step is to keep secret the pair as his secret key. A method to factor as in a reasonable amount of time has not been found. B. Solution Space of DNA Strands for Every Unsigned Integer of Bits Suppose that an unsigned integer of bits is represented as a -bit binary number, , where the value of each bit is either one or zero for . The bits and represent, respectively, the most signicant bit and the least signicant bit for . The range of the value to an unsigned integer of bits is from 0 to . From [22], [46], for every bit ,two distinct 15 base value sequences are designed. One represents the value zero for and the other represents the value one for . For convenience, we assume that denotes the value of to be one and denes the value of to be zero. The following algorithm is used to construct the solution space of DNA strands for different unsigned integer values. Procedure InitialSolution (T ) (1) For j = k down to 1 (1a) Amplify( T , T , T ). (1b) Append( T , m ). (1c) Append( T , m ). (1d) T = [(T ;T ) . EndFor EndProcedure CHANG et al. : FAST PARALLEL MOLECULAR ALGORITHMS FOR DNA-BASED COMPUTATION: FACTORING INTEGERS 153 TABLE I R ESULT FOR T UBE T I S G ENERATED BY THE A LGORITHM I NITIAL S OLUTION (T ) TABLE II R ESULT FOR T UBE T I S G ENERATED BY THE A LGORITHM I NITIAL P RODUCT (T ) Consider that the number of bits for is 3 bits. Eight values for are, respectively, 000, 001, 010, 011,100, 101 110, and 111. Tube is an empty tube and is regarded as an input tube for the algorithm InitialSolution . Because the value for is three, Steps (1a) through (1d) will be run three times. After the rst execution of Step (1a) is nished, tube , tube , and tube . Next, after the rst execution for Step (1b) and Step (1c) is performed, tube and tube . After the rst execution of Step (1d) is run, tube , tube , and tube . Then, after the second execution of Step (1a) is nished, tube , tube , and tube . After the rest of operations are performed, tube , tube , and the result for tube is shown in Table I. Lemma 1 is applied to demonstrate correction of the algorithm InitialSolution unsigned integer of bits denoted in Section III-B, is one of two large prime numbers if the remainder is equal to zero. Assume that in a divider the length of a dividend is bits and the length of a divisor is bits, where .Itis very obvious that the division instruction is nished through successive compare, shift, and subtract operations of at most times. Therefore, suppose that is represented as a -bit binary number, , where the value of each bit is either one or zero for and . The bits and , respectively, rep- resent the most signicant bit and the least signicant bit for . One binary number and another binary number are, respectively, applied to represent the minuend and the difference for the successive compare, shift, and subtract operations of the th time. This is to say that the binary number is the minuend for the suc- cessive compare, shift, and subtract operations of the . Lemma 1: The algorithm InitialSolution is used to con- struct the solution space of DNA strands for different un- signed integer values. Proof: The algorithm InitialSolution is implemented by means of the amplify, append , and merge operations. Each execution of Step (1a) is used to amplify tube and to generate two new tubes, and , which are copies of . Tube then becomes empty. Then, Step (1b) is applied to append a DNA sequence, representing the value one for , onto the end of every strand in tube . This is to say that those integers containing the value one to the th bit appear in tube . Step (1c) is also employed to append a DNA sequence, representing the value zero for , onto the end of every strand in tube . That implies that these integers containing the value zero to the th bit appear in tube . Next, Step (1d) is used to pour tubes and into tube . This indicates that DNA strands in tube include DNA sequences of and . At the end of Step (1), tube consists of DNA sequences representing different unsigned integer values. From InitialSolution , it takes amplify operations, append operations, merge operations, and three test tubes to construct the solution space of DNA strands. A value sequence for every bit contains 15 bases. Therefore, the length of a DNA strand, encoding an unsigned integer value of bits, is bases consisting of the concatenation of one value sequence for each bit. th time. For every bit ,two distinct 15 base value sequences were designed. One represents the value zero for and the other represents the value one for . For convenience, we assume that denotes the value of to be one and denes the value of to be zero. The following algorithm is used to construct a DNA strand for the value of . Procedure InitialProduct (T ) (1) For q =1 to 2 k (1a) Append-head (T ;n ;q) . EndFor EndProcedure Consider that the number of bits for is 6 bits and the value for is 001 111. Tube with the result shown in Table I is regarded as an input tube for the algorithm, InitialProduct . Because the value for is six, Step (1a) will be executed six times. After each operation for Step (1a) is performed, the result is shown in Table II. Lemma 2 is used to prove correction of the algorithm InitialProduct . C. The Construction to the Product of Two Large Prime Numbers of Bits Assume that the length for , the product of two large prime numbers of bits, denoted in Section III-A, is bits. Also suppose that the product is used to represent the minuend (dividend) and the difference for successive compare, shift, and subtract operations in a divider. When Lemma 2: A DNA strand for the value of can be con- structed from InitialProduct . Proof: Refer to Lemma 1. From InitialProduct , it takes append-head oper- ations and one test tube to construct a DNA strand. The length of the DNA strand, encoding the value of ,is bases con- sisting of the concatenation of one value sequence for each bit. is divided by ,an [ Pobierz całość w formacie PDF ] |
||||
![]() |
|||||
Wszelkie Prawa Zastrzeżone! Jedyną nadzieją jest... nadzieja. Design by SZABLONY.maniak.pl. |
![]() |
||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |