Recursos de colección
Caltech Authors (165.044 recursos)
Repository of works by Caltech published authors.
Group = Parallel and Distributed Systems Group
Repository of works by Caltech published authors.
Group = Parallel and Distributed Systems Group
Zuck, Aviad; Li, Yue; Bruck, Jehoshua; Porter, Donald E.; Tsafrir, Dan
Encryption is a useful tool to protect data confidentiality. Yet it is still challenging to hide the very presence of encrypted, secret data from a powerful adversary. This paper presents a new technique to hide data in flash by manipulating the voltage level of pseudo-randomlyselected flash cells to encode two bits (rather than one) in the cell. In this model, we have one “public” bit interpreted using an SLC-style encoding, and extract a private bit using an MLC-style encoding. The locations of cells that encode hidden data is based on a secret key known only to the hiding user.
Intuitively, this...
Jain, Siddharth; Raviv, Netanel; Bruck, Jehoshua
Erwin Chargaff in 1950 made an experimental observation that the count of A is equal to the count of T and the count of C is equal to the count of G in DNA. This observation played a crucial rule in the discovery of the double stranded helix structure by Watson and Crick. However, this symmetry was also observed in single stranded DNA. This phenomenon was termed as 2nd Chargaff Rule. This symmetry has been verified experimentally in genomes of several different species not only for mononucleotides but also for reverse complement pairs of larger lengths up to a small...
Huang, Wentao; Bruck, Jehoshua
This paper studies the problem of repairing secret sharing schemes, i.e., schemes that encode a message into n shares, assigned to n nodes, so that any n − r nodes can decode the message but any colluding z nodes cannot infer any information about the message. In the event of node failures so that shares held by the failed nodes are lost, the system needs to be repaired by
reconstructing and reassigning the lost shares to the failed (or replacement) nodes. This can be achieved trivially by a trustworthy third-party that receives the shares of the available nodes, recompute and reassign...
Jain, Siddharth; Farnoud (Hassanzadeh), Farzad; Schwartz, Moshe; Bruck, Jehoshua
Duplication mutations play a critical role in the
generation of biological sequences. Simultaneously, they
have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into account the presence of other types of mutations. In this work, we consider the capacity of duplication mutations in the presence of point-mutation
noise, and so quantify the generation power of these mutations. We show that if the number of point mutations is vanishingly small compared to the number of duplication mutations...
Jain, Siddharth; Farnoud (Hassanzadeh), Farzad; Schwartz, Moshe; Bruck, Jehoshua
Duplication mutations play a critical role in the
generation of biological sequences. Simultaneously, they
have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into account the presence of other types of mutations. In this work, we consider the capacity of duplication mutations in the presence of point-mutation
noise, and so quantify the generation power of these mutations. We show that if the number of point mutations is vanishingly small compared to the number of duplication mutations...
Alon, Noga; Bruck, Jehoshua; Farnoud, Farzad; Jain, Siddharth
We study the tandem duplication distance between binary sequences and their roots. In other words, the quantity of interest is the number of tandem duplication operations of the form x = abc → y = abbc, where
x and y are sequences and a, b, and c are their substrings, needed to generate a binary sequence of length
n starting from a square-free sequence from the set
{0, 1, 01, 10, 010, 101}. This problem is a restricted case of finding the duplication/deduplication
distance between two sequences, defined as the minimum number of duplication and deduplication
operations required to transform one sequence to the other....
Jain, Siddharth; Farnoud, Farzad; Schwartz, Moshe; Bruck, Jehoshua
The ability to store data in the DNA of a living
organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected to maintain data integrity. In this paper, we provide error-correcting codes for errors caused by tandem duplications, which create a copy of a block of the sequence and insert it in a tandem manner, i.e., next
to the original. In particular, we present two families of codes for...
Jain, Siddharth; Farnoud, Farzad; Bruck, Jehoshua
The majority of the human genome consists of repeated sequences. An important type of repeats common in the human genome are tandem repeats, where identical copies appear next to each other. For example, in the sequence AGTCTGTGC, TGTG is a tandem repeat, namely, it was generated from AGTCTGC by tandem duplication of length 2. In this work, we investigate the possibility of generating a large number of sequences from a small initial string (called
the seed) by tandem duplication of length bounded by a constant. Our results include exact capacity values for certain tandem duplication string systems with alphabet sizes 2;...