Jaya Prakash, *
Biomedical optical imaging is capable of providing functional information of the soft bi-ological tissues, whose applications include imaging large tissues, such breastand brain in-vivo. Biomedical optical imaging uses near infrared light (600nm-900nm) as the probing media, givin ganaddedadvantageofbeingnon-ionizingimagingmodality. The tomographic technologies for imaging large tissues encompasses diﬀuse optical tomogra-phyandphotoacoustictomography.
Traditional image reconstruction methods indiﬀuse optical tomographyemploysa
�2-norm based regularization, which is known to remove high frequency no is either econstructed images and make the mappearsmooth. Hence as parsity based image reconstruction has been deployed for diﬀuse optical tomography, these sparserecov-ery methods utilize the �p-norm based regularization in the estimation problem with 0≤...
Padhy, Venkat Prasad
Estimation of the Radar Cross Section of large inhomogeneous scattering objects such as composite aircrafts, ships and biological bodies at high frequencies has posed large computational challenge. The detection of scattering from wake vortex leading to detection and possible identification of low observable aircrafts also demand the development of computationally efficient and rigorous numerical techniques. Amongst the various methods deployed in Computational Electromagnetics, the Method of Moments predicts the electromagnetic characteristics accurately. Method of Moments is a rigorous method, combined with an array of modeling techniques such as triangular patch, cubical cell and tetrahedral modeling. Method of Moments has become...
Shaw, Calvin B
Diﬀuse optical tomography uses near infrared (NIR) light as the probing media to re-cover the distributions of tissue optical properties with an ability to provide functional information of the tissue under investigation. As NIR light propagation in the tissue is dominated by scattering, the image reconstruction problem (inverse problem) is non-linear and ill-posed, requiring usage of advanced computational methods to compensate this.
Diffuse optical image reconstruction problem is always rank-deficient, where finding the independent measurements among the available measurements becomes challenging problem. Knowing these independent measurements will help in designing better data acquisition set-ups and lowering the costs associated with...
Our intention is to find similarity among the time series expressions of the genes in microarray experiments. It is hypothesized that at a given time point the concentration of one gene’s mRNA is directly affected by the concentration of other gene’s mRNA, and may have biological significance. We define dissimilarity between two time-series data set as the variance of Euclidean distances of each time points. The large numbers of gene expressions make the calculation of variance of distance in each point computationally expensive and therefore computationally challenging in terms of execution time. For this reason we use autoregressive model which...
High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in space and time, that are prevalent in the dynamic electricity price markets. Typically, a job submission in the batch queues used in these systems incurs a variable queue waiting time before the resources necessary for its execution become available. In variably-priced electricity...
Kandel, Durga Datta
Deciphering the activity of chemical molecules against a pathogenic organism is an essential task in drug discovery process. Virtual screening, in which few plausible molecules are selected from a large set for further processing using computational methods, has become an integral part and complements the expensive and time-consuming in vivo and in vitro experiments. To this end, it is essential to extract certain features from molecules which in the one hand are relevant to the biological activity under consideration, and on the other are suitable for designing fast and robust algorithms. The features/representations are derived either from physicochemical properties or...
Miniaturisation of electronic chips poses challenges at the design stage. The
progressively decreasing circuit dimensions result in complex electrical behaviour
that necessitates complex models.
Simulation of complex circuit models involves extraordinarily large compu-
tational complexity. Such complexity is better managed through Model Order
Reduction. Model order reduction has been successful in large reductions in
system order for most types of circuits, at high levels of accuracy. However,
multiport circuits with large number of inputs/outputs, pose an additional
computational challenge. A strategy based on
exible clustering of interconnects
results in more e cient reduction of multiport circuits. Clustering methods
traditionally use Krylov-subspace methods such as PRIMA for the actual model
Infrastructure-as-a-Service(IaaS), one of the service models of cloud computing, provides resources in the form of Virtual Machines(VMs). Many applications hosted on the IaaS cloud have time varying workloads. These kind of applications benefit from the on-demand provision ing characteristic of cloud platforms. Applications with time varying workloads demand time varying resources in IaaS, which requires elastic resource provisioning in IaaS, such that their performance is intact. In current IaaS cloud systems, VMs are static in nature as their configurations do not change once they are instantiated. Therefore, fluctuation in resource demand is handled in two ways: allocating more VMs to...
Modern supercomputers now use accelerators to achieve their performance with the most widely used accelerator being the Graphics Processing Unit (GPU). However, achieving the performance potential of systems that combine a GPU and CPU is an arduous task which could be made easier with the assistance of the compiler or runtime. In particular, exploiting two features of GPU architectures -- distributed memory and concurrent kernel execution -- is critical to achieve good performance, but in current GPU programming systems, programmers must exploit them manually. This can lead to poor performance. In this thesis, we propose automatic techniques that: i) perform...
Raghavan, Hari K
Adaptive Mesh Refinement (AMR) is a method which dynamically varies the spatio-temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. Due to high resolution discretization of localized regions of interests into rectangular mesh units called patches, AMR provides low cost of computations and high degree of accuracy. General purpose graphics processing units (GPGPUs) with their support for fine-grained parallelism, offer an attractive option for obtaining high performance for AMR applications. The data parallel computations of the finite difference schemes of AMR can be efficiently performed on GPGPUs. This research deals with challenges...
Rajath Kumar, *
Production parallel systems are space-shared and employ batch queues in which the jobs submitted to the systems are made to wait before execution. Thus, jobs submitted to parallel batch systems incur queue waiting times in addition to the execution times. Prediction of these queue waiting times is important to provide overall estimates to the users and can also help meta-schedulers make scheduling decisions.
In the first part of our research, we have developed an integrated framework PQStar for identification and prediction of jobs with short queue waiting times. Analyses of the job traces of supercomputers reveal that about 56 to...
Coarse-Grained Reconfigurable Architectures(CGRAs) can be employed for accelerating computational workloads that demand both flexibility and performance. CGRAs comprise a set of computation elements interconnected using a network and this interconnection of computation elements is referred to as a reconfigurable fabric. The size of application that can be accommodated on the reconfigurable fabric is limited by the size of instruction buffers associated with each Compute element. When an application cannot be accommodated entirely, application is partitioned such that each of these partitions can be executed on the reconfigurable fabric. These partitions are scheduled by an orchestrator. The orchestrator employs dynamic dataflow...
The rampant growth of the Internet has been coupled with an equivalent growth in cyber crime over the Internet. With our increased reliance on the Internet for commerce, social networking, information acquisition, and information exchange, intruders have found financial, political, and military motives for their actions. Network Intrusion Detection Systems (NIDSs) intercept the traffic at an organization’s periphery and try to detect intrusion attempts. Signature-based NIDSs compare the packet to a signature database consisting of known attacks and malicious packet fingerprints. The signatures use regular expressions to model these intrusion activities.
This thesis presents a memory efficient pattern matching system...
MATLAB is an array language, initially popular for rapid prototyping, but is now being in-creasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program’s execution time. Today’s com-puter systems have tremendous computing power in the form of traditional CPU cores and also throughput-oriented accelerators such as graphics processing units (GPUs). Thus, an approach that maps the control flow dominated regions of a MATLAB program to the CPU and the data parallel regions to the GPU can...
A Coarse-Grained Reconfigurable Architecture (CGRA) is a processing platform which constitutes an interconnection of coarse-grained computation units (viz. Function Units (FUs), Arithmetic Logic Units (ALUs)). These units communicate directly, viz. send-receive like primitives, as opposed to the shared memory based communication used in multi-core processors. CGRAs are a well-researched topic and the design space of a CGRA is quite large. The design space can be represented as a 7-tuple (C, N, T, P, O, M, H) where each of the terms have the following meaning: C -choice of computation unit, N -choice of interconnection network, T -Choice of number of...
Sundari, Sivagama M
Climate science or climatology is the scientific study of the earth’s climate, where climate is the term representing weather conditions averaged over a period of time. Climate models are mathematical models used to quantitatively describe, simulate and study the interactions among the components of the climate system -atmosphere, ocean, land and sea-ice. CCSM (Community Climate System Model) is a state-of-the-art climate model, and a long-running coupled multicomponent parallel application involving component models for simulating the components of the climate system. Each of the component models is a large-scale parallel application, and the parallel components exchange climate data through a specialized...
Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to transient faults, wear-out related permanent faults and process variations. Decreasing CMOS reliability implies that high-availability systems which were previously restricted to the domain of mainframe computers or specially designed fault-tolerant systems may be come important for the commodity market as well. In this thesis we tackle the problem of enabling efficient, low cost and configurable fault-tolerance using Chip Multiprocessors (CMPs).
Our work studies architectural fault detection methods based on redundant execution, specifically focusing on “leader-follower” architectures. In such architectures redundant execution is performed on two cores/threads of...
Governments, military, corporations, financial institutions and others exchange a great deal of confidential information using Internet these days. Protecting such confidential information and ensuring their integrity and origin authenticity are of paramount importance. There exist protocols and solutions at different layers of the TCP/IP protocol stack to address these security requirements. Application level encryption viz. PGP for secure mail transfer, TLS based secure TCP communication, IPSec for providing IP layer security are among these security solutions. Due to scalability, wide acceptance of the IP protocol, and its application independent character, the IPSec protocol has become a standard for providing Internet...
Bhavsar, Rajul D
Over the last decade, biological sequence repositories have been growing at an exponential rate. Sophisticated indexing techniques are required to facilitate efficient searching through these humongous genetic repositories. A particularly attractive index structure for such sequence processing is the classical suffix-tree, a vertically compressed trie structure built over the set of all suffixes of a sequence. Its attractiveness stems from its linearity properties -- suffix-tree construction times are linear in the size of the indexed sequences, while search times are linear in the size of the query strings.
In practice, however, the promise of suffix-trees is not realized for extremely long...