Obtaining an accurate whole program path (WPP) that captures a program’s runtime behaviour in terms of a control-flow trace has a number of well-known benefits, including opportunities for code optimization, bug detection, program analysis refinement, etc. Existing techniques to compute WPPs perform sub-optimal instrumentation resulting in significant space and time overheads. Our goal in this thesis is to minimize these overheads without losing precision.
To do so, we design a novel and scalable whole program analysis to determine instrumentation points used to obtain WPPs. Our approach is divided into three components: (a) an efficient summarization technique for inter-procedural path reconstruction,...
Vasista, Vinay V
Geometric Multigrid (GMG) methods are widely used in numerical analysis to accelerate the convergence of partial differential equations solvers using a hierarchy of grid discretizations. These solvers find plenty of applications in various fields in engineering and scientific domains, where solving PDEs is of fundamental importance. Using multigrid methods, the pace at which the solvers arrive at the solution can be improved at an algorithmic level. With the advance in modern computer architecture, solving problems with higher complexity and sizes is feasible - this is also the case with multigrid methods. However, since hardware support alone cannot achieve high performance...
Software components expose Application Programming Interfaces (APIs) as a means to access their functionality, and facilitate reuse. Developers use APIs supplied by programming languages to access the core data structures and algorithms that are part of the language framework. They use the APIs of third-party libraries for specialized tasks. Thus, APIs play a central role in mediating a developer's interaction with software, and the interaction between different software components. However, APIs are often large, complex and hard to navigate. They may have hundreds of classes and methods, with incomplete or obsolete documentation. They may encapsulate concurrency behaviour that the developer...
In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only automate the feature extraction process but also provide with robust features for various machine learning tasks. But the unsupervised pretraining and feature extraction using multi-layered networks are restricted only to the input features and not to the output. The performance of many supervised learning algorithms (or models) depends on how well the output dependencies are...
Joseph, Ajin George
Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in ma-chine learning, information retrieval, regression, estimation, operations research and a wide variety of computing domains. The subject is being deeply studied both theoretically and experimentally and several algorithms are available in the literature. These algorithms which can be executed (sequentially or concurrently) on a computing machine explore the space of input parameters to seek high quality solutions to the optimization problem with the search mostly guided by certain structural properties of the objective function. In...
We study the problem of proving lower bounds for depth four arithmetic circuits. Depth four circuits have been receiving much attraction when it comes to recent circuit lower bound results, as a result of the series of results culminating in the fact that strong enough lower bounds for depth four circuits will imply super-polynomial lower bounds for general arithmetic circuits, and hence solve one of the most central open problems in algebraic complexity i.e a separation between the VP and VNP classes. However despite several efforts, even for general arithmetic circuits, the best known lower bound is Omega(N log N)...
The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision problems arising in modern applications including Internet display advertising, crowdsourcing, online procurement, smart grids, etc. Several variants of the MAB problem have been proposed to extend the basic model to a variety of practical and general settings. The sleeping multi-armed bandit (SMAB) problem is one such variant where the set of available arms varies with time. This study is focused on analyzing the efficacy of the Thompson Sampling algorithm for solving the SMAB problem.
Any algorithm for the classical MAB problem is expected to choose one of...
Datta Krupa, R
Interval graphs are well studied structures. Intervals can represent resources like jobs to be sched-uled. Finding maximum independent set in interval graphs would correspond to scheduling maximum number of non-conflicting jobs on the computer. Most optimization problems on interval graphs like independent set, vertex cover, dominating set, maximum clique, etc can be solved eﬃciently using combinatorial algorithms in polynomial time. Hitting, Covering and Packing problems have been ex-tensively studied in the last few decades and have applications in diverse areas. While they are NP-hard for most settings, they are polynomial solvable for intervals. In this thesis, we consider the generaliza-tions...
Graph algorithms are ubiquitously used across domains. They exhibit parallelism, which can be exploited on parallel architectures, such as multi-core processors and accelerators. However, real world graphs are massive in size and cannot fit into the memory of a single machine. Such large graphs are partitioned and processed in a distributed cluster environment which consists of multiple GPUs and CPUs.
Existing frameworks that facilitate large scale graph processing in the distributed cluster have their own style of programming and require extensive involvement by the user in communication and synchronization aspects. Adaptation of these frameworks appears to be an overhead for...
Oblivious Transfer (OT) is one of the most fundamental cryptographic primitives with wide-spread application in general secure multi-party computation (MPC) as well as in a number of tailored and special-purpose problems of interest such as private set intersection (PSI), private information retrieval (PIR), contract signing to name a few. Often the instantiations of OT require prohibitive communication and computation complexity. OT extension protocols are introduced to compute a very large number of OTs referred as extended OTs at the cost of a small number of OTs referred as seed OTs.
We present a fast OT extension protocol for small secrets...
Tekumalla, Lavanya Sita
The focus of this thesis is models for non-parametric clustering of multivariate count data. While there has been significant work in Bayesian non-parametric modelling in the last decade, in the context of mixture models for real-valued data and some forms of discrete data such as multinomial-mixtures, there has been much less work on non-parametric clustering of Multi-variate Count Data. The main challenges in clustering multivariate counts include choosing a suitable multivariate distribution that adequately captures the properties of the data, for instance handling over-dispersed data or sparse multivariate data, at the same time leveraging the inherent dependency structure between dimensions...
Satyanath Bhat, K
In this thesis, we address several generic problems concerned with procurement of tasks from a crowd that consists of strategic workers with uncertainty in their qualities. These problems assume importance as the quality of services in a service marketplace is known to degrade when there is (unchecked) information asymmetry pertaining to quality. Moreover, crowdsourcing is increasingly being used for a wide variety of tasks these days since it offers high levels of flexibility to workers as well as employers. We seek to address the issue of quality uncertainty in crowdsourcing through mechanism design and machine learning. As the interactions in...
Praphul Chandra, *
With the ever-increasing trend in the number of social interactions getting intermediated by technology (the world wide web) as the backdrop, this thesis focuses on the design of mechanisms for online communities (crowds) which strive to come together, albeit in ad-hoc fashion, to achieve a social objective. Two examples of such web-based social communities which are of central concern in this thesis are crowdsourcing markets and crowdfunding platforms. For these settings which involve strategic human agents, we design mechanisms that incentivize contributions (effort, funds, or information) from the crowd and aggregate these contributions to achieve the specified objective. Our work...
Ranganath, B N
Hierarchical Bayesian Models and Matrix factorization methods provide an unsupervised way to learn latent components of data from the grouped or sequence data. For example, in document data, latent component corn-responds to topic with each topic as a distribution over a note vocabulary of words. For many applications, there exist sparse relationships between the domain entities and the latent components of the data. Traditional approaches for topic modelling do not take into account these sparsity considerations. Modelling these sparse relationships helps in extracting relevant information leading to improvements in topic accuracy and scalable solution. In our thesis, we explore these...
Reddy, Danda Sai Koti
Optimization problems involving uncertainties are common in a variety of engineering disciplines such as transportation systems, manufacturing, communication networks, healthcare and finance. The large number of input variables and the lack of a system model prohibit a precise analytical solution and a viable alternative is to employ simulation-based optimization. The idea here is to simulate a few times the stochastic system under consideration while updating the system parameters until a good enough solution is obtained.
Formally, given only noise-corrupted measurements of an objective function, we wish to end a parameter which minimises the objective function. Iterative algorithms using statistical methods...
In this doctoral thesis, we address several representative problems that arise in the context of learning from multiple heterogeneous agents. These problems are relevant to many modern applications such as crowdsourcing and internet advertising. In scenarios such as crowdsourcing, there is a planner who is interested in learning a task and a set of noisy agents provide the training data for this learning task. Any learning algorithm making use of the data provided by these noisy agents must account for their noise levels. The noise levels of the agents are unknown to the planner, leading to a non-trivial difficulty....
A wireless sensor network (WSN) is a collection of sensor nodes distributed over a geographical region to obtain the environmental data. It can have different types of applications ranging from low data rate event driven and monitoring applications to high data rate real time industry and military applications. Energy efficiency and reliability are the two major design issues which should be handled efficiently at all the layers of communication protocol stack, due to resource constraint sensor nodes and erroneous nature of wireless channel respectively. Media access control (MAC) is the protocol which deals with the problem of packet collision due...
Watwe, Siddharth P
Clock synchronization in a wireless sensor network (WSN) is essential as it provides
a consistent and a coherent time frame for all the nodes across the network. Typically,
clock synchronization is achieved by message passing using carrier sense multiple
access (CSMA) for media access. The nodes try to synchronize with each other, by
sending synchronization request messages. If many nodes try to send messages simultaneously, contention-based schemes cannot efficiently avoid collisions which results in message losses and affects the synchronization accuracy. Since the nodes in a WSN have limited energy, it is required that the energy consumed by the clock synchronization protocols is as...
One of the fundamental problems in commutative algebra and algebraic geometry is to understand the nature of the solution space of a system of multivariate polynomial equations over a field k, such as real or complex numbers. An important algorithmic tool in this study is the notion of Groebner bases (Buchberger (1965)). Given a system of polynomial equations, f1= 0,..., fm = 0, Groebner basis is a “canonical" generating set of the ideal generated by f1,...., fm, that can answer, constructively, many questions in computational ideal theory. It generalizes several concepts of univariate polynomials like resultants to the multivariate case,...
In recent times there has been an explosion of online user-generated video content. This has generated significant research interest in video analytics. Human users understand videos based on high-level semantic concepts. However, most of the current research in video analytics are driven by low-level features and descriptors, which often lack semantic interpretation. Existing attempts in semantic video analytics are specialized and require additional resources like movie scripts, which are not available for most user-generated videos. There are no general purpose approaches to understanding videos through semantic concepts.
In this thesis we attempt to bridge this gap. We view videos as collections...