Repository of works by Caltech published authors.
Group = Center for Advanced Computing Research
Mostrando recursos 1 - 20 de 80
JClarens: A Java Framework for Developing and Deploying Web Services for Grid Computing - Thomas, Michael; Steenberg, Conrad; van Lingen, Frank; Newman, Harvey; Bunn, Julian; Ali, Arshad; McClatchey, Richard; Anjum, Ashiq; Azim, Tahir; ur Rehman, Waqas; Khan, Faisal; In, Jang Uk
High Energy Physics (HEP) and other scientific communities have adopted Service Oriented Architectures (SOA) as part of a larger Grid computing effort. This effort involves the integration of many legacy applications and programming libraries into a SOA framework. The Grid Analysis Environment (GAE) is such a service oriented architecture based on the Clarens Grid Services Framework and is being developed as part of the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) at the European Laboratory for Particle Physics (CERN). Clarens provides a set of authorization, access control, and discovery services, as well as XMLRPC and SOAP...
SOAP Services with Clarens: Guide for Developers and Administrators - Steenberg, Conrad; Jacob, Joseph C.; Williams, Roy
The Clarens application server enables secure, asynchronous SOAP services to run on a Grid cluster such as one of those of the TeraGrid. There is a Client, who wants to use the service and understands the application domain enough to form a reasonable service request; a Developer, who is a power-user of the TeraGrid, who understands both Clarens and the application domain, and creates and deploys a service on a TeraGrid head node; and there is a Root system administrator, who controls the Clarens installation and the cluster on which it runs. The purpose of this document is to provide...
Grist: Grid-based Data Mining for Astronomy - Jacob, Joseph C.; Katz, Daniel S.; Miller, Craig D.; Walia, Harshpreet; Williams, Roy; Djorgovski, S. George; Graham, Matthew; Mahabal, Ashish; Babu, Jogesh; Vanden Berk, Daniel E.; Nichol, Robert
The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a work ow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the "hyperatlas" project to serve high-resolution multi-wavelength imagery over...
Time Domain Explorations With Digital Sky Surveys - Mahabal, Ashish A.; Djorgovski, S. G.; Graham, M. J.; Kollipara, Priya; Granett, Benjamin; Krause, Elisabeth; Williams, Roy; Bogosavljevic, M.; Baltay, C.; Rabinowitz, D.; Bauer, A.; Andrews, P.; Ellman, N.; Duffau, S.; Jerke, J.; Rengstorf, A.; Brunner, R.; Musser, J.; Mufson, S.; Gebhard, M.
One of the new frontiers of astronomical research is the exploration of time variability on the sky at different wavelengths and flux levels. We have carried out a pilot project using DPOSS data to study strong variables and transients, and are now extending it to the new Palomar-QUEST synoptic sky survey. We report on our early findings and outline the methodology to be implemented in preparation for a real-time transient detection pipeline. In addition to large numbers of known types of highly variable sources (e.g., SNe, CVs, OVV QSOs, etc.), we expect to find numerous transients whose nature may be...
Virtual Astronomy, Information Technology, and the New Scientific Methodology - Djorgovski, S. G.
All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transform the scientific practice, but also poses a number of common technological challenges. The Virtual Observatory concept is the astronomical community's response to these challenges: it aims to harness the progress in information technology in the service of astronomy, and at the same time provide a valuable testbed for information technology and applied computer science. Challenges broadly fall into two categories: data handling (or "data farming"), including issues such as archives, intelligent storage, databases, interoperability, fast...
Virtual Observatory: From Concept to Implementation - Djorgovski, S. G.; Williams, R.
We review the origins of the Virtual Observatory (VO) concept, and the current status of the efforts in this field. VO is the response of the astronomical community to the challenges posed by the modern massive and complex data sets. It is a framework in which information technology is harnessed to organize, maintain, and explore the rich information content of the exponentially growing data sets, and to enable a qualitatively new science to be done with them. VO will become a complete, open, distributed, web-based framework for astronomy of the early 21st century. A number of significant efforts worldwide are...
Heterogeneous Relational Databases for a Grid-enabled Analysis Environment - Ali, Arshad; Anjum, Ashiq; Azim, Tahir; Bunn, Julian; Iqbal, Saima; McClatchey, Richard; Newman, Harvey; Shah, S. Yousaf; Solomonides, Tony; Steenberg, Conrad; Thomas, Michael; van Lingen, Frank; Willers, Ian
Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism which can hide the heterogeneity of the backend databases from the client applications. This paper focuses on accessing data stored in disparate relational databases through a web service interface, and...
Resource Management Services for a Grid Analysis Environment - Ali, Arshad; Anjum, Ashiq; Azim, Tahir; Bunn, Julian; Mehmood, Atif; McClatchey, Richard; Newman, Harvey; ur Rehman, Waqas; Steenberg, Conrad; Thomas, Michael; van Lingen, Frank; Willers, Ian; Zafar, Muhammad Adeel
Selecting optimal resources for submitting jobs on a computational grid or accessing data from a data grid is one of the most important tasks of any grid middleware. Most modern grid software today satisfies this reponsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about grid weather, and gives them more control over the decision making...
The Clarens Web Service Framework for Distributed Scientific Analysis in Grid Projects - van Lingen, Frank; Steenberg, Conrad; Thomas, Michael; Anjum, Ashiq; Azim, Tahir; Khan, Faisal; Newman, Harvey; Ali, Arshad; Bunn, Julian; Legrand, Iosif
Large scientific collaborations are moving towards service oriented architecutres for implementation and deployment of globally distributed systems. Clarens is a high performance, easy to deploy Web Service framework that supports the construction of such globally distributed systems. This paper discusses some of the core functionality of Clarens that the authors believe is important for building distributed systems based on Web Services that support scientific analysis.
The UltraLight Project: The Network as an Integrated and Managed Resource in Grid Systems for High Energy Physics and Data Intensive Science - Newman, Harvey; Bunn, Julian; Cavanaugh, Richard; Legrand, Iosif; Low, Steven; McKee, Shawn; Nae, Dan; Ravot, Sylvan; Steenberg, Conrad; Su, Xun; Thomas, Michael; van Lingen, Frank; Xia, Yang
We describe the NSF-funded UltraLight project. The project's goal is to meet the data-intensive computing challenges of the next generation of particle physics experiments with a comprehensive, network-focused agenda. In particular we argue that instead of treating the network traditionally, as a static, unchanging and unmanaged set of inter-computer links, we instead will use it as a dynamic, configurable, and closely monitored resource, managed end-to-end, to construct a next-generation global system able to meet the data processing, distribution, access and analysis needs of the high energy physics (HEP) community. While the initial UltraLight implementation and services architecture is being developed...
TetSplat: Real-time Rendering and Volume Clipping of Large Unstructured Tetrahedral Meshes - Museth, Ken; Lombeyda, Santiago
We present a novel approach to interactive visualization and exploration of large unstructured tetrahedral meshes. These massive 3D meshes are used in mission-critical CFD and structural mechanics simulations, and typically sample multiple field values on several millions of unstructured grid points. Our method relies on the pre-processing of the tetrahedral mesh to partition it into non-convex boundaries and internal fragments that are subsequently encoded into compressed multi-resolution data representations. These compact hierarchical data structures are then adaptively rendered and probed in real-time on a commodity PC. Our point-based rendering algorithm, which is inspired by QSplat, employs a simple but highly...
An Architecture for Scaling NVO Services to TeraGrid - Williams, Roy; Hanisch, Bob; Jacob, Joe; Plante, Ray; Szalay, Alex
The term "cyberinfrastructure" has been adopted by the US National Science Foundation to mean "advanced computing engines, data archives and digital libraries, observation and sensor systems, and other research and education instrumentation [linked] into a common framework". One of the largest awards in this program is the TeraGrid, a linkage of large supercomputer centers based on the Globus software. Another cyberinfrastructure program is the National Virtual Observatory, a linkage of astronomical data publishers into a service-oriented framework.
There are different philosophies behind the TeraGrid and the NVO architecture. This note explains a proposed service-oriented architecture for TeraGrid nodes that is...
NVO-TeraGrid First Year Results: TeraGrid Utilization Annual Report for the National Virtual Observatory Multi-year, Large Research Collaboration - Williams, Roy; Connolly, Andrew; Gardner, Jeffrey
The NSF National Virtual Observatory (NVO) is a multiyear effort to build tools, services, registries, protocols, and standards that can extract the full knowledge content of massive, multi-frequency data sets. Here we detail our progress from the first year, and plans for the second year of our effort to combine the computational resources of the TeraGrid with the NVO. The work includes: creation of derived image products from multi-terabyte datasets; multiwavelength and multitemporal image federation; analysis of 3-point correlation in galaxies; fitting models to thousands of galaxy spectra; Monte-Carlo modeling of early-Universe models; processing of the ongoing 13 Tbyte Palomar-Quest...
HotGrid: Graduated Access to Grid-based Science Gateways - Williams, Roy; Steenberg, Conrad; Bunn, Julian
We describe the idea of a Science Gateway, an application-specific task wrapped as a web service, and some examples of these that are being implemented on the US TeraGrid cyberinfrastructure. We also describe HotGrid, a means of providing simple, immediate access to the Grid through one of these gateways, which we hope will broaden the use of the Grid, drawing in a wide community of users. The secondary purpose of HotGrid is to acclimate a science community to the concepts of certificate use. Our system provides these weakly authenticated users with immediate power to use the Grid resources for science,...
Vector Field Analysis and Visualization through Variational Clustering - McKenzie, Alexander; Lombeyda, Santiago; Desbrun, Mathieu
Scientic computing is an increasingly crucial component of research in various disciplines. Despite its potential, exploration of the results is an often laborious task, owing to excessively large and verbose datasets output by typical simulation runs. Several approaches have been proposed to analyze, classify, and simplify such data to facilitate an informative visualization and deeper understanding of the underlying system. However, traditional methods leave much room for improvement. In this article we investigate the visualization of large vector elds, departing from accustomed processing algorithms by casting vector eld simplication as a variational partitioning problem. Adopting an iterative strategy, we introduce...
A Visual Stack Based Paradigm for Visualization Environments - Gilbert, Matt; Lombeyda, Santiago
We present a new visual paradigm for Visualization Systems, inspired by stack-based programming. Most current implementations of Visualization systems are based on directional graphs. However directional graphs as a visual representation of execution, though initially quite intuitive, quickly grow cumbersome and difficult to follow under complex examples. Our system presents the user with a simple and compact methodology of visually stacking actions directly on top of data objects as a way of creating filter scripts. We explore and address extensions to the basic paradigm to allow for: multiple data input or data output objects to and from execution action modules,...
A Compiler and Runtime Infrastructure for Automatic Program Distribution - Diaconescu, Roxana E.; Wang, Lei; Mouri, Zachary; Chu, Matt
This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that enables experimentation with various program partitioning and mapping strategies and the study of automatic distribution's effect on resource consumption (e.g., CPU, memory, communication). Since many optimization techniques are faced with conflicting optimization targets (e.g., memory and communication), we believe that it is important to be able to study their interaction.
We present a set of techniques that enable flexible resource modeling and program distribution. These are: dependence analysis, weighted graph partitioning, code and communication generation, and...
Reusable and Extensible High Level Data Distributions - Diaconescu, Roxana E.; Chamberlain, Bradford; Zima, Hans P.
This paper presents a reusable design of a data distribution framework for data parallel high performance applications. Distributions are a means to express locality in systems composed of large numbers of processor and memory components connected by a network. Since distributions have a great effect on the performance of applications, it is important that the distribution strategy is flexible, so its behavior can change depending on the needs of the application. At the same time, high productivity concerns require that the user is shielded from error-prone, tedious details such as communication and synchronization.
We propose an approach to distributions that enables...
The "MIND" Scalable PIM Architecture - Sterling, Thomas; Brodowicz, Maciej
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors...