slideshow 1 slideshow 2 slideshow 3 slideshow 4 slideshow 5 slideshow 6

You are here

Maestro: Managing Conductor Networks of Automated Processing Pipelines

Wednesday, December 10, 2008
Bradford
Castalia (Univ of AZ)

This presentation will provide an overview of the Maestro software for managing networks of Conductor pipelines with examples as used at the HiRISE Operations Center.

The Mars Reconnaissance Orbiter (MRO; http://mars.jpl.nasa.gov/mro/) High Resolution Imaging Science Experiment (HiRISE; http://hirise.lpl.arizona.edu/) has generated a very large number of observation data products - currently over 30 terabytes in over 867,000 product files - and continues to generate products at a high rate. This has been accomplished by using a network of automated Conductor (http://pirlwww.lpl.arizona.edu/software/Conductor.shtml) pipelines distributed over a cluster of 27 multi-processor compute nodes at the HiRISE Operations Center (HiROC). When a watchdog process, that is always running on one of these nodes, detects that new HiRISE observation data is available from NASA's the Jet Propulsion Lab (JPL; http://www.jpl.nasa.gov/), which receives it from the spacecraft via the Deep Space Network, the watchdog makes an entry in the sources database table of the first Conductor pipeline segment. This initiates the data download to HiROC that begins the sequence of linked Conductor pipelines. Each pipeline segment defines a particular set of data processing operations to be applied as data flows through the network of Conductor pipelines which ultimately produce the data products that are delivered to the science community and the public by Planetary Data System (PDS; http://pds.jpl.nasa.gov/) and the HiRISE web site.

The Conductor pipelines, though they operate autonomously, do require management. For example, to keep up with the flow of new incoming data multiple Conductors are allocated on multiple compute nodes to handle long running procedures, such as geometric processing, thus avoiding processing bottlenecks by processing multiple data sources in parallel. Bad data (Deep Space Network transmission gaps in critical sections) or problems in the underlying systems can cause processing failures that will, if the configured failure limit is reached, cause Conductors to stop processing the affected pipelines and call on operators to investigate the problem before restarting the Conductors. Changes to processing parameters can require reprocessing of some or all data products which calls for a different network of Conductor pipelines than is used for routine processing. Thus the data processing operators use various Conductor networks depending on the needs of the situation. The data processing procedures themselves are undergoing constant tuning and enhancements that require suspending some or all of the Conductor pipelines while new processing software and/or configuration files are installed. These conditions call for a tool that can manage both individual Conductors and the pipeline networks as a whole.

The Maestro package is a new addition to the Conductor software package. It provides remote monitoring and management of Conductor networks. The Maestro software is based on an asynchronous, event-driven Messenger service in which each Conductor reports its processing activities as they occur to a Stage_Manager for its Theater location. Each computer system may host multiple Theaters as needed. A Kapellmeister client is included in the Maestro package that can connect to the Stage_Manager of any Theater location and request to receive a list identifying all Conductors operating at the Theater location and notification of changes to the list. The Kapellmeister establishes Messenger connections to the individual Conductors through the Stage_Manager to receive the notifications of all the Conductor processing activities in real-time. The Kapellmeister provides Conductor network managers with a graphical user interface that controls all Theater connections, lists all the Conductors on all the Theaters with their processing state, enables managers to send the Conductors messages to change their processing state, and shows a matrix of all Conductor pipelines by their Theater location with a display that summarizes all the Conductor states. Operators can start new Conductors on any Theater as needed as well as cause existing Conductors to safely stop processing and, if desired, quit. The Kapellmeister can write a Profile file that defines the current Conductor network, and can read a Profile file to establish the defined Conductor network. Operators also have available a new Conductor Manager interface that provides detailed monitoring and management of all Conductor operations. The Manager may be used remotely via a Kapellmeister or locally when running an individual Conductor.

The Maestro package provides a high level of view of a Conductor pipeline network combined with detailed monitoring and control capabilities that offers significant management effectiveness and efficiency for these networks.

PSI, a Nonprofit Corporation 501(c)(3), and an Equal Opportunity/M/F/Vet/Disabled/Affirmative Action Employer.
Corporate Headquarters: 1700 East Fort Lowell, Suite 106 * Tucson, AZ 85719-2395 * 520-622-6300 * FAX: 520-622-8060
Copyright © 2017 . All Rights Reserved.