Skip links

Data Central

Portfolio

Research Data Discovery

Data Central brings together an array of heterogenous astronomical data in one accessible, feature-rich data portal, maximising the scientific return on Australian optical datasets of national significance.

The Data Central Science Platform has revolutionised how astronomical research teams manage projects across the full data lifecycle. Enabling researchers to make key discoveries with their data across multiple domains, Data Central provides instant access to both cutting edge and legacy data across a breadth of national and international facilities and instruments. We provide user-focused exploration alongside complementary cross-matched data to facilitate new scientific discoveries. Users can run analyses, data reduction routines, and publish data to the world, all on a single platform; a crucial step forward in the era of big-data, where large data volumes are no longer manageable on desktop machines.

Client

Astronomical Research Community

Product

Scientific Platform

Funding

Incorporating the Optical Data Centre, an AAL-funded initiative

URL

datacentral.org.au

Many research datasets suffer from three big problems: 1. they stand in isolation from other similar or complementary data; 2. they do not have appropriate long-term storage; and 3. they do not comply with the FAIR principles (stating that data should be Findable, Accessible, Interoperable, and Reusable; https://go-fair.org). In combination, these difficulties limit the academic potential and legacy value of some of our world-leading scientific endeavours.

In 2015 we set out to address this problem for astronomical data, by developing a bespoke software system that re-evaluated how astronomical big data could be managed. This became the Data Central Science Platform in 2017. Data Central provides data access by implementing International Virtual Observatory Alliance (IVOA or "Virtual Observatory") standards agreed across the global astronomical community. This helps to make astronomical data FAIR: however there are no turnkey solutions available (see O'Toole & Tocknell 2023). We have therefore developed our own implementations of the IVOA standards and built them into the Data Central science platform to enhance the interoperability of the data. 

Data Central's goal has been, from its inception, to support data across the full research life cycle: from concept through to publication. A large part of this is to allow the management of heterogeneous datasets, which is crucial for the next generation of data archives, given the different observational and theoretical data formats produced by telescopes, instruments, and simulations. The platform initially incorporated data from the Anglo-Australian Telescope, but has evolved and now hosts datasets from a large range of telescopes, including optical, infrared and radio telescopes, both ground and space based. Theoretical and simulation data are now also being incorporated into the platform.

One of the big challenges in research is the equity of access to facilities and resources. Data Central strives to address inequitable access to compute and data resources by providing software services for astronomers around the world.

The project to build Data Central began in 2015 using a design-thinking approach. After a detailed and thorough elaboration process, we settled on a Python / Django / React stack for the front end, a custom-built Python middleware layer for data ingestion and management, and Hadoop for data storage. The latter in particular was a departure from traditional astronomy practices and enables a distributed data storage model, which enables scalability of the system. We build the entire platform into Docker containers to ensure portability and to enable rapid development and testing.

Accessibility and usability has been a key focus of the platform. We engage with the community through forums, surveys and usability testing to continually improve the user experience, ensuring a responsive, mobile-friendly, intuitive interface to otherwise complex and diverse datasets. We are striving towards a seamless experience for research teams publishing their data through the platform, and are constantly developing new services to facilitate this goal.

We have taken a continuous improvement attitude to the ongoing development over the lifetime of the software, and have moved with modern technology trends, meaning that much of the codebase from the original version of Data Central has been improved, refactored, or replaced. As an example, we are progressively adding more React (Javascript) components to the user interface, which improves both usability and maintainability. The latter is also critical for scalability and addition of new features. We have also adopted asynchronous programming techniques for improved usability and performance. This design and maintenance approach means that the Data Central software is straightforward and stable to use, with minimal downtime.

The Data Central Science Platform is a team endeavour. Conceived by the Team Leads and championed by the Head of Astronomy, the software was designed, developed and deployed by the AAO RDS team.

The initial release of the software platform to the Australian and international astronomical community was on 19 July 2017 (version 1.0). Subsequent regular releases have increased the scope and functionality of the platform to its current release (version 1.13). This includes the following functionality supporting the full data life cycle and diversity of research use cases across astronomy and other fields of research.

Data Aggregation Service – This revolutionises the way that astronomers approach their research, transforming the painstaking process of finding existing relevant datasets into a fast seamless data aggregation process for individual celestial objects transcending wavelength silos, from a large and growing number of remote and local data centres, presented through an easy to use interface.

Advanced SQL query tool – Data Central archives contain the most complete dataset from the AAT, as well as datasets from several major studies and surveys using other major telescopes around the world. Although safely stored in the Archive these data are difficult for scientists (outside the original observing team) to find, and to combine with other complementary datasets to answer new science questions. The Advanced SQL Query Tool facilitates the finding of relevant data by scientists. It is a FAIR compliant query tool that implements the Virtual Observatory standard Astrophysical Data Query Language, an extension of standard SQL, that understands astronomical data quantities - e.g. celestial coordinate systems. This tool queries the locally hosted datasets and returns all data matching the request to the researcher. 

Astronomical Cone Search, Image Access cutout, Spectrum Access services – As well as helping researchers obtain the original raw telescope datasets Data Central provides access to science-ready products such as catalogues, images, and spectra of objects across the entire sky stored remotely as well as locally. Our FAIR-compliant Virtual Observatory software services allow researchers to search and retrieve these objects by celestial position, wavelength and other quantities. 

Archives and data processing pipelines – Data Central offers a redeployable system that allows users to easily search for raw telescope datasets, filter the results and download the data. Data files can be taken offline for bespoke processing, or processed with the PAWS (Pipelines As a Web Service) system, which returns science-ready data products.

Team-curated documentation – this service allows research teams to curate documentation to describe their data, thus ensuring that data descriptions used by the community are written by the experts in those data.

Access control management – Users can manage their account details, and the access to their data and other services in the Data Central Science Platform. This ability is now being used beyond astronomy, including in archaeology and early childhood education research.

Supporting others to build on Data Central – To enable other tools to be written to access survey data hosted in Data Central the schema browser service displays the metadata for all hosted datasets, and the REST API service grants programmatic access to the data.

We also manage a Single Sign On service based on the Apereo Central Authentication Service software, and this provides seamless interoperability between the above services. Our SSO service is also used by: the Anglo-Australian Telescope, the CSIRO Data Access Portal, the Murchison Widefield Array data archive, the Theoretical Astrophysical Observatory, the FAIMS mobile and web application, and the ORICL tool.

The Data Central Science Platform has revolutionised how astronomical research teams manage projects across the full data lifecycle. The software maximises the scientific return and impact of Australia’s astronomy investment over the past 40 years, providing instant access to both cutting edge and legacy data across a breadth of national and international facilities and instruments. Data Central provides user-focused exploration alongside complementary cross-matched data to facilitate new scientific discoveries. Users can run analyses, data reduction routines, and publish data to the world, all on a single platform; a crucial step forward in the era of big-data, where large data volumes are no longer manageable on desktop machines.

The Macquarie University Research Centre for Astronomy, Astrophysics and Astrophotonics was able to become a node of the ARC ASTRO3D Centre of Excellence in 2021, largely based on the Data Central Science Platform and how it enables Data Intensive Astronomy across the centre.

 

Data Central has begun to reduce barriers to entry for a wide range of users, providing a centralised accessible platform for students, researchers and amateurs alike to access data and make key scientific discoveries. Less time is spent managing infrastructure, codebases and websites, and more time on innovative research. The platform facilitates collaboration for teams distributed across Australia and the world. The PAWS model we have developed also provides more equitable data access to complex data reduction packages requiring large computing resources that may be out of reach of researchers in developing countries (e.g. Miszalski et al. 2023).

Testimonials
Trusted by research teams
This website uses cookies to improve your web experience.