Data Access Strategies

Remote data access: The large data sets will initially be stored at the PNI centres and those external users who are not able to transfer all the data to their home institutes must have the possibility to access their data remotely. A web-based portal shall act as a common interface for this access. A survey of established open source software (e.g. Fedora Commons) and of solutions planned or implemented at other institutions (e.g. DIAMOND, EuroFEL) will be performed and a portal prototype will be implemented at DESY. In the envisaged solution the user will have the means to perform keyword-based searching for data, to browse and visualize the contents of large data files, and to transfer specific parts of it without the need to transfer the complete data files. In the case where users decide to leave their data with the facilities, data will be managed by a suitable caching and tape robot system for a defined period whose length needs to be discussed directly with the user community. A possible option for achieving this is the dCache system developed by DESY and FermiLab. A long-term goal, which is not part of the first phase of HDRI, is to provide access to experimental data within the frame of a Data-GRID to the users.

Remote Computing: For external users who do not have sufficient infrastructure at their home institutes to handle such huge amounts of data, a platform will be provided for remote data treatment. For this purpose, computing resources and a PNI-wide repository of programs and routines will be established. This will cover the basic needs of data treatment but will also include software needed in specific scientific fields (e.g. tomographic reconstruction). Work on this shall start within the frame of the proposed activity but will have to be carried on beyond that.

Authentication and authorisation: Remote access to data and computing resources requires schemes for authentication and access authorization. It is envisaged to establish a common mechanism for use by all centres participating in HDRI. Based on a detailed evaluation, one of the available solutions for authentication (e.g. open-id, Shibboleth) will be chosed and implemented at the partner sites. The idea behind this effort is for users to have to provide their basic contact information only once within PNI.