IPE 12-19 Internship or Master Thesis: Scalable storage technologies for large archives of multi-dimensional imaging data

Karlsruhe Institute of Technology (KIT) - KIT - Helmholtz Association

Karlsruhe, Germany

Work group:

Institute for Data Processing and Electronics (IPE)

Area of research:

Work placement

Job description:

An ever increasing number of large tomographic volumes are recorded at synchrotron facilities worldwide. Due to the drastic increase in data sizes, there is a recent trend to provide data analysis services at facilities as well. Though a high-speed clustered storage is used to store data sets, it is a challenge to provide the imaging data fast enough to keep applications interactive.

The master thesis will be performed within an international project that aims to build a cloud-based infrastructure for synchrotron light sources enabling remote data analysis and visualization. The objective of this master thesis is to develop the data management system scalable in terms of number of user requests and data volume. The student is expected to evaluate existing technologies developed to handle big data and propose the distributed storage engine, data reduction framework as well as data compression and caching strategies. Additionally, it may be desirable to develop spatialaware data layout optimized for reading sub-volumes from very large volumes stored at magnetic storage with read performance largely limited by high seek latencies.

The student is expected to be familiar with network administration in Linux and to have a basic know-how on clustering. He should understand the relational and MapReduce models and be aware of current trends in the database technology. Prior experience with TileDB and Apache Spark is a plus.

Contract Duration

limited, according to the study regulations

Contact person in line-management

Suren Chilingaryan, IPE, Phone: +49 721 / 608 26579 (suren.chilingaryan@kit.edu)

Andreas Kopmann, IPE, Phone: +49 721 / 608 24910 (andreas.kopmann@kit.edu)

