IPE 09-19 Internship or Master Thesis: Novel database technologies for the large archives with time series data

Karlsruhe Institute of Technology (KIT)

Karlsruhe, Germany

Work group:

Institute for Data Processing and Electronics (IPE)

Area of research:

Work placement

Job description:

Complex and distributed detector and control systems are required for modern scientific experiments. The instrumentation integrates custom and commercial components from various sources and generates ever-increasing amounts of data. A variety of different formats, underlying storage engines, and data workflows are used. Often proper manual data interpretation and quality assurance is difficult or even impossible due to the tremendously increase of both number and size of datasets. This raises the need for novel automatic or semi-automatic data analysis methods and tools. Information on operation and scientific meaning needs to be extracted from the data stream and provided to the users in visual and easy to interpret form.

The work is embedded in a project that aims to develop a novel platform for handling data management tasks of mid-range scientific experiments. We plan to build tools to integrate the data recorded by different subsystems and made it available to users in uniform, comprehensible, and easy-to-use fashion. The thesis is focused on the data storage subsystem. Student is expected to review novel database technologies and provide detailed evaluation of several possible solutions which are optimized to store high volumes of time series data. The selected engine should be integrated with the existing data management system operating at Aragats Space Environmental Center.

The ideal database will:

Reliably store the high-bandwidth streams of the data;Scale well in the cluster environment;Include intelligent caching mechanisms to speed-up the queries;Extract standard statistical information and provide programming interface to compute custom properties;Support Geo-distributed operation modes;Integrate with data analysis tools like Apache Spark, etc.

Good background in systems engineering and cloud technologies. Good programming skills, preferably prior experience with Python. Very good understanding of the relational and NOSQL database technologies.

Contract Duration

limited, according to the study regulations

Contact person

Suren Chilingaryan, IPE, Phone: +49 721 / 608 26579 (suren.chilingaryan@kit.edu)

Andreas Kopmann, IPE, Phone: +49 721 / 608 24910 (andreas.kopmann@kit.edu)


Apply with CV and Cover Letter

Must be a .doc, .docx, or .pdf file and no larger than 1MBMust be a .doc, .docx, or .pdf file and no larger than 1MB

IPE 09-19 Internship or Master Thesis: Novel database technologies for the large archives with time series data