Earlier this year, Christof Flück, one of our developers, reached a huge personal and professional milestone: a grade B in a Thesis on Implementing a Streamed Distributed Data Processing Pipeline.
In this blog, Christof shares the reasons why he focused on this area of research, the results of the project and how it has benefited Comfone.
Thesis: Implementing a Streamed Distributed Data Processing Pipeline
What made you choose this area of research?
I did my Bachelors Degree in Computer Science and with the specialization “Data Engineering”. In this specialization we were taught about data processing, machine learning, deep learning and much more. The data processing part was the most interesting one for me, so I wanted to do a thesis in that area. It was the most interesting one for me because I find it fascinating how we can crunch huge amounts of data into small insightful packages which we then can use to improve ours and/or our customers business.
We added the keyword “distributed” to the topic as we wanted to delve into that subject further. The traffic in roaming is growing and growing every year. We’re reaching the points where single machines can’t handle it anymore or if they can they lay bare for most of the year until peak times. When we write a distributable pipeline we can scale the hardware to accommodate the different levels of traffic with little overhead.
Insights into the project
The project was realized using Apache Flink as the processing engine. We capture traffic and create statistics out of it we can use in our applications such as Hawkeye, IMSI Tracing etc. As we’re aggregating the data for Hawkeye it added a level of complexity when using distributed systems to combine different data together. The pipeline I create also works in real time and is able to deliver data faster than the previous approach in addition to the new scaling possibilities.
We were able to achieve faster speeds by distributing the work-load across multiple servers. This showed us that when we switch over to this new system we will be able to handle almost any future traffic increase in roaming.
I was awarded the grade B with this Thesis with some deductions in the testing part of the projects. So for me personally the thesis was a great success!
What impact has this project had on Comfone?
My Thesis sprung up a new Development team which now focuses on bringing what I started to production. Comfone sees that we need to be able to handle more traffic in the next years and is working on being able to achieve that.