DataEngConf Reflection

Sign board next to a white building saying "This way to DataEngConf" The DataEngConf is a unique opportunity for data engineers and data scientists to mingle and share experiences and insights. Put together by Hakka Labs whose mission it is to organize, foster and educate data communities, this year’s conference was held in NYC.


The highlight of the conference was the presentation by Hilary Mason, titled Data Science: Past, Present, and Future. Ms. Mason is a nationally known figure in the data science community. Her message acknowledging how far we’ve come in the fields of data engineering and data science and where we are headed was fantastic. She is also a leading advocate for building up and connecting the data community in NYC.

At most conferences the opportunity to interact with presenters is limited but DataEngConf successfully remedied this problem by asking each speaker to host “office hours” immediately after their presentation. While this does present time conflicts, it also allows for the opportunity for deeper questioning and dialogue with the presenters, which was fantastic.


From Ms. Mason, I received the following advice on fostering the data professionals community here in Madison. I have coordinated the BigDataMadison meetup for the past few years and she advised me to make a particular effort to be inclusive to different types of individuals, and cater to multiple needs. She also shared the idea to schedule “data drinks”, which are invite-only, unstructured meetings held in a social atmosphere to promote sharing and fun around data.

Presentations Summary

Group of people eating lunch in a building with steel beamsWithin the data engineering track there were three types of talks:
1. Technology specific talks: Parquet/Arrow, Kudu, Kafka Streams
2. Use case/implementation talks: Spotify, Buzzfeed, Basho
3. Advice/prognostication: future of python in data wrangling, a career panel, computational social science, using data science for social good

My three top talks were the presentations:
Data Science: Past, Present, and Future Hilary Mason
The Evolution of Data Processing At Spotify by Erin Miller
The Future of Column-Oriented Data Processing with Arrow and Parquet by Julien Le Dem.

All in all, a very enjoyable experience. Thanks to Hakka Labs for organizing the conference and my employer, Earthling Interactive, for subsidizing the costs of travel and attendance.


Man in his 40's riding a Citibike on a sidewalk at night