Apache Druid On Kubernetes - Part 1
Introduction
Let me start off with a quick personal introduction and then move to the topic at hand. I have
Introduction to The World of Data - (OLTP, OLAP, Data Warehouses, Data Lakes and more)
[This article was originally published here]
In this article, I hope to paint a picture of the modern data world
Hands-On Introduction to Apache Iceberg - Data Lakehouse Engineering
[This Article was Originally published here]
As a Developer Advocate for Dremio I spend a lot of time doing research
Streaming data on object storage: Thoughts
Object stores are the gold standard for cloud native data persistence. So, it is natural to want to store streaming
SQL Query on MinIO
Full fledged analytical applications, AI, ML workloads, dashboards - need a high performance query engine, that understands standard SQL parlance.
SQL Query on Parquet Files with DataFusion
Rust big data ecosystem is all set for bigtime - with Arrow and surrounding
ecosystem (DataFusion, Ballista) leading the pack.
Big Data ecosystem turning to Rust: an overview
Java is synonymous with last generation of Big Data tools and technologies. But
a lot has changed since 2000s. Latest
The Curious Case of Small Files
Background
Most of the files, by the virtue of their average size and usage patterns are
clearly cut out for
Streaming Data Tools & Techniques
Introduction
Streaming data is exactly what it sounds like, a continuously flowing stream of
data generated by one or multiple
Deploy Spark on Kubernetes
Introduction
Yarn has been the default orchestration platform for tools from Hadoop
ecosystem. This has started changing in recent times.