Apache Hudi

Verified

Data Lake Platform

Streaming data lake platform for incremental data processing on data lakes.

Open Source employees

Open Source Project

Founded 2016

Visit Website

Work at Apache Hudi?

Claim this profile to update your company information and connect with buyers

Product Overview

Apache Hudi

Apache Hudi is an open-source data lakehouse platform that brings database-like capabilities to data lakes. It provides incremental data processing, ACID transactions, and automated table services for managing large-scale analytical datasets.

Unique Value Proposition

The incremental processing framework that delivers 10x efficiency through record-level updates and automated table optimization. Hudi uniquely combines streaming and batch with best-in-class support for CDC, deduplication, and time travel queries.

Target Market

Industries

Technology

Financial Services

E-commerce

Transportation

Media & Entertainment

Telecommunications

Company Size

1000 - 50000 employees

Reviews (0)

No reviews yet. Be the first to review!

Pricing information not available. Contact vendor for details.

Key Features

Incremental Processing

Record-Level Updates

ACID Transactions

Time Travel

Automatic Compaction

Clustering

Z-ordering

Multi-Modal Indexing

CDC Support

Data Deduplication

Merge on Read

Copy on Write

Automatic Table Services

Schema Evolution

Savepoint & Rollback

Integrations

Apache Spark

Apache Flink

Presto

Trino

Apache Hive

AWS Athena

AWS EMR

Amazon Redshift

Debezium

Apache Kafka

Apache Pulsar

dbt

Apache Airflow

Databricks

Snowflake

API Available

View Docs

Security Features

Encryption Support

Access Control Integration

Audit Trail

Column Masking

Implementation & Support

Implementation Time

7 weeks (45 days)

Deployment Options

Cloud

On-Premise

Hybrid

Support Hours

Community Support