Apache Hudi

Apache Hudi

Verified

Data Lake Platform

Streaming data lake platform for incremental data processing on data lakes.

Open Source employees
Open Source Project
Founded 2016
Visit Website

Work at Apache Hudi?

Claim this profile to update your company information and connect with buyers

0

Product Overview

Apache Hudi

Apache Hudi is an open-source data lakehouse platform that brings database-like capabilities to data lakes. It provides incremental data processing, ACID transactions, and automated table services for managing large-scale analytical datasets.

Unique Value Proposition

The incremental processing framework that delivers 10x efficiency through record-level updates and automated table optimization. Hudi uniquely combines streaming and batch with best-in-class support for CDC, deduplication, and time travel queries.

Categories

Data Lakehouse
Table Format
Incremental Processing
Storage Management
Stream Processing

Target Market

Industries

Technology
Financial Services
E-commerce
Transportation
Media & Entertainment
Telecommunications

Company Size

1000 - 50000 employees

Reviews (0)

No reviews yet. Be the first to review!

Pricing information not available. Contact vendor for details.

Key Features

Incremental Processing
Record-Level Updates
ACID Transactions
Time Travel
Automatic Compaction
Clustering
Z-ordering
Multi-Modal Indexing
CDC Support
Data Deduplication
Merge on Read
Copy on Write
Automatic Table Services
Schema Evolution
Savepoint & Rollback

Integrations

Apache Spark
Apache Flink
Presto
Trino
Apache Hive
AWS Athena
AWS EMR
Amazon Redshift
Debezium
Apache Kafka
Apache Pulsar
dbt
Apache Airflow
Databricks
Snowflake
API Available
View Docs

Security Features

Encryption Support
Access Control Integration
Audit Trail
Column Masking

Implementation & Support

Implementation Time

7 weeks (45 days)

Deployment Options

Cloud
On-Premise
Hybrid

Support Hours

Community Support