This is an AI generated summary. There may be inaccuracies.
Summarize another video · Purchase summarize.tech Premium
Apache Doris is an open-source real-time data warehouse that graduated from the Apache incubator last year and boasts a user base of over 2,500 Enterprises. Doris collects data from various sources, including relational databases and IoT devices, and offers features like generating reports, ad hoc analysis, and federated queries. Doris is known for its high performance, as shown in benchmarking results against Presto, Greenplum, and ClickHouse, with performance increasing by over 10 times in the past two years. Doris's performance is attributed to its cost-based query optimizer, fully vectorized execution engine, and MPP architecture. The speaker further discusses Doris's architecture and features, including its datadriven query execution model, rich collection of indexes, materialized views, and caching mechanism. Doris supports both merge on read and merge on write for data updates and offers optimizations for schema-free data. Doris can achieve a data latency of minutes and optimizes resource usage through workload groups. Doris is also compatible with popular tools and supports quick schema changes. Compared to other data lakehouse solutions like Trino, Doris is reportedly three to five times faster due to its efficient query engine and use of stateless compute nodes. Doris also allows users to write computation results of external tables into Doris as views and supports tiered storage, potentially reducing storage costs by around 70%. Doris offers features like snapshot backup and restoration, cross-cluster replication, and supports various data ingestion methods.
Copyright © 2026 Summarize, LLC. All rights reserved. · Terms of Service · Privacy Policy · As an Amazon Associate, summarize.tech earns from qualifying purchases.