Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
Jun 6, 2024 - Java
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Scalable, redundant, and distributed object store for Apache Hadoop
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Common Crawl fork of Apache Nutch
[Practice Hadoop Programming Projects] This repository collects 2 of programming projects for Hadoop
[Hadoop Programming Courses] This repository collects 3 of programming courses for Hadoop.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
A large-scale entity and relation database supporting aggregation of properties
AI on Hadoop
[Practice 82 Hadoop Free Tutorials]-This repository collects 82 of free tutorials for Hadoop. It offers comprehensive tutorials and hands-on labs tailored for learners of all levels, from students to professionals and enthusiasts.
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."