spark-sql

Star

Here are 752 public repositories matching this topic...

apache / incubator-gluten

Star

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

arrow clickhouse simd vectorization spark-sql velox

Updated May 29, 2024
Scala

mohankrishna02 / interview-scenerios-spark-sql

Star

This repository focuses on providing interview scenario questions that I have encountered during interviews. The questions are designed to simulate real-world scenarios and test your problem-solving and technical skills. By exploring these scenarios, you can gain insights into common interview topics and prepare yourself for similar challenges.

sql spark pyspark spark-sql

Updated May 29, 2024
Scala

apache / kyuubi

Star

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

kubernetes sql spark hive hadoop jdbc thrift data-lake hacktoberfest spark-sql

Updated May 29, 2024
Scala

antoniosilv-l / Spark-dataStack

Star

Repositorio pensado na criação de um ambiente Spark, para desenvolvimento de pipelines de dados.

python dockerfile spark pyspark spark-sql

Updated May 29, 2024
Dockerfile

almond-sh / almond

Star

A Scala kernel for Jupyter

scala spark jupyter repl jupyter-notebook jupyter-kernels spark-sql

Updated May 28, 2024
Scala

getredash / redash

Star

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

visualization javascript mysql python bigquery bi spark dashboard athena analytics postgresql business-intelligence redash redshift databricks hacktoberfest spark-sql

Updated May 28, 2024
Python

muhammad-ahsan / spark-toolbox

Star

Spark based applications to perform big data analytics

python data big-data spark yarn hadoop mllib data-analysis spark-sql aws-emr-serverless

Updated May 28, 2024
Python

asuiu / SparkORM

Star

ORM for Apache Spark and DataFrames schema manager

python sqlalchemy orm spark python3 pyspark spark-orm spark-sql pyspark-python sqlalchemy-orm sparkql

Updated May 28, 2024
Python

Qbeast-io / qbeast-spark

Star

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

scala big-data spark sampling datasource spark-sql data-lakehouse

Updated May 28, 2024
Scala

AlexRogalskiy / spark-patterns

Star

🏆 Spark4You Design patterns

patterns spark ebook spark-streaming spark-sql spark-structured-streaming patterns-design

Updated May 28, 2024
Shell

ploomber / jupysql

Star

Better SQL in Jupyter. 📊

mysql python bigquery postgres data-science sql presto hive jupyter clickhouse sqlite snowflake data-engineering redshift tsql spark-sql trino duckdb polars

Updated May 27, 2024
Python

mtumilowicz / big-data-scala-spark-batch-workshop

Star

Introduction to Spark Batch processing.

big-data workshop spark workshop-materials batch-processing spark-sql big-data-processing

Updated May 27, 2024
Scala

groda / big_data

Star

Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.

docker big-data spark hadoop bigdata jupyter-notebook pyspark hadoop-cluster mapreduce gutenberg-ebooks spark-sql mrjob hadoop-hdfs testdfsio mapreduce-bash apache-sedona

Updated May 26, 2024
Jupyter Notebook

japila-books / spark-sql-internals

Star

The Internals of Spark SQL

spark apache-spark book internals spark-sql mkdocs-material

Updated May 26, 2024

sjrusso8 / spark-connect-rs

Star

Apache Spark Connect Client for Rust

spark spark-sql grpc-client spark-connect

Updated May 29, 2024
Rust

felix11736 / felix_spark-home-work

Star

spark-sql google-colab-notebook

Updated May 25, 2024
Jupyter Notebook

fabiogouw / spark-aws-messaging

Star

A custom sink provider for Apache Spark that sends the content of a dataframe to an AWS SQS

spark aws-sqs spark-sql

Updated May 23, 2024
Java

RafaelQSantos-RQS / big_data_complex_problem

Star

O objetivo deste trabalho é explorar as capacidades de arquiteturas de bancos de dados distribuídos para lidar com conjuntos de dados complexos, em particular, o "Relatório de Saldo Mensal da Conta", que apresenta todos os Saldos Mensais das Contas dos clientes entre Jan/2020 e Dez/2020.

big-data spark databricks spark-sql big-data-analytics

Updated May 23, 2024
Jupyter Notebook

Shankar-Anumula / data-engineer

Star

java scala spark spark-streaming spark-sql

Updated May 21, 2024
Scala

raghul3 / IPL_Data_Analysis

Star

Large dataSet of IPL Data till 2017 analysis using PySpark.

s3-bucket pyspark spark-sql databricks-notebooks

Updated May 21, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-sql

Here are 752 public repositories matching this topic...

apache / incubator-gluten

mohankrishna02 / interview-scenerios-spark-sql

apache / kyuubi

antoniosilv-l / Spark-dataStack

almond-sh / almond

getredash / redash

muhammad-ahsan / spark-toolbox

asuiu / SparkORM

Qbeast-io / qbeast-spark

AlexRogalskiy / spark-patterns

ploomber / jupysql

mtumilowicz / big-data-scala-spark-batch-workshop

groda / big_data

japila-books / spark-sql-internals

sjrusso8 / spark-connect-rs

felix11736 / felix_spark-home-work

fabiogouw / spark-aws-messaging

RafaelQSantos-RQS / big_data_complex_problem

Shankar-Anumula / data-engineer

raghul3 / IPL_Data_Analysis

Improve this page

Add this topic to your repo