The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Jun 6, 2024 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Unified and privacy-centric event data collection for digital analytics
The open source high performance ELT framework powered by Apache Arrow
ODK Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments around the world. Contribute and make the world a better place! ✨📋✨
This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.
ODK Web Forms enables form filling and submission editing of ODK forms in a web browser. It's coming soon! ✨
This repository contains the project materials for optimising e-commerce conversion rates through comprehensive data analysis. Leveraging SQL, MySQL, Power BI, and other tools, explore key factors influencing website performance. From data collection to actionable insights and recommendations,
This is a repository to automate the scraping of every film shown in the Spanish public TV, using rvest and GitHub actions.
Fast and differentiable particle accelerator optics simulation for reinforcement learning and optimisation applications.
Georegistry + Data Collection + Microplanning
The AI-Driven Crop Prediction System that applies Machine Learning and AI to analyze weather, soil, and crop data to predict crop health and yield. This system provides farmers with precise predictions, empowering them to make data-driven decisions and enhance their farming practices.
70+ CLI tools to build, browse, and blend your media library. An index for your archive.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
Experimentations in Open Source Repository Metrics
An electronic data capture platform designed for administering remote and in-person clinical instruments, including both interactive tasks and forms
Collect POST requests
Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/ and learn more about Augur at our website https://augurlabs.io
Implement the word embedding for exploring the correlation among words - Design a sequence model for generating text
This project implements both exact and approximate inference techniques for Bayesian Networks using enumeration and rejection sampling, respectively. It processes Bayesian Network structures in XMLBIF format, accepting command-line inputs to compute the posterior distribution of a query variable given observed evidence.
Add a description, image, and links to the data-collection topic page so that developers can more easily learn about it.
To associate your repository with the data-collection topic, visit your repo's landing page and select "manage topics."