Data Hut™ — Open Source Project Directory
Curated insights on the most popular data science and data engineering projects. By combining machine learning techniques with expert knowledge, we help you to understand the open source landscape and to pick the best software for your needs.
Data Hut News (December 05, 2022): December updates, with three new projects: DiCE (explainability), Daft (DataFrame), and Polars (DataFrame). Happy Holidays!
For site and project updates, follow us on Twitter: @datahutai
Projects by Category¶
| Category | Description | Projects | 
|---|---|---|
| Tools for transforming and analyzing the largest data sets. | 23 | |
| Tools for statistical analysis and machine learning. | 85 | |
| Data repositories | 39 | |
| Processing data as networks of interconnected nodes. | 25 | 
To jump to a project directly or find by keyword, use the search page or the search box above.
Popular Communities and Project Backers¶
| Community | Website | Description | 
|---|---|---|
| Apache is the world’s largest open source foundation with over 300 top-level projects. | ||
| As the world’s largest social network, FaceBook has created and sponsored a wide range of open source projects. | ||
| As a multinational technology company, Google has created and sponsored over 2,000 open source projects in a wide range of areas, from programming languages to UI frameworks to machine learning. | ||
| NumFocus is a 501(3)c public charity founded in 2012 to provide a fiscal umbrella for many open source software projects that have become essential for science and research. NumFocus sponsored projects benefit from a range of services including fiscal, legal, and operational. | 
Latest News¶
| Date | Topic | Description | 
|---|---|---|
| 2022-12-02 | v2.0.0 Major new release with over 350 enhancements and bug fixes. more | |
| 2022-12-02 | RayDP-0.6.0 Highlights: Support Ray 1.9.0 - 2.1.0; Support Spark 3.1 - 3.3; Spark master node affinity; Updated … more | |
| 2022-12-02 | Cortex 1.14.0 This release contains 115 contributions from 28 contributors. Thank you! Some notable changes release … more | |
| 2022-12-02 | Apache IoTDB 1.0.0 New Features: New architecture that supports standalone and cluster mode with two types of nodes … more | |
| 2022-11-18 | 1.1.0 Native support for TensorFlow Decision Forests in TensorFlow Serving; Add support for zipped Yggdrasil … more | |
| 2022-11-17 | v0.6.0 We’re happy to announce the AutoGluon 0.6 release. 0.6 contains major enhancements to Tabular, Multimodal, … more | |
| 2022-11-16 | v0.4.0 Address existing deprecations; deprecate async submodule; add new examples & example cleanup; add failure … more | |
| 2022-11-14 | TensorFlow 2.11.0 New features in Tf.lite, Tf.keras, Tf.Variable, and Tf.SavedModel. Introduced … more | |
| 2022-11-14 | Keras Release 2.11.0 See the TensorFlow release notes for details on the 2.11 release. more | |
| 2022-11-08 | Ray-2.1.0 Ray AI Runtime (AIR): Better support for Image-based workloads; Ability to read TFRecord input; Ray Serve: … more | |
| 2022-11-07 | v0.14.0 Highlights: serialization and deserialization of all sktime objects via save method & base.load; documented … more | |
| 2022-11-01 | v2.5.0 Features: Updated user interface (UI); Allow for incremental changes to fields.idx. Also security patches, … more | 
