Open Source Data Management Systems - Page 9

Data Management Systems

View 4118 business solutions
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Apache Doris

    Apache Doris

    MPP-based interactive SQL data warehousing for reporting and analysis

    Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With it's distributed architecture, up to 10PB level datasets will be well supported and easy to operate. Apache Doris can meet various data analysis demands, including history data reports, real-time data analysis, interactive data analysis, and exploratory data analysis. Make your data analysis easier! Support standard SQL language, compatible with MySQL protocol. The main advantages of Doris are the simplicity (of developing, deploying and using) and meeting many data serving requirements in a single system. Doris mainly integrates the technology of Google Mesa and Apache Impala, and it is based on a column-oriented storage engine and can communicate by MySQL client.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. Messaging patterns including publish/subscribe, request/reply and streaming. Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. Reliable FIFO and strict ordered messaging in the same queue. Efficient pull and push consumption model. Million-level message accumulation capacity in a single queue. Multiple messaging protocols like JMS and OpenMessaging. Flexible distributed scale-out deployment architecture. Lightning-fast batch message exchange system.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Deep Learning Models (CV, LLM, and Generative) are an amazing technology that will power many of future ML use cases. A large set of these technologies are being deployed into businesses (the real world) in what we consider a production setting.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    AutoAni

    AutoAni

    Compilation-free adaptive swf visualization template

    AutoAni is a compilation-free adaptive visualization template designed for creating animated bar chart races and similar time-based visualizations using Flash. It focuses on ease of use for beginners by removing the need to install Flash or compile code—users simply prepare their data and images, place them in the same directory, and open the provided SWF file with any compatible player. The template comes with built-in Source Han Sans fonts, ensuring consistent text rendering without requiring users to install additional fonts. It supports smooth and adaptive animation with variable and uniform speed algorithms, making transitions between data states visually clear and accurate. Configuration is handled through simple CSV files, allowing users to adjust layout, speed, positioning, and formatting without editing the source code directly. AutoAni includes modules such as champion bars and champion portraits, with more features planned.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    BlockArrays.jl

    BlockArrays.jl

    BlockArrays for Julia

    A block array is a partition of an array into blocks or subarrays, see Wikipedia for a more extensive description. This package has two purposes. Firstly, it defines an interface for an AbstractBlockArray block arrays that can be shared among types representing different types of block arrays. The advantage to this is that it provides a consistent API for block arrays. Secondly, it also implements two different types of block arrays that follow the AbstractBlockArray interface. The type BlockArray stores each block contiguously while the type PseudoBlockArray stores the full matrix contiguously. This means that BlockArray supports fast noncopying extraction and insertion of blocks while PseudoBlockArray supports fast access to the full matrix to use in for example a linear solver.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    ChaosTools.jl

    ChaosTools.jl

    Tools for the exploration of chaos and nonlinear dynamics

    A Julia module that offers various tools for analyzing nonlinear dynamics and chaotic behavior. It can be used as a standalone package, or as part of DynamicalSystems.jl. All further information is provided in the documentation, which you can either find online or build locally by running the docs/make.jl file. ChaosTools.jl is the jack-of-all-trades package of the DynamicalSystems.jl library: methods that are not extensive enough to be a standalone package are added here. You should see the full DynamicalSystems.jl library for other packages that may contain functionality you are looking for but did not find in ChaosTools.jl.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    CleanVision

    CleanVision

    Automatically find issues in image datasets

    CleanVision automatically detects potential issues in image datasets like images that are: blurry, under/over-exposed, (near) duplicates, etc. This data-centric AI package is a quick first step for any computer vision project to find problems in the dataset, which you want to address before applying machine learning. CleanVision is super simple -- run the same couple lines of Python code to audit any image dataset! The quality of machine learning models hinges on the quality of the data used to train them, but it is hard to manually identify all of the low-quality data in a big dataset. CleanVision helps you automatically identify common types of data issues lurking in image datasets. This package currently detects issues in the raw images themselves, making it a useful tool for any computer vision task such as: classification, segmentation, object detection, pose estimation, keypoint detection, generative modeling, etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Dask

    Dask

    Parallel computing with task scheduling

    Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain is especially helpful if batch operations can be optimized – for instance, when synchronous API calls can be parallelized or where an LLM API offers batch processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    DataEase

    DataEase

    Data visualization analysis tool

    An open source data visualization analysis tool available to everyone. DataEase is an open-source data visualization analysis tool that helps users quickly analyze data and gain insight into business trends, so as to achieve business improvement and optimization. DataEase supports rich data source connections, can quickly create charts by dragging and dropping, and can easily share with others. Supports rich chart types (Apache ECharts / AntV), supports drag-and-drop method to quickly create dashboards. Support direct connection mode, local mode (based on Apache Doris / Kettle implementation). Support various data sources such as data warehouse/data lake, OLAP database, OLTP database, Excel data file, API, etc. Open source and open: zero threshold, quick access and installation online; quick access to user feedback, new versions released monthly. pport multiple data sharing methods to ensure data security.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    DataFramesMeta.jl

    DataFramesMeta.jl

    Metaprogramming tools for DataFrames

    Metaprogramming tools for DataFrames.jl objects to provide more convenient syntax. DataFrames.jl has the functions select, transform, and combine, as well as the in-place select! and transform! for manipulating data frames. DataFramesMeta.jl provides the macros @select, @transform, @combine, @select!, and @transform! to mirror these functions with more convenient syntax. Inspired by dplyr in R and LINQ in C#.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Explorer

    Explorer

    Series (one-dimensional) and dataframes (two-dimensional)

    Explorer brings series (one-dimensional) and data frames (two-dimensional) to Elixir for fast data exploration.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Foxglove Studio

    Foxglove Studio

    Robotics visualization and debugging

    Foxglove Studio is an open-source visualization and debugging tool for robotics. Use customizable layouts to arrange interactive visualizations and quickly understand what your robot is doing. Use Foxglove Studio's rich interactive visualizations to analyze live connections and pre-recorded data. Experience the world as your robot does. Visualize images and point clouds, overlay bounding boxes, add classification labels and planned movements, and drill down into your data with plots or raw message views. Upload recordings to your private data lake for easy storage, searching, and analysis. Stream recorded data directly into Foxglove Studio to get insights into your robots' behavior. We're long-time fans and beneficiaries of open source software. Join our community on Github and Slack to contribute bug reports, feature requests, or pull requests.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    GeoNode

    GeoNode

    GeoNode is an open source platform for geospatial data

    GeoNode is a geospatial content management system, a platform for the management and publication of geospatial data. It brings together mature and stable open-source software projects under a consistent and easy-to-use interface allowing non-specialized users to share data and create interactive maps. Data management tools built into GeoNode allow for integrated creation of data, metadata, and map visualization. Each dataset in the system can be shared publicly or restricted to allow access to only specific users. Social features like user profiles and commenting and rating systems allow for the development of communities around each platform to facilitate the use, management, and quality control of the data the GeoNode instance contains. It is also designed to be a flexible platform that software developers can extend, modify or integrate against to meet requirements in their own applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Jupyter Notebook Viewer

    Jupyter Notebook Viewer

    A Jupyter notebook viewer for macOS

    A native macOS application to view Jupyter/IPython notebooks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Modin

    Modin

    Scale your Pandas workflows by changing a single line of code

    Scale your pandas workflow by changing a single line of code. Modin uses Ray, Dask or Unidist to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical. It is not necessary to know in advance the available hardware resources in order to use Modin. Additionally, it is not necessary to specify how to distribute or place data. Modin acts as a drop-in replacement for pandas, which means that you can continue using your previous pandas notebooks, unchanged, while experiencing a considerable speedup thanks to Modin, even on a single machine. Once you’ve changed your import statement, you’re ready to use Modin just like you would pandas.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Molly.jl

    Molly.jl

    Molecular simulation in Julia

    Much of science can be explained by the movement and interaction of molecules. Molecular dynamics (MD) is a computational technique used to explore these phenomena, from noble gases to biological macromolecules. Molly.jl is a pure Julia package for MD, and for the simulation of physical systems more broadly. The package is described in a talk at Enzyme Conference 2023 and an earlier talk at the JuliaMolSim minisymposium at JuliaCon 2022. Slides are also available for a tutorial in September 2023.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Nonlinear Dynamics

    Nonlinear Dynamics

    A concise introduction interlaced with code

    This repository holds material related with the textbook Nonlinear Dynamics: A Concise Introduction Interlaced with code, co-authored by George Datseris and Ulrich Parlitz. The textbook will be published by Springer-Nature, in the series Undergraduate Lecture Notes in Physics.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    POCO

    POCO

    Cross-platform C++ libraries for building network applications

    The POCO C++ Libraries are powerful cross-platform C++ libraries for building network- and internet-based applications that run on desktop, server, mobile, IoT, and embedded systems. Whether building automation systems, industrial automation, IoT platforms, air traffic management systems, enterprise IT application and infrastructure management, security and network analytics, automotive infotainment and telematics, financial or healthcare, C++ developers have been trusting the POCO C++ Libraries for 15+ years and deployed it in millions of devices. Create software for connected embedded devices running Linux, Windows Embedded or QNX. Create cross-platform backends in C++ for iOS and Android applications and combine it with a native or HTML5-based user interface. Create software for IoT devices that talk to cloud backends over HTTP REST APIs. See macchina.io for an IoT platform built with POCO.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    PySyft

    PySyft

    Data science on data without acquiring a copy

    Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place. The Syft ecosystem seeks to change this system, allowing you to write software which can compute over information you do not own on machines you do not have (total) control over. This not only includes servers in the cloud, but also personal desktops, laptops, mobile phones, websites, and edge devices. Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used privately.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    RuntimeGeneratedFunctions.jl

    RuntimeGeneratedFunctions.jl

    Functions generated at runtime without world-age issues or overhead

    RuntimeGeneratedFunctions are functions generated at runtime without world-age issues and with the full performance of a standard Julia anonymous function. This builds functions in a way that avoids eval. For technical reasons, RuntimeGeneratedFunctions needs to cache the function expression in a global variable within some module. This is normally transparent to the user, but if the RuntimeGeneratedFunction is evaluated during module precompilation, the cache module must be explicitly set to the module currently being precompiled. This is relevant for helper functions in some modules that construct a RuntimeGeneratedFunction on behalf of the user.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    SandDance

    SandDance

    Visually explore, understand, and present your data

    By using easy-to-understand views, SandDance helps you find insights about your data, which in turn help you tell stories supported by data, build cases based on evidence, test hypotheses, dig deeper into surface explanations, support decisions for purchases, or relate data into a wider, real world context. SandDance uses unit visualizations, which apply a one-to-one mapping between rows in your database and marks on the screen. Smooth animated transitions between views help you to maintain context as you interact with your data. This new version of SandDance has been rebuilt from scratch with the goal of being modular, extensible, and embeddable into your custom applications. We are open and driven by the community through contributions, feature requests, and discussion. SandDance was created by the Microsoft Research VIDA Group which explores novel technologies for visualization and immersive data analytics.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit, and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of data sources. SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. For production-grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do. Recommendation, personalization and targeting involves evaluating recommender models over content items to select the best ones. Vespa lets you build applications which does this online, typically combining fast vector search and filtering with evaluation of machine-learned models over the items. This makes it possible to make recommendations specifically for each user or situation, using completely up to date information.
    Downloads: 2 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.