Data Science Tools for Linux

View 14 business solutions

Browse free open source Data Science tools and projects for Linux below. Use the toggles on the left to filter open source Data Science tools by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1
    DAT Linux

    DAT Linux

    The data science OS

    DAT Linux is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment. https://datlinux.com It's based on Lubuntu, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs. DAT Linux is perfect for students, professionals, academics, or anyone interested in data science who doesn’t want to spend endless hours downloading, installing, configuring, and maintaining applications from a range of sources, each with different technical requirements and set-up challenges.
    Leader badge
    Downloads: 61 This Week
    Last Update:
    See Project
  • 2
    xsv

    xsv

    A fast CSV command line toolkit written in Rust

    xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable. Simple tasks should be easy. Performance trade offs should be exposed in the CLI interface. Composition should not come at the expense of performance. Let's say you're playing with some of the data from the Data Science Toolkit, which contains several CSV files. Maybe you're interested in the population counts of each city in the world. So grab the data and start examining it. The next thing you might want to do is get an overview of the kind of data that appears in each column. The stats command will do this for you. The xsv table command takes any CSV data and formats it into aligned columns using elastic tabstops. These commands are instantaneous because they run in time and memory proportional to the size of the slice (which means they will scale to arbitrarily large CSV data).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    cuDF

    cuDF

    GPU DataFrame Library

    Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data. cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming. For additional examples, browse our complete API documentation, or check out our more detailed notebooks. cuDF can be installed with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel. cuDF is supported only on Linux, and with Python versions 3.7 and later. The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next