Showing 36 open source projects for "hadoop project"

View related business solutions
  • Get the most trusted enterprise browser Icon
    Get the most trusted enterprise browser

    Advanced built-in security helps IT prevent breaches before they happen

    Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.
    Download Chrome
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 1
    SeaweedFS

    SeaweedFS

    Distributed storage system for blobs, objects, files, and data lake

    SeaweedFS is a distributed storage system for blobs, objects, files, and data lake, to store and serve billions of files fast! Blob store has O(1) disk seek, local tiering, cloud tiering. Filer supports cross-cluster active-active replication, Kubernetes, POSIX, S3 API, encryption, Erasure Coding for warm storage, FUSE mount, Hadoop, WebDAV. SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible because of the community. SeaweedFS is a simple...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Apache Bigtop

    Apache Bigtop

    Bigtop is an Apache Foundation project for Infrastructure Engineers

    Apache Bigtop is a project focused on building and packaging the Hadoop ecosystem and related big data components. It provides a consistent framework for testing, packaging, and deploying Hadoop distributions, including tools like HDFS, YARN, Spark, Hive, HBase, and more. By maintaining cross-platform builds (RPMs, DEBs, Docker images, and Kubernetes support), Bigtop makes it easier for organizations to deploy big data stacks in different environments. It also includes a set of integration...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SZT-bigdata

    SZT-bigdata

    SZT‑bigdata is an open source project

    SZT‑bigdata is an open-source project analyzing real Shenzhen metro (subway) card usage data using big‑data frameworks like Spark, Hadoop, Hive, Kafka, Flink, ClickHouse, HBase, and Elasticsearch. Aimed at exploring transit passenger flow patterns and system optimization using a variety of Scala-based technologies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    ..., Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5

    MarDRe

    MapReduce-based tool to remove duplicate DNA reads

    .... Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    X-RIME is a open source project devoted to provide Hadoop based solution for large scale social network analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    HareDB HBase Client

    HareDB HBase Client

    GUI Tools for HBase (including PIG and high speed Hive Query)

    Most people are not familiar with command mode. However, there is only command mode in the world of Hadoop and HBase. For the reason above, we are focusing on developing a set of tools, “HBase Client”, which can be used more easily and having a more friendly interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    RSS Atom Feed Analytics With MapReduce

    This is a data analytics project for RSS feeds using hadoop MapReduce

    This project accepts the output of jatomrss project as the input. It applies the MR logic on the same to perform the analytics
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    ..., available online at: http://dx.doi.org/10.1093/bioinformatics/bts054 Note that the library part of Hadoop-BAM is mainly for developers with experience in using Hadoop. The command line tools of Hadoop-BAM should be understandable to all users, but they are limited in scope. See the SeqPig project for a higher-level interface to the file formats supported by Hadoop-BAM: http://seqpig.sourceforge.net See Seal for Hadoop-based read alignment tools, Seal: http://biodoop-seal.sourceforge.net
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Hadoop

    Hadoop

    Use Hadoop in Scientific Workflows

    This project provides with an application to integrate Hadoop with WS-PGRADE workflows. It uses Openstack cloud to create user specified Hadoop clusters and execute jobs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses. * community license is following GPLv3 - Some algorithms in Core Project do...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    hmrjp-maven-plugin

    hmrjp-maven-plugin

    Hadoop mapreduce maven plugin

    hmrjp-maven-plugin is a maven plugin which helps creating, running and verifying hadoop mapreduce jobs remotely just like any other java project which is built using maven.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Flamingo Project

    Flamingo Project

    Workflow Designer, Hive Editor, Pig Editor, File System Browser

    Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    Volt

    Volt is pure JAVA NGS mapping soft which run on Hadoop 2.0 env

    The project move to VoltMR http://voltmr.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Cascalog

    Cascalog

    Data processing on Hadoop without the hassle

    Cascalog is a powerful Clojure (and Java) data processing and querying library built atop Hadoop (via Cascading), providing a high-level, Datalog-inspired abstraction for both big data processing and local computation. Cascalog is hosted at Clojars, and some of its dependencies are hosted at Conjars. Both Clo/Con-jars are maven repos that's easy to use with maven or leiningen. The Cascalog website contains more information and links to Various articles and tutorials. The best way to get started...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Aspose for Hadoop

    Aspose for Hadoop

    This project holds source code for Aspose for Hadoop project.

    Aspose for Hadoop project enables Apache Hadoop / MapReduce developers to work with various binary file formats. The developers can create and convert binary sequence files into text sequence files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    UI To the Hadoop HBase Project
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    HDFSFileTransfer

    File transfer from local FS to HDFS

    The HDFSFileTransfer project was created and developed to ease Hadoop users quickly copying varied files such as: flat, structured, unstructured, big and small from linux to Hadoop File System (HDFS). It allows users to transfer files: - within the same physical machine - from local file system (linux) into HDFS - between two physical machines - copy files from local file system (linux) with HDFS cluster installed to another HDFS cluster. Sample - one can have two single clustered Hadoop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    This project aims to reduce the data read redundancy in the Apache Hadoop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    BIRT Report Designer

    BIRT Report Designer

    Open Source Reporting & Data Visualization Platform

    .... With a flexible Open Data Access framework, developers can write custom data drivers to access data from any source, including Big Data sources like Apache Hadoop, Cassandra, and MongoDB, along with all traditional relational databases, Flat Files, XML data streams, and data stored in proprietary systems. Built for embedding, BIRT includes APIs for data access, chart generation, output formats, content execution, and integration within larger applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Crowbar

    Crowbar

    A complete operations platform to deploy, maintain and scale clusters.

    The Crowbar Project is an effort to build a complete, easy to use operational platform for everyone. It allows for any number of physical nodes to be moved from bare-metal to production cluster within hours. Specific applications include (but are not limited to) Hadoop and OpenStack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Jxtadoop

    Jxtadoop

    This project aims to provide P2P capabilities with Hadoop DFS.

    Hadoop is designed to work in large datacenters with thousands of servers connected to each others in the Hadoop cloud. This project focuses on the Distributed File System part of Hadoop (HDFS). The goal of this project is to provide an alternative to direct IP connectivity required for Hadoop. Instead, the DFS layer has been modified to use a Peer-2-Peer framework which allows direct connectivity in datacenters as well as indirect connectivity to bypass firewall constraints. The typical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    testHadoop

    Project to work with binary files on hadoop

    Aspose for Hadoop will enable hadoop developers to work with binary file formats on Hadoop by converting binary sequence files into text sequence files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    JUMMP

    JUMMP

    JUMMP: Job Uninterrupted Maneuverable MapReduce Platform

    JUMMP is an automated scheduling platform that provides a customized Hadoop environment within a batch-scheduled cluster environment. JUMMP enables an interactive pseudo-persistent MapReduce platform within the existing administrative structure of an academic high performance computing center by “jumping” between nodes with minimal administrative effort. Jumping is implemented by the synchronization of stopping and starting daemon processes on different nodes in the cluster. Use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.