Large Tech Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP and international Consulting including extensive Travel Tips around the world
-
Updated
Nov 12, 2025 - Shell
Large Tech Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP and international Consulting including extensive Travel Tips around the world
open source based development related contents
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
🚀 Build your skills with hands-on programming tutorials across various languages, guiding you to create applications from scratch.
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux..
🎯 Streamline talent management with this intuitive platform for tracking, recruiting, and onboarding top candidates efficiently.
📊 Streamline retail store data processing and enhance reporting with this efficient ETL pipeline.
📚 Master PySpark in 18 days with structured lessons, hands-on tasks, and an end-to-end project, covering essential concepts and ML model training.
📊 Enhance data management with hbase-68i, a powerful tool for efficient handling and processing of large datasets on HBase.
🚀 Enhance HBase performance with advanced data handling and management tools, streamlining operations for better efficiency and reliability.
📊 Explore simulated financial transactions and AI logs for the Sr. Auditor Analytics challenge, enhancing continuous auditing through data analysis and risk indicators.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
📊 Showcase data projects in engineering, machine learning, and business intelligence, emphasizing technical processes and business impacts.
🗂️ Access essential AI and ML concepts with quick-reference cheatsheets for effective learning and project implementation.
Calc is a simple calculator application that performs basic arithmetic operations. It features a user-friendly interface, allowing users to quickly add, subtract, multiply, and divide numbers.
A collection of ready-to-use Docker development environments for multiple Linux distributions (Ubuntu, Debian, Alpine, Arch, Kali). Includes shared configurations, utility scripts, and comprehensive documentation for reproducible development setups across teams and CI/CD pipelines.
Pipeline PySpark pour la classification de particules en physique des hautes énergies (dataset HEPMASS). Inclut le prétraitement distribué, l'entraînement de modèles (régression logistique, arbres de décision), l'évaluation et des visualisations clés. Optimisé pour Hadoop/Spark.
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."