📊 Build a Logistic Regression model to predict customer churn in telecom, utilizing Python and scikit-learn for data analysis and insights.
-
Updated
Dec 13, 2025 - Jupyter Notebook
📊 Build a Logistic Regression model to predict customer churn in telecom, utilizing Python and scikit-learn for data analysis and insights.
📊 Showcase data projects that highlight analytics, machine learning, and MLOps with reproducible code and clear business insights.
🚀 Build your skills with hands-on programming tutorials across various languages, guiding you to create applications from scratch.
🎯 Streamline talent management with this intuitive platform for tracking, recruiting, and onboarding top candidates efficiently.
📊 Streamline retail store data processing and enhance reporting with this efficient ETL pipeline.
📚 Master PySpark in 18 days with structured lessons, hands-on tasks, and an end-to-end project, covering essential concepts and ML model training.
📊 Enhance data management with hbase-68i, a powerful tool for efficient handling and processing of large datasets on HBase.
🚀 Enhance HBase performance with advanced data handling and management tools, streamlining operations for better efficiency and reliability.
📊 Explore simulated financial transactions and AI logs for the Sr. Auditor Analytics challenge, enhancing continuous auditing through data analysis and risk indicators.
📊 Showcase data projects in engineering, machine learning, and business intelligence, emphasizing technical processes and business impacts.
🗂️ Access essential AI and ML concepts with quick-reference cheatsheets for effective learning and project implementation.
Calc is a simple calculator application that performs basic arithmetic operations. It features a user-friendly interface, allowing users to quickly add, subtract, multiply, and divide numbers.
A collection of ready-to-use Docker development environments for multiple Linux distributions (Ubuntu, Debian, Alpine, Arch, Kali). Includes shared configurations, utility scripts, and comprehensive documentation for reproducible development setups across teams and CI/CD pipelines.
Pipeline PySpark pour la classification de particules en physique des hautes énergies (dataset HEPMASS). Inclut le prétraitement distribué, l'entraînement de modèles (régression logistique, arbres de décision), l'évaluation et des visualisations clés. Optimisé pour Hadoop/Spark.
🚀 Migrate legacy mainframe data to a modern Hadoop ecosystem, automating ingestion, transformation, and validation for scalable storage and analytics.
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
Large Tech Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP and international Consulting including extensive Travel Tips around the world
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."