This repository contains scripts for managing VASP calculations on a SLURM-based cluster system.
A script for submitting and monitoring sequential VASP calculations (Relax → SC → ELF → Band → DOS).
nohup bash job_monitor.sh <calc_type> &where <calc_type> can be:
Relax: Structure relaxationSC: Self-consistent calculationELF: Electron Localization Function calculationBand: Band structure calculationDOS: Density of states calculation
structure_directory/
├── Relax/
├── SC/
├── ELF/
├── DOS/
└── Band/INCAR_<calc_type>: INCAR file for each calculation typesbp_<calc_type>.sh: SLURM submission script for each calculation typePOTCAR: VASP pseudopotential file../aflow_sym/uniq_poscar_list: List of structures to processdiverge_structs: (optional) List of structures to skip
Scripts for managing optical calculations (SC → DIAG → GW0 → BSE) with automatic error checking and resubmission.
For normal execution with error checking:
nohup ./redo_optics.sh > redo_optics.log 2>&1 &For forced restart of all calculations:
nohup ./restart_optics.sh > restart_optics.log 2>&1 &Direct_dir: File containing list of directories to process- INCAR files:
INCAR_SCINCAR_DIAGINCAR_GW0INCAR_BSE
- SLURM submission scripts:
sbp_SC.shsbp_DIAG.shsbp_GW0.shsbp_BSE.sh
POTCAR_GW: VASP GW pseudopotential file
SC: DFT groundstate calculationDIAG: DFT "virtual" orbitals (empty states)GW0: RPA quasiparticles with single-shot GWBSE: BSE calculation
structure_directory/
├── Optics/
│ ├── SC/
│ ├── DIAG/
│ ├── GW0/
│ └── BSE/For redo_optics.sh:
optical_jobs.log: Detailed job submission informationjob_<calc_type>.log: Job counting logs for each calculation type
For restart_optics.sh:
restart_optical_jobs.log: Detailed job submission information for restartsjob_<calc_type>_restart.log: Job counting logs for restarted calculations
redo_optics.sh:
- Automatic error detection and job resubmission
- Sequential dependency handling
- Detailed logging of job submissions
- Limits concurrent jobs up to 60 computational nodes
restart_optics.sh:
- Forces restart of all calculations regardless of previous status
- Maintains same workflow and dependencies
- Uses separate log files to avoid confusion with original runs
- Limits concurrent jobs up to 60 computational nodes
- Manages sequential job submissions
- Limits concurrent jobs to 60
- Handles failed calculations
- Supports structure skipping via
diverge_structs
- Automatic directory creation and management
- Sequential dependency handling (SC → DIAG → GW0 → BSE)
- Automatic error detection and job resubmission
- Detailed logging of job submissions
- Limits concurrent jobs to 60
- Missing required files - ensure all INCAR and submission scripts are present
- Directory permissions - ensure write access in all directories
- SLURM queue limits - script will wait if queue is full
- Failed calculations - check individual VASP output files for errors
- Missing vasprun.xml - script will detect and resubmit affected calculations
- Failed phonon calculations - use get_err_phon.sh to generate resubmission script
- Both scripts assume SLURM job scheduler
- Maximum concurrent jobs is set to 60
- Scripts will create necessary directories if they don't exist
- Error handling includes automatic resubmission of failed jobs
A script for managing phonon calculations with automatic supercell generation and job monitoring.
chmod +x submit_phonon.sh
nohup ./submit_phonon.sh > nohup.out 2>&1 &phonon_list: File containing list of directories to processINCAR_PHON: INCAR file for phonon calculationssbp_PHON.sh: SLURM submission script for phonon calculations- Supporting scripts:
convert_kpath.shgenerate_supercell.shextract_band_conf.shpreprocess_high_symmetry_points.sh
structure_directory/
├── Relax/
│ └── CONTCAR
└── PHON/
├── POSCAR-*
├── INCAR
├── POTCAR
└── sbp.shjob_PHON.log: Detailed job submission tracking- Records which phonon calculations have been submitted for each structure
- Automatic supercell generation using VASPKIT
- Batch submission (10 jobs at a time)
- Limits concurrent jobs to 50
- Resumes from last submitted job if interrupted
- Maintains submission history in log file
- Reads structures from
phonon_list - For each structure:
- Creates PHON directory
- Copies CONTCAR from Relax directory
- Generates primitive cell using VASPKIT
- Generates supercells
- Submits jobs in batches
- Monitors job queue and maintains submission limits
- Tracks progress in log file
- Check supercell size in
generate_supercell.sh - Monitor convergence in individual phonon calculations
- Use
job_PHON.logto track submission progress - Check VASPKIT output for primitive cell generation
A script for post-processing phonon calculations with automatic error detection and data generation.
sbatch phonon-pp-job.sh- Automatic error detection in SLURM output files
- Generates FORCE_SETS using phonopy
- Creates phonon band plots and raw data files
- Handles LaTeX formatting for band labels
- Detailed logging with configurable verbosity
phonon_list: List of structures to process- Supporting scripts:
convert_kpath.shextract_band_conf.shpreprocess_high_symmetry_points.sh
A script for analyzing and categorizing band structures based on their electronic properties.
./band_gap-pp.sh- Automatically categorizes structures as Direct, Indirect, or Metallic
- Uses VASPKIT for band structure analysis
- Error detection in SLURM output files
- Generates categorized lists of structures
Direct_dir: List of structures with direct band gapsIndirect_dir: List of structures with indirect band gapsMetallic_dir: List of structures with metallic/semimetallic band gaps
A utility script for handling failed phonon calculations.
./get_err_phon.sh- Analyzes phonon post-processing logs for errors
- Generates resubmission script for failed calculations
- Handles missing or corrupted vasprun.xml files
- Automatic cleanup and job resubmission
A Python script for analyzing Electron Localization Function (ELF) calculations to identify potential electride structures using Bader topological analysis. Electrides are materials where electrons occupy interstitial regions rather than being associated with atoms.
Uses Bader topological analysis from the Henkelman group to identify critical points in the ELF field, avoiding false positives from covalent bond regions.
Single structure analysis:
cd /path/to/structure/ELFCAR
python3 /path/to/aflow_sym/analyze_electride.py ELFCARBatch analysis of all structures:
cd /path/to/parent/directory
python3 /path/to/aflow_sym/analyze_electride.py --batch . -o electride_results.csvWith custom parameters:
python3 analyze_electride.py ELFCAR --threshold 0.7 --min-distance 2.0 --volume-threshold 0.5With custom bader executable:
python3 analyze_electride.py --bader-exe /path/to/bader /path/to/structure/ELFCARForce regenerate BCF.dat:
rm /path/to/ELF/BCF.dat
python3 analyze_electride.py /path/to/structure/ELFCARELFCAR: Output from VASP ELF calculation (generated withLELF=.TRUE.in INCAR)baderexecutable: Download from Henkelman Group- Auto-detection: Script automatically looks for
baderin the same directory as ELFCAR - Alternative: Add
baderto system PATH or use--bader-exeoption
- Auto-detection: Script automatically looks for
- Bader topological analysis: Rigorous critical point detection in ELF field
- Automatic electride detection based on interstitial ELF maxima
- Zero false positives: Correctly distinguishes covalent bonds from interstitial electrons
- Distance-based filtering to exclude atomic regions
- Volume estimation of electron-rich interstitial regions
- Batch processing for analyzing multiple structures
- CSV export for systematic analysis
- BCF.dat caching: Reuses existing Bader analysis results
The script provides:
- Potential electride classification (yes/no)
- Maximum ELF value in interstitial regions
- Number of interstitial electron sites
- Volume and volume fraction of interstitial regions
- Distance of interstitial sites from nearest atoms
| Parameter | Default | Description |
|---|---|---|
--threshold |
0.6 | Minimum ELF value for electride detection |
--min-distance |
1.5 Å | Minimum distance from atoms to consider interstitial |
--volume-threshold |
0.5 ų | Minimum volume for significant interstitial region |
--output |
electride_analysis.csv | Output file for batch analysis |
- INCAR settings: Ensure
LELF=.TRUE.is set inINCAR_ELF - Grid density: Use fine FFT grids for accurate ELF calculations
- Convergence: ELF calculations should be performed on well-converged charge densities
- ELFCAR vs CHGCAR:
- We analyze
ELFCAR(ELF field) directly withbader ELFCAR - The
-ref CHGCAR_sumoption is for charge density analysis, not needed for ELF - VASP outputs complete ELF field in ELFCAR, no core correction needed
- We analyze
- BCF.dat caching:
- The script reuses existing BCF.dat if present (faster)
- Delete BCF.dat to force regeneration after changes
- Bader executable detection (in order of priority):
- User-specified path via
--bader-exe baderfile in same directory as ELFCAR (convenient for per-structure executables)baderin system PATH
- User-specified path via
- Threshold tuning: Adjust
--thresholdbased on your material system:- Strong electrides: ELF > 0.7
- Moderate electrides: ELF 0.5-0.7
- Weak localization: ELF < 0.5
- Distance parameter:
--min-distance 1.5(default) works for most cases- Increase to 2.0 Å for systems with large atoms
- Decrease to 1.2 Å for compact structures
# 1. Submit ELF calculations
nohup bash job_monitor.sh ELF &
# 2. Wait for calculations to complete
# 3. Analyze single structure
python3 ../aflow_sym/analyze_electride.py 1/ELF/ELFCAR
# 4. Or batch analyze all structures
python3 ../aflow_sym/analyze_electride.py --batch . -o electride_results.csv
# 5. Check results
cat electride_results.csv
grep "True" electride_results.csv # List potential electridesDepending on which doping script you use, you'll need different Python packages:
For general substitution WITHOUT symmetry bias:
Example: aflow_sym/rnd_SiGe_doping.py or aflow_sym/NaSiGe_doping.py
For using Fingerprint energy as symmetry bias:
Example: aflow_sym/Doping.py
For explicitly using group-subgroup splitting:
Example: aflow_sym/subgroup_doping.py
For entropy-guided MCMC with duplicate avoidance (RECOMMENDED):
Example: aflow_sym/fp_doping.py
- ASE
- libfp
- ReformPy (for fingerprint entropy)
- matplotlib (for visualization)
- scipy
- numba
- kimpy (optional, for KIM energy filtering)
For ELF electride analysis:
Example: aflow_sym/analyze_electride.py
Ensure these environment variables are set:
$AFLOW_HOME: Path to AFLOW executable$VASPKIT_HOME: Path to VASPKIT executable$PHONOPY_HOME: Path to Phonopy executable
- Structure Generation (
fp_doping.py) - Structure Relaxation (
job_monitor.sh Relax) - Electronic Structure (
job_monitor.sh SC/Band/DOS) - ELF Analysis (Optional,
job_monitor.sh ELF+analyze_electride.py) - Optical Properties (
redo_optics.sh) - Phonon Calculations (
submit_phonon.sh)
A robust method for generating diverse atomic substituted structures with automatic diversity optimization using entropy-guided Markov Chain Monte Carlo (MCMC) sampling based on fingerprint entropy maximization.
- 65% overall uniqueness (80-100% for 3+ substitutions, validated by AFLOW)
- Entropy-guided MCMC directly maximizes atomic environment diversity
- Always succeeds - no clustering failures for high substitution levels
- JIT-compiled fingerprint entropy calculations (fast performance)
- Optional KIM energy filtering removes unstable structures
- Entropy distribution plots for interpretability
- Theoretically grounded - uses ReformPy's fingerprint entropy metric
Command Line:
cd aflow_sym/
python3 fp_doping.pyYou will be prompted for:
- Element to substitute (e.g.,
Si) - New element (e.g.,
Ge) - Maximum number of atoms to substitute
- Maximum structures per substitution level
- MCMC temperature (default: 1.0, higher = more exploration)
- MCMC iterations per level (default: 10000)
- Whether to use KIM energy filtering (y/n)
- Whether to generate entropy distribution plots (y/n)
Python API:
import ase.io
from fp_doping import POSCAR_GEN_CLUSTER
# Load structure
atoms = ase.io.read('POSCAR')
# Generate diverse structures
structures = POSCAR_GEN_CLUSTER(
atoms_origin=atoms,
elem_from='Si',
elem_to='Ge',
max_subs=5,
max_structures=10,
max_iter=10000,
mcmc_temperature=1.0,
visualize=True,
kim_model="Tersoff_LAMMPS_Tersoff_1989_SiGe__MO_350526375143_004"
)- MCMC Initialization: Start with random substitution pattern for each level
- Metropolis-Hastings Sampling:
- Propose new substitution pattern (swap one substituted/non-substituted atom)
- Calculate fingerprint entropy: S = (1/N) Σᵢ log(N × δq_min,i)
- Accept if entropy increases, or with probability exp(ΔS/T) if decreases
- Thinning & Burnin: Discard initial samples, keep every 10th sample
- Diversity Selection: Choose top entropy structures (most diverse atomic environments)
- Energy Filtering (optional): Exclude high-energy structures using KIM calculator
Key Insight: Maximizing fingerprint entropy ensures atoms have maximally diverse local environments, avoiding symmetry-equivalent structures.
POSCAR Files:
POSCAR_N_Mwhere N = substitution level, M = structure index- Example:
POSCAR_3_5= 5th structure with 3 substitutions
Visualization:
entropy_distribution_N_substitutions.png- Shows entropy histogram and ranked values- Helps verify MCMC convergence and diversity of generated structures
Validated with AFLOW --compare_materials on Si₃₄ test structure:
| Substitutions | Generated | Unique (AFLOW) | Uniqueness | Status |
|---|---|---|---|---|
| 1 atom | 10 | 1 | 10% | Expected* |
| 2 atoms | 8 | 1 | 12.5% | Expected* |
| 3 atoms | 10 | 8 | 80% | Excellent |
| 4 atoms | 8 | 8 | 100% | Perfect |
| 5 atoms | 8 | 8 | 100% | Perfect |
| 6 atoms | 8 | 8 | 100% | Perfect |
| Overall | 52 | 34 | 65.4% | Good |
* Low uniqueness for 1-2 substitutions is expected: high-symmetry structures have many equivalent sites. MCMC correctly converges to globally optimal configurations.
max_iter: MCMC iterations per substitution level
- Default: 10000 (good for most cases)
- Higher values: Better sampling, longer runtime
- Suggested range: 5000-20000
mcmc_temperature: Exploration vs exploitation trade-off
- Default: 1.0 (balanced)
- Higher (2.0-5.0): More exploration, higher diversity (use if getting duplicates)
- Lower (0.5): More exploitation, faster convergence
KIM model (optional energy filtering):
- Si-Ge systems:
"Tersoff_LAMMPS_Tersoff_1989_SiGe__MO_350526375143_004" - Excludes top 20% highest energy structures (default threshold)
- Requires kimpy installation
- Use
Noneto disable
| Feature | Entropy-MCMC (New) | PCA+Clustering (Old) |
|---|---|---|
| Robustness | Always succeeds | Failed for 6+ substitutions |
| Scalability | Any substitution level | Limited by clustering |
| Theoretical basis | Entropy maximization | Ad-hoc PCA distance |
| Speed | Fast (JIT-compiled) | Moderate |
| Uniqueness (3-6 subs) | 80-100% | N/A (failed) |
- For high-symmetry structures: Expect low uniqueness for 1-2 substitutions (this is CORRECT behavior - MCMC finds globally optimal configurations)
- For more diversity: Increase temperature (2.0-5.0) or iterations (20000+)
- Check convergence: Use visualization plots to verify entropy distribution
- AFLOW filtering: Always use
reduce_sim_struct.shfor final uniqueness verification
To verify uniqueness of generated structures:
bash reduce_sim_struct.sh
cat uniq_poscar_listThis uses AFLOW to identify symmetrically equivalent structures. The entropy-MCMC method achieves 65% overall uniqueness and 80-100% for 3+ substitutions, which is excellent for DFT workflows.