Annotation
In modern PHP development, a hidden BOM (Byte Order Mark) in UTF-8 files can cause fatal errors when using the namespace and declare(strict_types=1) constructs, which are required to be the first statements in a script. The appearance of BOM often goes unnoticed, as it is silently inserted by various editors or through automated conversions, leading to hard-to-diagnose failures and increased debugging costs. Solutions include manual BOM removal via editors (Notepad++, VSCode), PHP scripts, command-line tools (sed, awk, iconv), as well as the comprehensive tool clean-bom-senior.sh, which provides recursive cleanup, atomic operations, backup, and integration into CI/CD and pre-commit hooks. The recommended practices of IDE configuration, pre-commit checks, and encoding standardization ensure the stability and predictability of PHP applications.
- 1 · Introduction
- 2 · Understanding BOM and Character Encodings
- 3 · BOM Detection and Diagnosis
- 4 · BOM Cleaning Methods
- 5 · DevOps Integration Strategies
- 6 · BOM Prevention Best Practices
- 7 · Practical Recipes and Workflows
- 8 · Troubleshooting Common Issues
- 9 · Enterprise Deployment Considerations
- 10 · Glossary
- Sources
1 · Introduction
1.1 Why This Guide Matters
Byte Order Mark (BOM) markers — invisible three-byte sequences EF BB BF — frequently cause fatal PHP errors when used with modern language constructs like namespace declarations and declare(strict_types=1) statements[^bom]. This comprehensive guide provides enterprise-grade solutions for:
- Detecting and removing BOM from codebases of any scale;
- Preventing BOM reintroduction through proper tooling and workflows;
- Automating cleanup processes in IDE environments, CI/CD pipelines, and Git workflows;
- Maintaining code integrity during cleanup operations.
1.2 What’s New in the 2025 Edition
This updated edition introduces significant enhancements based on real-world enterprise deployments:
- clean-bom-senior.sh v2.06.4 with revolutionary features:
- Complete file attribute preservation: owner, group, permissions, timestamps
- Atomic operations with automatic rollback on failure
- Fixed statistics reporting using process substitution instead of pipes
- Global command access via
bomalias - Enterprise error handling with comprehensive logging
- Extended DevOps coverage: Docker containers, Kubernetes deployments, CI/CD templates
- Security considerations: permission handling, backup strategies, audit trails
- Performance optimization: parallel processing, large file handling, memory management
1.3 Target Audience
- PHP Developers working with modern namespace-based applications
- DevOps Engineers implementing automated code quality checks
- System Administrators managing multi-developer environments
- Technical Team Leads establishing coding standards and workflows
2 · Understanding BOM and Character Encodings
2.1 What is Byte Order Mark (BOM)
The Byte Order Mark (BOM) is a special Unicode character (U+FEFF) that appears at the beginning of a text file to indicate:
- Byte order (endianness) for UTF-16 and UTF-32 encodings
- Encoding type identification for text processors
- Unicode presence in the file stream
In UTF-8 files, BOM appears as three bytes: EF BB BF (hexadecimal).
2.2 Why BOM Breaks Modern PHP
PHP’s strict parsing requirements mandate that the first statement after the opening <?php tag must be either:
- A
namespacedeclaration - A
declare()statement (e.g.,declare(strict_types=1))
When BOM is present, PHP treats these three invisible bytes as content that precedes the namespace/declare statement, resulting in:
<?php
// ❌ Invisible BOM bytes here (EF BB BF)
namespace App\Controllers; // Fatal error: Namespace declaration statement has to be the very first statementError example:
Fatal error: Namespace declaration statement has to be the very first statement or after any declare call in the script2.3 Cross-Platform Line Ending Issues
CRLF vs LF conflicts arise when:
- Windows editors save files with
\r\n(CRLF) line endings - Unix/Linux systems expect
\n(LF) only - Mixed environments create inconsistent file states
Consequences include:
- Git diff pollution with «whitespace changes»
- Failed CI/CD builds due to checksum mismatches
- Inconsistent behavior across development environments
- Code review complications with phantom changes
2.4 File Encoding Detection
Understanding how to identify encoding issues is crucial for effective cleanup:
| Encoding Type | BOM Signature | Hex Representation |
|---|---|---|
| UTF-8 with BOM | EF BB BF | 239 187 191 |
| UTF-16 LE | FF FE | 255 254 |
| UTF-16 BE | FE FF | 254 255 |
| UTF-32 LE | FF FE 00 00 | 255 254 0 0 |
| UTF-32 BE | 00 00 FE FF | 0 0 254 255 |
3 · BOM Detection and Diagnosis
3.1 Command-Line Detection Methods
Essential diagnostic commands for BOM identification:
| Tool | Command | Purpose | Output Example |
|---|---|---|---|
| hexdump | hexdump -C file.php | head -1 | Visual hex inspection | 00000000 ef bb bf 3c 3f 70 68 70 |
| od | od -An -tx1 -N3 file.php | Byte-level analysis | ef bb bf |
| file | file -bi file.php | MIME type detection | text/x-php; charset=utf-8 |
| xxd | xxd -l 16 file.php | Hex dump with ASCII | 00000000: efbb bf3c 3f70 6870 |
3.2 Advanced Detection Techniques
Bulk scanning for BOM across entire projects:
# Find all files containing BOM
find . -type f \( -name "*.php" -o -name "*.js" -o -name "*.css" \) \
-exec grep -l $'\xEF\xBB\xBF' {} \;
# Statistical analysis of encoding issues
find . -name "*.php" -exec file -bi {} \; | sort | uniq -cAutomated detection script:
#!/bin/bash
# detect-bom.sh - Enterprise BOM detection
for file in $(find . -name "*.php" -type f); do
if [[ $(hexdump -n 3 -e '3/1 "%02x"' "$file") == "efbbbf" ]]; then
echo "BOM detected: $file"
fi
done3.3 IDE-Based Detection
Visual Studio Code:
- Status bar shows encoding (look for «UTF-8 with BOM«)
- Extensions: «Fix UTF-8 BOM», «BOM detector»
PhpStorm/IntelliJ:
- File → File Properties → File Encoding
- Settings → Editor → File Encodings → «Transparent native-to-ascii conversion»
Sublime Text:
- View → Show Console →
view.encoding() - Packages: «EncodingHelper», «BOM detector»
4 · BOM Cleaning Methods
4.1 Manual Methods
4.1.1 Text Editor Solutions
Notepad++ (Windows):
- Open file → Encoding menu
- Select «Convert to UTF-8 without BOM«
- Save file (Ctrl+S)
Visual Studio Code:
- Click encoding indicator in status bar
- Select «Save with Encoding»
- Choose «UTF-8» (not «UTF-8 with BOM»)
Sublime Text:
- File → Save with Encoding
- Select «UTF-8» option
- Verify in View → Show Console:
view.encoding()
4.1.2 When Manual Methods Are Appropriate
- Small codebases (< 50 files)
- One-time cleanup operations
- Learning and understanding BOM issues
- Precision editing of specific problematic files
4.2 PHP-based Solutions
4.2.1 Simple BOM Removal Script
<?php
/**
* Basic BOM removal for single files
* Usage: php remove-bom.php filename.php
*/
function removeBOM($filepath) {
$bom = "\xEF\xBB\xBF";
$content = file_get_contents($filepath);
if (strncmp($content, $bom, 3) === 0) {
$cleaned = substr($content, 3);
file_put_contents($filepath, $cleaned);
echo "BOM removed from: $filepath\n";
return true;
}
echo "No BOM found in: $filepath\n";
return false;
}
// Command line usage
if ($argc > 1) {
removeBOM($argv[1]);
}
?>4.2.2 Enterprise-Grade PHP Solution
<?php
/**
* Enterprise BOM Cleaner v2.0
* Features: Backup, logging, batch processing, error handling
*/
class BOMCleaner {
private $logFile;
private $backupDir;
private $processedCount = 0;
private $cleanedCount = 0;
public function __construct($logFile = 'bom-cleaner.log', $backupDir = 'backups') {
$this->logFile = $logFile;
$this->backupDir = $backupDir;
if (!is_dir($this->backupDir)) {
mkdir($this->backupDir, 0755, true);
}
}
public function processDirectory($directory, $extensions = ['php', 'js', 'css', 'html']) {
$iterator = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($directory)
);
foreach ($iterator as $file) {
if ($file->isFile()) {
$extension = strtolower($file->getExtension());
if (in_array($extension, $extensions)) {
$this->processFile($file->getPathname());
}
}
}
$this->logMessage("Processing complete. Files processed: {$this->processedCount}, BOM removed: {$this->cleanedCount}");
}
private function processFile($filepath) {
$this->processedCount++;
if (!is_readable($filepath) || !is_writable($filepath)) {
$this->logMessage("Permission denied: $filepath", 'ERROR');
return false;
}
$content = file_get_contents($filepath);
$bom = "\xEF\xBB\xBF";
if (strncmp($content, $bom, 3) === 0) {
// Create backup
$backupPath = $this->backupDir . '/' . basename($filepath) . '.' . time() . '.bak';
copy($filepath, $backupPath);
// Remove BOM
$cleaned = substr($content, 3);
if (file_put_contents($filepath, $cleaned) !== false) {
$this->cleanedCount++;
$this->logMessage("BOM removed: $filepath (backup: $backupPath)");
} else {
$this->logMessage("Failed to write: $filepath", 'ERROR');
}
}
}
private function logMessage($message, $level = 'INFO') {
$timestamp = date('Y-m-d H:i:s');
$logEntry = "[$timestamp] [$level] $message\n";
file_put_contents($this->logFile, $logEntry, FILE_APPEND | LOCK_EX);
echo $logEntry;
}
}
// Usage example
$cleaner = new BOMCleaner();
$cleaner->processDirectory('./src', ['php', 'js', 'css']);
?>4.3 Command Line Utilities
4.3.1 sed-based Solutions
Basic BOM removal:
# Remove BOM from single file
sed -i '1s/^\xEF\xBB\xBF//' file.php
# Process multiple files
find . -name "*.php" -exec sed -i '1s/^\xEF\xBB\xBF//' {} \;
# Create backup copies
find . -name "*.php" -exec sed -i.bak '1s/^\xEF\xBB\xBF//' {} \;Advanced sed with validation:
#!/bin/bash
# sed-bom-cleaner.sh - Production-ready sed solution
for file in $(find . -name "*.php" -type f); do
if hexdump -n 3 -e '3/1 "%02x"' "$file" | grep -q "efbbbf"; then
echo "Cleaning BOM from: $file"
sed -i.bom-backup '1s/^\xEF\xBB\xBF//' "$file"
echo "Backup created: ${file}.bom-backup"
fi
done4.3.2 awk-based Solutions
# Remove BOM and CRLF in single pass
awk 'BEGIN{RS="\r?\n"; ORS="\n"} NR==1{gsub(/^\xEF\xBB\xBF/, "")} {print}' file.php > file.php.clean
mv file.php.clean file.php
# Batch processing with awk
find . -name "*.php" -exec awk 'BEGIN{RS="\r?\n"; ORS="\n"} NR==1{gsub(/^\xEF\xBB\xBF/, "")} {print}' {} \; > {}.clean \; && mv {}.clean {} \;4.3.3 iconv and dos2unix Solutions
# Convert encoding and remove BOM
iconv -f utf-8 -t utf-8 -c input.php -o output.php
# Fix line endings and encoding
dos2unix file.php # Removes CRLF
iconv -f utf-8 -t utf-8 -c file.php -o file.php.clean # Removes BOM4.4 clean-bom-senior.sh Enterprise Solution
4.4.1 Installation and Setup
Quick installation (global access):
# Download and install globally
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
sudo chmod +x /usr/local/bin/bom
# Verify installation
bom --versionAlternative installation methods:
# User-specific installation
mkdir -p ~/.local/bin
curl -Lo ~/.local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x ~/.local/bin/bom
# Project-specific installation
mkdir -p tools
cd tools
wget https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x clean-bom-senior.shCreating system-wide alias:
# Option 1: Symbolic link (recommended)
sudo ln -s /path/to/clean-bom-senior.sh /usr/local/bin/bom
# Option 2: Shell alias (user-specific)
echo 'alias bom="/path/to/clean-bom-senior.sh"' >> ~/.bashrc
source ~/.bashrc
# Option 3: Function wrapper (advanced)
echo 'bom() { /path/to/clean-bom-senior.sh "$@"; }' >> ~/.bashrc4.4.2 Basic Usage Patterns
| Command | Purpose | Use Case |
|---|---|---|
bom | Recursive cleanup | Production deployment prep |
bom --dry-run | Preview changes | Pre-commit validation |
bom --verbose | Detailed logging | Development/debugging |
bom file1.php file2.js | Specific files | Targeted cleanup |
4.4.3 Advanced Features in v2.06.4
Complete File Attribute Preservation:
- Owner/Group (UID/GID): Original ownership maintained
- Permissions: File modes (755, 644, etc.) preserved
- Timestamps: Modification times unchanged for clean files
- Extended attributes: SELinux labels, ACLs preserved
Atomic Operations with Rollback:
# Automatic backup creation: filename.bak.PID
# Atomic file replacement or complete rollback
# Zero data loss guaranteeEnhanced Statistics and Reporting:
=== PROCESSING SUMMARY ===
Execution time: 3 seconds
Files processed: 247
Files skipped (clean): 203
Errors encountered: 0
--- Issues Fixed ---
BOM signatures removed: 23
CRLF line endings fixed: 21
--- File Type Distribution ---
.php files: 156
.js files: 67
.css files: 24Process Substitution Fix (Critical v2.06.4 improvement):
- Previous versions: Statistics lost due to pipe subshells
- v2.06.4: Correct statistics using
while ... < <(find) - Impact: Accurate reporting for CI/CD integration
4.4.4 Configuration and Customization
Environment variables:
# Temporary directory override
export TMPDIR="/custom/temp/path"
bom --verbose
# Maximum file size limit (default: 100MB)
# Modify MAX_FILE_SIZE in script for larger filesSupported file extensions (default):
.php— PHP scripts.css— Stylesheets.js— JavaScript files.txt— Text documents.xml— XML files.htm/.html— Web pages
5 · DevOps Integration Strategies
5.1 CI/CD Pipeline Integration
5.1.1 GitHub Actions Implementation
Basic workflow (.github/workflows/bom-check.yml):
name: BOM Validation
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
bom-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install BOM Cleaner
run: |
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
sudo chmod +x /usr/local/bin/bom
- name: Check for BOM issues
run: |
if ! bom --dry-run --verbose; then
echo "❌ BOM or CRLF issues detected"
echo "Run 'bom --verbose' locally to fix"
exit 1
fi
echo "✅ No encoding issues found"Advanced workflow with auto-fix:
name: BOM Auto-Fix
on:
push:
branches: [feature/*]
jobs:
auto-fix-bom:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Clean BOM issues
run: |
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
sudo chmod +x /usr/local/bin/bom
bom --verbose
- name: Commit fixes
run: |
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add -A
git diff --staged --quiet || git commit -m "🧹 Auto-fix: Remove BOM and normalize line endings"
git push5.1.2 GitLab CI Configuration
.gitlab-ci.yml example:
stages:
- validate
- cleanup
variables:
BOM_VERSION: "v2.06.4"
bom_check:
stage: validate
image: alpine:latest
before_script:
- apk add --no-cache curl bash findutils coreutils
- curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
- chmod +x /usr/local/bin/bom
script:
- bom --dry-run --verbose
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
bom_cleanup:
stage: cleanup
script:
- bom --verbose
artifacts:
reports:
junit: bom-report.xml
paths:
- "*.bak.*"
expire_in: 1 week
only:
- develop
- /^release\/.*$/5.1.3 Jenkins Pipeline Integration
pipeline {
agent any
stages {
stage('Setup') {
steps {
sh '''
curl -Lo /tmp/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x /tmp/bom
'''
}
}
stage('BOM Validation') {
steps {
script {
def bomCheck = sh(
script: '/tmp/bom --dry-run --verbose',
returnStatus: true
)
if (bomCheck != 0) {
error "BOM or encoding issues detected. Please run BOM cleaner locally."
}
}
}
}
stage('Deploy') {
when { branch 'main' }
steps {
sh '/tmp/bom --verbose' // Clean before deployment
// Deploy steps here
}
}
}
post {
always {
archiveArtifacts artifacts: '*.bak.*', allowEmptyArchive: true
cleanWs()
}
}
}5.2 Git Hooks Implementation
5.2.1 Pre-commit Hook
.git/hooks/pre-commit (executable):
#!/bin/bash
# Pre-commit hook: Prevent commits with BOM/CRLF issues
echo "🔍 Checking for BOM and encoding issues..."
# Check if bom command exists
if ! command -v bom &> /dev/null; then
echo "⚠️ BOM cleaner not installed. Installing..."
curl -Lo /tmp/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x /tmp/bom
BOM_CMD="/tmp/bom"
else
BOM_CMD="bom"
fi
# Get staged files
STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACM | grep -E '\.(php|js|css|html|xml)$')
if [ -z "$STAGED_FILES" ]; then
echo "✅ No relevant files to check"
exit 0
fi
# Check staged files only
ISSUES_FOUND=false
for file in $STAGED_FILES; do
if [ -f "$file" ]; then
if ! $BOM_CMD "$file" --dry-run &> /dev/null; then
echo "❌ Issues found in: $file"
ISSUES_FOUND=true
fi
fi
done
if [ "$ISSUES_FOUND" = true ]; then
echo ""
echo "🚨 BOM or CRLF issues detected in staged files!"
echo "💡 Run the following command to fix:"
echo " $BOM_CMD --verbose"
echo ""
echo "Then stage and commit your changes again."
exit 1
fi
echo "✅ No encoding issues detected"
exit 05.2.2 Pre-push Hook
.git/hooks/pre-push (executable):
#!/bin/bash
# Pre-push hook: Auto-clean before pushing
protected_branch='main'
current_branch=$(git symbolic-ref HEAD | sed -e 's,.*/\(.*\),\1,')
if [ $protected_branch = $current_branch ]; then
echo "🧹 Cleaning BOM issues before pushing to main..."
if command -v bom &> /dev/null; then
bom --verbose
# Check if any files were modified
if [ -n "$(git diff --name-only)" ]; then
echo "📝 Files were cleaned. Please review and commit:"
git diff --name-only
echo ""
echo "Run: git add . && git commit -m 'Clean BOM and line endings'"
exit 1
fi
echo "✅ Repository is clean"
else
echo "⚠️ BOM cleaner not available. Please install clean-bom-senior.sh"
fi
fi
exit 05.2.3 Husky Integration (Node.js projects)
package.json:
{
"husky": {
"hooks": {
"pre-commit": "bom --dry-run && lint-staged",
"pre-push": "bom --verbose"
}
},
"lint-staged": {
"*.{php,js,css}": [
"bom --verbose",
"git add"
]
}
}5.3 Docker Integration
5.3.1 Multi-stage Dockerfile
# Multi-stage build with BOM cleaning
FROM alpine:latest AS cleaner
RUN apk add --no-cache curl bash findutils coreutils
RUN curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
RUN chmod +x /usr/local/bin/bom
FROM php:8.2-fpm-alpine
# Copy cleaner tool
COPY --from=cleaner /usr/local/bin/bom /usr/local/bin/bom
# Copy application code
COPY . /var/www/html
WORKDIR /var/www/html
# Clean BOM issues during build
RUN bom --verbose || true
# Remove cleaner tool from final image (optional)
RUN rm -f /usr/local/bin/bom
# Set proper ownership and permissions
RUN chown -R www-data:www-data /var/www/html
RUN find /var/www/html -type d -exec chmod 755 {} \;
RUN find /var/www/html -type f -exec chmod 644 {} \;
EXPOSE 9000
CMD ["php-fpm"]5.3.2 Docker Compose Integration
docker-compose.yml:
version: '3.8'
services:
app:
build:
context: .
dockerfile: Dockerfile
volumes:
- ./src:/var/www/html/src
depends_on:
- bom-cleaner
bom-cleaner:
image: alpine:latest
volumes:
- ./src:/workspace
working_dir: /workspace
command: sh -c "
apk add --no-cache curl bash findutils coreutils &&
curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh &&
chmod +x /usr/local/bin/bom &&
bom --verbose
"5.3.3 Kubernetes Job
apiVersion: batch/v1
kind: Job
metadata:
name: bom-cleaner
spec:
template:
spec:
containers:
- name: cleaner
image: alpine:latest
command: ["/bin/sh"]
args:
- -c
- |
apk add --no-cache curl bash findutils coreutils
curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x /usr/local/bin/bom
bom --verbose /workspace
volumeMounts:
- name: source-code
mountPath: /workspace
volumes:
- name: source-code
persistentVolumeClaim:
claimName: source-code-pvc
restartPolicy: Never
backoffLimit: 36 · BOM Prevention Best Practices
6.1 IDE Configuration Standards
6.1.1 Visual Studio Code
Settings configuration (.vscode/settings.json):
{
"files.encoding": "utf8",
"files.autoGuessEncoding": false,
"files.insertFinalNewline": true,
"files.trimFinalNewlines": true,
"files.trimTrailingWhitespace": true,
"files.eol": "\n",
"editor.renderWhitespace": "boundary"
}Recommended extensions:
- EncodingHelper: Visual encoding indicators
- BOM Detector: Automatic BOM detection and removal
- EditorConfig: Consistent encoding across team members
6.1.2 PhpStorm/IntelliJ Configuration
File encoding settings:
- Settings → Editor → File Encodings
- Set «Global Encoding» to UTF-8
- Set «Project Encoding» to UTF-8
- Disable «Transparent native-to-ascii conversion»
- Disable «BOM for UTF-8 files»
Code style configuration:
<!-- .idea/encodings.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="Encoding" defaultCharsetForPropertiesFiles="UTF-8">
<file url="PROJECT" charset="UTF-8" />
</component>
</project>6.1.3 Sublime Text Configuration
User settings (Preferences → Settings):
{
"default_encoding": "UTF-8",
"fallback_encoding": "UTF-8",
"show_encoding": true,
"default_line_ending": "unix"
}6.2 EditorConfig Implementation
.editorconfig file (project root):
# EditorConfig: https://editorconfig.org/
root = true
[*]
indent_style = space
indent_size = 4
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
[*.{js,css,json,yml,yaml}]
indent_size = 2
[*.md]
trim_trailing_whitespace = false
[*.{php}]
indent_size = 4
max_line_length = 1206.3 Git Configuration
6.3.1 Repository-wide Settings
.gitattributes file:
# Handle line endings automatically for files detected as text
* text=auto
# Explicitly set text files
*.php text eol=lf
*.js text eol=lf
*.css text eol=lf
*.html text eol=lf
*.xml text eol=lf
*.json text eol=lf
*.md text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
# Explicitly set binary files
*.png binary
*.jpg binary
*.gif binary
*.ico binary
*.pdf binary6.3.2 Global Git Configuration
# Set line ending handling globally
git config --global core.autocrlf false # Unix/Linux/macOS
git config --global core.autocrlf true # Windows (if needed)
git config --global core.eol lf
# Enable whitespace detection
git config --global core.whitespace trailing-space,space-before-tab6.4 Team Development Standards
6.4.1 CONTRIBUTING.md Template
# Contribution Guidelines
## Code Standards
### File Encoding
- All source files MUST use UTF-8 encoding **without BOM**
- Line endings MUST be Unix-style (LF, not CRLF)
- Files MUST end with a single newline character
### Pre-commit Checklist
1. Run `bom --dry-run` to check for encoding issues
2. Ensure your IDE is configured for UTF-8 without BOM
3. Verify line endings are consistent (LF only)
### Automated Checks
Our CI/CD pipeline automatically checks for:
- BOM markers in source files
- Inconsistent line endings
- Trailing whitespace
### Quick Fix
If you encounter BOM issues:Install cleaner tool
#bash
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
sudo chmod +x /usr/local/bin/bomClean your code
#bash
bom --verbose### IDE Setup
Please configure your IDE according to our [IDE Configuration Guide](docs/IDE-SETUP.md).6.4.2 Onboarding Checklist
# Developer Onboarding - Encoding Setup
## ✅ Required Setup Steps
1. **Install BOM cleaner tool**#bash
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
sudo chmod +x /usr/local/bin/bom2. **Configure Git settings**#bash
git config --global core.autocrlf false
git config --global core.eol lf3. **IDE Configuration** (choose your primary IDE)
- [ ] VS Code: Install EditorConfig extension + apply team settings
- [ ] PhpStorm: Configure File Encodings (UTF-8, no BOM)
- [ ] Sublime Text: Set default encoding to UTF-8
4. **Verify setup**#bash
# This should show no issues
bom --dry-run## 🔧 Troubleshooting
**Problem**: Git shows file changes but no actual content changes
- **Solution**: Line ending mismatch - run `bom --verbose` to normalize
**Problem**: PHP namespace errors in new files
- **Solution**: BOM in file - your IDE is set to UTF-8 with BOM
**Problem**: Build failures in CI/CD
- **Solution**: Encoding issues - run `bom --dry-run` locally first7 · Practical Recipes and Workflows
7.1 Language-Specific Workflows
7.1.1 PHP Project Cleanup
Complete PHP project sanitization:
#!/bin/bash
# php-project-clean.sh - Comprehensive PHP cleanup
echo "🧹 Starting PHP project cleanup..."
# 1. Clean BOM and line endings
echo "Step 1: Cleaning BOM and line endings..."
bom --verbose
# 2. Fix PHP-specific issues
echo "Step 2: PHP-specific fixes..."
find . -name "*.php" -type f | while read -r file; do
# Remove trailing PHP closing tags (PSR-2 compliance)
if tail -1 "$file" | grep -q "?>"; then
sed -i '$ { /^?>$/ d }' "$file"
echo "Removed closing tag: $file"
fi
# Ensure single newline at EOF
if [ -s "$file" ]; then
if [ "$(tail -c1 "$file")" != "" ]; then
echo "" >> "$file"
echo "Added EOF newline: $file"
fi
fi
done
# 3. Composer file cleanup
if [ -f "composer.json" ]; then
echo "Step 3: Cleaning composer.json..."
# Remove BOM from composer files
bom composer.json composer.lock 2>/dev/null || true
fi
echo "✅ PHP project cleanup complete!"7.1.2 JavaScript/Node.js Cleanup
#!/bin/bash
# js-project-clean.sh - JavaScript project cleanup
echo "🧹 JavaScript project cleanup..."
# Clean source files
bom --verbose
# Clean package.json files
find . -name "package*.json" -exec bom {} \;
# Clean configuration files
for config in .eslintrc .prettierrc tsconfig.json webpack.config.js; do
[ -f "$config" ] && bom "$config"
done
# Node.js specific: clean JavaScript and TypeScript files
find . -name "*.js" -o -name "*.ts" -o -name "*.jsx" -o -name "*.tsx" | while read -r file; do
# Skip node_modules
if [[ "$file" == *"node_modules"* ]]; then
continue
fi
echo "Processing: $file"
bom "$file"
done
echo "✅ JavaScript cleanup complete!"7.2 Automated Maintenance Scripts
7.2.1 Weekly Maintenance Cron Job
/etc/cron.d/bom-cleanup:
# Weekly BOM cleanup for all project directories
# Runs every Sunday at 3 AM
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
MAILTO=devops@company.com
# Cleanup main projects
0 3 * * 0 root /usr/local/bin/bom /var/www/project1 --verbose >> /var/log/bom-cleanup.log 2>&1
5 3 * * 0 root /usr/local/bin/bom /var/www/project2 --verbose >> /var/log/bom-cleanup.log 2>&1
# Cleanup user directories (if needed)
10 3 * * 0 root find /home -name "*.php" -path "*/public_html/*" -exec /usr/local/bin/bom {} \; >> /var/log/bom-cleanup.log 2>&17.2.2 Monitoring and Alerting Script
#!/bin/bash
# bom-monitor.sh - Monitor for BOM issues and alert
LOG_FILE="/var/log/bom-monitor.log"
ALERT_EMAIL="devops@company.com"
PROJECTS_DIR="/var/www"
# Function to send alerts
send_alert() {
local subject="$1"
local message="$2"
echo "$(date): $message" >> "$LOG_FILE"
# Send email alert (requires mail command)
if command -v mail &> /dev/null; then
echo "$message" | mail -s "$subject" "$ALERT_EMAIL"
fi
# Send Slack notification (if webhook configured)
if [ -n "$SLACK_WEBHOOK" ]; then
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"🚨 $subject\n$message\"}" \
"$SLACK_WEBHOOK"
fi
}
# Scan for BOM issues
echo "$(date): Starting BOM monitoring scan..." >> "$LOG_FILE"
ISSUES_FOUND=false
for project in "$PROJECTS_DIR"/*; do
if [ -d "$project" ]; then
project_name=$(basename "$project")
# Check for BOM issues
if ! bom --dry-run "$project" &> /dev/null; then
ISSUES_FOUND=true
send_alert "BOM Issues Detected in $project_name" \
"BOM or encoding issues found in project: $project_name\nPath: $project\nPlease run 'bom --verbose $project' to fix."
fi
fi
done
if [ "$ISSUES_FOUND" = false ]; then
echo "$(date): No BOM issues detected across all projects" >> "$LOG_FILE"
fi7.3 Integration with Build Tools
7.3.1 Makefile Integration
# Makefile with BOM cleaning targets
.PHONY: clean-bom check-bom install-bom help
# Default target
help:
@echo "Available targets:"
@echo " install-bom - Install BOM cleaner tool"
@echo " check-bom - Check for BOM/encoding issues"
@echo " clean-bom - Clean BOM/encoding issues"
@echo " test-clean - Run tests after cleaning"
# Install BOM cleaner
install-bom:
@if ! command -v bom &> /dev/null; then \
echo "Installing BOM cleaner..."; \
sudo curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh; \
sudo chmod +x /usr/local/bin/bom; \
echo "✅ BOM cleaner installed"; \
else \
echo "✅ BOM cleaner already installed"; \
fi
# Check for issues without fixing
check-bom: install-bom
@echo "🔍 Checking for BOM/encoding issues..."
@bom --dry-run --verbose
# Clean BOM issues
clean-bom: install-bom
@echo "🧹 Cleaning BOM/encoding issues..."
@bom --verbose
# Run tests after cleaning
test-clean: clean-bom
@echo "🧪 Running tests after cleanup..."
@composer test
# Pre-commit hook simulation
pre-commit: check-bom
@if ! bom --dry-run &> /dev/null; then \
echo "❌ BOM issues detected. Run 'make clean-bom' first."; \
exit 1; \
fi
@echo "✅ No BOM issues detected"
# Build target with cleaning
build: clean-bom
@echo "🏗️ Building project..."
@composer install --no-dev --optimize-autoloader
@npm run build
# Deploy target
deploy: build test-clean
@echo "🚀 Deploying..."
# Add deployment commands here7.3.2 npm Scripts Integration
package.json:
{
"scripts": {
"prebuild": "bom --verbose",
"build": "webpack --mode production",
"pretest": "bom --dry-run",
"test": "jest",
"clean:bom": "bom --verbose",
"check:bom": "bom --dry-run",
"lint": "eslint . && bom --dry-run",
"prepare": "husky install",
"postinstall": "bom --dry-run || echo 'Warning: BOM issues detected'"
},
"husky": {
"hooks": {
"pre-commit": "npm run check:bom && lint-staged"
}
}
}7.3.3 Composer Scripts (PHP)
composer.json:
{
"scripts": {
"clean-bom": "bom --verbose",
"check-bom": "bom --dry-run",
"pre-autoload-dump": "@check-bom",
"post-install-cmd": [
"@check-bom"
],
"post-update-cmd": [
"@clean-bom"
],
"test": [
"@check-bom",
"phpunit"
]
},
"scripts-descriptions": {
"clean-bom": "Clean BOM and line ending issues",
"check-bom": "Check for BOM and encoding issues without fixing"
}
}8 · Troubleshooting Common Issues
8.1 Permission and Access Problems
8.1.1 Permission Denied Errors
Problem: Cannot write to file: protected.php
Diagnosis:
# Check file permissions
ls -la protected.php
# Check directory permissions
ls -la $(dirname protected.php)
# Check file ownership
stat protected.phpSolutions:
# Solution 1: Fix file permissions
chmod 644 protected.php
# Solution 2: Fix ownership (if you're the owner)
chown $USER:$GROUP protected.php
# Solution 3: Run with sudo (use carefully)
sudo bom --verbose
# Solution 4: Fix directory permissions
chmod 755 $(dirname protected.php)8.1.2 SELinux and ACL Issues
Problem: Permission denied despite correct ownership/permissions
Diagnosis:
# Check SELinux context
ls -Z file.php
# Check ACLs
getfacl file.php
# Check SELinux denials
ausearch -m AVC -ts recentSolutions:
# Fix SELinux context
restorecon -v file.php
# Or set appropriate context
chcon -t httpd_exec_t file.php
# Fix ACLs if needed
setfacl -m u:$USER:rw file.php8.2 Large File Handling
8.2.1 File Size Limitations
Problem: Files larger than 100MB are skipped
Diagnosis:
# Find large files
find . -name "*.php" -size +100M -exec ls -lh {} \;
# Check memory usage during processing
top -p $(pgrep clean-bom-senior)Solutions:
# Solution 1: Process large files individually
bom large-file.php
# Solution 2: Modify MAX_FILE_SIZE in script
# Edit clean-bom-senior.sh:
# MAX_FILE_SIZE=$((500 * 1024 * 1024)) # 500MB
# Solution 3: Split large files if possible
split -l 10000 huge-file.php huge-file-part-
# Solution 4: Use streaming approach for very large files
sed '1s/^\xEF\xBB\xBF//' huge-file.php > huge-file-clean.php8.2.2 Memory Management
Problem: Out of memory errors on large files
Solution with streaming:
#!/bin/bash
# stream-bom-cleaner.sh - Memory-efficient BOM removal
process_large_file() {
local file="$1"
local temp=$(mktemp)
# Check if file starts with BOM
if od -An -tx1 -N3 "$file" | tr -d ' ' | grep -q '^efbbbf$'; then
echo "Processing large file: $file"
# Remove BOM using dd (memory efficient)
dd if="$file" of="$temp" bs=1 skip=3 2>/dev/null
# Replace original file
mv "$temp" "$file"
echo "BOM removed from: $file"
else
echo "No BOM in: $file"
rm -f "$temp"
fi
}
# Usage
process_large_file "$1"8.3 Character Encoding Issues
8.3.1 Mixed Encoding Detection
Problem: Files with mixed encodings causing corruption
Diagnosis:
# Detect file encoding
file -bi file.php
# Check for non-UTF-8 characters
iconv -f utf-8 -t utf-8 file.php > /dev/null || echo "Encoding issues detected"
# Visual inspection with cat -v
cat -v file.php | head -20Solutions:
# Solution 1: Convert to UTF-8
iconv -f iso-8859-1 -t utf-8 file.php -o file.php.utf8
mv file.php.utf8 file.php
# Solution 2: Auto-detect and convert
chardet file.php # Install with: pip install chardet
# Then convert based on detected encoding
# Solution 3: Force UTF-8 conversion (lossy)
iconv -f utf-8 -t utf-8 -c file.php -o file.php.clean8.3.2 Invisible Character Problems
Problem: Files appear clean but still cause issues
Diagnosis:
# Show all bytes including invisible ones
hexdump -C file.php | head -5
# Check for various Unicode BOMs
od -An -tx1 -N10 file.php
# Look for other invisible characters
cat -A file.php | head -10Solutions:
# Remove all invisible characters (aggressive)
tr -cd '\11\12\15\40-\176' < file.php > file.php.clean
# Remove specific problematic characters
sed 's/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]//g' file.php > file.php.clean
# Use bom tool with verbose output to see what's found
bom --verbose file.php8.4 Git Integration Issues
8.4.1 Git Hook Failures
Problem: Pre-commit hooks failing silently
Diagnosis:
# Test hook manually
.git/hooks/pre-commit
# Check hook permissions
ls -la .git/hooks/pre-commit
# Debug with set -x
sed -i '1a set -x' .git/hooks/pre-commitSolutions:
# Fix permissions
chmod +x .git/hooks/pre-commit
# Add error handling to hook
#!/bin/bash
set -euo pipefail # Exit on errors
# Add logging
exec > >(tee -a /tmp/pre-commit.log) 2>&18.4.2 Line Ending Confusion
Problem: Git reports changes but files look identical
Diagnosis:
# Check line endings
file file.php
# Show line endings visually
cat -e file.php
# Check Git's line ending handling
git config --get core.autocrlf
git config --get core.eolSolutions:
# Normalize line endings
git add --renormalize .
# Fix Git configuration
git config core.autocrlf false
git config core.eol lf
# Update .gitattributes
echo "* text eol=lf" > .gitattributes8.5 Performance Optimization
8.5.1 Slow Processing on Large Codebases
Problem: BOM cleaning takes too long on large projects
Optimization strategies:
# Strategy 1: Parallel processing
find . -name "*.php" -type f -print0 | xargs -0 -P 4 -I {} bom {}
# Strategy 2: Skip clean files faster
#!/bin/bash
# fast-bom-check.sh - Skip obviously clean files
for file in $(find . -name "*.php" -type f); do
# Quick check - only process if BOM detected
if od -An -tx1 -N3 "$file" 2>/dev/null | tr -d ' ' | grep -q '^efbbbf$'; then
bom "$file"
fi
done
# Strategy 3: Process only recently modified files
find . -name "*.php" -mtime -7 -exec bom {} \;
# Strategy 4: Use exclude patterns
bom --verbose --exclude="vendor/*" --exclude="node_modules/*"8.5.2 I/O Optimization
Problem: Disk I/O bottlenecks
Solutions:
# Use RAM disk for temporary files
sudo mount -t tmpfs -o size=1G tmpfs /tmp/bom-work
export TMPDIR=/tmp/bom-work
# Process files in batches
find . -name "*.php" -print0 | xargs -0 -n 50 bom
# Use SSD for temporary operations
export TMPDIR=/fast/ssd/temp9 · Enterprise Deployment Considerations
9.1 Security and Compliance
9.1.1 Security Best Practices
Principle of Least Privilege:
# Create dedicated user for BOM operations
sudo useradd -r -s /bin/false bom-cleaner
# Set up sudo rules for limited access
echo "devops ALL=(bom-cleaner) NOPASSWD: /usr/local/bin/bom" >> /etc/sudoers.d/bom-cleaner
# Use in scripts
sudo -u bom-cleaner bom --verbose /var/www/projectAudit Trail:
# Enable detailed logging
export BOM_LOG_LEVEL=DEBUG
export BOM_LOG_FILE="/var/log/bom-operations.log"
# Log all operations with user context
logger -t bom-cleaner "User $USER executed BOM cleanup on $(pwd)"
# Integrate with centralized logging (syslog)
bom --verbose 2>&1 | logger -t bom-cleaner9.1.2 Compliance Requirements
File Integrity Monitoring:
#!/bin/bash
# fim-bom-integration.sh - AIDE/Tripwire integration
# Create checksums before cleaning
find /var/www -name "*.php" -exec sha256sum {} \; > /tmp/checksums-before
# Perform BOM cleaning
bom --verbose /var/www
# Create checksums after cleaning
find /var/www -name "*.php" -exec sha256sum {} \; > /tmp/checksums-after
# Report changes
diff /tmp/checksums-before /tmp/checksums-after > /var/log/bom-changes.log
# Update FIM database
aide --updateSOX/PCI Compliance:
# Segregation of duties - require approval for production
if [[ "$ENVIRONMENT" == "production" ]]; then
echo "Production BOM cleaning requires approval ticket"
read -p "Enter approval ticket number: " ticket
logger -t bom-cleaner "Production cleanup authorized by ticket: $ticket"
fi
# Change management integration
curl -X POST "$CHANGE_MGMT_API/bom-cleanup" \
-H "Authorization: Bearer $TOKEN" \
-d "{\"environment\": \"$ENVIRONMENT\", \"files_processed\": $COUNT}"9.2 Monitoring and Alerting
9.2.1 Health Checks and Monitoring
Nagios/Icinga Plugin:
#!/bin/bash
# check_bom_issues.sh - Monitoring plugin
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
PROJECT_DIR="/var/www/html"
TEMP_LOG=$(mktemp)
# Run BOM check
if bom --dry-run "$PROJECT_DIR" > "$TEMP_LOG" 2>&1; then
echo "OK - No BOM issues detected"
rm -f "$TEMP_LOG"
exit $STATE_OK
else
ISSUE_COUNT=$(grep -c "Would process:" "$TEMP_LOG" 2>/dev/null || echo 0)
if [ "$ISSUE_COUNT" -gt 0 ]; then
echo "WARNING - $ISSUE_COUNT files with BOM issues detected"
rm -f "$TEMP_LOG"
exit $STATE_WARNING
else
echo "CRITICAL - BOM checker failed to run"
cat "$TEMP_LOG"
rm -f "$TEMP_LOG"
exit $STATE_CRITICAL
fi
fiPrometheus Metrics:
#!/bin/bash
# bom-metrics-exporter.sh - Export metrics for Prometheus
METRICS_FILE="/var/lib/prometheus/node-exporter/bom-status.prom"
# Run BOM check and capture metrics
BOM_OUTPUT=$(bom --dry-run --verbose 2>&1)
FILES_WITH_BOM=$(echo "$BOM_OUTPUT" | grep -c "Would process:" || echo 0)
TOTAL_FILES=$(echo "$BOM_OUTPUT" | grep -o "Files processed: [0-9]*" | awk '{print $3}' || echo 0)
# Export metrics
cat > "$METRICS_FILE" << EOF
# HELP bom_files_with_issues Number of files with BOM issues
# TYPE bom_files_with_issues gauge
bom_files_with_issues $FILES_WITH_BOM
# HELP bom_total_files_checked Total number of files checked
# TYPE bom_total_files_checked gauge
bom_total_files_checked $TOTAL_FILES
# HELP bom_last_check_timestamp Unix timestamp of last check
# TYPE bom_last_check_timestamp gauge
bom_last_check_timestamp $(date +%s)
EOF9.2.2 Integration with Monitoring Systems
Grafana Dashboard Configuration:
{
"dashboard": {
"title": "BOM Cleanup Monitoring",
"panels": [
{
"title": "Files with BOM Issues",
"type": "stat",
"targets": [
{
"expr": "bom_files_with_issues",
"legendFormat": "Files with Issues"
}
]
},
{
"title": "BOM Issues Over Time",
"type": "graph",
"targets": [
{
"expr": "bom_files_with_issues",
"legendFormat": "BOM Issues"
}
]
}
]
}
}ELK Stack Integration:
# Filebeat configuration for BOM logs
# /etc/filebeat/conf.d/bom.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/bom-operations.log
fields:
logtype: bom-cleaner
multiline.pattern: '^\[\d{4}-\d{2}-\d{2}'
multiline.negate: true
multiline.match: after
output.elasticsearch:
hosts: ["elasticsearch:9200"]
index: "bom-cleaner-%{+yyyy.MM.dd}"
# Logstash filter
filter {
if [fields][logtype] == "bom-cleaner" {
grok {
match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp}\] \[%{WORD:level}\] %{GREEDYDATA:message_text}" }
}
}
}9.3 Disaster Recovery and Backup
9.3.1 Backup Strategy
Pre-processing Backup:
#!/bin/bash
# enterprise-bom-cleanup.sh - With enterprise backup
PROJECT_DIR="$1"
BACKUP_DIR="/backups/bom-cleanup/$(date +%Y%m%d-%H%M%S)"
LOG_FILE="/var/log/bom-enterprise.log"
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Function to log with timestamp
log_message() {
echo "$(date '+%Y-%m-%d %H:%M:%S') $*" | tee -a "$LOG_FILE"
}
# Create full backup before processing
log_message "Creating backup before BOM cleanup..."
tar -czf "$BACKUP_DIR/pre-bom-cleanup.tar.gz" "$PROJECT_DIR"
# Verify backup
if tar -tzf "$BACKUP_DIR/pre-bom-cleanup.tar.gz" > /dev/null 2>&1; then
log_message "Backup verified successfully"
else
log_message "ERROR: Backup verification failed"
exit 1
fi
# Perform BOM cleanup with detailed logging
log_message "Starting BOM cleanup process..."
if bom --verbose "$PROJECT_DIR" 2>&1 | tee -a "$LOG_FILE"; then
log_message "BOM cleanup completed successfully"
# Create post-cleanup snapshot
tar -czf "$BACKUP_DIR/post-bom-cleanup.tar.gz" "$PROJECT_DIR"
# Calculate differences
log_message "Calculating cleanup differences..."
diff -r "$PROJECT_DIR" <(tar -xzf "$BACKUP_DIR/pre-bom-cleanup.tar.gz" -O) > "$BACKUP_DIR/changes.diff"
else
log_message "ERROR: BOM cleanup failed, restoring from backup..."
tar -xzf "$BACKUP_DIR/pre-bom-cleanup.tar.gz" -C "$(dirname "$PROJECT_DIR")"
exit 1
fi
# Cleanup old backups (keep last 30 days)
find /backups/bom-cleanup -type d -mtime +30 -exec rm -rf {} \;
log_message "Enterprise BOM cleanup process completed"9.3.2 Recovery Procedures
Automated Recovery Script:
#!/bin/bash
# bom-recovery.sh - Disaster recovery for BOM cleanup
BACKUP_DIR="/backups/bom-cleanup"
PROJECT_DIR="$1"
RECOVERY_DATE="$2" # Format: YYYYMMDD-HHMMSS
usage() {
echo "Usage: $0 <project_directory> [recovery_date]"
echo "If recovery_date not specified, uses latest backup"
exit 1
}
[ -z "$PROJECT_DIR" ] && usage
# Find backup to restore
if [ -n "$RECOVERY_DATE" ]; then
BACKUP_FILE="$BACKUP_DIR/$RECOVERY_DATE/pre-bom-cleanup.tar.gz"
else
BACKUP_FILE=$(find "$BACKUP_DIR" -name "pre-bom-cleanup.tar.gz" | sort | tail -1)
fi
if [ ! -f "$BACKUP_FILE" ]; then
echo "ERROR: Backup file not found: $BACKUP_FILE"
exit 1
fi
echo "Recovering from backup: $BACKUP_FILE"
# Verify backup integrity
if ! tar -tzf "$BACKUP_FILE" > /dev/null 2>&1; then
echo "ERROR: Backup file is corrupted"
exit 1
fi
# Create current state backup before recovery
EMERGENCY_BACKUP="$BACKUP_DIR/emergency-$(date +%Y%m%d-%H%M%S).tar.gz"
echo "Creating emergency backup: $EMERGENCY_BACKUP"
tar -czf "$EMERGENCY_BACKUP" "$PROJECT_DIR"
# Perform recovery
echo "Restoring from backup..."
tar -xzf "$BACKUP_FILE" -C "$(dirname "$PROJECT_DIR")"
echo "Recovery completed. Emergency backup saved to: $EMERGENCY_BACKUP"9.4 Multi-Environment Management
9.4.1 Environment-Specific Configurations
Configuration Management:
# /etc/bom-cleaner/config.env
# Environment-specific BOM cleaner configuration
case "$ENVIRONMENT" in
"production")
BOM_REQUIRE_APPROVAL=true
BOM_BACKUP_RETENTION=90 # days
BOM_LOG_LEVEL=INFO
BOM_NOTIFICATION_WEBHOOK="$PROD_SLACK_WEBHOOK"
;;
"staging")
BOM_REQUIRE_APPROVAL=false
BOM_BACKUP_RETENTION=30
BOM_LOG_LEVEL=DEBUG
BOM_NOTIFICATION_WEBHOOK="$STAGING_SLACK_WEBHOOK"
;;
"development")
BOM_REQUIRE_APPROVAL=false
BOM_BACKUP_RETENTION=7
BOM_LOG_LEVEL=DEBUG
BOM_NOTIFICATION_WEBHOOK=""
;;
esac
export BOM_REQUIRE_APPROVAL BOM_BACKUP_RETENTION BOM_LOG_LEVEL BOM_NOTIFICATION_WEBHOOK9.4.2 Deployment Pipeline Integration
Ansible Playbook:
---
- name: Deploy BOM Cleaner Enterprise
hosts: web_servers
become: yes
vars:
bom_version: "v2.06.4"
bom_install_path: "/usr/local/bin/bom"
tasks:
- name: Install BOM cleaner
get_url:
url: "https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh"
dest: "{{ bom_install_path }}"
mode: '0755'
owner: root
group: root
- name: Create BOM cleaner config directory
file:
path: /etc/bom-cleaner
state: directory
mode: '0755'
- name: Deploy environment-specific configuration
template:
src: config.env.j2
dest: /etc/bom-cleaner/config.env
mode: '0644'
- name: Install cron job for regular cleanup
cron:
name: "Weekly BOM cleanup"
minute: "0"
hour: "3"
weekday: "0"
job: "source /etc/bom-cleaner/config.env && {{ bom_install_path }} --verbose /var/www/html"
user: root
- name: Create log rotation config
template:
src: bom-cleaner.logrotate.j2
dest: /etc/logrotate.d/bom-cleaner
mode: '0644'Terraform Infrastructure:
# BOM cleaner infrastructure as code
resource "aws_instance" "bom_cleaner" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
user_data = <<-EOF
#!/bin/bash
apt-get update
curl -Lo /usr/local/bin/bom https://github.com/paulmann/Clean_BOM_Senior/raw/main/clean-bom-senior.sh
chmod +x /usr/local/bin/bom
# Install monitoring agent
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon-linux/amd64/latest/amazon-cloudwatch-agent.rpm
rpm -U ./amazon-cloudwatch-agent.rpm
EOF
tags = {
Name = "bom-cleaner-${var.environment}"
Environment = var.environment
}
}
# CloudWatch monitoring for BOM operations
resource "aws_cloudwatch_log_group" "bom_cleaner" {
name = "/aws/ec2/bom-cleaner"
retention_in_days = 30
}10 · Glossary
Atomic Operation[^glossary-atomic]
: A file operation that either completes entirely or fails completely, ensuring data integrity. In the context of BOM cleaning, this means files are either successfully cleaned or left unchanged, with no partial modifications.
BOM (Byte Order Mark)[^glossary-bom]
: A special Unicode character sequence that appears at the beginning of a text file to indicate byte order and encoding. In UTF-8 files, BOM appears as three bytes: EF BB BF (hexadecimal). While optional in UTF-8, BOM can cause issues in PHP and other languages.
Character Encoding[^glossary-encoding]
: The method used to represent characters as bytes in computer files. Common encodings include UTF-8, UTF-16, ASCII, and ISO-8859-1. UTF-8 is the standard for web development and modern applications.
CI/CD (Continuous Integration/Continuous Deployment)[^glossary-cicd]
: Development practices that involve frequent code integration and automated deployment pipelines. BOM cleaning is often integrated into these pipelines to ensure code quality.
CRLF (Carriage Return + Line Feed)[^glossary-crlf]
: The Windows-style line ending sequence (\r\n), contrasted with Unix-style LF-only endings (\n). Mixed line endings can cause version control and deployment issues.
DevOps[^glossary-devops]
: A set of practices that combines software development and IT operations, emphasizing automation, collaboration, and continuous improvement. BOM cleaning automation is a common DevOps practice.
Dry Run Mode[^glossary-dryrun]
: An execution mode where operations are simulated without making actual changes. Useful for previewing what would be modified before committing to changes.
File Attributes[^glossary-attributes]
: Metadata associated with files including ownership (UID/GID), permissions (mode), timestamps, and extended attributes like SELinux labels.
Git Hooks[^glossary-githooks]
: Scripts that run automatically at certain points in the Git workflow (e.g., pre-commit, pre-push). Often used to enforce coding standards including BOM checking.
Hexadecimal[^glossary-hex]
: A base-16 number system using digits 0-9 and letters A-F. Commonly used to represent byte values in files. BOM appears as «EF BB BF» in hexadecimal.
IDE (Integrated Development Environment)[^glossary-ide]
: Software applications that provide comprehensive facilities for software development, including code editors, debuggers, and build tools. Examples include VS Code, PhpStorm, and Sublime Text.
Process Substitution[^glossary-process-sub]
: A bash feature that allows the output of a command to be treated as a file. Uses syntax like < <(command). Critical for preserving variable changes in loops, unlike pipes which create subshells.
Rollback[^glossary-rollback]
: The process of reverting changes when an operation fails. BOM cleaning tools often create backups and can restore original files if errors occur.
Subshell[^glossary-subshell]
: A separate instance of the shell created when using pipes or certain other constructs. Variable changes in subshells don’t affect the parent shell, which was a key issue fixed in clean-bom-senior.sh v2.06.4.
UTF-8[^glossary-utf8]
: A variable-width character encoding for Unicode. The standard encoding for web content and modern applications. Can optionally include a BOM, though this is not recommended for most use cases.
This comprehensive guide provides enterprise-grade solutions for BOM cleaning in PHP and other web development contexts. For the latest updates and community contributions, visit the Clean BOM Senior project on GitHub.
Document Version: 2.0
Last Updated: September 28, 2025
Status: Production Ready
Sources
- https://www.w3.org/International/questions/qa-byte-order-mark
- https://alastaira.wordpress.com/2011/06/07/php-and-utf-8-bom-or-why-do-my-webpages-start-with/
- https://www.php.net/manual/en/function.mb-detect-encoding.php
- https://www.honeybadger.io/blog/php-character-encoding-unicode-utf8-ascii/
- https://www.php.net/manual/en/language.namespaces.definition.php
- https://bugs.php.net/74339
- https://www.php-fig.org/psr/psr-1/
- https://stackoverflow.com/questions/53376444/getting-fatal-error-while-i-am-trying-to-use-the-declarestrict-types-1-on-my
- https://php-errors.readthedocs.io/en/latest/messages/strict_types-declaration-must-be-the-very-first-statement-in-the-script.html
- https://laracasts.com/discuss/channels/laravel/namespace-declaration-statement-has-to-be-the-very-first-statement-or-after-any-declare-call-in-the-script-2
- https://stackoverflow.com/questions/5601904/encoding-a-string-as-utf-8-with-bom-in-php
- https://stackoverflow.com/questions/2558172/utf-8-bom-signature-in-php-files
- https://blog.somewhatabstract.com/2014/10/06/drop-the-bom-a-case-study-of-json-corruption-in-wordpress/
- https://anupamsaha.wordpress.com/2011/08/02/detecting-utf-byte-order-mark-using-php/
- https://community.dynamics.com/forums/thread/details/?threadid=28cd1f8f-2696-4f34-b098-e7d1d773cc2d
- https://www.ghisler.ch/board/viewtopic.php?t=73747
- https://stackoverflow.com/questions/8432584/how-can-i-make-notepad-to-save-text-in-utf-8-without-the-bom
- https://www.hesk.com/knowledgebase/?article=87
- https://forum.farmanager.com/viewtopic.php?t=12647
- https://stackoverflow.com/questions/10290849/how-to-remove-multiple-utf-8-bom-sequences
- https://csv.thephpleague.com/9.0/connections/bom/
- https://stackoverflow.com/questions/31983528/php-remove-bom-when-requiring-a-php-file
- https://github.com/emrahgunduz/bom-cleaner
- https://topic.alibabacloud.com/a/php-realizes-automatic-detection-and-font-classtopic-s-color00c1deremovalfont-of-bom-of-utf-8-file_1_34_33114522.html
- https://www.php-fig.org/psr/psr-12/
- https://dcblog.dev/php-strict-types-vs-weak-types-when-and-how-to-use-declarestrict-types1
- https://stackoverflow.com/questions/21433086/fatal-error-namespace-declaration-statement-has-to-be-the-very-first-statement
- https://www.php.net/manual/en/control-structures.declare.php
- https://github.com/kaitai-io/kaitai_struct/issues/637
- https://inspector.dev/why-use-declarestrict_types1-in-php-fast-tips/
- https://bugs.php.net/bug.php?id=48127
- https://bugs.php.net/bug.php?id=78043&edit=1
- https://cs.symfony.com/doc/rules/index.html
- https://github.com/PHP-CS-Fixer/PHP-CS-Fixer/issues/6872
- https://forum.codeigniter.com/showthread.php?tid=81750
- https://flowframework.readthedocs.io/en/8.3/TheDefinitiveGuide/PartV/CodingGuideLines/PHP.html
- https://neos.readthedocs.io/en/8.3/References/CodingGuideLines/PHP.html
- https://webmonkeyuk.wordpress.com/2011/04/23/how-to-avoid-character-encoding-problems-in-php/
- https://www.sitepoint.com/community/t/text-file-encoding-problem/39413
- https://www.sphinx-solution.com/blog/php-development-tools/
- https://www.sitepoint.com/community/t/detect-and-remove-bom/3183
- https://stackoverflow.com/questions/825939/php-character-encoding-problems
- https://www.admin-magazine.com/Articles/Automation-Scripting-with-PHP
- https://github.com/wp-cli/wp-cli/issues/5578
- https://www.reddit.com/r/PHPhelp/comments/t2ia34/using_php_to_fix_some_character_encoding_issues/
- https://engineering.teknasyon.com/the-what-why-how-guide-of-php-code-quality-tools-6eaa6406859
- https://www.php.net/manual/en/ref.mbstring.php
- https://groups.google.com/g/php-fig/c/fGDuNFKy390
- https://ashallendesign.co.uk/blog/using-declare-strict_types-1-for-more-robust-php-code
- https://inspector.dev/declarestricttypes1-in-laravel/
- https://dev.to/inspector/why-use-declarestricttypes1-in-php-fast-tips-3c1
- https://bdthemes.com/5-most-common-wordpress-errors-and-how-to-fix-them/