Read TODO
📚 Priority Reading Queue
1. Weight Noise Injection-Based MLPs With Group Lasso Penalty: Asymptotic Convergence and Application to Node Pruning
- Authors: Wang J, Chang Q, Chang Q, Liu Y, Pal NR
- Journal: IEEE Transactions on Cybernetics, 2019
- Volume/Issue: Vol. 49, No. 12, pp. 4346-4364
- DOI: 10.1109/TCYB.2018.2864142
- Key Focus:
- Shows L2 weight decay inadequacy for sparse solutions
- Proposes group lasso as regularizer alternative
- Node pruning applications for fault-tolerant MLPs
- Status: ⏳ To Read
- Notes: Key paper showing why traditional weight decay fails for sparsity
2. A Novel Pruning Algorithm for Smoothing Feedforward Neural Networks Based on Group Lasso Method
- Authors: Wang J, Xu C, Yang X, Zurada JM
- Journal: IEEE Transactions on Neural Networks and Learning Systems
- Year: 2018
- Volume/Issue: Vol. 29, No. 5, pp. 2012-2024
- DOI: 10.1109/TNNLS.2017.2748585
- Key Focus:
- Four new backpropagation variants using Group Lasso
- Smoothing functions to handle non-differentiability
- Direct comparison with Weight Decay, Weight Elimination
- Status: ⏳ To Read
- Notes: Comprehensive comparison with traditional weight decay methods
3. Group Sparse Regularization for Deep Neural Networks
- Authors: Scardapane S, Comminiello D, Hussain A, Uncini A
- Conference/Journal: ArXiv preprint
- Year: 2016
- ArXiv ID: 1607.00485
- Key Focus:
- Joint optimization of weights, neuron count, and feature selection
- Group Lasso penalty for network connections
- Extensive comparison with classical weight decay
- Status: ⏳ To Read
- Notes: Foundational paper on group sparse regularization vs weight decay
📝 Reading Notes Template
For each paper, capture:
- Main Contribution: How does it extend/replace weight decay?
- Methodology: What specific regularization technique is proposed?
- Experimental Setup: What baselines are compared?
- Key Results: Performance vs traditional weight decay
- Theoretical Insights: Why does the proposed method work better?
- Implementation Details: Any code or algorithmic specifics
- Future Directions: What questions does this raise?
🔍 Key Questions to Address
- Fundamental Question: Why does traditional L2 weight decay fail for structured sparsity?
- Methodological: How do group-based penalties differ from element-wise penalties?
- Practical: What are the computational trade-offs between methods?
- Theoretical: What convergence guarantees exist for these approaches?
✅ Completion Checklist
- Paper 1: Weight Noise Injection-Based MLPs
- Paper 2: Novel Pruning Algorithm for Smoothing
- Paper 3: Group Sparse Regularization for DNNs
- Synthesis: Write summary comparing all three approaches
- Implementation: Try reproducing key results from one paper
Last Updated: Add date when you start reading
Priority: High - Core understanding of weight decay limitations in structured sparsity