PyTorch reproduction of TensorFlow paper underperforms by 4 pp on DermaMNIST , what cross-framework issues should I check? [R]

I'm reproducing a published paper's hybrid Gabor + CNN architecture in PyTorch. The original implementation is in TensorFlow. My reproduction consistently lands ~4 pp below the paper's reported test accuracy on DermaMNIST (73-74% vs paper's 77.01%). I'd like to know which cross-framework differences are most likely to cause this gap.

Ahmed et al., "A Lightweight Hybrid Gabor Deep Learning Approach", IJCV 2026 (DOI: 10.1007/s11263-025-02658-2). The architecture is a fixed Gabor filter bank front-end followed by a small CNN with one SE block, one residual block, and three FC layers. ~340k parameters total. I've already tried Different sigma_factor values (1.0 vs 1.2) and Multiple random seeds (42, 0, 123) and tried diffrent sigma valyes of the lpf and hpf channels but its didnt close the gap.

please any idea on how to at least get a 76% to match the paper because i wanted to add improvements to see the diffrence, i would really appreciate it on how to fix this problem or any advice on what to do.

also here is just example of one epoch i have noticed that the test accuracy is lower than the validation accuracy: im i doing something wrong

[ 47/100] Train: 75.70% Val: 76.07% Best: 76.97% Loss: 0.6827 [paper] test acc = 0.7382

Code example:

python

class FixedGaborFrontEnd(nn.Module): def __init__(self, scales=(0.10, 0.20, 0.40), orientations=(4, 4, 4), sigma_factor=1.0, input_size=224, output_size=56): super().__init__() # Build Gabor parameters (fixed buffers, not learnable) sigmas, thetas, freqs, kernel_sizes = [], [], [], [] for f, o in zip(scales, orientations): sigma = sigma_factor / (math.pi * f) N = 2 * int(math.floor(3 * sigma)) + 1 for k in range(o): sigmas.append(sigma) thetas.append(math.pi * k / o) freqs.append(f) kernel_sizes.append(N) # ... build real/imag kernels with zero-mean + L2 normalization ... def forward(self, x): # Convert RGB to grayscale if x.shape[1] != 1: x = 0.299 * x[:, 0:1] + 0.587 * x[:, 1:2] + 0.114 * x[:, 2:3] real = F.conv2d(x, self.real_kernels, padding=self.max_kernel_size // 2) imag = F.conv2d(x, self.imag_kernels, padding=self.max_kernel_size // 2) magnitude = torch.sqrt(real ** 2 + imag ** 2 + 1e-8) lpf = F.conv2d(x, self.lpf_kernel, padding=self.lpf_pad) hpf = F.conv2d(x, self.hpf_kernel, padding=self.hpf_pad) feats = torch.cat([magnitude, lpf, hpf], dim=1) feats = F.avg_pool2d(feats, 4, 4) # 224 → 56 return feats # Standard backbone follows: SE → Conv-BN-ReLU → MaxPool → ResBlock → Dropout → GAP → FC × 3 optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5

submitted by /u/Plane_Stick8394
[link] [comments]

PyTorch reproduction of TensorFlow paper underperforms by 4 pp on DermaMNIST , what cross-framework issues should I check? [R]

Want to read more?

Tagged with