ReSounder Logo

ReSounder

A lightweight spectrogram inverter by NQR

🖼️ Spectrogram Input

or Ctrl+V to paste
×

Tips & Troubleshooting

    Best tip for future troubleshooting: experiment until you gain an intuition about common issues.
  1. Speech sounds distorted or "robotic" Symptom: Vowels are smeared, consonants feel metallic or synthetic. Likely cause: Frequency axis scale does not match original spectrogram. Fix: Try switching Frequency Scale (e.g., Linear ? Log). Ensure Min/Max Frequency align with the original signal's content.
  2. Audio is the wrong pitch (too high or too low) Symptom: Everything sounds like chipmunks or deep giants. Likely cause: Frequency bounds do not match the image's real spectral range. The height of the spectrogram determines the musical pitch mapping. Fix: Increase Max Frequency if the audio sounds too deep. Decrease Min Frequency if the audio sounds too high.
  3. Audio timing seems stretched or compressed Symptom: Speech rate or tempo sounds wrong. Likely cause: Horizontal scale mismatch. Time axis scaling changes the actual duration mapping. Fix: Adjust Assumed Duration (sec) to match original export. Re-render preview with new value before reconstruction.
  4. Audio is just loud impulsive noise Symptom: Energy bursts appear instead of smooth tonal content. Likely cause: FFT size too small for the image's time resolution. If the time window is too short, harmonic structure collapses into broadband transients. Fix: Increase FFT Size to improve frequency resolution. Reduce Assumed Duration (sec) if needed to balance performance. Ideal FFT sizes for speech sampled at 44.1-48 kHz are around 2048 and 4096.
  5. Audio sounds "sing-songy" Symptom: Sustained tones rise and fall unnaturally, creating a melodic lilt that was not present in the original audio. Likely cause: The FFT size is too large, causing excessive smoothing over time and smearing rapid changes in pitch and articulation. Fix: Reduce FFT Size to improve time resolution, preserving more natural speech and transient detail. Ideal FFT sizes for speech sampled at 44.1-48 kHz are around 2048 and 4096.
  6. Audio feels muffled or missing detail Symptom: High-frequency or Low-frequency content is dull or absent, or not enough detail in spectrogram. Likely cause: The spectrogram's maximum frequency set too low, or minimum frequency set too high, or spectrogram height too small, or FFT size is too small, or Noise Floor (dB) is too small. Nonlinear scales compress/stretch near the top/bottom of spectrograms. Fix if you didn't create the spectrogram: Check if the frequency scale was exported as log but decoded linear. Increase Noise Floor (dB). Fix if you created the spectrogram: Raise the maximum frequency, or lower the minimum frequency, or increase FFT size, or increase image height. Use Linear scale instead of Log, Mel or Bark to maximize detail in the spectrogram.
  7. Audio sounds like white noise Symptom: Audio sounds like white/broadband noise though the waveform appears correct. Likely cause: The colormap is being interpreted in the wrong orientation. Fix: Toggle Invert Colors to correct the intensity mapping.
  8. Audio is too quiet or fades into silence Symptom: Everything is faint though the waveform appears correct. Likely cause: Too wide a Dynamic Range or incorrect intensity normalization. Very low pixel values correspond to extreme attenuation in dB space. Fix: Reduce Dynamic Range (dB). Increase Noise Floor (dB). Ensure color inversion (if applied) matches original export. Increase Pre-gain or Post-gain.
  9. Audio has a high-pitched hissing floor Symptom: Constant noise underlying all playback. Likely cause: Noise pixels mapped to non-zero magnitude during dB-to-linear conversion. Small values near the noise floor get amplified in reconstruction. Fix: Decrease Noise Floor (dB). Increase Dynamic Range (dB) slightly.
  10. Audio has "underwater" / "phasing" artifacts Symptom: Warbly, chorus-like sound. Likely cause: Phase propagation struggling due to abrupt changes. Fix: Ensure FFT Size & Assumed Duration (sec) is close to original. Unfortunately, this is the main symptom of phase reconstruction and can't be entirely removed.

⚙️ Basic Settings

🎨 Image Interpretation

🔄 Phase Reconstruction

✨ Post-Processing (Not recommended)

🎵 Inverted Audio

WARNING: Lower your volume before processing. Inversion may produce unexpectedly loud output.
Ready.
×

MIT License

Copyright (c) 2025 NQR

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.