% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/modulationSpectrum.R
\name{modulationSpectrum}
\alias{modulationSpectrum}
\title{Modulation spectrum}
\usage{
modulationSpectrum(
  x,
  samplingRate = NULL,
  maxDur = 5,
  logSpec = FALSE,
  windowLength = 25,
  step = NULL,
  overlap = 80,
  wn = "hanning",
  zp = 0,
  power = 1,
  roughRange = c(30, 150),
  returnComplex = FALSE,
  aggregComplex = TRUE,
  plot = TRUE,
  savePath = NA,
  logWarp = NA,
  quantiles = c(0.5, 0.8, 0.9),
  kernelSize = 5,
  kernelSD = 0.5,
  colorTheme = c("bw", "seewave", "heat.colors", "...")[1],
  xlab = "Hz",
  ylab = "1/KHz",
  main = NULL,
  width = 900,
  height = 500,
  units = "px",
  res = NA,
  ...
)
}
\arguments{
\item{x}{folder, path to a wav/mp3 file, a numeric vector representing a
waveform, or a list of numeric vectors}

\item{samplingRate}{sampling rate of x (only needed if x is a numeric vector,
rather than an audio file). For a list of sounds, give either one
samplingRate (the same for all) or as many values as there are input files}

\item{maxDur}{maximum allowed duration of a single sound, s (longer sounds
are split)}

\item{logSpec}{if TRUE, the spectrogram is log-transformed prior to taking 2D
FFT}

\item{windowLength}{length of FFT window, ms}

\item{step}{you can override \code{overlap} by specifying FFT step, ms}

\item{overlap}{overlap between successive FFT frames, \%}

\item{wn}{window type: gaussian, hanning, hamming, bartlett, rectangular,
blackman, flattop}

\item{zp}{window length after zero padding, points}

\item{power}{raise modulation spectrum to this power (eg power = 2 for ^2, or
"power spectrum")}

\item{roughRange}{the range of temporal modulation frequencies that
constitute the "roughness" zone, Hz}

\item{returnComplex}{if TRUE, returns a complex modulation spectrum (without
normalization and warping)}

\item{aggregComplex}{if TRUE, aggregates complex MS from multiple inputs,
otherwise returns the complex MS of the first input (recommended when
filtering and inverting the MS of a single sound, e.g. with
\code{\link{filterSoundByMS}})}

\item{plot}{if TRUE, plots the modulation spectrum}

\item{savePath}{if a valid path is specified, a plot is saved in this folder
(defaults to NA)}

\item{logWarp}{the base of log for warping the modulation spectrum (ie log2
if logWarp = 2); set to NULL or NA if you don't want to log-warp}

\item{quantiles}{labeled contour values, \% (e.g., "50" marks regions that
contain 50\% of the sum total of the entire modulation spectrum)}

\item{kernelSize}{the size of Gaussian kernel used for smoothing (1 = no
smoothing)}

\item{kernelSD}{the SD of Gaussian kernel used for smoothing, relative to its
size}

\item{colorTheme}{black and white ('bw'), as in seewave package ('seewave'),
or any palette from \code{\link[grDevices]{palette}} such as 'heat.colors',
'cm.colors', etc}

\item{xlab, ylab, main}{graphical parameters}

\item{width, height, units, res}{parameters passed to
\code{\link[grDevices]{png}} if the plot is saved}

\item{...}{other graphical parameters passed on to \code{filled.contour.mod}
and \code{\link[graphics]{contour}} (see \code{\link{spectrogram}})}
}
\value{
Returns a list with four components:
\itemize{
\item \code{$original} modulation spectrum prior to blurring and log-warping,
but after squaring if \code{power = TRUE}, a matrix of nonnegative values.
Rownames are spectral modulation frequencies (cycles/KHz), and colnames are
temporal modulation frequencies (Hz).
\item \code{$processed} modulation spectrum after blurring and log-warping
\item \code{$roughness} proportion of energy / amplitude of the modulation
spectrum within \code{roughRange} of temporal modulation frequencies, \%
\item \code{$complex} untransformed complex modulation spectrum (returned
only if returnComplex = TRUE)
}
}
\description{
Produces a modulation spectrum of waveform(s) or audio file(s), with temporal
modulation along the X axis (Hz) and spectral modulation (1/KHz) along the Y
axis. A good visual analogy is decomposing the spectrogram into a sum of
ripples of various frequencies and directions. Algorithm: prepare a
\code{\link{spectrogram}}, take its logarithm (if \code{logSpec = TRUE}),
center, perform a 2D Fourier transform (see also spec.fft() in the "spectral"
package), take the upper half of the resulting symmetric matrix, and raise it
to \code{power}. The result is returned as \code{$original}. Roughness is
calculated as the proportion of energy / amplitude of the modulation spectrum
within \code{roughRange} of temporal modulation frequencies. By default, the
modulation matrix is then smoothed with Gaussian blur (see
\code{\link{gaussianSmooth2D}}) and log-warped (if \code{logWarp} is a
positive number) prior to plotting. This processed modulation spectrum is
returned as \code{$processed}. For multiple inputs, such as a list of
waveforms or path to a folder with audio files, the ensemble of modulation
spectra is interpolated to the same spectral and temporal resolution and
averaged. This is different from the behavior of
\code{\link{modulationSpectrumFolder}}, which produces a separate modulation
spectrum per file, without averaging.
}
\examples{
# white noise
ms = modulationSpectrum(runif(16000), samplingRate = 16000,
  logSpec = FALSE, power = TRUE)

# harmonic sound
s = soundgen()
ms = modulationSpectrum(s, samplingRate = 16000,
  logSpec = FALSE, power = TRUE)

# embellish
ms = modulationSpectrum(s, samplingRate = 16000,
  xlab = 'Temporal modulation, Hz', ylab = 'Spectral modulation, 1/KHz',
  colorTheme = 'heat.colors', main = 'Modulation spectrum', lty = 3)
\dontrun{
# Input can also be a list of waveforms (numeric vectors)
ss = vector('list', 10)
for (i in 1:length(ss)) {
  ss[[i]] = soundgen(sylLen = runif(1, 100, 1000), temperature = .4,
    pitch = runif(3, 400, 600))
}
# lapply(ss, playme)
ms1 = modulationSpectrum(ss[[1]], samplingRate = 16000)  # the first sound
dim(ms1$original)
ms2 = modulationSpectrum(ss, samplingRate = 16000)  # all 10 sounds
dim(ms2$original)

# Careful with complex MS of multiple inputs:
ms3 = modulationSpectrum(ss, samplingRate = 16000,
  returnComplex = TRUE, aggregComplex = FALSE)
dim(ms3$complex)  # complex MS of the first input only
ms4 = modulationSpectrum(ss, samplingRate = 16000,
  returnComplex = TRUE, aggregComplex = TRUE)
dim(ms4$complex)  # aggregated over inputs

# As with spectrograms, there is a tradeoff in time-frequency resolution
s = soundgen(pitch = 500, amFreq = 50, amDep = 100, samplingRate = 44100)
# playme(s, samplingRate = 44100)
ms = modulationSpectrum(s, samplingRate = 44100,
  windowLength = 50, overlap = 0)  # poor temporal resolution
ms = modulationSpectrum(s, samplingRate = 44100,
  windowLength = 5, overlap = 80)  # poor frequency resolution
ms = modulationSpectrum(s, samplingRate = 44100,
  windowLength = 15, overlap = 80)  # a reasonable compromise

# customize the plot
ms = modulationSpectrum(s, samplingRate = 44100,
  kernelSize = 17,  # more smoothing
  xlim = c(-20, 20), ylim = c(0, 4),  # zoom in on the central region
  quantiles = c(.25, .5, .8),  # customize contour lines
  colorTheme = 'heat.colors',  # alternative palette
  power = 2)                   # ^2
# NB: xlim/ylim currently won't work properly with logWarp on

# Input can be a wav/mp3 file
ms = modulationSpectrum('~/Downloads/temp/200_ut_fear-bungee_11.wav')

# Input can be path to folder with audio files (average modulation spectrum)
ms = modulationSpectrum('~/Downloads/temp', kernelSize = 11)
# NB: longer files will be split into fragments <maxDur in length

# A sound with ~3 syllables per second and only downsweeps in F0 contour
s = soundgen(nSyl = 8, sylLen = 200, pauseLen = 100, pitch = c(300, 200))
# playme(s)
ms = modulationSpectrum(s, samplingRate = 16000, maxDur = .5,
  xlim = c(-25, 25), colorTheme = 'seewave',
  power = 2)
# note the asymmetry b/c of downsweeps

# "power = 2" returns squared modulation spectrum - note that this affects
# the roughness measure!
ms$roughness
# compare:
modulationSpectrum(s, samplingRate = 16000, maxDur = .5,
  xlim = c(-25, 25), colorTheme = 'seewave', logWarp = NULL,
  power = 1)$roughness  # much higher roughness

# Plotting with or without log-warping the modulation spectrum:
ms = modulationSpectrum(soundgen(), samplingRate = 16000,
  logWarp = NA, plot = TRUE)
ms = modulationSpectrum(soundgen(), samplingRate = 16000,
  logWarp = 2, plot = TRUE)

# logWarp and kernelSize have no effect on roughness
# because it is calculated before these transforms:
modulationSpectrum(s, samplingRate = 16000, logWarp = 5)$roughness
modulationSpectrum(s, samplingRate = 16000, logWarp = NA)$roughness
modulationSpectrum(s, samplingRate = 16000, kernelSize = 17)$roughness

# Log-transform the spectrogram prior to 2D FFT (affects roughness):
ms = modulationSpectrum(soundgen(), samplingRate = 16000, logSpec = FALSE)
ms = modulationSpectrum(soundgen(), samplingRate = 16000, logSpec = TRUE)

# Complex modulation spectrum with phase preserved
ms = modulationSpectrum(soundgen(), samplingRate = 16000,
                        returnComplex = TRUE)
image(t(log(abs(ms$complex))))
}
}
\references{
\itemize{
  \item Singh, N. C., & Theunissen, F. E. (2003). Modulation spectra of
  natural sounds and ethological theories of auditory processing. The Journal
  of the Acoustical Society of America, 114(6), 3394-3411.
}
}
\seealso{
\code{\link{modulationSpectrumFolder}} \code{\link{spectrogram}}
}
