J. P. Gilliver
2023-08-04 01:24:23 UTC
OK, not strictly broadcast, but YouTube is almost a broadcast channel.
I use yt-dlp a lot, usually on default settings (which AIUI usually gets
the best available), and then an extractor set to "extract original
audio" (I use Pazera, as it's easy to be sure it's extracting original
without any further transcoding; however, it's just ffmpeg-based, and I
presume any other similar would yield the same result). [Yes, I looked
into using the audio-only settings for yt-dlp, but they didn't easily
lend themselves to batching; besides, I sometimes _do_ want the video
too - clip of an artist performing, and I want the audio-only one for
use in the car. My muscle memory of the keystrokes to extract the audio
means I can do it in seconds anyway.] I usually look at the resultant
audio - sometimes with the intention of reducing the filesize, sometimes
just out of curiosity. (I use GoldWave, but I presume almost any other
similar utility - such as Audacity - would yield similar results.)
Several observations:
1. The _vast_ majority are coded at 44100 Hz, stereo. I suppose that -
"CD quality" - is the default setting for many capture/encoding devices,
but it does seem overkill for mono material, especially of considerable
age (such as from 78s). Still, I'm not surprised. (I very occasionally
find one that _has_ been encoded mono. Though I don't think I've seen
any encoded at less than 44100 - certainly if I have, it's been
extremely rare.)
2. The _level_ is often extremely low - especially for some old (say
1960-1999) video clips. (Not all, by any means - but often enough to be
noticeable.) By low, I mean I have to boost them by ×4, or even ×8 or
occasionally ×16, to get the peaks above 50% full scale. (I only use
powers of 2 to avoid distortion.) Is this something YouTube are
imposing? Is audio level adjustment difficult on some common piece of
video capture hardware/software? I even came across one recently where
the uploader _said_ something like "this is quiet, you may have to
adjust" in the notes, so s/he knew about it. This does seem odd.
3. (This is the one that finally prompted me to post.) Far more often
than not - I'd say over 90% of tracks - there's a visible (I can no
longer hear that high) tone around 15½ kHz. I presume in the majority of
cases, it's timebase - 15625 for "PAL" (yes I know, but YKWIM), 15750
for NTSC; even where it's not from an actual video source, I presume it
has picked it up somewhere in the processing, e. g. from a computer
monitor/graphics card. This is _not_ what's puzzling me. What is, is
that the spectrum is very often brickwalled at that line: even where the
actual valid material is all below 15, 12, 11, 10, 8, or 6 kHz (you'd be
surprised how much _does_ have nothing valid above those!), and the
remainder is just uniform noise - it cuts off at the line. Can anyone
think why? It's nowhere near the Nyquist limit of 22050; I could
understand a rolloff _towards_ that to avoid aliasing, and that rolloff
being gentle to avoid other adverse effects, but no, it's brickwalled,
and at the line (which is _well_ below).
I use yt-dlp a lot, usually on default settings (which AIUI usually gets
the best available), and then an extractor set to "extract original
audio" (I use Pazera, as it's easy to be sure it's extracting original
without any further transcoding; however, it's just ffmpeg-based, and I
presume any other similar would yield the same result). [Yes, I looked
into using the audio-only settings for yt-dlp, but they didn't easily
lend themselves to batching; besides, I sometimes _do_ want the video
too - clip of an artist performing, and I want the audio-only one for
use in the car. My muscle memory of the keystrokes to extract the audio
means I can do it in seconds anyway.] I usually look at the resultant
audio - sometimes with the intention of reducing the filesize, sometimes
just out of curiosity. (I use GoldWave, but I presume almost any other
similar utility - such as Audacity - would yield similar results.)
Several observations:
1. The _vast_ majority are coded at 44100 Hz, stereo. I suppose that -
"CD quality" - is the default setting for many capture/encoding devices,
but it does seem overkill for mono material, especially of considerable
age (such as from 78s). Still, I'm not surprised. (I very occasionally
find one that _has_ been encoded mono. Though I don't think I've seen
any encoded at less than 44100 - certainly if I have, it's been
extremely rare.)
2. The _level_ is often extremely low - especially for some old (say
1960-1999) video clips. (Not all, by any means - but often enough to be
noticeable.) By low, I mean I have to boost them by ×4, or even ×8 or
occasionally ×16, to get the peaks above 50% full scale. (I only use
powers of 2 to avoid distortion.) Is this something YouTube are
imposing? Is audio level adjustment difficult on some common piece of
video capture hardware/software? I even came across one recently where
the uploader _said_ something like "this is quiet, you may have to
adjust" in the notes, so s/he knew about it. This does seem odd.
3. (This is the one that finally prompted me to post.) Far more often
than not - I'd say over 90% of tracks - there's a visible (I can no
longer hear that high) tone around 15½ kHz. I presume in the majority of
cases, it's timebase - 15625 for "PAL" (yes I know, but YKWIM), 15750
for NTSC; even where it's not from an actual video source, I presume it
has picked it up somewhere in the processing, e. g. from a computer
monitor/graphics card. This is _not_ what's puzzling me. What is, is
that the spectrum is very often brickwalled at that line: even where the
actual valid material is all below 15, 12, 11, 10, 8, or 6 kHz (you'd be
surprised how much _does_ have nothing valid above those!), and the
remainder is just uniform noise - it cuts off at the line. Can anyone
think why? It's nowhere near the Nyquist limit of 22050; I could
understand a rolloff _towards_ that to avoid aliasing, and that rolloff
being gentle to avoid other adverse effects, but no, it's brickwalled,
and at the line (which is _well_ below).
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)***@T+H+Sh0!:`)DNAf
I admire him for the constancy of his curiosity, his effortless sense of
authority and his ability to deliver good science without gimmicks.
- Michael Palin on Sir David Attenborough, RT 2016/5/7-13
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)***@T+H+Sh0!:`)DNAf
I admire him for the constancy of his curiosity, his effortless sense of
authority and his ability to deliver good science without gimmicks.
- Michael Palin on Sir David Attenborough, RT 2016/5/7-13