issue12

EMUSIC-L Digest                                      Volume 51, Issue 12

This issue's topics:
	
	Sound Morphing (14 messages)

Your EMUSIC-L Digest moderator is Joe McMahon .
You may subscribe to EMUSIC-L by sending mail to listserv@american.edu with 
the line "SUB EMUSIC-L your name" as the text.
 
The EMUSIC-L archive is a service of SunSite (sunsite.unc.edu) at the 
University of North Carolina.
------------------------------------------------------------------------
Date:         Fri, 16 Apr 1993 12:26:24 CDT
From:         John Eichenseer 
Subject:      Re: transmogrification

>how is 'transmogrification' distinct from stepwise interpolation, or
>even just simple linear mixing with shifting coefficients?
>as for all the myriad ways of modulation, keep in mind that most are
>subclasses of a few very basic types.

Yes... using various combinations and algorithms of these basic types could lead
 to an infinite number of paths from one sound to another. Which is the only
 problem with:

>This is much easier to accomplish with synthesizers, of course;

Modulating parameters from one sound to the next will only lead you through one
 set of transformative paths, if you will. You are right, though; it is an
 extremely efective way to do it, for obvious reasons.

>Hmmm, now I wonder. If you took a graph of a waveform (or its Fourier
>transform, I'm not picky) and DID a graphical morph to a completely
>different waveform, what would it sounds like? Probably a MicroWave.

Maybe just noise. It is hard to tell what algorithms would be meaningful...

>This idea has been implemented by a former grad student in
>Dartmouth's Electroacoustic program named Martin McKinney.

Wow - I think I will drop him a line... we have a next '30 laying around...

>If sonic morphing were ever realized, it would
>represent the aural equivalent of selective breeding.  This is a truly
>astounding prospect.

Yes - buy it and it will unlock your creativity. :-)

I thought my friend's neural net ideas were pretty neat; his theory was that he
 could teach a net to transmogrophy without ever having to know how to do it
 himself... ah, yasss....

regards,

jhno

PS - anybody wanna buy an Outbound 2030E? (shaddup, metlay)

------------------------------
Date:         Fri, 16 Apr 1993 12:02:34 PDT
From:         metlay 
Subject:      Re: transmogrification

Jhno sez:
>I thought my friend's neural net ideas were pretty neat; his theory was that he
> could teach a net to transmogrophy without ever having to know how to do it
> himself... ah, yasss....

HmmmmMMMMmmmmm. 

>PS - anybody wanna buy an Outbound 2030E? (shaddup, metlay)

What! Did I say ANYTHING?! It's a great little computer and we both know
it; damn good for MIDI and cheap to upgrade, too. What? WHAT?!


--
mike metlay * atomic city * box 81175 pgh pa 15217-0675 * metlay@netcom.com
---------------------------------------------------------------------------
The Rhodes Chroma: The only synthesizer whose front panel buttons hit back.

------------------------------
Date:         Sat, 17 Apr 1993 13:22:20 -0400
From:         "James M. Macknik" 
Subject:      Sound Morphing/Transmogrification

        Has anybody looked at algorithms already in use for graphic morphing
of computer images (like those used in the movies...T2 comes to mind)?  The
relations between the two media might be promising for developing some kind
of working system for sound.
        I'm in the beginning stages of research studying Synesthaesia
(cross-modal relations between visual and aural (musical) relations) and
this subject intrigues me.  Most studies in this area, I have found, have
proved to be somewhat fruitless, for there are not direct cross-modal
relations found in each synethaesiac (par exemple, A# (in the aural field) at
 amplitude [a] and timbral parameters [b],[c],[d] is equivalent to Green
(in the visual field) at frequency [x] at intensity [y]).  For each subject
claiming to be synestaesiac, there are different cross-modal references.
        Perhaps studying an algorithm for moprhing in the visual field might
yield a few solid relations to morphing in the aural field.  Most likely, a
custom driver and oscillator will have to be built accomplish such a task.
Perhaps using a workstation to control both visual and aural
*simultaneously* would be the method of experiment.

        Just food for thought.  Please write back to me or respond on the
list to tell me if this relation to visual context holds any validity, or if
I'm just talking out of my sphincter.  As well, if anyone out there is
working on a related subject, please write back.

                                                Aloha Nui Loa-
                                                                Jim

--
James M. Macknik                    Assistant to MicroComputer Specialists
Junior - Connecticut College        Department of Academic Computing
New London, Connecticut             Connecticut College
Home (203)-439-3560                 Work (203)-439-2354

BITNet:  jmmac@conncoll.bitnet      InterNet: jmmac@mvax.cc.conncoll.edu

------------------------------
Date:         Sat, 17 Apr 1993 14:01:42 -0400
From:         "Casimir J. Palowitch" 
Subject:      Re: Sound Morphing/Transmogrification

On Sat, 17 Apr 1993, James M. Macknik wrote:

>         I'm in the beginning stages of research studying Synesthaesia
> (cross-modal relations between visual and aural (musical) relations) and
> this subject intrigues me.  Most studies in this area, I have found, have
> proved to be somewhat fruitless, for there are not direct cross-modal
> relations found in each synethaesiac (par exemple, A# (in the aural field) at
>  amplitude [a] and timbral parameters [b],[c],[d] is equivalent to Green
> (in the visual field) at frequency [x] at intensity [y]).  For each subject
> claiming to be synestaesiac, there are different cross-modal references.

Jim, you might be interested in the work done by Christopher Penrose of
Princeton, who wrote a program for NeXTStep called HyperUpic.
(Christopher, if you're on this list, Hi) Here is the introductory file:

The program is a lot of fun and does some excellent sound creation.

--------------------begin hyperupic introduction ------------------------
Welcome to Hyperupic!

Hyperupic is a color image to sound transducer.  In other words, Hyperupic
transforms  a 24-bit rgb TIFF image into a 16-bit sound file using a
variety of dimensional mapping schemes.  Unfortunately, Hyperupic
transduction is a computationally expensive process. You'll need to hold
your horses.

Fortunately this process seems to bear gifts;  it has unveiled unique, and
coherent sounds from many trial images and it shows promise of being a
subtle and potent sound exploration tool.

Hyperupic employs  oscillator bank resynthesis to synthesize a sound from
a user specified frequency distribution and amplitude information derived
from the input TIFF image.  Unfortunately, to answer a predictible
question, this oscillator bank is not implemented using the NeXT's
resident 56001 DSP.  Volunteers?

Hyperupic can even be used as a (relatively) poor-man's Upic: the sound
representation system conceived by Greek supercomposer Iannis Xenakis.
Just launch icon, grab a funky brush pattern and draw a wacky image.  Save
it as a 24-bit alpha-free image, and load it into Hyperupic.  Taste-tee!

What seperates Hyperupic from its $30,000 cousin is that Hyperupic can
transduce images of trees, Bosnia, or even Elvis.  Hyperupic uses color.

By the way, Hyperupic is free of copy restrictions;  I hate them, but I
may have to resort to selling software in the future.  For the time being
you can trade this software like you would a virtual baseball card.   You
can even pretend you wrote this software yourself!  But if you do this,
you might catch a rare strain of leprosy from a pipe-smoking stranger.

Forward all questions and lucrative compositional commissions to:

Christopher Penrose
Department of Music
Princeton University
Princeton, NJ  08544

penrose@silvertone.princeton.edu

--------------------end HyperUpic introduction ---------------------

> James M. Macknik                    Assistant to MicroComputer Specialists
> Junior - Connecticut College        Department of Academic Computing
> New London, Connecticut             Connecticut College
> Home (203)-439-3560                 Work (203)-439-2354
>
> BITNet:  jmmac@conncoll.bitnet      InterNet: jmmac@mvax.cc.conncoll.edu



** Casimir J. (Casey) Palowitch  -  In 1996, there will be two kinds  **
**      Slavic Cataloger         -  of computer professional : those  **
**  U. of Pgh. Library Systems   -    who know NeXTStep, and those    **
**       cjp+@pitt.edu           -              without Jobs.         **

------------------------------
Date:         Sat, 17 Apr 1993 22:02:07 GMT
From:         "Luis E. Scheker" 
Subject:      Re: Sound Morphing/Transmogrification

In article <9304171722.AA10463@mvax.cc.conncoll.edu>
        "James M. Macknik"  writes:

>         Has anybody looked at algorithms already in use for graphic morphing
> of computer images (like those used in the movies...T2 comes to mind)?  The
> relations between the two media might be promising for developing some kind
> of working system for sound.

     For a while, I had wondered why no one had taken the principles of
graphic morphing and and applied them to sound.  After further
consideration of the idea, however, I realized that the nature of sound
presents a significant problem which must first be addressed before
morphing similar to the graphical model could ever be achieved.
     Unlike animation, sound only makes sense over time.  Animation can
be broken up into static frames that, in and of themselves, are
discrete and functional units.  These individual units can be digitized
and then manipulated with entirely predictable results (e.g. changing a
1 to a 0 in a bit-mapped picture might turn off a pixel [most picture
formats are not this simplistic, but you get the idea]).  An individual
sample unit, however, is meaningless.  Many sample units must be heard
in succession for a person to be able to identify an instrument and its
pitch.  Furthermore, as samples are raw data that change over time,
rather than static, mapped information, there doesn't seem to be a way
to predictably manipulate digitized sound on the binary level.  What
does a 1 or 0 mean in the sea of numbers representing a guitar strum or
drum hit?
     I suppose we'll have to wait for an alternative to sampling that
can digitally mimic the individual aspects of a natural sound over
time.  Would the much balley-hooed (by Keyboard magazine at least)
real-time additive synthesis be able to achieve the type of control
over sound necessary for morphing?

------------------------------
Date:         Sun, 18 Apr 1993 02:52:33 -0400
From:         "James M. Macknik" 
Subject:      Sound Morphing/Continued

-=Here are some comments on Luis Sheker's reply to mail letter=-

>     Unlike animation, sound only makes sense over time.  Animation can
> be broken up into static frames that, in and of themselves, are
> discrete and functional units.  These individual units can be digitized
> and then manipulated with entirely predictable results.

   I think there is one particular flaw in the above statement, and I would
like to see if you can discover it before I explain what I mean.  Perhaps
this will help to clarify:

>   An individual sample unit, however, is meaningless.  Many sample units
> must be heard in succession for a person to be able to identify an
> instrument and its pitch.

   Here, I think is where the flaw in your arguement lies.  The above
arguement may be true enough, but in very few contexts.  If, for example,
you have a 5 sec. animated graphic, and isolate one frame, you are not, as
you are implying above, viewing that frame in the same time-scale as it is
shown in the context of the animation.  You are isolating one moment in time
(in relation to the animated production) and displaying it for many times
that same number of moments.  To clarify further, if the frame is 1/30th of
1 sec, you do not isolate that frame and view it for that 1/30th of a
second, but for a much longer period of time, say 10 seconds.  This
extension, or stretching of the contextual time for that frame allows you to
view the frame as a moment within its own context.
        Relating this to a musical animation, oh, say a violin sample, I
think you will find the results similar.  If you were to have a 5 sec.
sample of a violin, and isolate a 30/th of 1 sec. of that sound, and played
it for that length (which is in the context of the musical *animation*), you
would indeed find it difficult to perceive a great deal either timbrally or
identify a pitch.  If, how ever, you were able to extend this moment as long
as you wish, by removing it from the context of it's own animatory sequence,
an aural equivalent of a visual freeze-frame would be produced, through
which you may actually find that you could perceive such aspects of the
sample as pitch, timbre, and amplitude, much in the same way that you are
able to perceive color shades, implied 3-dimensional relations on a
2-dimensional surface, and brightness when viewing the visual freeze-frame.

>   Furthermore, as samples are raw data that change over time,
> rather than static, mapped information, there doesn't seem to be a way
> to predictably manipulate digitized sound on the binary level.  What
> does a 1 or 0 mean in the sea of numbers representing a guitar strum or
> drum hit?

        This is true enough, though it is a technological wall rather than a
conceptual impossibility, I think.  What kind of meaning can be derived from
a 2-dimensional representation of a 3-dimensional image?  One dimension is
missing, yet we are able to interpolate depth without much problem.  So
who's to say we can't develop a decent system for mapping the dimensions of
sound?

>   I suppose we'll have to wait for an alternative to sampling that
> can digitally mimic the individual aspects of a natural sound over
> time.
        You're right, and probably a long time.  The thing that *I* think
people have to remember is not to think of sound as more limited than
picture, with less dimension, depth, or complexity.  If anything (for the
human ear at least), it's more complex for we haven't developed a way to
identify and describe everything we hear.  Perhaps some day we'll have a way
to really describe the multi-faceted dimensions of sound, but you're right,
it'll be a while.

        Sorry this is so long-winded, but there was a lot here to which I
felt compelled to respond.  If anyoone has any refutations to anything I've
said I'd love to hear 'em.  Respond here or drop me a line.

                                                Aloha,
                                                        Jim


--
James M. Macknik                    Assistant to MicroComputer Specialists
Junior - Connecticut College        Department of Academic Computing
New London, Connecticut             Connecticut College
Home (203)-439-3560                 Work (203)-439-2354

BITNet:  jmmac@conncoll.bitnet      InterNet: jmmac@mvax.cc.conncoll.edu

------------------------------
Date:         Mon, 19 Apr 1993 13:58:35 +0000
From:         Nick Rothwell 
Subject:      Morphing

>The more they overthink the plumbing, the easier it is to stop up the drain.

To be fair, many of today's instruments are quite discrete and so morphing
doesn't make much sense except at a quite coarse level, and due to the
complexity of the architecture a lot of the intermediate points are going
to be contradictory in some way, enough to wedge the machine. On a
Wavestation, going from a 1-voice patch to a 2-voice patch is discrete:
what happens in the middle? How do you morph between two wave sequences
(one of which is looped, say)? I'm sure there are schemes which would work,
but they need to be designed, and they'll be subjective.

You can morph clay, but you can't morph Lego.

                        Nick Rothwell   |   cassiel@cassiel.demon.co.uk
     CASSIEL Contemporary Music/Dance   |   cassiel@cix.compulink.co.uk

------------------------------
Date:         Mon, 19 Apr 1993 15:31:49 EDT
From:         ronin 
Subject:      transmogrificationissimo

i guess i remain unconvinced. i don't really see what the issue is.
first, neither digital sound nor image is continuous. we don't see
pixels, we see images, which take finite time to render, and which
recur quickly enough that the illusion of continuity is maintained.
we don't here bits, we hear waves, and in fact we don't even single
cycles, we hear multiple cycles, whose period of recurrence we perceive
as pitch. change in any medium, in any dimension, if it occurs within
given thresholds, is for all intents and purposes continuous (at least
for our intents and purposes... i would prefer not to enter a debate
concerning complex sytems modeling.)
second, i don't see why 'graphical morphing' of two waveshapes should
be any different from time-domain convolution. by description, one
seems to be the technical term for the other.
if one is interested in producing something other than a conventional
complex filtering algorithm, why not try this:
specify the spectra of a start and an end tone, either as the analyses
of generated tones, or arbitrarily. let each spectrum consist of, say,
32 components. rather than shifting from one spectrum to the other
by filtering, which will simply adjust the amplitudes of components
in place, produce a one-to-one component map that determines the pitch
and amplitude of each component. one could go so far as to map component
1 of sound 1 to component 32 of sound 2, and so on, so that the inbetweens
are made of pitch-shifting elements that 'resolve' to the final spectrum.
whereas, in something like the waldorf, fixed components appear and
disappear by a continuous 'weighting' process, this would produce
continuously moving components. i think this would be more like what is
meant by 'transmogrification'.

-----------< Cognitive Dissonance is a 20th Century Art Form >-----------
Eric Harnden (Ronin)
 or 
The American University Physics Dept.
4400 Mass. Ave. NW, Washington, DC, 20016-8058
(202) 885-2748  (with Voice Mail)
---------------------< Join the Cognitive Dissidents >-------------------

------------------------------
Date:         Tue, 20 Apr 1993 14:49:11 EDT
From:         wbf@ALUXPO.ATT.COM
Subject:      Re: Transmogrificationissimo


A lot of posts culminating in Eric Harnden saying:
> i guess i remain unconvinced. i don't really see what the issue is.
> ...etc...

How about twiddling the knobs on any old analog synth? Is that
transmogrificationism enough for the desired effect or is something more
radical sounding desired?  What are people truely envisioning here?
"Fascinating!"
--
Bill Fox * Fox's Den Recording Studio * Nazareth, PA * wbf@alux1.att.com
------------------------------------------------------------------------
You don't have to worry about some bozotic little CPU chip going er er er
if you ask it to do something odd.....          (Mike Metlay)

------------------------------
Date:         Tue, 20 Apr 1993 15:26:44 -0400
From:         Joe McMahon 
Subject:      


>How about twiddling the knobs on any old analog synth? Is that
>transmogrificationism enough for the desired effect or is something more
>radical sounding desired?  What are people truely envisioning here?
>"Fascinating!"
Literally, a transformation of a sound from the standpoint of something
like the T2 transformation from a tile floor through flowing metal into a
human form.  I want a sound to start as a flute, shift into filtered pink
noise, and then transform into a bass drum (as an example, dunno what I'd
be able to use that particular sound for).

Several people have mentioned a cross-fade. This isn't what I'm shooting
for; that's sort of (visually speaking) fading out one image and fading in
another. The transformation I envision (notice that there's no word for
describing what you hear in your mind's ear?) is one where the components
retain their volume, but are transformed "in place".

Bill's close; a patch based on a sine wave could be distorted into
sawtooth, for instance. However, the hardware has to support that
operation. Think of being able to automate turning dials and rewiring so
that a transition from sound to sound flowed gradually from one to the
other. That's what I'm after.

I was thinking about Eric's comments about the Fourier spectra. One could
fade peaks down and others up, or one could shift and fade peaks and
valleys. Intuition tells me that the two operations would sound a lot
different; operation 1 would transform sound-to-sound with a sawtooth-like
peak in the middle (because of the approximately equal prominence of the
harmonics in mid-transformation). Operation 2 I can't imagine. I'll love to
hear it though.
 --- Joe M.

"Primal scream therapy you can dance to?" (A. Schabtach)

------------------------------
Date:         Tue, 20 Apr 1993 16:43:00 EDT
From:         rg8 
Subject:      


>Literally, a transformation of a sound from the standpoint of something
>like the T2 transformation from a tile floor through flowing metal into a
>human form.  I want a sound to start as a flute, shift into filtered pink
>noise, and then transform into a bass drum (as an example, dunno what I'd
>be able to use that particular sound for).

Jean-Claude Risset has done this in Passages (for flute and tape;
Wergo 2013-50).  There are "passages" where the flute is transformed
gradually into a vocal sound using Music V. This was 10 years ago, BTW.

As has been mentioned, I don't see how you can do this with MIDI
instruments, since the general architecture has dead ends -- or at least
limits -- for transforming one patch to another.  If the tranformation is
created by an algorithm which uses models of the source and destination
sounds, then the change takes place at the single-cycle level of the
waveform.  Ergo, no Lego (or at least they're so small, and there are so
many of them that you don't hear the bumps).

Bob Gibson
rg8@umail.umd.edu

------------------------------
Date:         Tue, 20 Apr 1993 14:57:42 -0600
From:         JONATHAN SMITH 
Subject:      Re: - no subject (01GX8KQL6L088WWCSY) -

>>I was thinking about Eric's comments about the Fourier spectra. One could
>>fade peaks down and others up, or one could shift and fade peaks and
>>valleys. Intuition tells me that the two operations would sound a lot
>>different;
>
>    Remember that a Fourier transform is linear.  So fading peaks down and
>others up to transform one spectral set to another is mathematically and
>sonically identical to the cross fade.

Well if you wanted to keep the basically same 'type' or method. You could dup
information and perform fourier's on each deciding how much of each to include
in the final, then patch those two(or more) back together following some preset
rules.
>
>


Jonathan Smith

------------------------------
Date:         Tue, 20 Apr 1993 16:50:27 -0500
From:         Brian Adamson 
Subject:      


>I was thinking about Eric's comments about the Fourier spectra. One could
>fade peaks down and others up, or one could shift and fade peaks and
>valleys. Intuition tells me that the two operations would sound a lot
>different;

    Remember that a Fourier transform is linear.  So fading peaks down and
others up to transform one spectral set to another is mathematically and
sonically identical to the cross fade.


operation 1 would transform sound-to-sound with a sawtooth-like
>peak in the middle (because of the approximately equal prominence of the
>harmonics in mid-transformation). Operation 2 I can't imagine. I'll love to
>hear it though. or one could shift and fade peaks and
>valleys. Intuition tells me that the two operations would sound a lot
>different;

    However, I think (if I remember correctly) Eric's comments discussed
horizontal motion on the freq axis (if you visualize an Amplitude vs. Frequency
plot) in addition to vertical (fading spectral peaks).  (This what you call
Operation 2 I think?) This might be interesting
if you shift from a sound, say with predominatly odd spectral content to
one with even content.  Again the rate and motion of the shift would likely
play a bigger role than perhaps the actual start and end point sounds
themselves in the perceived character of the timbre.  Something like FM
modulation on individual spectral components, sliding peaks
of one harmonic to another?    So imagine a sound engine with FM envelopes
for each
of 32 or so harmonic components?  A very slow morph rate would start with a
nice harmonically clean square wave, slide through some dissonant sounding
stuff to a harmonically clean sawtooth spectrum. A higher morph rate
(analagous to a higher
freq FM modulator) might tend to sound less dissonant on the average. The
morphing part (i.e. FM spectral modulation if you will) keeps the sound
interesting, not necessarily the starting or ending sounds themselves.

But you wouldn't want the morphing process to be simply cyclic either, the
same thing repeated over and over (While watching a picture of a sphere
morph into a square cube
and back again might be interesting a couple of times, you would probably
get bored watching (or listening) to this process repeatedly)  I have
played with sound synthesis on real
time DSP where the sound results from its harmonic content (amplitude & phase of
each harmonic) being controlled by time-varying random processes.  This
keeps the harmonic
content interesting without sliding through dissonant (or non-harmonic)
spectrums.
(I'm not saying non-harmonic spectrums aren't interesting :) ) This
analagous to
Joe's Operation 1, with continually random start and end points ... This
could be modified
with weights on the random processes which would create essentially
deterministic
start and end points with controlled random harmonic motion between the two
sounds (i.e. weight probabilities to start out with a square wave spectra
shifting
those weights through a non-deterministic spectral region to another
deterministic
e.g. sawtooth spectra) hence a cross-fade with a random envelope :)

Operation 2 - ? shift the frequency of peak harmonics in start sound to
frequency of
  peaks in end point sound (same for valleys) ?


> --- Joe M.
>
>"Primal scream therapy you can dance to?" (A. Schabtach)

- Brian Adamson

------------------------------
Date:         Tue, 20 Apr 1993 18:20:08 -0400
From:         Joe McMahon 
Subject:      


>As has been mentioned, I don't see how you can do this with MIDI
>instruments, since the general architecture has dead ends -- or at least
>limits -- for transforming one patch to another.

Well, existing instruments themselves I'm sure can't do it. However, a
decent sampler with a SCSI connection ought to be able to hand me a file or
files I  can mess with and load back again. This is just speculation,
really, but if
it stimulates someone to experiment, it was worthwhile. It's even more
worthwhile if I get to hear it sometime.
 --- Joe M.

"Primal scream therapy you can dance to?" (A. Schabtach)

------------------------------
End of the EMUSIC-L Digest
******************************