[Dixielandjazz] The Ultimate Synthesizer

Mon Nov 24 13:07:25 PST 2003

Long, not specifically OKOM, but could be. Imagine, with the below
technology, there would be no more need for tribute bands. The subject,
Elvis, Holiday, Water, ODJB, could be completely "covered" by a
computer. Woe is me . . . Is it Art? Can you imagine Louis Armstrong
singing rap? Read on.

Kazh, the technology was developed in Spain. Did you have any input? ;-)

Cheers,
Steve Barbone

November 23, 2003 - NY Times

Could I Get That Song in Elvis, Please?

By BILL WERDE

   Imagine having a singer with a world-class voice at your disposal,
any hour of any day. She's just standing at the ready, game to perform
whatever silly song you might make up for her: a ballad about her love
for you, a tribute to your best friend's golf game, a stirring rendition
of the evening's dinner menu.

Close friends of Madonna or Mariah may already have had that pleasure,
but for everyone else a new technology called Vocaloid may offer the
next best thing. Developed at Pompeu Fabra University in Spain and
financed by the Yamaha Corporation, the software, which is due to be
released to consumers in January, allows users to cast their own (or
anyone else's) songs in a disembodied but exceedingly life-like
concert-quality voice. Just as a synthesizer might be programmed to play
a series of notes like a violin one time and then like a tuba the next,
a computer equipped with Vocaloid will be able to "sing" whatever
combination of notes and words a user feeds it. The first generation of
the software will be available for $200. But its arrival raises the
prospect of a time when anyone with a laptop will be able to repurpose
any singer's voice or even bring long-gone virtuosos back to life. In an
era when our most popular singers are marketed in every conceivable way
— dolls, T-shirts, notebooks, make-up lines — the voice may become one
more extension of a pop-star brand.

The human voice has proven the most difficult of all sounds to
synthesize. Digital technology can produce something clear enough to
convey meaning, but only in a clipped monotone that sounds more like a
robot than a real live person. A convincing human voice, spoken or sung,
with all its complex, flowing articulations and quivering uncertainties
has been unattainable. Yamaha has not yet made Vocaloid available for
scrutiny, but judging by some early samples and demonstrations, the
company seem to have made that quantum leap.

You can think of the software as a kind of audio font: musical notation
and lyrics can be translated into the chosen voice, then saved for
replay, just as a word processor might translate a text into Helvetica
or Times New Roman and print it out as many times as you like.

These fonts are made up of a database of phonemes, the basic sounds that
make up any language. To create the database, technicians record a
singer performing as many as 60 pages of scripted articulations (like
"epp, pep, lep"). Assorted pitches and techniques like glissandos and
legatos are also thrown in the mix; with all the combinations, the
process takes a week of five-hour singing days. The resultant font is
"reminiscent" of the singer's voice, says Ed Stratton, the managing
director of Zero-G Limited, a London-based company that has licensed the
Vocaloid technology.

Zero-G is using Vocaloid to create the first of these fonts: Leon,
described as a "Virtual Soul Vocalist," and Lola, his female
counterpart. The digitized duo will make their debut in January at the
International Music Products Association conference in Anaheim, Calif.

The technology first attracted attention in March at Musikmesse, an
annual music technology conference in Germany. Paul White, the editor of
the British audio gear magazine Sound on Sound, was there for the
demonstration. "A few simple tools were used to adjust inflection, tone,
vibrato and so on," wrote Mr. White. "Within minutes, the computer was
singing like a professional!" A Vocaloid version of the song "Amazing
Grace" — recorded with prototype technology, yet still more human
sounding than any previous vocal synthesis — was released on Yamaha's
Web site shortly after the conference. Quickly, that sample drew links
from sites in the Netherlands, Germany, France, Japan, Russia and the
United States, setting Internet message boards and chat rooms buzzing.

In the case of Leon and Lola, session singers were hired to record what
Mr. Stratton calls "generic soul-singing voices." The decision to start
with soul was purely a marketing calculation: Mr. Stratton figured that
the most common use of Vocaloid, at least in its early stages, would be
to serve as
background singers. With a soulful sound, the company could target a
commercial market that ranges from Justin Timberlake to Jay-Z.

But Mr. Stratton has many more plans. Soon, he said: "You'll buy new
fonts and then any song you write, you can hear it sung a number of
ways. You might hear what it sounds like sung by a soul singer, and then
an operatic voice or a choir boy."

Hit music producers like Dan (The Automator) Takemura (a creator of the
Gorillaz, a band that appeared only in an animated form, but sold
several million albums anyway) and the Matrix (the trio of Scott Spock,
Graham Edwards and his wife, Lauren Christy, that produced the three No.
1 hits from Avril Lavigne's last album) say they are likely at least to
try recording with Vocaloid instead of backup singers. "As producers,
you run into some artists and oh god, it's so hard to get the right
vocal," Mr. Spock said. "It's intriguing, this idea of `O.K., just give
me all your vowels and all your consonants and I'll see you later.' "

Mr. Takemura says he would want to use the software to create sounds
that human voices could not. "The first producers to work with this are
probably going to have a hit just based on the novelty factor," he said.
But, he warns, "it's the imperfections in a voice, the happy accidents,
the human-ness that are often what's best in a song."

The market for synthesized voices extends well beyond recorded music.
For example, cell phone ring tones — a rapidly expanding field — already
use synthesized voices to personalize incoming calls. The DA Group, a
Scottish company, uses patented technologies to animate several popular
virtual stars,
including Ananova, the British newscaster who exists solely online as a
lifelike, digital countenance, and Maddy, the bank teller avatar who is
being tested on ATM's in several markets around the United States. After
listening to some Vocaloid samples online, Mike Antliff, the company's
chief executive, said, "I'm going to have my research team look into
this as soon as I get off the phone."

Vocaloid's next application will be Miriam, a third font that Zero-G
expects to release later in 2004. (A Japanese company, Crypton, expects
to release its own font — "Japanese Pops," a bubbly female voice — in
March.) Miriam is based on recordings of Miriam Stockley, a singer for
the new age group Adiemus, which has worldwide album sales in excess of
several million. "At first I was quiet horrified by the idea," Ms.
Stockley said. "People tend to pay a lot of money to get my sound, and
here I am putting it on a font."

She changed her mind, she said, because "you can't fight progress, no
matter how strange it sounds." She also negotiated an undisclosed
percentage for each copy of Miriam that sells. But once Miriam the vocal
font is out there in the public, Ms. Stockley the actual singer has
little control of how it will be used. Anyone who legally purchases the
font is entitled to use it to write songs for commercial purposes,
though they're not allowed to market them as Ms. Stockley's own
recordings.

Mr. Stratton reiterated the point, "when vocal fonts are used, the
performer is the user and Vocaloid is an instrument."

In the long term, Mr. Stratton is aware that the true killer application
will be recognizable celebrity fonts — the Elton, say, or the Aretha.
But so far, none of the world's most famous voices have volunteered.

Michael Stipe of R.E.M. heard a Vocaloid version of "Amazing Grace"
online, and he said he was impressed. (The Yamaha Corporation includes
samples with a recent press release at
http://www.global.yamaha.com/news/20030304b.html.) But he wasn't
prepared to rush out and have a font created. "I would hate to think
that 250 years from now Altria would use the Michael Stipe voice to sell
organic soy to a Mars landing," he said. "It's intriguing in 2003. I'm
not sure about 2303."

If Napster and other online file-trading programs have taught the world
anything, it's that once a technological cat is out of the bag, it can
be difficult to control. What's to stop dilettantes from creating their
own fonts? Could it be long before falsified but entirely convincing
clips of Britney Spears begging
for Justin's forgiveness circulate on the Web — to say nothing of George
Bush conspiring with Tony Blair about weapons of mass destruction?

"It is a matter of time before Yamaha makes this technology available
for consumers to make their own fonts," Mr. Stratton said. But at
present, the process, which requires a deep knowledge of phonetics and
audio engineering, is too complex for ordinary consumers. Even if an
ingenious audiophile were to untangle the process, however, he would
still need a database of thousands of articulations — more than someone
would be likely to cobble together from available recordings. As for
famous voices now lost to time, if they left behind a substantial enough
catalog, it might be possible to produce at least a portion of the
required phoneme database. The rest of the required vocals could come
from a sound-alike singer.

Elvis seems like an obvious candidate for vocal reanimation. Recently
(and for the first time), his estate licensed a couple of his songs for
dance-floor remixes; one of them became a No. 1 single in England.
Licensing Elvis for Vocaloid would be a different matter, though, says
Gary Hovey, vice-president of entertainment for Elvis Presley
Enterprises. "If someone came to us and said, `We want Elvis to sing
this new song,' we'd have a lot to contemplate," he said. "We tried to
retain the integrity of his original song with the remixes. Now you're
talking about a whole new vocal performance of a song he never sang or
knew? How do we know he'd want to sing it?"

"Believe me, that would go all the way to Lisa," he added, referring to
Elvis's daughter, Lisa Marie Presley, who owns Elvis's estate.

Still, there is the potential for enormous money to be made, even by
Elvis standards. How much would an advertiser pay to have Elvis sing a
new jingle? How easily would a new "Elvis" song climb the pop charts —
if only for the novelty value? Mr. Stratton is optimistic about the
prospect. "No font comes out of the box with a singer's timing and
expressions," he said. "It's just the tone of his voice and his
pronunciations. The finer bits of expression — timing, pitch bend, the
sorts of things that add real character — would have to be added by the
user working with the font. It would take a great deal of effort to make
it sound just like Elvis. But you could do it."

Once a full palette of vocal fonts is available (or once Yamaha allows
users to create their own), the possibilities become mind-boggling: a
chorus of Billie Holiday, Louis Armstrong and Frank Sinatra; Marilyn
Manson singing show tunes and Barbra Streisand covering Iron Maiden. And
how long before a
band takes the stage with no human at the mike, but boasting an amazing
voice, regardless?

In fact, in today's world of computer-produced music, who needs humans
at all? Vocaloid could be used as part of an integrated music-generating
machine. Start with any number of existing programs that randomly
generate music. Run those files through Hit Song Science, the software
that has analyzed 3.5 million songs to determine mathematic patterns in
hit music. (Major labels are already taking suggestions from it —
"Slower tempo, please, and a little more melody at the bridge.") Throw
in a lyric-generating program, several of which can be found free
online, and then route the notes and lyrics through Vocaloid to give the
song a voice. It might not be a hit, but the process could provide
inspiration for a lot of lonely songwriters.

At this early stage of its development, the future life of this
technology is as much fun to think about as the almost-human voices
could be to play with. At the very least, Vocaloid promises to bring a
whole new copyright-infringing definition to the phrase "losing one's
voice." We may soon know if an unmanned computer could produce hit
singles or the voice of tomorrow's virtual pop hero. Lisa Marie, any
thing to say about that? And really, can we even be certain it's you?