Fascinating topic and something new that deserves consideration.
As I explained in more detail in another topic,
everything involved with computers, audio hardware like Fender and MOTU external digital audio and MIDI processors, Digital Audio Workstation (DAW) applications, VSTi virtual instruments and their sampled-sound libraries, and VST effect plug-ins is AI in one way or another.
The operating systems (Linux, macOS, and Windows) are AI, as well; and the practical perspective on software engineering is that it also is an AI activity, where the way applications behave is based on the AI provided by Software Engineers (SE), which I suppose makes it SE/AI.
There are different levels of what AI does and is capable of doing, of course; but how much of it one decides to use is a subjective matter, at least depending on how absolutely and restrictively you define "AI Sound Source", since as noted
everything in the digital music production universe is AI in one degree or another.
One of the things I have been doing for decades is old-time science fiction radio plays; and for the first few decades I read the scripts for all the characters; but over a year ago, I discovered 11ElevenLabs AI and now am using it for female voices and certain male voices, which I enhance with various VST effects plug-ins, audio clip editing in Fender Studio Pro 8 and Melodyne to make the voices more personalized, tailored, and in some instances alien in outer space ways.
Years ago, I started using Realivox Blue (RealiTone) as a virtual soprano in Kontakt; and she is my favorite virtual soprano and sings melodies composed in music notation using words provided in custom phonetic language scripts that are created via the Realivox Blue GUI running in Kontakt.
For a long time, I did
not use highly automated musical ensembles; but starting several years ago, I decided to use a few of what one can call "AI ensembles". I select them and provide the music notation, where perhaps the best example is "Hollywood Pop Brass" (EW ComposerCloud+), which has a virtual festival of different horn section riffs but also can be used to play music notation for individual brass instruments.
Reason (Reason Studios) is MIDI-based; and it has advanced "players" that can be programmed to do all sorts of useful musical activities, including chord patterns, drumkit rhythms, and so forth, all of which can be used in Fender Studio Pro 8 via the Reason Rack Plug-in (VST and VSTi).
My perspective is that AI generally maps to Glenn Miller and Tommy Dorsey using musicians for their songs and Elvis Presley using the Jordanaires for singing backup, which also is like the Beatles using symphonic orchestras for some of their songs and usually were arranged and composed by George Martin, along with other arrangers.
The practical aspect of my perspective is that
the way things sound is what matters ultimately.
If it sounds good, then it works here in the sound isolation studio, which considering I have an artificial heart valve and would be dead without it is the rule in the world according to me.
THOUGHTS
The OP topic ". . . using AI as a sound source in place of traditional VST instruments" makes what I think is an incorrect presumption that VSTi virtual instruments are not AI, which I suggest is incorrect because while the notes generally are played by trained musicians using professional quality instruments, the notes nevertheless are recorded, digitized, and enhanced (a) using AI equipment and software and (b) are no less artificial than supposedly "AI sound sources", which themselves are likely to be based on digitized sampled sounds.
Yet, although it appears that one of the distinguishing factors might be the way AI sound sources
automagically handle and play what with more traditional VSTi virtual instruments are handled and played either (a) by requiring elaborately articulated, stylized, and dynamic music notation symbols and language--although only when the respective engine supports them--which is vastly and annoyingly visually cluttering and at least when the focus is on providing printed sheet music for use by human musicians cannot be avoided or (b) by using specific sampled-sound library subsets where the instruments are played in specific articulations, styles, and dynamics, which can be enhanced via VST effect plug-ins, where for example in what I call a "diatonically sampled" library only every other note is sampled, this requires non-sampled notes to be computed using logarithmic interpolation and by its nature changes the rate and depth of motion effects like tremolo and vibrato, hence the only way to ensure consistent rates and depth is to use VST effect plug-ins for motion.
Based on the example video, it appears that the AI aspect perhaps maps to a better interpretation of some articulations, styles, and dynamics, although I think the only practical way to determine this is to do a side-by-side comparison with the way specific sampled-sound subsets respond when played by simple music notation, which specifically maps "simple" to providing
no articulations, styles, and dynamics in the music notation, which for reference is the way I use music notation.
In this regard, the comparison should exclude using a MIDI keyboards and things like "mod wheels", expression pedals (
una corda,
sostenuto, and
damper), and other onboard controlling devices; because a MIDI keyboard can control things which in music notation are articulations, playing styles, dynamics, and so forth.
Explained another way, it appears that this new style of AI sound sourcing automagically does (a) what is done in music notation by specifying articulations, playing styles, dynamics, and other expression symbols and language--but again only if the respective engine supports it--or (b) by engine-supported sampled sound libraries that respond directly to advanced controls on MIDI keyboards.
One of the relevant example is the Gypsy Jazzy (UVI) collection, which responds to some music notation symbols and language, as well as certain MIDI keyboard controls, as do the Orange Tree Samples SLIDE Lap Steel sample library and engine for Kontakt (Native Instruments) and World Instruments (UVI) for its VSTi virtual instrument UVI Workstation player, where the advanced features of these require either (a) elaborate music notation or (b) using an advanced MIDI keyboard, where the latter is required for doing string bends, slides, and other articulations in controlled and very realistic ways, although they can be used with simple music notation that does not provide specific articulation, playing styles, dynamics, and other information, where for reference I prefer to control dynamics with compressor-limiters rather than with dynamic marks in music notation that are used for
pianissimo and
fortissimo.
[
NOTE: It appears that UVI has stopped providing Gypsy Jazzy in its subscription and as a a single product, but it appears to be included in Sonic Bundle. Gypsy Jazzy continues to work nicely in UVI Workstation player, which is all I know about its status at present, other than I like it and actually use it for Gypsy violin. I have a full license for Gypsy Jazzy, so it works in UVI Workstation. ]
If I want it loud, then I crank it up; but if I want is not so loud then I either (a) crank it down or (b) remove it based on the idea that if you cannot hear it, then it's noise and probably is
not necessary.
Can the apparently new "AI sound source" technology do this?
It appears that it does this, although doing it probably requires some user intervention at least to provide guidance on what the AI should be doing.
And if you connect a few relevant dots, I think Tonalic (Celemony) at least does some of this, which it appears it can and therefore makes it one of the new "AI sound source" technologies for
doing more while
requiring less elaborate music notation and fewer MIDI keyboard activities?
SUMMARY
This looks to be intriguing; and while I do
not distinguish it so much conceptually from strategies which already exist--including automated
automagical ensembles like World Instruments (UVI)--the new "AI Sound Source" technology certainly appears to do more of the desired work
automagically.
Lots of FUN
[
NOTE: I wrote "Chapter 1: On The Asteroid" in the late-1970's and did the first version about two decades ago with music using a 1999 Fender American Deluxe Stratocaster and an Alesis ION Analog Modeling Synthesizer (for outer space sounds). Last year, I started updating the older chapters using more elaborate music, AI ensembles, and 11ElevenLabs AI voices for some of the characters. Whether any of the AI sounds are controlled solely by AI depends on the way one defines "solely by AI", since I used VST effect plug-ins to control volume levels and a few of the motion effects. If one allows audio-engineering, then the AI sounds are not done "solely by AI", but I think one can be too "purist" in that regard. If the AI sounds are done primarily by AI, then this is consistent with my ongoing series of old-time science fiction radio plays, "Extreme Gravity", where the strategy is to compose a script for the AI voice; upload it to the 11ElevenLabs AI website, where it is generated to audio based on the AI Voice you select; and then I download the resulting audio clip and modify it using VST effect plug-ins, FSP8 audio clip editing, and Melodyne (Celemony). 
]
ElevenLabs AI Voice Library