Successfully upgraded to Vanilla Forums v2.6. Please report any issue you may find.

Current VAN Format

VAN stands for Vector Animation Format and at this stage (prototype) have the following file structure:
[string] Header
[string] Version
[string] Animation name
[int] Compressed frames flag (1=Yes, 0=No)
[int] Embedded music flag (1=Yes, 0=No)
[int] Native animation width
[int] Native animation height
[int] Native playback frame rate
[int] Frames count

--- optional music data ---
[string] Music file name
[int] Music data size
[---] Music data block

--- animation frames ---
----- for each frame -----
[int] Frame size
[---] Frame data block
...

Comments

  • edited December 2015
    Header is always the string VectorAnimation
  • Hmmm... It looks like it needs some work. Can the audio be streamed with the video? Perhaps the embedded integers for frame compression and music flag should be enumerations of modes in case the next version of VAN supports different compression modes and audio compression modes as well. Zero=NONE for either case makes sense though.

    Something I was thinking of with the coordinate system I emailed you is that the Y size would always be normalized to 32768 as a 15-bit unsigned integer and X size would be the only variable needed for size in the header but stored in a 16-bit unsigned integer, making the widest aspect ratio 2:1.
  • Yes, I know, this is only a proof of concept and still need much work :smile:

    Hmmm... It looks like it needs some work. Can the audio be streamed with the video?

    Yes, I'm using this code to allow the music streaming:
    
        If newObj.EmbeddedMusic = 1
          newObj.MusicName = ReadLine(fid)
          newObj.MusicSize = ReadInt(fid)
          newObj.Offset    = FilePos(fid)
          newObj.MusicID   = Nil
          newObj.MusicFile = DefineVirtualFile(file, newObj.Offset, newObj.MusicSize, newObj.MusicName)
          Seek(fid, FilePos(fid)+newObj.MusicSize)
        EndIf
    newObj is the animation object created at opening time.
    the Seek() command skips the music data and continue to the animation frames.
    The same concept is applied to each animation frame, I define virtual files from the animation file so I can load and operate with them directly without preloads, here is the initialization routine:
    
        newObj.Frames = {}
        
        For Local f = 1 To newObj.Count
          newObj.Frames[f-1] = {}
          newObj.Frames[f-1].Size        = ReadInt(fid)
          newObj.Frames[f-1].Offset      = FilePos(fid)
          newObj.Frames[f-1].VirtualFile = DefineVirtualFile(file, newObj.Frames[f-1].Offset, newObj.Frames[f-1].Size, "f"..f..".svg")
          Seek(fid, FilePos(fid)+newObj.Frames[f-1].Size)
        Next


    Perhaps the embedded integers for frame compression and music flag should be enumerations of modes in case the next version of VAN supports different compression modes and audio compression modes as well. Zero=NONE for either case makes sense though.

    Sure, let's start defining them:
    --- Audio ---
    #AUDIO_NONE = 0
    #AUDIO_STREAM = 1
    #AUDIO_MODULE = 2
    From Hollywood manual seems that also modules can be streamed, it's stated looking at the OpenMusic() command.
    I'd also like to add multiple audio channels, for example to handle multiple language soundtracks.
    We could replace the [int] Embedded music flag (1=Yes, 0=No) with [int] Default audio track (0=None) and change the audio section with:
    
    --- audio chunk ---
    [int] Audio tracks (How many tracks this chunk holds)
    
    --- track 1 ---
    [string] Track name
    [string] Track language
    [int] Music data size
    [---] Music data block
    
    --- track 2 ---
    ...
    
    Compression flags:
    
    --- Compression Flags---
    #COMPRESSION_NONE = 0
    #COMPRESSION_INBUILT = 1 (Hollywood internal compression routines)
    


    Something I was thinking of with the coordinate system I emailed you is that the Y size would always be normalized to 32768 as a 15-bit unsigned integer and X size would be the only variable needed for size in the header but stored in a 16-bit unsigned integer, making the widest aspect ratio 2:1.

    From what I've understood it's something we have to handle while encoding frames, I've included in the header the native resolution because when I play the animation I have a default size for the playback, the size at which the animation has been encoded.

    For completeness here is the full initialization routine of the SVGA.Anim:Open(file, name) method:
    /******************************************************************************
    AnimName = SVGA.Anim:Open(file, name)
    
    Open an animation stream for playback.
    ---------------------------------------------------------------------
    INPUT
      file      Animation file
      name      Optional animation name to override the embedded one.
    OUTPUT
      AnimObj   Animation object or Nil
    ******************************************************************************/
      DBG.Console.Out(".Anim:Open()", DBG.OpenFunc, SVGA.DebugCh) 
      
      Local newObj = CopyTable(self)
      
      If Exists(file)
        Local fid      = OpenFile(Nil, file, #MODE_READ)
        Local fHeader  = ReadLine(fid)
        Local fVersion = ReadLine(fid)
        Local fName    = LowerStr(ReadLine(fid))
        If Not(IsNil(name)) Then fName = name
        
        ; Check version
        If fVersion > SVGA.Version
          DBG.Console.Out("SVGA version: " .. SVGA.Version, DBG.Warning, SVGA.DebugCh)
          DBG.Console.Out("Anim version: " .. fVersion, DBG.Warning, SVGA.DebugCh)
          DBG.Console.Out("Cannot proceed.", DBG.Error, SVGA.DebugCh)
          DBG.Console.Out(Nil, DBG.CloseFunc, SVGA.DebugCh)
          Return(Nil)
        EndIf
        
        ; Check header
        If fHeader <> SVGA.Header
          DBG.Console.Out("SVGA header: " .. SVGA.Header, DBG.Warning, SVGA.DebugCh)
          DBG.Console.Out("Anim header: " .. fHeader, DBG.Warning, SVGA.DebugCh)
          DBG.Console.Out("Cannot proceed.", DBG.Error, SVGA.DebugCh)
          DBG.Console.Out(Nil, DBG.CloseFunc, SVGA.DebugCh)
          Return(Nil)
        EndIf
    
        ; Initialize animation structure
        fName = LowerStr(fName)
        newObj.Name = fName
        newObj.Current = 1
        newObj.CurrentTimeMS = 0
        newObj.X = 0
        newObj.Y = 0
        newObj.PlayerID = Nil
        newObj.ClearMode = #SVGA_CLRMODE_SOLID
        newObj.SolidColor = $0000FF
        newObj.GradColors = { $FF0000, 0.5, $0000FF, 1.0 }
        newObj.GradAngle = 0
        newObj.GradType = #LINEAR
        newObj.TextBrush = Nil
        newObj.CacheBG = False
        newObj.CacheBGID = Nil
        newObj.TextX = 0
        newObj.TextY = 0
        newObj.Events = {}
        newObj.Events.ChangeFrame = Function() EndFunction
        newObj.Events.LastFrame   = Function() EndFunction 
        newObj.Events.BeforeFrame = Function() EndFunction
        
        newObj.Compressed    = ReadInt(fid)
        newObj.EmbeddedMusic = ReadInt(fid)
        newObj.Width         = ReadInt(fid)
        newObj.Height        = ReadInt(fid)
        newObj.FPS           = ReadInt(fid)
        newObj.FrameMS       = 1000/newObj.FPS
        newObj.Count         = ReadInt(fid) 
        
        newObj.DrawOptions   = {}
        newObj.Doublebuffer  = False
        newObj.Hardware      = False
        newObj.Playing     = False
        newObj.Paused      = False
        
        newObj.Viewport   = {}
        newObj.Viewport.X = newObj.X
        newObj.Viewport.Y = newObj.Y
        newObj.Viewport.W = newObj.Width
        newObj.Viewport.H = newObj.Height
        newObj.Viewport.ClipID = Nil
        newObj.Viewport.Display = Nil
        
        newObj.FPST  = 0
        newObj.TimerOffset = 0 ;  
        
        If newObj.EmbeddedMusic = 1
          newObj.MusicName = ReadLine(fid)
          newObj.MusicSize = ReadInt(fid)
          newObj.Offset    = FilePos(fid)
          newObj.MusicID   = Nil
          newObj.MusicFile = DefineVirtualFile(file, newObj.Offset, newObj.MusicSize, newObj.MusicName)
          Seek(fid, FilePos(fid)+newObj.MusicSize)
        EndIf
        
        newObj.Frames = {}
        
        For Local f = 1 To newObj.Count
          newObj.Frames[f-1] = {}
          newObj.Frames[f-1].Size        = ReadInt(fid)
          newObj.Frames[f-1].Offset      = FilePos(fid)
          newObj.Frames[f-1].VirtualFile = DefineVirtualFile(file, newObj.Frames[f-1].Offset, newObj.Frames[f-1].Size, "f"..f..".svg")
          Seek(fid, FilePos(fid)+newObj.Frames[f-1].Size)
        Next
        
        newObj.Linked = {}
        
        CloseFile(fid)
        
        DBG.Console.Out(Nil, DBG.CloseFunc, SVGA.DebugCh)
        Return(newObj)
        
      Else
        DBG.Console.Out("File does not exists.", DBG.Error, SVGA.DebugCh)
        DBG.Console.Out(Nil, DBG.CloseFunc, SVGA.DebugCh)
        Return("")
      EndIf
      
    EndFunction
  • We need another chunk for preloaded data and prerendered graphics. This would allow us to make commonly used sections of graphics and curves in particular will have to be prerendered if we want to render quickly on a Classic Amiga. In this way, slower drawn art and common sections can be put in brushes or sprites. To be able to run a Tracker Module subset, for example, we'd have to preload the instrument samples. I think some module players just internally prerender the song into one sample before playback. This works for systems with a lot of memory but will prove more difficult on a Classic Amiga model.

    Also, the enumeration for the audio track formats may need to be a little more explicit. It may need to indicate what the compression technique used on it is. Is it a Vorbis file or an uncompressed AIFF? Is it 16-bit or 8-bit? Does it reuse instrument data like a Tracker Mod or does it stream like a continuous sample?

    Another thing: Would it be of any use to support scene selection like the DVD formats support or would it be better if each scene were a separate file? I think being able to index forward and backward based on the scene titles is useful. Maybe we could expunge some prerendered data between scenes and retain other global data.

    This is getting complicated but I think it pays to make a format future-proof.
  • Actually, now that I think about it, the scene selection is more of a format for a play-list instead of an animation file. We can keep things simpler that way.
  • We need another chunk for preloaded data and prerendered graphics. This would allow us to make commonly used sections of graphics and curves in particular will have to be prerendered if we want to render quickly on a Classic Amiga. In this way, slower drawn art and common sections can be put in brushes or sprites.

    mmm... you want to use prerendered graphics with vector graphics? For example for a background or a graphics portion common to multiple frames?
    Something that could be addressed using a primitive code?
    Just to be clear, we have a section for prerendered gfx addressed by an id, later in the file we have the frames composed by primitives:
      line ...
      poly ...
      gfx ...  <--- our prerendered image
      dot ...
      poly ...
    and so on.


    To be able to run a Tracker Module subset, for example, we'd have to preload the instrument samples. I think some module players just internally prerender the song into one sample before playback. This works for systems with a lot of memory but will prove more difficult on a Classic Amiga model.

    Also, the enumeration for the audio track formats may need to be a little more explicit. It may need to indicate what the compression technique used on it is. Is it a Vorbis file or an uncompressed AIFF? Is it 16-bit or 8-bit? Does it reuse instrument data like a Tracker Mod or does it stream like a continuous sample?

    Ok, I wll expand it with more explicit data.
    More ideas about audio handling but not sure if they can be implemented easily:
    - soundtrack could be separated by the "talk" soundtrack this could help to save space with multilanguage audio tracks but add complexity and the need to mix at least two audio tracks.
    - samples chunk played at a given frame, we have a command to put a static prerendered gfx block, with could also have a command to play one or more static sample. This also adds complexity...


    Another thing: Would it be of any use to support scene selection like the DVD formats support or would it be better if each scene were a separate file? I think being able to index forward and backward based on the scene titles is useful. Maybe we could expunge some prerendered data between scenes and retain other global data.
    ...
    Actually, now that I think about it, the scene selection is more of a format for a play-list instead of an animation file. We can keep things simpler that way.

    Ok.


    This is getting complicated but I think it pays to make a format future-proof.

    I'm agree that it could be an interesting experiment with interesting results :smile:
  • After giving it some thought, we may have to do some more thinking through of the sound. On modern game systems the music is streamed and the sound effects are mixed with the stream in realtime. On the Amiga, getting more than 4 voices requires mixing or an external streaming device like the MasPlayer for MP3 files. Trying to interleave the streams and disk accesses will be hard enough as it is so maybe we should just skip the Paula playback of sound effects and speech and let the decompressor use all 4 voices to mix to 14-bit stereo on systems without an external playback device.

    One alternative to 14-bit sound that offers a 50% compression ratio is the Audio8 format. It uses the volume control as a prescalar so that when the sound is loud it plays at full volume and when it's quiet it plays at lower volume to allow the use of all 8 bits even then. Audio8 will play back well on Paula with very little CPU usage. It will require additional decoding on 16-bit sound cards but still won't take much CPU power. We could even vary the sampling period lengths to give it a variable bit rate for better compression yet. Since the Amiga can implement stereo panning by playing the same sample through a left and right voice, that would still allow 2 stereo 8-bit samples at once.

    Let me know what you think.
  • edited December 2015
    Well... I think that we should not compromise the quality lowering the results we could have on hi-end systems or add too much complexity in the decoding/encoding routines, so my proposal is:
    how about adding a dedicated audio track encoded for weak cpus? For example, considering the sfx and separated music soundttrack and speach track we could have:
    
    --- animation header ---
    [int] Standard music format
    [int] Light music format
    [int] Standard Sfx format
    [int] Light Sfx format
    [ :: later we have to define here additional properties like sample rates, etc... :: ]
    
    
    --- Sfx chunk ---
    [int] Sfx count (how many sfx this animation have)
    
    --- sfx 1 ---
    [int] Id (unique id used to address the sfx during the playback)
    [int] Standard sfx data size
    [---] Standard sfx data block
    [int] Light sfx data size
    [---] Light sfx data block
    
    ...
    
    --- Soundtrack chuck ---
    [int] Audio tracks (how many audio tracks this animation holds)
    [bool] Common music track? (True -> the first audio track is common music track, in this case the track language is ignored)
    
    --- track 1 ---
    [string] Track name
    [string] Track language
    [int] Standard music data size
    [---] Standard Music data block
    [int] Light music data size
    [---] Light music data block
    
    --- track 2 ---
    ...
    
    This allow us to have lightweight audio suitable for classic or systems with few cpu power, for example we could embed an MP3 as standard audio track and a MOD for light audio track improving flexibility.
    If light or standard format does not exists the player should fall back the existing one if it is capable of playing it back.
    On a classic system the player should be smart enough to choice which tracks to playback (or let the user select which ones), or, if capable (because have enough power or have additional hardware), play them all mixing them on the fly.
    The negative aspect is an increase of the animation size but the choice is always the same:
    - Small files -> More cpu time
    - Large files -> Less cpu time

    Using this approach we could have from 0 to 2 stereo soundtracks (music & speach) and an undefined number of sound effects that could also overlap each other during the playback.
    But the alternative light audio can be anything, for example a stereo music soundtrack, a mono speach soundtrack and the sfx that could be mapped to the fourth channel on classics.

    Realistically, considering a film conversion to our format, will produce only a stereo soundtrack (for each language) where music & speach are already mixed.
    This implementation allow what you have suggested in you previous post (Audio8 and all 4 channels).

    What do you think?
    Do you have any links to have a look at the Audio8 format?
  • Let's not do the light format thing. Audio8 uses a fixed frame with a separate volume control for each frame. If it were fixed to use different frame width with altered sampling rates, it could get much better compression. AHI does most of the work. If we encode something from a digitized it can use a8v (v for variable bit rate), just like any other codec format.

    MP3 has license issues like MPEG & MPEG2. Vorbis is better but still CPU intensive like MP3 and both formats are lossy so their quality suffers in exchange for file size.

    MOD files are tight sized like MIDI but with instrument samples. They load statically and play multiple tracks. They need mixing but no streaming.

    Let's support multiple formats.
  • Ok, so let's return one step back:
    
    --- MUSIC SECTION HEADER ---
    ------ soundtrack : this track is used when speach tracks for multiple languages are available.
    ------ this requires that the speach track and music track must be mixed during playback.
    [int] SoundtrackFormat (where 0 means no soundtrack, format constant)
    [int] SoundtrackHeaderSize (Size of the next block)
    [---] Soundtrack HeaderData (This block holds additional informations needed for the playback)
    
    ------ speach : this track is used for the speach to be mixed with the soundtrack or for audio
    ------ audio track already mixed with the soundtrack.
    [int] SpeachFormat (where 0 means no speach soundtrack)
    [int] SpeachHeaderSize (Size of the next block)
    [---] SpeachHeaderData
    
    ------ sounds : sounds are used to play effects indipendently by the other audio tracks
    [int] SoundsFormat (where 0 means no sounds)
    [int] SoundsHeader
    [---] SoundsHeaderData
    
    
    --- SOUNDS SECTION ---
    --- ::: IMPORTANT: If SoundsFormat = 0 this section does not exists :::
    --- In this block there are all sounds effects needed by the animation, they are stored
    --- sequentially and the player needs to make a first scan to map the offset of each sound,
    --- this way it could load/stream sounds only when needed.
    [int] Sounds count
    
    --- Sound 1 ---
    [int] Id (unique id used to address the sfx during the playback)
    [int] Sound Data Size
    [---] Sound Data Block
    --- Sound 2 ---
    ...
    --- Sound n ---
    
    
    --- SOUNDTRACK SECTION ---
    --- ::: IMPORTANT : If SoundtrackFormat = 0 this section does not exists :::
    [string] Track Title
    [int] SoundtrackDataSize (size of the next block)
    [int] SoundtrackDataBlock
    
    
    --- SPEACH SECTION ---
    --- ::: IMPORTANT : If SpeachFormat = 0 this section does not exists :::
    --- As stated before this section can have the speach track or the soundtrack+speach 
    --- already mixed, like in common video formats.
    [int] SpeachTracks (how many audio tracks this animation holds)
    
    --- Track 1 ---
    [string] Track Title
    [string] Track Language (enumerators?)
    [int] TrackDataSize
    [---] TrackDataBlock
    
    --- Track 2 ---
    ...
    About sounds:
    In your oponion what could be the best way to manage the sound effects?
    - Have a timeline with the sounds playback moments.
    - Put a command to play certains sounds directly into the animation frame

  • Let's have the "music" track be shared audio for all languages. The audio header can be format specific. That way somebody shouting in the background can be language specific.
  • allanon said:


    About sounds:
    In your oponion what could be the best way to manage the sound effects?
    - Have a timeline with the sounds playback moments.
    - Put a command to play certains sounds directly into the animation frame

    I'd do the latter and allow frequency changes, volume changes, etc., so that the sound command hunk could abstract the functionality of a MOD file without having to implement the whole MOD playback code.
  • Ok, I wll try to rework the format with our latest decisions :smile:
Sign In or Register to comment.