doc/design/decoder_structures.tex

   1 \documentclass{article}
   2 \usepackage[usenames]{xcolor}
   3 \usepackage{listings}
   4 \title{Decoder structures}
   5 \author{}
   6 \date{}
   7 \begin{document}
   8 \maketitle
   9
  10 At the time of writing we have a get-stuff-at-this-time API which
  11 hides a decode-some-and-see-what-comes-out approach.
  12
  13 \section{Easy and hard extraction of particular pieces of content}
  14
  15 With most decoders it is quick, easy and reliable to get a particular
  16 piece of content from a particular timecode.  This applies to the DCP,
  17 DCP subtitle, Image and Video MXF decoders.  With FFmpeg, however,
  18 this is not easy.
  19
  20 This suggests that it would make more sense to keep the
  21 decode-and-see-what-comes-out code within the FFmpeg decoder and not
  22 use it anywhere else.
  23
  24 However resampling screws this up, as it means all audio requires
  25 decode-and-see.  I don't think you can't resample in neat blocks as
  26 there are fractional samples and other complications.  You can't postpone
  27 resampling to the end of the player since different audio may be
  28 coming in at different rates.
  29
  30 This suggests that decode-and-see is a better match, even if it feels
  31 a bit ridiculous when most of the decoders have slightly clunky seek
  32 and pass methods.
  33
  34 Having said that: the only other decoder which produces audio is now
  35 the DCP one, and maybe that never needs to be resampled.
  36
  37
  38 \section{Multiple streams}
  39
  40 Another thing unique to FFmpeg is multiple audio streams, possibly at
  41 different sample rates.
  42
  43 There seem to be two approaches to handling this:
  44
  45 \begin{enumerate}
  46 \item Every audio decoder has one or more `streams'.  The player loops
  47   content and streams within content, and the audio decoder resamples
  48   each stream individually.
  49 \item Every audio decoder just returns audio data, and the FFmpeg
  50   decoder returns all its streams' data in one block.
  51 \end{enumerate}
  52
  53 The second approach has the disadvantage that the FFmpeg decoder must
  54 resample and merge its audio streams into one block.  This is in
  55 addition to the resampling that must be done for the other decoders,
  56 and the merging of all audio content inside the player.
  57
  58 These disadvantages suggest that the first approach is better.
  59
  60 One might think that the logical conclusion is to take streams all the
  61 way back to the player and resample them there, but the resampling
  62 must occur on the other side of the get-stuff-at-time API.
  63
  64
  65 \section{Going back}
  66
  67 Thinking about this again in October 2016 it feels like the
  68 get-stuff-at-this-time API is causing problems.  It especially seems
  69 to be a bad fit for previewing audio.  The API is nice for callers,
  70 but there is a lot of dancing around behind it to make it work, and it
  71 seems that it is more `flexible' than necessary; all callers ever do
  72 is seek or run.
  73
  74 Hence there is a temptation to go back to see-what-comes-out.
  75
  76 There are two operations: make DCP and preview.  Make DCP seems to be
  77
  78 \lstset{language=C++}
  79 \lstset{basicstyle=\footnotesize\ttfamily,
  80         breaklines=true,
  81         keywordstyle=\color{blue}\ttfamily,
  82         stringstyle=\color{red}\ttfamily,
  83         commentstyle=\color{olive}\ttfamily}
  84
  85 \begin{lstlisting}
  86   while (!done) {
  87     done = player->pass();
  88     // pass() causes things to appear which are
  89     // sent to encoders / disk
  90   }
  91 \end{lstlisting}
  92
  93 And preview seems to be
  94
  95 \begin{lstlisting}
  96   // Thread 1
  97   while (!done) {
  98     done = player->pass();
  99     // pass() causes things to appear which are buffered
 100     sleep_until_buffers_empty();
 101   }
 102
 103   // Thread 2
 104   while (true) {
 105     get_video_and_audio_from_buffers();
 106     push_to_output();
 107     sleep();
 108   }
 109 \end{lstlisting}
 110
 111 \texttt{Player::pass} must call \texttt{pass()} on its decoders.  They
 112 will emit stuff which \texttt{Player} must adjust (mixing sound etc.).
 113 Player then emits the `final cut', which must have properties like no
 114 gaps in video/audio.
 115
 116 Maybe you could have a parent class for simpler get-stuff-at-this-time
 117 decoders to give them \texttt{pass()} / \texttt{seek()}.
 118
 119 One problem I remember is which decoder to \texttt{pass()} at any given time:
 120 it must be the one with the earliest last output, presumably.
 121 Resampling also looks fiddly in the v1 code.
 122
 123
 124 \section{Having a go}
 125
 126 \begin{lstlisting}
 127   class Decoder {
 128     virtual void pass() = 0;
 129     virtual void seek(ContentTime time, bool accurate) = 0;
 130
 131     signal<void (ContentVideo)> Video;
 132     signal<void (ContentAudio, AudioStreamPtr)> Audio;
 133     signal<void (ContentTextSubtitle)> TextSubtitle;
 134   };
 135 \end{lstlisting}
 136
 137 or perhaps
 138
 139 \begin{lstlisting}
 140   class Decoder {
 141     virtual void pass() = 0;
 142     virtual void seek(ContentTime time, bool accurate) = 0;
 143
 144     shared_ptr<VideoDecoder> video;
 145     shared_ptr<AudioDecoder> audio;
 146     shared_ptr<SubtitleDecoder> subtitle;
 147   };
 148
 149   class VideoDecoder {
 150     signals2<void (ContentVideo)> Data;
 151   };
 152 \end{lstlisting}
 153
 154 Questions:
 155 \begin{itemize}
 156 \item Video / audio frame or \texttt{ContentTime}?
 157 \item Can all the subtitle period notation code go?
 158 \end{itemize}
 159
 160 \subsection{Steps}
 161
 162 \begin{itemize}
 163 \item Add signals to \texttt{Player}.
 164   \begin{itemize}
 165     \item \texttt{signal<void (shared\_ptr<PlayerVideo>), DCPTime)> Video;}
 166     \item \texttt{signal<void (shared\_ptr<AudioBuffers>, DCPTime)> Audio;}
 167     \item \texttt{signal<void (PlayerSubtitles, DCPTimePeriod)> Subtitle;}
 168   \end{itemize}
 169   \item Remove \texttt{get()}-based loops and replace with \texttt{pass()} and signal connections.
 170   \item Remove \texttt{get()} and \texttt{seek()} from decoder parts; add emission signals.
 171   \item Put \texttt{AudioMerger} back.
 172   \item Remove \texttt{during} stuff from \texttt{SubtitleDecoder} and decoder classes that use it.
 173   \item Rename \texttt{give} methods to \texttt{emit}.
 174   \item Remove \texttt{get} methods from \texttt{Player}; replace with \texttt{pass()} and \texttt{seek()}.
 175 \end{itemize}
 176
 177
 178 \section{Summary of work done in \texttt{back-to-pass}}
 179
 180 The diff between \texttt{back-to-pass} and \texttt{master} as at 21/2/2017 can be summarised as:
 181
 182 \begin{enumerate}
 183 \item Remove \texttt{AudioDecoderStream}; no more need to buffer, and resampling is done in \texttt{Player}.
 184 \item \texttt{AudioDecoder} is simple; basically counting frames.
 185 \item All subtitles-during stuff is gone; no need to know what happens in a particular period as we just wait and see.
 186 \item Pass reason stuff gone; not sure what it was for but seems to have been a contortion related to trying to find specific stuff.
 187   \item \texttt{Player::pass} back, obviously.
 188   \item \texttt{Player::get\_video}, \texttt{get\_audio} and
 189     \texttt{get\_subtitle} more-or-less become \texttt{Player}'s
 190     handlers for emissions from decoders; lots of buffering crap gone
 191     in the process.
 192   \item Add \texttt{Decoder::position} stuff so that we know what to \texttt{pass()} in \texttt{Player}.
 193   \item Add \texttt{AudioMerger}; necessary as audio arrives at the
 194     \texttt{Player} from different streams at different times.  The
 195     \texttt{AudioMerger} just accepts data, mixes and spits it out
 196     again.
 197 \item \texttt{AudioMerger} made aware of periods with no content to
 198   allow referenced reels; adds a fair amount of complexity.  Without
 199   this the referenced reel gaps are silence-padded which confuses
 200   things later on as our VF DCP gets audio data that it does not need.
 201 \item Obvious consumer changes: what was a loop over the playlist
 202   length and calls to \texttt{get()} is now calls to \texttt{pass()}.
 203   \item Maybe-seek stuff gone.
 204   \item Some small \texttt{const}-correctness bits.
 205 \end{enumerate}
 206
 207 Obvious things to do:
 208
 209 \begin{enumerate}
 210 \item Ensure AudioMerger is being tested.
 211 \item Ensure hardest-case in video / audio is being tested.
 212 \item Look at symmetry of video/audio paths / APIs.
 213 \end{enumerate}
 214
 215 \end{document}