As you know Flash sends the audio data via TCP stream.
TCP protocol is a reliable protocol, so time of arriving of audio frame is not limited. It may be 5, 10, 20 , 30, 50, 100 etc milliseconds depends on network condition.
So, only way – to use Adaptive Jitter buffer or simple playback buffer at the receiver side. Such buffer should prevent audio breaks and playback the stream smoothly.
If it is Adaptive Jitter Buffer, it may reduce latency dropping some packets which is been received too late. If it is simple playback bufffer, it may increase latency because it does not
drop packets smart enough.
Another option is to use WebRTC or RTMFP solutions (Flashphoner Web Call Server).
Here the jitter is lesser because WebRTC is seccure RTP and RTMFP is based on UDP – conectionless, unreliable protocols. So here we have a good chance to get good jitter.
But again it will not be exactly 20ms interval but it will be closer to this value.