"For 
    authoring a karaoke presentation, 
    the author has to describe content structure of an audio and a text media 
    and then he/she has to manually set synchronization between text fragment 
    and audio segment a well description of a text’s content and an audio’s content. 
    Our work submited to ACM MM2002 has provided such an authoring environment. 
    However, in this work, the media content structuring is performed independently. 
    Thus the synchronizations between them after that have to manually specify. 
    Although the authoring environment provides the power visual editing tools, 
    it is also very hard work if the authors have to manually synchronize, for 
    instance, a video during three hours with a long textual document annotating 
    it. A more semantic description for media segments or media content plus the 
    interoperability between semantic description models could give us automatic 
    solutions for composing multimedia presentation. We are experimenting on this 
    solution for observing its realities and effection.