Pomo video recognition is important for Intemet content monitoring. In this paper, a novel pomo video recognition method by fusing the audio and video cues is proposed. Firstly, global color and texture features and local scale-invariant feature transform (SIFT) are extracted to train multiple support vector machine (SVM) classifiers for different erotic categories of image frames. And then, two continuous density hidden Markov models (CHMM) are built to recognize porno sounds. Finally, a fusion method based on Bayes rule is employed to combine the classification results by video and audio cues. The experimental results show that our model is better than six state-of-the-art methods.