This project serves as age & gender prediction plugins for ‘AI Led Mirror’ which is an embedded system that interacts with its user via speech, and combines face detection, age & gender prediction and other awesome functions as a whole.

The age prediction, an ordinal regression problem, is addressed with deep Convolutional Neural Network. The pipeline for training is shown as the diagram below.

Pipeline for multiple outputs convolutional neural network

The code powered by tensorflow is available here on my github page. It is forked and overrided from another open src smart speaker project called dingdang-robot where I am a contributor!

An end-to-end learning architecture for automatic speech recognition is proposed with the overrode code of Transformer. Character error rate (CER) of Mandarin ASR on THCHS-30 has been reduced a lot compared to the HMM based benchmark method. I am writing a paper to summarize my work.