About Me
Currently, I am a researcher in Machine Learning Group, Microsoft Research Asia. My research interests include AI in Financial Technology, Reinforcement Learning, Natural Language Processing and Computer Vision.
Before that, I got my Ph.D. degree from the Joint Ph.D. Program of Microsoft Research Asia and Nankai University in June 2019, supervised by Prof. Tie-Yan Liu (at Microsoft Research Asia) and Prof. Gang Wang (at Nankai University). Prior to that, I obtained my bachelor degree from Nankai University in 2014. Here is my CV.
微软亚洲研究院(MSRA)Machine Learning Group 招聘研究型实习生。实习内容为AI在金融方向的探索性研究,如时间序列分析等。并将成果发表论文。
要求如下:1. 具有扎实的机器学习基础,了解常见深度学习模型,熟练掌握至少一种常用深度学习工具,如Tensorflow, PyTorch等。2. 有科研经验、论文发表经验优先。3. 全职实习3个月以上,地点在北京,近期可入职优先。
有意向的同学请将简历发送至 chanx@microsoft.com,标题请注明[姓名]+[学校]+[年级]+[专业]。
Experiences
AI in Financial Technology High-Frequency Factor Mining for Stock Trend Prediction Event Driven Stock Trend Prediction
Facial micro-expression recognition. Reinforcement learning for learning rate control. Novel model design for learning dependency in natural language. Unsupervised machine translation.
Health statue prediction for hard drives with recurrent neural networks.
Knowledge based word representation learning. Ad click prediction by modeling user’s sequential behaviors.
Projects
Unsupervised neural machine translation has great potentials for the low-resource or even zero-resource machine translation. We have proposed a general framework called Polygon-Net, which leverages multi auxiliary languages for jointly boosting unsupervised neural machine translation models. Experiments on the benchmark datasets including UN Corpus and WMT show that our approach significantly improves over the two-language based methods, and achieves better performance with more languages introduced to the framework.
Modeling rich semantic structure information of a sentence is useful for understanding natural languages. We have designed a novel RNN model called Multi-channel RNN to leverage the structural information of text inputs via modeling diverse dependence patterns within natural language. Significant improvement achieved on many NLP tasks, including machine translation, text summarization and language modeling.
Models trained by SGD are sensitive to learning rates and good learning rates are problem specific. We have proposed a deep reinforcement learning based learning rate controller for neural network training. Use long-term rewards to guide the selection of learning rate. Better performance achieved than traditional human designed optimizers.
Differentiating posed expressions from spontaneous ones is a more challenging task than conventional facial expressin recognition.We have implemented deep Convolutional Neural Networks and Long Short-Term Memory based framework for posed and spontaneous expression recognition. Design a new layer called comparison layer for CNN to represent the difference information between onset and apex facial expression in middle and high abstraction levels.
In a petabyte-level file system, hard drives fail almost every day. In response to the problem of hard drive failure, researchers have investigated on both reactive fault tolerance and proactive failure prediction. We have proposed a novel method based on Recurrent Neural Networks to assess the health statuses of hard drives by leveraging the gradually changing sequential SMART attributes. Experiments on real-world datasets for disks of different brands and scales demonstrate that the proposed method can not only achieve a reasonable accurate health status assessment, but also can achieve better failure prediction performance than previous works.
Learning high-quality word embedding is quite valuable for many text mining and NLP tasks. We have introduced a novel framework called RC-NET to leverage both the relational and categorical knowledge in knowledge graph to produce word representations of higher quality. The experiments on popular text mining and natural language processing tasks have all demonstrated that the proposed model can significantly improve the quality of word representations.
Selected Honors
- Silver, Chengdu
- Bronze, Tianjin