About Me
Currently, I am a researcher in Machine Learning Group, Microsoft Research Asia. My research interests include AI in Financial Technology, Reinforcement Learning, Natural Language Processing and Computer Vision.
Before that, I got my Ph.D. degree from the Joint Ph.D. Program of Microsoft Research Asia and Nankai University in June 2019, supervised by Prof. Tie-Yan Liu (at Microsoft Research Asia) and Prof. Gang Wang (at Nankai University). Prior to that, I obtained my bachelor degree from Nankai University in 2014. Here is my CV.
微软亚洲研究院(MSRA)Machine Learning Group 招聘研究型实习生。实习内容包括时序分析、生成式模型、大语言模型等相关课题,并将成果作为论文发表。
要求如下:1. 具有扎实的机器学习、深度学习基础。并了解相关领域的热门模型和算法。2. 编程功底良好,能独立完成从算法实现、模型调参,到实验分析的全过程。 3. 全职实习3个月以上,地点在北京。
有意向的同学请将简历发送至 chanx@microsoft.com,标题请注明[姓名]+[学校]+[年级]+[专业]。
Experiences
AI in Financial Technology High-Frequency Factor Mining for Stock Trend Prediction Event Driven Stock Trend Prediction
Facial micro-expression recognition. Reinforcement learning for learning rate control. Novel model design for learning dependency in natural language. Unsupervised machine translation.
Health statue prediction for hard drives with recurrent neural networks.
Knowledge based word representation learning. Ad click prediction by modeling user’s sequential behaviors.
Projects
Unsupervised neural machine translation has great potentials for the low-resource or even zero-resource machine translation. We have proposed a general framework called Polygon-Net, which leverages multi auxiliary languages for jointly boosting unsupervised neural machine translation models. Experiments on the benchmark datasets including UN Corpus and WMT show that our approach significantly improves over the two-language based methods, and achieves better performance with more languages introduced to the framework.
Modeling rich semantic structure information of a sentence is useful for understanding natural languages. We have designed a novel RNN model called Multi-channel RNN to leverage the structural information of text inputs via modeling diverse dependence patterns within natural language. Significant improvement achieved on many NLP tasks, including machine translation, text summarization and language modeling.
Models trained by SGD are sensitive to learning rates and good learning rates are problem specific. We have proposed a deep reinforcement learning based learning rate controller for neural network training. Use long-term rewards to guide the selection of learning rate. Better performance achieved than traditional human designed optimizers.
Differentiating posed expressions from spontaneous ones is a more challenging task than conventional facial expressin recognition.We have implemented deep Convolutional Neural Networks and Long Short-Term Memory based framework for posed and spontaneous expression recognition. Design a new layer called comparison layer for CNN to represent the difference information between onset and apex facial expression in middle and high abstraction levels.
In a petabyte-level file system, hard drives fail almost every day. In response to the problem of hard drive failure, researchers have investigated on both reactive fault tolerance and proactive failure prediction. We have proposed a novel method based on Recurrent Neural Networks to assess the health statuses of hard drives by leveraging the gradually changing sequential SMART attributes. Experiments on real-world datasets for disks of different brands and scales demonstrate that the proposed method can not only achieve a reasonable accurate health status assessment, but also can achieve better failure prediction performance than previous works.
Learning high-quality word embedding is quite valuable for many text mining and NLP tasks. We have introduced a novel framework called RC-NET to leverage both the relational and categorical knowledge in knowledge graph to produce word representations of higher quality. The experiments on popular text mining and natural language processing tasks have all demonstrated that the proposed model can significantly improve the quality of word representations.
Selected Honors
- Silver, Chengdu
- Bronze, Tianjin