Yongjun Chen

About Me: currently a research engineer at Apple working on production reinforcement learning systems to align agentic models for Apple Intelligence. I had worked on post-training large and small generative models for Apple Intelligence products, including Writing Tools and Reply Suggestions. Broadly interested in AI systems & products.

I was a research engineer at Salesforce AI Research, where I worked on both research and applications of Sequential Recommender Systems, and AutoML. I earned my M.S. in Computer Science from Washington State University, advised by Prof. Shuiwang Ji, and my B.S. in Mathematics and Statistics from Huazhong University of Science and Technology (HUST) in China.

Contact: yongjunchen1995@gmail.com

[Linkedin] [Google Scholar] [Github]

Selected Publications

Conference

Intent Contrastive Learning for Sequential Recommendation
Yongjun Chen, Zhiwei Liu, Jia Li, Julian McAuley, Caiming Xiong
The Web Conference (WWW), 2022

[Abstract] [Paper] [Code]

Users’ interactions with items are driven by various intents (e.g., preparing for holiday gifts, shopping for fishing equipment, etc.). However, users’ underlying intents are often unobserved/latent,making it challenging to leverage such a latent intent factor for Sequential recommendation(SR). To investigate the benefits of latent intent and leverage it effectively for recommendation, we proposeIntentContrastiveLearning(ICL), a general learning paradigm that leverages a latent intent variable into SR. The core idea is to learn users’ intent distribution functions from unlabeled user behavior sequences and optimize SR models with contrastive self-supervised learning (SSL) by considering the learnt intents to improve recommendation. Specifically, we introduce a latent variable to represent users’ intents and learn the distribution function of the latent variable via clustering. We propose to leverage the learnt intents intoSR models via contrastive SSL, which maximizes the agreement between a view of sequence and its corresponding intent. The training is alternated between intent representation learning and the SR model optimization steps within the generalized expectation-maximization (EM) framework. Fusing user intent information intoSR also improves model robustness. Experiments conducted on four real-world datasets demonstrate the superiority of the proposed learning paradigm, which improves performance, and robustness against data sparsity and noisy interaction issues. Case studies onSports and Yelp further verify the effectiveness of ICL.

ELECRec: Training Sequential Recommenders as Discriminators
Yongjun Chen, Jia Li, Caiming Xiong
The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

[Abstract] [Paper] [Code]

Sequential recommendation is often considered as a generative task, i.e., training a sequential encoder to generate the next item of a user’s interests based on her historical interacted items. Despite their prevalence, these methods usually require training with more meaningful samples to be effective, which otherwise will lead to a poorly trained model. In this work, we propose to train the sequential recommenders as discriminators rather than generators. Instead of predicting the next item, our method trains a discriminator to distinguish if a sampled item is a ‘real’ target item or not. A generator, as an auxiliary model, is trained jointly with the discriminator to sample plausible alternative next items and will be thrown out after training. The trained discriminator is considered as the final SR model and denoted as ELECRec. Experiments conducted on four datasets demonstrate the effectiveness and efficiency of the proposed approach.

Voxel Deconvolutional Networks for 3D Brain Image Labeling
Yongjun Chen, Hongyang Gao, Lei Cai, Min Shi, Dinggang Shen and Shuiwang Ji
The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2018

[Abstract] [Paper] [Code] [Slides] [Poster]

Deep learning methods have shown great success in pixel-wise prediction tasks. One of the most popular methods employs an encoder-decoder network in which deconvolutional layers are used for up-sampling feature maps. However, a key limitation of the deconvolutional layer is that it suers from the checkerboard artifact problem, which harms the prediction accuracy. is is caused by the independency among adjacent pixels on the output feature maps. Previous work only solved the checkerboard artifact issue of deconvolutional layers in the 2D space. Since the number of intermediate feature maps needed to generate a deconvolutional layer grows exponentially with dimensionality, it is more challenging to solve this issue in higher dimensions. In this work, we propose the voxel deconvolutional layer (VoxelDCL) to solve the checkerboard artifact problem of deconvolutional layers in 3D space. We also provide an ecient approach to implement VoxelDCL. To demonstrate the eectiveness of VoxelDCL, we build four variations of voxel deconvolutional networks (VoxelDCN) based on the U-Net architecture with VoxelDCL. We apply our networks to address volumetric brain images labeling tasks using the ADNI and LONI LPBA40 datasets. e experimental results show that the proposed iVoxelDCNa achieves improved performance in all experiments. It reaches 83.34% in terms of dice ratio on the ADNI dataset and 79.12% on the LONI LPBA40 dataset, which increases 1.39% and 2.21% respectively compared with the baseline. In addition, all the variations of VoxelDCN we proposed outperform the baseline methods on the above datasets, which demonstrates the eectiveness of our methods.

Modeling Dynamic Attributes for Next Basket Recommendation
Yongjun Chen, Jia Li, Chenghao Liu, Chenxi Li, Markus Anderle, Julian McAuley, Caiming Xiong
Context-Aware Recommender Systems Workshop at ACM Conference on Recommender Systems (CARS@RecSys), 2021

[Abstract] [Paper] [Code]

Traditional approaches to next-item and next basket recommendation typically extract users’ interests based on their past interactions and associated static contextual information (e.g. a user id or item category). However, extracted interests can be inaccurate and become obsolete. Dynamic attributes, such as user income changes, item price changes (etc.), change over time. Such dynamics can intrinsically reflect the evolution of users’ interests. We argue that modeling such dynamic attributes can boost recommendation performance. However, properly integrating them into user interest models is challenging since attribute dynamics can be diverse such as time-interval aware, periodic patterns (etc.), and they represent users’ behaviors from different perspectives, which can happen asynchronously with interactions. Besides dynamic attributes, items in each basket contain complex interdependencies which might be beneficial but nontrivial to effectively capture. To address these challenges, we propose a novel Attentive network to model Dynamic attributes (named AnDa). AnDa separately encodes dynamic attributes and basket item sequences. We design a periodic aware encoder to allow the model to capture various temporal patterns from dynamic attributes. To effectively learn useful item relationships, intra-basket attention module is proposed. Experimental results on three real-world datasets demonstrate that our method consistently outperforms the state-of-the-art.

Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations
Hongyang Gao, Yongjun Chen, and Shuiwang Ji
The Web Conference (WWW), 2019

[Abstract] [Paper]

With the development of graph convolutional networks (GCN), deep learning methods have started to be used on graph data. In additional to convolutional layers, pooling layers are another important components of deep learning. However, no effective pooling methods have been developed for graphs currently. In this work, we propose the graph pooling (gPool) layer, which employs a trainable projection vector to measure the importance of nodes in graphs. By selecting the k-most important nodes to form the new graph, gPool achieves the same objective as regular max pooling layers operating on images. Another limitation of GCN when used on graph-based text representation tasks is that, GCNs do not consider the order information of nodes in graph. To address this limitation, we propose the hybrid convolutional (hConv) layer that combines GCN and regular convolutional operations. The hConv layer is capable of increasing receptive fields quickly and computing features automatically. Based on the proposed gPool and hConv layers, we develop new deep networks for text categorization tasks. Our results show that the networks based on gPool and hConv layers achieves new state-of-the-art performance as compared to baseline methods.

Dense Transformer Networks
Jun Li, Yongjun Chen, Lei Cai, Ian Davidson, and Shuiwang Ji
The 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019

[Abstract] [Paper] [Code]

The key idea of current deep learning methods for dense prediction is to apply a model on a regular patch centered on each pixel to make pixel-wise predictions. These methods are limited in the sense that the patches are determined by network architecture instead of learned from data. In this work, we propose the dense transformer networks, which can learn the shapes and sizes of patches from data. The dense transformer networks employ an encoder-decoder architecture, and a pair of dense transformer modules are inserted into each of the encoder and decoder paths. The novelty of this work is that we provide technical solutions for learning the shapes and sizes of patches from data and efficiently restoring the spatial correspondence required for dense prediction. The proposed dense transformer modules are differentiable, thus the entire network can be trained. We apply the proposed networks on natural and biological image segmentation tasks and show superior performance is achieved in comparison to baseline methods.

Interpreting Deep Models for Text Analysis via Optimization and Regularization Methods
Hao Yuan, Yongjun Chen, Xia Hu and Shuiwang Ji
The 33rd AAAI Conference on Artificial Intelligence (AAAI), 2019

[Abstract] [Paper]

Interpreting deep neural networks is of great importance to understand and verify deep models for natural language processing (NLP) tasks. However, most existing approaches only focus on improving the performance of models but ignore their interpretability. In this work, we propose an approach to investigate the meaning of hidden neurons of convolutional neural network (CNN) models.We first employ saliency map and optimization techniques to approximate the detected information of hidden neurons from input sentences. Then we develop regularization terms and explore words in vocabulary to interpret such detected information. Experimental results demonstrate that our approach can identify meaningful and reasonable interpretations for hidden spatial locations. Additionally, we show that our approach can describe the decision procedure of deep NLP models.

Preprints

Contrastive Self-supervised Sequential Recommendation with Robust Augmentation
Zhiwei Liu*, Yongjun Chen*, Jia Li, Philip S. Yu, Julian McAuley, Caiming Xiong
Preprint, 2021

[Abstract] [Paper] [Code]

Sequential Recommendation describes a set of techniques to model dynamic user behavior in order to predict future interactions in sequential user data. At their core, such approaches model transition probabilities between items in a sequence, whether through Markov chains, recurrent networks, or more recently, Transformers. However both old and new issues remain, including data-sparsity and noisy data; such issues can impair performance, especially in complex, parameter-hungry models. In this paper, we investigate the application of contrastive Self-Supervised Learning (SSL) to sequential recommendation, as a way to alleviate some of these issues. Contrastive SSL constructs augmentations from unlabelled instances, where agreements among positive pairs are maximized. It is challenging to devise a contrastive SSL framework for sequential recommendation, due to its discrete nature, correlations among items, and skewness of length distributions. To this end, we propose a novel framework, Contrastive Self-supervised Learning for Sequential Recommendation (CoSeRec). We introduce two informative augmentation operators leveraging item correlations to create high quality views for contrastive learning. Experimental results on three real-world datasets demonstrate the effectiveness of the proposed method on improving model performance, and the robustness against sparse and noisy data. Our implementation is available: https://github.com/YChen1993/CoSeRec.

Honors & Awards

Pennyworth@Salesforce is ranked #4 of CIKM 2021 AnalytiCup on Automated Hyperparameter Optimization (led the competition), Oct. 2021.

Excellent Graduate, HUST, June. 2016.

Honorable mention prize of The International Interdisciplinary Contest in Modeling (ICM), Feb. 2015.

Undergraduate Science and Technology Innovation Scholarship, HUST, Feb. 2014, 2015.

Yongjun Chen

Selected Publications

Conference

Preprints

Education

Professional Services

DM/ML/AI Conference Reviewer

Honors & Awards