Papers

M. Zhong*, A. Zhang*, X. Wang, R. Hou, W. Xiong, C. Zhu, Z. Chen, L. Tan, C. Bi, M. Lewis, S. Popuri, S. Narang, M. Kambadur, D. Mahajan, S. Edunov, J. Han, and L. van der Maaten (*equal contribution)
Law of the Weakest Link: Cross Capabilities of Large Language Models
In Proceedings of the International Conference on Learning Representations (ICLR), 2025
llm-cross-capabilities.org
J. Ji, A. Zhang, C. Zhu, S. Wang, M. Kambadur, S. Chang, and W. Xiong
Pruning Computations in Transformer Prefilling for Large Language Models
“Speed up Transfomer prefilling for generation via a learnable router.” In arXiv, 2025
J. Kim, A. Goyal, A. Zhang, B. Xiong, R. Hou, M. Kambadur, D. Mahajan, H. Hajishirzi, and L. Tan
A Systematic Examination of Preference Learning through the Lens of Instruction-Following
In Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025
Y. Yu, Z. Chen, A. Zhang, L. Tan, C. Zhu, R. Y. Pang, Y. Qian, X. Wang, S. Gururangan, C. Zhang, M. Kambadur, D. Mahajan, and R. Hou
Self-Generated Critiques Boost Reward Modeling for Language Models
In Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025
Z. Zhang, Y. Yao, A. Zhang, X. Tang, X. Ma, Z. He, Y. Wang, M. Gerstein, R. Wang, G. Liu, and H. Zhao
Igniting Language Intelligence: The Hitchhiker’s Guide From Chain-of-Thought Reasoning to Language Agents
In ACM Computing Surveys (CSUR), 2024
Llama Team, AI@Meta (Core Contributor)
The Llama 3 Herd of Models
2024
Z. Zhang and A. Zhang
You Only Look at Screens: Multimodal Chain-of-Action Agents
In Findings of the Association for Computational Linguistics (ACL), 2024
Z. Zhang, A. Zhang, M. Li, H. Zhao, G. Karypis, and A. J. Smola
Multimodal Chain-of-Thought Reasoning in Language Models
In Transactions on Machine Learning Research (TMLR), 2024
[Idea Inspiration by Homeschooling]
S. Ren, A. Zhang, Y. Zhu, S. Zhang, S. Zheng, M. Li, A. J. Smola, X. Sun
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023
Z. Zeng, C. Hawkins, M. Hong, A. Zhang, N. Pappas, V. Singh, and S. Zheng
Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023
J. Chen, A. Zhang, X. Shi, M. Li, A. J. Smola, and D. Yang
Parameter-Efficient Fine-Tuning Design Spaces
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
Z. Zhang, A. Zhang, M. Li, and A. J. Smola
Automatic Chain of Thought Prompting in Large Language Models
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
Z. Liu, Z. Tang, X. Shi, A. Zhang, M. Li, A. Shrivastava, and A. Wilson
Learning Multimodal Data Augmentation in Feature Space
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
T. Yang, Y. Zhu, Y. Xie, A. Zhang, C. Chen, and M. Li
AIM: Adapting Image Models for Efficient Video Understanding
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
C. Qin, A. Zhang, Z. Zhang, J. Chen, M. Yasunaga, and D. Yang
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
In Empirical Methods in Natural Language Processing (EMNLP), 2023
J. Chen, A. Zhang, D. Yang, M. Li, and A. J. Smola
A Cheaper and Better Diffusion Language Model with Soft-Masked Noise
In Empirical Methods in Natural Language Processing (EMNLP), 2023
R. Aly, X. Shi, K. Lin, A. Zhang, and A. G. Wilson
Automated Few-Shot Learning with Instruction-Finetuned Language Models
In Findings of Empirical Methods in Natural Language Processing (EMNLP), 2023
H. Wang, A. Zhang, Y. Zhu, S. Zheng, M. Li, A. J. Smola, and Z. Wang
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
In Proceedings of International Conference on Machine Learning (ICML, Long Presentation), 2022
H. Wang, A. Zhang, S. Zheng, X. Shi, M. Li, and Z. Wang
Removing Batch Normalization Boosts Adversarial Training
In Proceedings of International Conference on Machine Learning (ICML), 2022
M. S. Bari, A. Zhang, S. Zheng, X. Shi, Y. Zhu, S. Joty, and M. Li
SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning
“If your prompt tuning can’t converge easily, make it semi-parametric.” In arXiv, 2022
H. Wang, J. Hong, A. Zhang, J. Zhou, and Z. Wang
Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2022
Z. Hu, R. K.-W, Lee, C. C. Aggarwal, and A. Zhang
Text Style Transfer: A Review and Experimental Evaluation
In ACM SIGKDD Explorations, 2022
X. Hao, Y. Zhu, S. Appalaraju, A. Zhang, W. Zhang, B. Li, and M. Li
MixGen: A New Multi-Modal Data Augmentation
“Interpolate images, concatenate text.” In arXiv, 2022
C. He, S. Zheng, A. Zhang, G. Karypis, T. Chilimbi, M. Soltanolkotabi, and S. Avestimehr
SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing
“2.5x speedup over Switch Transformers.” In arXiv, 2022
E. Grassucci, A. Zhang, and D. Comminiello
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions
In IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
A. Zhang, Y. Tay, S. Zhang, A. Chan, A. T. Luu, S. C. Hui, and J. Fu
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters
In Proceedings of the International Conference on Learning Representations (ICLR, Outstanding Paper Award), 2021
A. Chan, Y. S. Ong, B. Pung, A. Zhang, and J. Fu
CoCon: A Self-Supervised Approach for Controlled Text Generation
In Proceedings of the International Conference on Learning Representations (ICLR), 2021
A. Zhang, Y. Tay, Y. Shen, A. Chan, and S. Zhang
Self-Instantiated Recurrent Units with Dynamic Soft Recursion
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021
Y. Long, B. Wang, Z. Yang, B. Kailkhura, A. Zhang, C. A. Gunter, and B. Li
G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021
T. Chen, Y. Sui, X. Chen, A. Zhang, and Z. Wang
A Unified Lottery Ticket Hypothesis for Graph Neural Networks
In Proceedings of International Conference on Machine Learning (ICML), 2021
H. Shao, Z. Xiao, S. Yao, D. Sun, A. Zhang, S. Liu, T. Wang, J. Li, and T. Abdelzaher
ControlVAE: Tuning, Analytical Properties, and Performance Analysis
In Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
A. Zhang, A. Chan, Y. Tay, J. Fu, S. Wang, S. Zhang, H. Shao, S. Yao, and R. Lee
On Orthogonality Constraints for Transformers
In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL, Oral), 2021
H. Shao, J. Wang, H. Lin, X. Zhang, A. Zhang, H. Ji, and T. Abdelzaher
Controllable and Diverse Text Generation in E-commerce
In Proceedings of the Web Conference (WWW), 2021
S. Zhang, H. Liu, A. Zhang, Y. Hu, C. Zhang, Y. Li, T. Zhu, S. He, and W. Ou
Learning User Representations with Hypercuboids for Recommender Systems
In Proceedings of the 14th International Conference on Web Search and Data Mining (WSDM), 2021
H. Shao, S. Yao, D. Sun, A. Zhang, S. Liu, D. Liu, J. Wang, and T. Abdelzaher
ControlVAE: Controllable Variational Autoencoder
In Proceedings of International Conference on Machine Learning (ICML), 2020
J. Guo, H. He, T. He, L. Lausen, M. Li, H. Lin, X. Shi, C. Wang, J. Xie, S. Zha, A. Zhang, H. Zhang, Z. Zhang, Z. Zhang, S. Zheng, and Y. Zhu
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
In Journal of Machine Learning Research (JMLR), Vol. 21, No. 23, 2020
A. Chan, Y. Tay, Y. S. Ong, and A. Zhang
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
In Findings of Empirical Methods in Natural Language Processing (EMNLP), 2020
H. Shao, D. Sun, J. Wu, Z. Zhang, A. Zhang, S. Yao, S. Liu, T. Wang, C. Zhang, and T. Abdelzaher
GitHub Repository Recommendation for Academic Papers
In Proceedings of the Web Conference (WWW), 2020
Y. Tay, A. T. Luu, A. Zhang, S. Wang, and S. C. Hui
Compositional De-Attention Networks
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2019
S. Zhang, L. Yao, L. V. Tran, A. Zhang, and Y. Tay
Quaternion Collaborative Filtering for Recommendation
In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019
Y. Tay, A. Zhang, A. T. Luu, J. Rao, S. Zhang, S. Wang, J. Fu, and S. C. Hui
Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Y. Tay, S. Wang, A. T. Luu, J. Fu, M. C. Phan, X. Yuan, J. Rao, S. C. Hui, and A. Zhang
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019
A. El-Kishky, F. Xu, A. Zhang, and J. Han
Parsimonious Morpheme Segmentation with an Application to Enriching Word Embeddings
In Proceedings of the IEEE International Conference on Big Data (IEEE BigData), 2019
S. Yao, Y. Zhao, A. Zhang, S. Hu, H. Shao, C. Zhang, L. Su, and T. Abdelzaher
Deep Learning for the Internet of Things
In Computer, 2018 (selected as May 2018 cover feature of the flagship magazine of the IEEE Computer Society)
A. El-Kishky, F. Xu, A. Zhang, S. Macke, and J. Han
Entropy-Based Subword with an Application to Word Embeddings
In Proceedings of the 2nd Workshop on Subword and Character Level Models in NLP at the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL SCLeM), 2018
S. Yao, Y. Zhao, H. Shao, C. Zhang, A. Zhang, S. Hu, D. Liu, S. Liu, L. Su, and T. F. Abdelzaher
SenseGAN: Enabling Deep Learning for Internet of Things with a Semi-Supervised Framework
In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (ACM Ubicomp, Distinguished Paper Award), 2018
A. Zhang, X. Lu, C. A. Gunter, S. Yao, F. Tao, R. Zhu, H. Gui, D. Fabbri, D. Liebovitz, and B. Malin
De Facto Diagnosis Specialties: Recognition and Discovery
In Learning Health Systems (LHS), 2018
S. Yao, Y. Zhao, H. Shao, C. Zhang, A. Zhang, D. Liu, S. Liu, L. Su, and T. Abdelzaher
ApDeepSense: Deep Learning Uncertainty Estimation Without the Pain for IoT
In Proceedings of the 38th IEEE International Conference on Distributed Computing Systems (ICDCS), 2018
A. Zhang, L. Garcia-Pueyo, J. B. Wendt, M. Najork, and A. Broder
Email Category Prediction
In Proceedings of the 26th International World Wide Web Conference (WWW), 2017
S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelzaher
DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing
In Proceedings of the 26th International World Wide Web Conference (WWW), 2017
S. Yao, Y. Zhao, A. Zhang, S. Lu, and T. Abdelzaher
DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems without Loss of Performance
In Proceedings of the 15th ACM Conference on Embedded Networked Sensor Systems (ACM Sensys, Best Paper Award Nomination), 2017
S. Yao, Y. Zhao, H. Shao, A. Zhang, C. Zhang, S. Li, and T. Abdelzaher
RDeepSense: Reliable Deep Mobile Computing Models with Uncertainty Estimations
In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (ACM Ubicomp), 2017
A. Zhang, A. Goyal, R. Baeza-Yates, Y. Chang, J. Han, C. A. Gunter, and H. Deng
Towards Mobile Query Auto-Completion: An Efficient Mobile Application-Aware Approach
In Proceedings of the 25th International World Wide Web Conference (WWW), 2016
A. Zhang and Q. Gu
Accelerated Stochastic Block Coordinate Descent with Optimal Sampling [Proof of Lemmas]
In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016
S. Demetriou, W. Merrill, W. Yang, A. Zhang, and C. A. Gunter
Free for All! Assessing User Data Exposure to Advertising Libraries on Android
In Proceedings of the Network and Distributed System Security Symposium (NDSS), 2016
R. Zhu, A. Zhang, J. Peng, and C. Zhai
Exploiting Temporal Divergence of Topic Distributions for Event Detection
In Proceedings of the IEEE International Conference on Big Data (IEEE BigData), 2016
A. Zhang, A. Goyal, W. Kong, H. Deng, A. Dong, Y. Chang, C. A. Gunter, and J. Han
adaQAC: Adaptive Query Auto-Completion via Implicit Negative Feedback
In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2015
W. Kong, R. Li, J. Luo, A. Zhang, Y. Chang, and J. Allan
Predicting Search Intent Based on Pre-Search Context
In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2015
H. Fu, A. Zhang, and X. Xie
Effective Social Graph De-anonymization based on Graph Structure and Descriptive Information
In ACM Transactions on Intelligent Systems and Technology (ACM TIST), Vol. 6, No. 4, 2015
X. Lu, A. Zhang, C. A. Gunter, D. Fabbri, D. Liebovitz, and B. Malin
Discovering De Facto Diagnosis Specialties
In Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), 2015
A. Zhang, X. Xie, K. C.-C. Chang, C. A. Gunter, J. Han, and X. Wang
Privacy Risk in Anonymized Heterogeneous Information Networks
In Proceedings of the 17th International Conference on Extending Database Technology (EDBT), 2014
H. Fu, A. Zhang, and X. Xie
De-anonymizing Social Graphs via Node Similarity
In Proceedings of the 23rd International World Wide Web Conference (WWW), 2014