Directions to Harris Corporation  Engineering Center

Campus Map




Laboratory  for MAchine Perception and LEarning (MAPLE)

Research | Awards | Projects | Students | Publications | Teaching | Services | News  | Sponsors


    Guo-Jun Qi, Ph.D.




guojunq at gmail dot com


Machine Perception and Learning (MAPLE) [url]

Github homepage: [github]

Follow me on social media

Zhihu, Facebook, Twitter

Research Interests

  • Multimedia Systems for Information Analysis  
  • Multimodal Internet-Of-Things (IOTs) 
  • Machine Learning and Pattern Recognition 
  • Image Processing and Computer Vision 

Recent Research (2017 - present)

  • [Deep Learning for IOTs and Multimodal Analysis]  We developed 1) State-Frequency Memory RNNs [pdf] for multiple-frequency analysis of signals, 2) Spatial-Temporal Transfomers [pdf] to integrate self-attentions over spatial topolgy and temporal dynamics for traffic forecasting, and 3) First-Take-All Hashing [pdf] to efficiently index and retrieve multimodal sensor signals at scale. 
    • 1) State-Frequency Memory (SFM) RNNs for Multi-Source/Sensor Signal Analysis. It explores multiple frequencies of dynamic memory for time-series analysis through SFM RNNs. The multi-frequency memory enables more accurate signal predictions than the LSTM in various ranges of dynamic contexts. For example, in financial anlayis [pdf], long-term investors use low-frequency information to forecast asset prices, while high-frequency traders rely more on high-frequency pricing signals to make investment decisions. 
    • 2) Spatial-Temporal Transformer for Traffic Forecasting. The spatial-temporal transformer [pdf] is among one of the first works to apply self-attention to dynamic graph neural networks by exploring both the network topology and temporal dynamics to forecast traffic flows from city-scale IOT data.
    • 3) First-Take-All Hashing and Deviced-Enabled Healthcare.  The First-Take-All (FTA) hashing was developed to efficiently index dynamic activities captured by multimodal sensors (cameras and depth sensors) [pdf] fior eldercare, and image [pdf] and cross-modal retrieval [pdf].  It is also applied to classify singals of brain neural activities for early diagnosis of ADHD [pdf], which is one order of magnitude faster than the SOTA methods on the multi-facility dataset in a Kaggle Challenge .
    • 4) Aligning Multi-Source/Device Signals. We propose Dynamically Programmable Layers to automatically align signals from multiple sources/devices. We successfully demonstrate its application to predict the brain connectivities between neurons [pdf].
    • 5) E-Optimal Sensor Deployment and Selection. We develop an optimal online sensor selection approach with the restricted isometry property based on e-optimality [link].  It was successfully applied for collaborative spectrum sensing in cognitive radio networks (CRNs), and selecting the most informative features from a large amount of data/signals. The paper will be featured in  IEEE Computer's "Spotlight on Transactions" Column.

  • [AutoEncoding Transformations (AET) and Adversarial Contrast (AdCo)] Self-supervised  methods ( and the applications to unsupervised training of CNNs, GCNs, GANs. We also reveal its intrinsic connection and generalization to Transformation-Equivariant Representation. A long version paper on learning Generalized Transformation-Equivariant Representations (GTER) via AutoEncoding Transformations (AET)  is available at [pdf].  See more information about this series of works: 
    • 1) Unsupervised training of CNNs: AETv1 [link][pdf][github] and AETv2 [link], 
    • 2) Variational AET and the connection to transformation-equivariant representation learning[pdf][github], 
    • 3) (Semi-)Supervised AET and the ensemble training with both spatial and non-spatial transformations [link][pdf][github], 
    • 4) Unsupervised training of Graph Convolutional Networks (GCNs) [pdf][github],
    • 5) Transformation GAN (TrGAN)  that uses the AET loss to train its discriminator for better generalizability to create new images [pdf].
    • 6) Adversarial Contrast (AdCo) [pdf][github]: An adversarial approach to train negative examples directly end-to-end with the representation backbone for unsupervised pre-training, efficiently pre-training ResNet-50 with 20% less computing time than the SOTA methods (e.g., BYOL)  in much fewer epochs.

  • [Regularized GANs] We present a regularized Loss-Sensitive GAN (LS-GAN), and extended it to a generalized version (GLS-GAN) with many variants of regularized GANs as its special cases. We proved both the distributional consistency and generalizability of the LS-GAN with polynomial sample complexity to generate new contents. See more details about
    • 1) LS-GAN and GLS-GAN [pdf][github],
    • 2) A landscape of regularized GANs in a big picture [url],
    • 3) An extension by obtaining an encoder of input samples directly with manifold margins through the loss-sensitive GAN [github:  torch, blocks] ,
    • 4) The LS-GAN has been adopted by Microsoft CNTK (Cognitive  Toolkit) as a reference regularized GAN model [link].
    • 5) Localized GAN was used to model the manifold of images along their tangent vector spaces.  It was used to capture and/or generate the local variants of input images so that their attributes can be edited by manipulating the input noises.  The local variants of images along the tangents can also be used to approximate the Beltrami-Laplace operator for semi-supervised representation learning [pdf].
  • [Small Data Challenges] Take a look at our survey of "Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods" [pdf], and our tutorial presented at IJCAI 2019 [link] with the presentation slides [pdf].  Also see our recent works on
    • 1) Unsupervised Learning. AutoEncoding Transformations (AET)  [pdf], Autoencoding Variational Transformations (AVT)  [pdf], GraphTER (Graph Transformation Equivariant Representations) [pdf], TrGANs (Transformation GANs) [pdf],
    • 2) Semi-Supervsied Learning. Localized GANs (see how to compute Laplace-Beltrami operator directly for semi-supervised learning) [pdf], Ensemble AET [pdf],
    • 3) Few-Shot Learning. FLAT (Few-Short Learning via AET) [pdf], knowledge Transfer for few-shot learning [pdf], task-agnostic meta-learning [pdf]

  • [MAPLE Github] We are releasing the source code of our research projects at our MAPLE github homepage [url]. We are inviting everyone interested in our works to try them. Feedbacks and pull requests are warmly welcome. 

Back to top


I am hiring FTEs and interns for several projects on perception, cognitive and actionable AI systems, which include but are not limited to machine learning and pattern recognition, multimedia and multi-sensor signal processing and analysis, computer vision, natural language processing, decision making, reinforcement learning, adaptive control, motion control, path planing, and system optimization.  Interested candidates can reach me directly by email (see above).

Recent News:  

  • [January 2021] ICME 2021 will take place 5-9 July.  Due to the impact of the COVID-19, the conference will be hybrid with an in-person Summit (10-11 July 2021) on "Disruptive Multimedia Technologies."  We are looking forward to meeting you in July. 
  • [January 2020] Dr. Qi received the Best Associate Editor Award for the IEEE Transactions on Circuits and Systems for Video Technology (2019).
  • [December 2019]  A paper "PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search" was accepted by ICLR 2020 [pdf]. A fast neural architecture search (NAS) algorithm was developed, orders of magnitude faster than DARTS. 
  • [November 2019] A paper "POST: POlicy-based Switch Tracking" was accepted by AAAI 2020.

Archived News:

  • Dr. Qi will serve as a General Chair for ICME 2021 at Shenzhen.
  • Four papers were accepted by CVPR 2020. Amogn them are applying AET (Auto-Encoding Transformations) for unsupervised/self training of Graph Convolutional Networks [pdf], and Generative Adversarial Networks [pdf].  
  • Dr. Qi will chair "Deep Learning for Multimedia Processing and Analysis I" [link] and "Image/Video Representation" [link] at ICASSP 2020.
  • Our paper "Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition" has been accepted by IEEE T-PAMI
  • Dr. Qi gave a tutorial on "Small Data Challenges in Big Data Era" at IJCAI 2019. See the slides [pdf] at the tutorial homepage [link].  A keynote on "Learning Generalized Transforamation-Equivariant Representations" was also presented at IJCAI Tusion 2019.
  • Our paper "AVT: Unsupervised Learning of Transformation Equivariant Representations by Autoencoding Variational Transformations" has been accepted by ICCV 2019 [pdf].  In this work, we study the AutoEncoding Transformations (AET) from an information-theoretic perspective, where we present a novel view of point to generalize the Transformation-Equivariant Representation.
  • "Few-Shot Image Recognition with Knowledge Transfer" was accepted by ICCV 2019.
  • Dr. Qi was elected into IEEE IVMSP and MMSP technical committees.
  • Dr. Qi is appointed as an Associate Editor for IEEE Transactions on Multimedia.
  • Dr. Qi is appointed as an Associate Editor for IEEE Transactions on Image Processing.
  • Our paper "Large-scale Bisample Learning on ID Versus Spot Face Recognition" was accepted by IJCV, see preprint [pdf].
  • Our paper "AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data" was accepted by CVPR 2019, see preprint [pdf].  A novel unsupervised learning approach was presented to train Transformation Equivariant Representation (TER) that achieves the state-of-the-art performance on ImageNet by the unsupervised AlexNet (53.2% of Top-1 accuracy) vs. 59.7% of Top-1 accuracy of fully supervised AlexNet.
  • [Feburary 2019] Our paper "Task-Agnostic Meta-Learning for Few-shot Learning" has been accepted by CVPR 2019. See our preprint at arvix [pdf]. It presents a meta-learning regularization approach by encouraging unbiased meta-training over training tasks so that the meta-model can be better generalized to unseen tasks.  
  • Our paper "CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces" was accepted by NIPS 2018. We present a novel capsule projection architecture, setting up a new state-of-the-art for the capsule nets in literature on CIFAR, SVHN and ImageNet. The source code was released at our github homepage.
  • Our paper "Learning Compact Features for Human Activity Recognition via Probabilistic First-Take-All" has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). The accepted paper and source code will be released soon.
  • We propose a Loss-Sensitive GAN (LS-GAN), and extend  it to a generalized LS-GAN (GLS-GAN) in which Wasserstein GAN is a special case. We have proved both distributional consistency and generalizability of the LS-GAN model in a polynomial sample complexity in terms of the model size and its Lipschitz constants. See more details in our paper [pdf], and an incomplete map of GANs in our view [url]. 
  • Our paper on "Generalized Loss-Sensitive Adversarial Learning with Manifold Margins [pdf]" was accepted by ECCV 2018, where we present to train the Loss-Sensitive GAN by learning a pull-back mapping from a sample x to its projection z onto the manifold generated by the GAN. We shall its applications into generating interpolated edits between images as well as semi-supervised learning with state-of-the-art performances. Source codes are available at [github:  torch, blocks]. 
  • Our paper on "A Principled Approach to Hard Triplet Generation via Adversarial Nets " was accepted by ECCV 2018, where we develop a principled way to generate harder yet more informative triplets to train query and re-identification models. State-of-the-art performances were demonstrated on the re-id and fine-grained classification problems.
  • Microsoft CNTK is officially supporting LS-GAN. You can make a side-by-side comparison with the other GAN models at
  • Dr. Qi is serving as a senior TPC member for AAAI 2019.
  • Our paper on "High sensitivity with tiny candidates for Pulmonary Nodule Detection" was accepted by MICCAI 2018.
  • The paper "Global versus Localized Generative Adversarial Nets" will appear in CVPR 2018. We present a new construction of Laplacian-Beltrami operator to enable semi-supervised learning on manifolds without resorting to Laplacian graphs as an approximate. We also demonstrate the state-of-the-art performance on image classiciation tasks. The source codes are released ad available at [code 1: generation, code 2: semi-supervised learning].
  • Our paper "Interleaved Structured Sparse Convolutional Neural Networks" will appear in CVPR 2018 to present a new compact CNN model.
  • Dr. Qi is invited as an area chair for ICPR 2018.
  • Dr. Qi will serve as a Technical Program Co-chair for ACM Multimedia 2020 at Seattle.
  • We have a paper "Interleaved Group Group Convolutions for Deep Neural Networks" accepted by ICCV 2017, where a super compact and fast deep convolutional model was develop that can be deployed on mobile devices. Two types of group convolutions, a primal group sparse convolution and a dual point-wise permutation convolution, are developed to make the model more efficient. [pdf]
  • We released our sources for our ICML 2017 and KDD 2017 papers on State-Frequency LSTM [githubpdf] and stock price prediction [githubpdf].
  • Congratulations to Hao and Liheng on their ICML2017 and KDD2017 papers being accepted.  
  • Congratulations to Mr. Joey Velez-Ginorio, an undergraduate researcher of our group, on being selected as a Barry Goldwater scholar. This is the most prestigious undergraduate scholarship across the country established by the United States Congress to support highly qualified college students to pursue careers in STEM.
  • Dr. Qi will serve as an Area Chair for ICCV 2017.
  • Dr. Qi is serving as an Area Chair for ICME 2017
  • A paper on learning compact features that encode dynamics of video and sensor data has been accepted by ACM TOMM.  
  • A paper on jointly learning label classification and tag recommendation has been accepted by AAAI 2017.  
  • One paper developing an efficient ranking-based hashing algorithm has been accepted for the publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. [pdf] [code
  • One paper "Tri-Clustered Tensor Completion for Social-Aware Image Tag Refinement" has been accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence
  • One paper has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence for classifying images of rarely seen or unseen classes with the help of text labels. [pdf][code]  
  • Two research papers, including an oral presentation "Hierarchically Gated Deep Networks for Segmantic Segmentation", have been accepted for presentation at CVPR 2016, Las Vegas, Nevada. [pdf]
  • One paper has been accepted for plenary presentation at SIGKDD 2016. A fast detection method for brain disorder based on fMRI was presented. It is one order of magnitude faster than state-of-the-art methods with even better accuracy.   
  • Dr. Qi is serving as an Area Chair for ACM Multimedia 2016.   
  • Dr. Qi will serve as a Senior Program Committee Member for KDD 2016.  
  • International Conference on MultiMedia Modeling will go to Miami FL, 4-6 January 2016 [link].   Dr. Qi will serve as program co-chair.  
  • CFP: Special Issue on "Big Media Data: Understanding, Search, and Mining", in IEEE Transactions on Big Data [pdf] (deadline: July 1, 2015). 
  • CFP: "Deep Learning for Multimedia Computing", in IEEE Transactions on Multimedia [pdf] (The new deadline is April 20, 2015).
  • Our full research paper "Weekly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation" has been selected as one of the four best paper candidates to be presented at ACM MM 2015.
  • One paper is accepted by ICCV 2015.  We developed a novel deep LSTM  model for analyzing human actions, where we explore the differential structure over memory states to study the dynamic saliency.
  • One full research paper "Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation" is accepted by ACM MM 2015.  We developed a novel cross-modal label transfer deep network, showing competitive performance on predicting image labels derived from the alignment with text documents.  
  • Dr. Qi is serving as an Area Chair for ACM Multimedia 2015
  • Three papers are accepted by KDD 2015. Congratulations to Vivek, Rohit, Shiyu and Wei!  In these papers, (1) we developed deep networks to reveal the brain neural connectivity by aligning  time-series activiations by neuron fires that are marked by calcium influx;  (2) we invented a new paradigm of dynamic model to select and predict sensors and their readings over time, as compared with the conventional static strategy; and (3) we developed heterogeneous networks to predict the cross-modal relevance between multimodal data.
  • One paper "Temporal-Order Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams" accepted by ICMR 2015.  On UTKinect-Action dataset, our best approach has achieved 100% accuracy. Congralulations to Jun and Kai! 
  • One paper "Sparse Composite Quantization" has been accepted by CVPR 2015. Congralutions to Ting!

Back to top

Last updated 00/00/00