guojunq at gmail dot com
Follow me on social media:
Zhihu, Facebook, Twitter
- Multimedia Systems for Information Analysis
- Multimodal Internet-Of-Things (IOTs)
Learning and Pattern Recognition
- Image Processing and Computer
- [Deep Learning for IOTs and Multimodal Analysis] We developed 1) State-Frequency Memory RNNs [pdf] for multiple-frequency analysis of signals, 2) Spatial-Temporal Transfomers [pdf]
to integrate self-attentions over spatial topolgy and temporal
dynamics for traffic forecasting, and 3) First-Take-All
Hashing [pdf] to efficiently index and retrieve multimodal sensor signals at scale.
- 1) State-Frequency Memory (SFM) RNNs for Multi-Source/Sensor Signal Analysis. It explores multiple frequencies of
dynamic memory for time-series analysis through SFM RNNs.
The multi-frequency memory enables more accurate signal predictions than
the LSTM in various ranges of dynamic contexts. For example, in financial anlayis [pdf],
long-term investors use low-frequency information to
forecast asset prices, while
high-frequency traders rely more on high-frequency pricing signals to make
- 2) Spatial-Temporal Transformer for Traffic Forecasting. The spatial-temporal transformer [pdf]
is among one of the first works to apply self-attention to dynamic
graph neural networks by exploring both the network topology and temporal
dynamics to forecast traffic flows from city-scale IOT data.
- 3) First-Take-All Hashing and Deviced-Enabled Healthcare.
The First-Take-All (FTA) hashing was developed to
efficiently index dynamic activities captured by multimodal sensors (cameras and depth sensors) [pdf] fior eldercare, and image [pdf] and cross-modal retrieval [pdf]. It is also applied to
classify singals of brain neural activities for early diagnosis of
which is one order of magnitude faster than
the SOTA methods on the
multi-facility dataset in a Kaggle Challenge .
- 4) Aligning Multi-Source/Device Signals.
We propose Dynamically Programmable Layers to
automatically align signals from multiple sources/devices. We successfully demonstrate
its application to predict the brain connectivities between neurons [pdf].
- 5) E-Optimal Sensor Deployment and Selection.
We develop an optimal online sensor selection approach with the
restricted isometry property based on e-optimality [link]. It was
successfully applied for collaborative spectrum sensing in cognitive
radio networks (CRNs), and selecting the most informative features from
a large amount of data/signals. The paper will be featured in
IEEE Computer's "Spotlight on Transactions" Column.
- [AutoEncoding Transformations (AET) and Adversarial Contrast (AdCo)] Self-supervised methods ( and the applications to
unsupervised training of CNNs, GCNs, GANs.
We also reveal its intrinsic connection and generalization to
Transformation-Equivariant Representation. A long version
paper on learning Generalized
Transformation-Equivariant Representations (GTER)
Transformations (AET) is available at [pdf].
See more information about this series of works:
- 1) Unsupervised training of
and AETv2 [link],
- 2) Variational
AET and the connection to transformation-equivariant
- 3) (Semi-)Supervised
and the ensemble training
with both spatial and non-spatial transformations [link][pdf][github],
- 4) Unsupervised
training of Graph Convolutional Networks (GCNs) [pdf][github],
- 5) Transformation
GAN (TrGAN) that uses the AET loss to train
its discriminator for better generalizability to create new images [pdf].
- 6) Adversarial Contrast (AdCo) [pdf][github]: An adversarial approach to train
negative examples directly end-to-end with the representation backbone
for unsupervised pre-training, efficiently pre-training ResNet-50 with
20% less computing time than the SOTA methods (e.g., BYOL) in
much fewer epochs.
- [Regularized GANs] We
present a regularized Loss-Sensitive GAN (LS-GAN),
and extended it to a generalized version (GLS-GAN) with
many variants of regularized GANs as its special cases.
We proved both the distributional consistency and generalizability of
the LS-GAN with polynomial
sample complexity to generate new contents. See more
- 1) LS-GAN and GLS-GAN [pdf][github],
- 2) A landscape of
regularized GANs in a big picture [url],
- 3) An
extension by obtaining an encoder of
input samples directly with
margins through the loss-sensitive GAN [github: torch,
- 4) The LS-GAN has been
adopted by Microsoft CNTK (Cognitive Toolkit) as a reference
regularized GAN model [link].
Localized GAN was used to model the manifold of images along their
tangent vector spaces. It was used to capture and/or generate the
local variants of input images so that their attributes can be edited
by manipulating the input noises. The local variants of images
along the tangents can also be used to approximate
the Beltrami-Laplace operator for semi-supervised representation
- [Small Data Challenges]
Take a look at our survey of "Small
Data Challenges in Big Data Era: A
Survey of Recent Progress on Unsupervised and Semi-Supervised Methods"
and our tutorial presented at IJCAI 2019 [link]
with the presentation slides [pdf].
Also see our recent works on
- 1) Unsupervised Learning.
AutoEncoding Transformations (AET) [pdf],
Autoencoding Variational Transformations (AVT) [pdf],
GraphTER (Graph Transformation Equivariant Representations) [pdf], TrGANs
(Transformation GANs) [pdf],
- 2) Semi-Supervsied Learning.
Localized GANs (see how to compute Laplace-Beltrami operator directly
for semi-supervised learning) [pdf],
Ensemble AET [pdf],
- 3) Few-Shot Learning.
FLAT (Few-Short Learning via AET) [pdf],
knowledge Transfer for few-shot learning [pdf],
task-agnostic meta-learning [pdf]
- [MAPLE Github] We
are releasing the source code of our research projects
at our MAPLE github homepage [url].
We are inviting everyone interested in our works to
Feedbacks and pull requests are warmly welcome.
Back to top
am hiring FTEs and interns for several projects on
cognitive and actionable AI systems, which include but are not limited to machine
pattern recognition, multimedia and multi-sensor signal processing and
analysis, computer vision, natural language
processing, decision making, reinforcement learning, adaptive
motion control, path planing, and system optimization. Interested candidates can reach me directly
by email (see above).
2021] ICME 2021 will take place 5-9 July. Due to the impact of
the COVID-19, the conference will be hybrid with an in-person Summit (10-11 July 2021) on
"Disruptive Multimedia Technologies." We are looking forward to
meeting you in July.
- [January 2020] Dr. Qi received the Best Associate Editor Award for the IEEE Transactions on Circuits and Systems for Video Technology (2019).
2019] A paper "PC-DARTS: Partial Channel Connections for
Memory-Efficient Architecture Search" was accepted by ICLR 2020 [pdf].
A fast neural architecture search (NAS) algorithm was developed, orders
of magnitude faster than DARTS.
- [November 2019] A
paper "POST: POlicy-based Switch Tracking" was accepted by AAAI 2020.
- Dr. Qi
will serve as a General Chair for ICME
2021 at Shenzhen.
- Four papers were accepted by
CVPR 2020. Amogn them are applying
AET (Auto-Encoding Transformations) for unsupervised/self training of
Graph Convolutional Networks
and Generative Adversarial Networks [pdf].
- Dr. Qi will chair "Deep Learning for Multimedia
Processing and Analysis I" [link]
and "Image/Video Representation" [link]
at ICASSP 2020.
- Our paper "Hierarchical Long
Short-Term Concurrent Memory for
Human Interaction Recognition" has been accepted by IEEE T-PAMI.
Qi gave a tutorial on "Small Data Challenges in Big Data Era" at IJCAI 2019. See the
at the tutorial homepage [link].
A keynote on "Learning Generalized
Representations" was also presented at IJCAI Tusion 2019.
paper "AVT: Unsupervised Learning of Transformation Equivariant
Representations by Autoencoding Variational Transformations" has been
accepted by ICCV 2019
In this work, we study the AutoEncoding Transformations (AET)
from an information-theoretic perspective, where we present a novel
view of point to generalize the Transformation-Equivariant
- "Few-Shot Image Recognition
with Knowledge Transfer" was accepted by
- Dr. Qi was elected into IEEE IVMSP
- Dr. Qi
is appointed as an Associate Editor for IEEE Transactions on Multimedia.
- Dr. Qi
is appointed as an Associate Editor for IEEE Transactions on Image
- Our paper
"Large-scale Bisample Learning on ID Versus Spot Face Recognition" was
accepted by IJCV,
see preprint [pdf].
- Our paper "AET vs. AED: Unsupervised
Representation Learning by Auto-Encoding Transformations rather than
Data" was accepted by CVPR
2019, see preprint [pdf].
A novel unsupervised learning approach was presented to train
Equivariant Representation (TER) that achieves the
state-of-the-art performance on ImageNet by the unsupervised AlexNet (53.2% of Top-1 accuracy) vs.
59.7% of Top-1 accuracy of fully supervised AlexNet.
- [Feburary 2019] Our paper "Task-Agnostic Meta-Learning for
Few-shot Learning" has been accepted by CVPR 2019. See our
preprint at arvix [pdf].
It presents a meta-learning regularization approach by encouraging
unbiased meta-training over training tasks so that the meta-model can
be better generalized to unseen tasks.
- Our paper "CapProNet: Deep Feature Learning
via Orthogonal Projections onto Capsule Subspaces" was
accepted by NIPS 2018.
We present a novel capsule projection architecture, setting up a new
state-of-the-art for the capsule nets in literature on CIFAR, SVHN and
ImageNet. The source code was released at our github homepage.
paper "Learning Compact
Features for Human Activity Recognition via
Probabilistic First-Take-All" has been accepted by IEEE
Pattern Analysis and Machine Intelligence (PAMI). The accepted
paper and source code will be released soon.
- We propose a Loss-Sensitive
and extend it to a generalized LS-GAN (GLS-GAN) in which
Wasserstein GAN is a special case.
We have proved both distributional consistency and generalizability of
model in a polynomial
sample complexity in terms of the model size and its Lipschitz
constants. See more
details in our paper [pdf], and
an incomplete map of GANs in our view [url].
- Our paper on "Generalized Loss-Sensitive
Adversarial Learning with Manifold Margins [pdf]"
was accepted by ECCV 2018,
where we present to train the Loss-Sensitive GAN by learning a
pull-back mapping from a sample
x to its projection z onto the manifold
generated by the GAN. We shall its applications into
generating interpolated edits between images as well as
semi-supervised learning with state-of-the-art performances. Source
codes are available at [github: torch,
- Our paper on "A Principled Approach to Hard
Triplet Generation via Adversarial Nets " was accepted by ECCV 2018,
where we develop a principled way to generate harder yet more
informative triplets to train query and
State-of-the-art performances were demonstrated on the re-id and
fine-grained classification problems.
- Microsoft CNTK is
officially supporting LS-GAN.
You can make a side-by-side comparison with the other GAN models at https://www.cntk.ai/pythondocs/CNTK_206C_WGAN_LSGAN.html.
- Dr. Qi is serving as a senior
TPC member for AAAI 2019.
- Our paper on "High sensitivity with tiny
candidates for Pulmonary Nodule Detection" was accepted by
- The paper "Global versus Localized
Generative Adversarial Nets" will
appear in CVPR 2018.
We present a new construction of Laplacian-Beltrami operator
to enable semi-supervised learning on manifolds without resorting to
as an approximate. We also demonstrate the
state-of-the-art performance on image classiciation tasks. The
source codes are released ad available at [code 1:
generation, code 2:
paper "Interleaved Structured Sparse Convolutional Neural Networks"
will appear in CVPR 2018
to present a new compact CNN model.
- Dr. Qi is invited as an area
chair for ICPR 2018.
- Dr. Qi will serve as a Technical Program Co-chair for
Multimedia 2020 at Seattle.
have a paper "Interleaved Group Group Convolutions for Deep Neural
accepted by ICCV 2017,
where a super compact and fast deep
convolutional model was develop that can be deployed on mobile devices. Two
types of group convolutions, a primal group sparse convolution
a dual point-wise permutation convolution, are developed to make the
model more efficient. [pdf]
released our sources for our ICML
2017 and KDD
2017 papers on
State-Frequency LSTM [github, pdf]
and stock price prediction [github, pdf].
- Congratulations to Hao and
Liheng on their ICML2017
papers being accepted.
- Congratulations to Mr. Joey Velez-Ginorio,
an undergraduate researcher of our group, on being selected as
a Barry Goldwater
scholar. This is the most prestigious undergraduate
scholarship across the country established by the United States Congress
to support highly qualified college students to pursue
careers in STEM.
- Dr. Qi will serve as an Area Chair for ICCV 2017.
- Dr. Qi is serving as an Area Chair for ICME 2017.
- A paper on learning compact
features that encode dynamics of video and sensor data has been
accepted by ACM TOMM.
- A paper on jointly learning
label classification and tag recommendation has been accepted by AAAI 2017.
- One paper developing an
ranking-based hashing algorithm has been accepted for the publication
in IEEE Transactions on
Pattern Analysis and Machine Intelligence. [pdf]
- One paper
"Tri-Clustered Tensor Completion for Social-Aware Image Tag Refinement"
has been accepted for publication in IEEE Transactions on Pattern
Analysis and Machine Intelligence.
paper has been accepted by
IEEE Transactions on Pattern Analysis and
Machine Intelligence for classifying images of rarely seen
classes with the help of text labels. [pdf][code]
- Two research papers,
including an oral presentation "Hierarchically
Gated Deep Networks for Segmantic Segmentation", have been
accepted for presentation at CVPR 2016, Las Vegas, Nevada. [pdf]
paper has been accepted for plenary presentation at SIGKDD 2016. A fast
detection method for brain disorder based on fMRI was presented. It is
one order of magnitude faster than state-of-the-art methods with even
- Dr. Qi is serving as an Area Chair for ACM Multimedia 2016.
- Dr. Qi will serve as a Senior Program Committee Member
for KDD 2016.
- International Conference on
MultiMedia Modeling will go to Miami FL, 4-6 January 2016 [link].
Dr. Qi will serve as program co-chair.
- CFP: Special Issue on "Big
Media Data: Understanding, Search, and Mining", in IEEE Transactions on
Big Data [pdf]
(deadline: July 1, 2015).
- CFP: "Deep Learning for
Multimedia Computing", in IEEE Transactions on Multimedia [pdf]
(The new deadline is April 20, 2015).
- Our full research paper
"Weekly-Shared Deep Transfer Networks for
Heterogeneous-Domain Knowledge Propagation" has been selected as one
of the four best paper
candidates to be presented at ACM MM 2015.
paper is accepted by ICCV
2015. We developed a novel deep LSTM
model for analyzing human actions, where we explore the differential
structure over memory states to study the dynamic saliency.
full research paper "Weakly-Shared Deep Transfer Networks for
Heterogeneous-Domain Knowledge Propagation" is accepted by ACM MM 2015.
We developed a
novel cross-modal label transfer deep network, showing competitive
performance on predicting image labels derived from the
alignment with text documents.
- Dr. Qi is serving as an Area Chair for ACM Multimedia 2015.
papers are accepted by KDD
2015. Congratulations to Vivek, Rohit, Shiyu and Wei!
In these papers, (1) we developed deep networks to reveal the
brain neural connectivity by aligning time-series
neuron fires that are marked by calcium influx; (2) we
invented a new paradigm
of dynamic model to select and predict sensors and their readings over
time, as compared with the conventional static strategy; and (3) we
heterogeneous networks to predict the cross-modal relevance between
paper "Temporal-Order Preserving Dynamic Quantization for Human Action
Recognition from Multimodal Sensor Streams" accepted by ICMR 2015.
UTKinect-Action dataset, our best approach has achieved 100% accuracy.
to Jun and Kai!
- One paper "Sparse Composite
Quantization" has been accepted by CVPR
2015. Congralutions to Ting!
Back to top