Machine Learning and Pattern
Unsupervised and Semi-Supervised Deep Learning
Multimodal Deep Learning
Generative Adversarial Network Theory and Applications
Vision and Multimedia Computing
Visual Scene Understanding
Video Analysis and Understanding
Multimodal Analysis and Applications
Internet-of-Things (IOTs) Information Processing and Analytics
Discovery of trustworthy sources/sensors
Multi-source information fusion
Heterogeneous sensor data analysis
- [Deep Network Self-Supervised
Pretraining] Using self-supervised methods for
unsupervised, semi-supervised and/or supervised (pre-)training of CNNs, GCNs,
We developed two novel paradigms of self-supervised methods a)
Auto-Encoding Transformations (AET)
that learns Transformation-Equivariant Representations; b)
Adversarial Contrast (AdCo) that directly self-trains negative pairs in contrastive learning approach.
- 1) Unsupervised training of
and AETv2 [link],
- 2) Variational
AET and the connection to transformation-equivariant
representation learning [link][pdf][github],
- 3) (Semi-)Supervised
AET training with an ensemble of spatial and non-spatial
- 4) Unsupervised
training of Graph Convolutional Networks (GCNs) [pdf][github],
- 5) Transformation
by using the AET loss to train the discriminator for better
generalization to create new images [pdf].
- 6) Adversarial Contrast (AdCo) [pdf][github]:
An adversarial contrastive learning method to directly train
negative samples end-to-end. It shows high performance to
on ImageNet with
20% fewer epochs than the SOTA methods (e.g., MoCo v2, and BYOL)
achieving even better top-1 accuracy. The model is easy to implement
and can be used as a plug-in algorithm to combine with many
- [Regularized GANs] We
present a regularized Loss-Sensitive GAN (LS-GAN),
and extended it to a generalized version (GLS-GAN) with
many variants of regularized GANs as its special cases.
We proved both the distributional consistency and generalizability of
the LS-GAN with polynomial
sample complexity to generate new contents. See more
- 1) LS-GAN and GLS-GAN [pdf][github],
- 2) A landscape of
regularized GANs in a big picture [url],
- 3) An
extension by obtaining an encoder of
input samples directly with
margins through the loss-sensitive GAN [github: torch,
- 4) The LS-GAN has been
adopted by Microsoft CNTK (Cognitive Toolkit) as a reference
regularized GAN model [link].
Localized GAN was used to model the manifold of images along their
tangent vector spaces. It was used to capture and/or generate
local variants of input images so that their attributes can be edited
by manipulating the input noises. The local variants of
along the tangents can also be used to approximate
the Beltrami-Laplace operator for semi-supervised
- [Deep Learning for IOTs and Multimodal Analysis] We
developed 1) State-Frequency Memory RNNs [pdf]
for multiple-frequency analysis of signals, 2) Spatial-Temporal
to integrate self-attentions over spatial topolgy and temporal
dynamics for traffic forecasting, and 3) First-Take-All
to efficiently index and retrieve multimodal sensor signals at
- 1) State-Frequency Memory (SFM)
RNNs for Multi-Source/Sensor Signal Analysis. It explores
multiple frequencies of
dynamic memory for time-series analysis through SFM RNNs.
The multi-frequency memory enables more accurate signal
the LSTM in various ranges of dynamic contexts. For example, in
financial anlayis [pdf],
long-term investors use low-frequency information to
forecast asset prices, while
high-frequency traders rely more on high-frequency pricing signals to
- 2) Spatial-Temporal Transformer
Forecasting. The spatial-temporal transformer [pdf]
is among one of the first works to apply self-attention to
graph neural networks by exploring both the network topology and
dynamics to forecast traffic flows from city-scale IOT data.
- 3) First-Take-All Hashing and
The First-Take-All (FTA) hashing was developed to
efficiently index dynamic activities captured by multimodal sensors
(cameras and depth sensors) [pdf]
fior eldercare, and image [pdf]
and cross-modal retrieval [pdf].
It is also applied to
classify singals of brain neural activities for early
which is one order of magnitude faster than
the SOTA methods on the
multi-facility dataset in a Kaggle Challenge .
- 4) Aligning Multi-Source/Device
We propose Dynamically Programmable Layers to
automatically align signals from multiple sources/devices. We
its application to predict the brain connectivities between neurons [pdf].
- 5) E-Optimal Sensor Deployment and
We develop an optimal online sensor selection approach with the
restricted isometry property based on e-optimality [link].
successfully applied for collaborative spectrum sensing in cognitive
radio networks (CRNs), and selecting the most informative features from
a large amount of data/signals. The paper will be featured in
IEEE Computer's "Spotlight
on Transactions" Column.
- [Small Data Challenges]
Take a look at our survey of "Small
Data Challenges in Big Data Era: A
Survey of Recent Progress on Unsupervised and Semi-Supervised Methods"
and our tutorial presented at IJCAI 2019 [link]
with the presentation slides [pdf].
Also see our recent works on
- 1) Unsupervised Learning.
AutoEncoding Transformations (AET) [pdf],
Autoencoding Variational Transformations (AVT) [pdf],
GraphTER (Graph Transformation Equivariant Representations) [pdf], TrGANs
(Transformation GANs) [pdf],
- 2) Semi-Supervsied Learning.
Localized GANs (see how to compute Laplace-Beltrami operator directly
for semi-supervised learning) [pdf],
Ensemble AET [pdf],
- 3) Few-Shot Learning.
FLAT (Few-Short Learning via AET) [pdf],
knowledge Transfer for few-shot learning [pdf],
task-agnostic meta-learning [pdf]
- [MAPLE Github] We
are releasing the source code of our research projects
at our MAPLE github homepage [url].
We are inviting everyone interested in our works to
Feedbacks and pull requests are warmly welcome.
Back to top