iccv22

iccv 2021 论文列表

IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, October 11-17, 2021.

Multi-Dimensional Scaling on Groups.
A Manifold Learning based Video Prediction approach for Deep Motion Transfer.
Sheaves as a Framework for Understanding and Interpreting Model Fit.
Grassmannian Dimensionality Reduction for Optimized Universal Manifold Embedding Representation of 3D Point Clouds.
Dual Transformation and Manifold Distances Voting for Outlier Rejection in Point Cloud Registration.
A unified framework for non-negative matrix and tensor factorisations with a smoothed Wasserstein loss.
The Flag Manifold as a Tool for Analyzing and Comparing Sets of Data Sets.
Two-parameter Persistence for Images via Distance Transform.
FedAffect: Few-shot federated learning for facial expression recognition.
SVEA: A Small-scale Benchmark for Validating the Usability of Post-hoc Explainable AI Solutions in Image and Signal Recognition.
Transformer Meets Part Model: Adaptive Part Division for Person Re-Identification.
On the Importance of Encrypting Deep Features.
Attention Aware Debiasing for Unbiased Model Prediction.
Rethinking Common Assumptions to Mitigate Racial Bias in Face Recognition Datasets.
Multi-Perspective Features Learning for Face Anti-Spoofing.
End-to-end Model-based Gait Recognition using Synchronized Multi-view Pose Constraint.
Formula-driven Supervised Learning with Recursive Tiling Patterns.
Sparse Feature Representation Learning for Deep Face Gender Transfer.
Student-Teacher Oneness: A Storage-efficient approach that improves facial expression recognition.
CryoPoseNet: End-to-End Simultaneous Learning of Single-particle Orientation and 3D Map Reconstruction from Cryo-electron Microscopy Data.
Thermal Image Processing via Physics-Inspired Deep Networks.
SS-JIRCS: Self-Supervised Joint Image Reconstruction and Coil Sensitivity Calibration in Parallel MRI without Ground Truth.
Compressed Classification from Learned Measurements.
Joint Reconstruction and Calibration Using Regularization by Denoising with Application to Computed Tomography.
What Does Your Computational Imaging Algorithm Not Know?: A Plug-and-Play Model Quantifying Model Uncertainty.
K-space refinement in deep learning MR reconstruction via regularizing scan specific SPIRiT-based self consistency.
How to cheat with metrics in single-image HDR reconstruction.
Fast Unsupervised MRI Reconstruction Without Fully-Sampled Ground Truth Data Using Generative Adversarial Networks.
Photon-Limited Object Detection using Non-local Feature Matching and Knowledge Distillation.
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving Environments.
Cross-modal Relational Reasoning Network for Visual Question Answering.
Aerial Cross-platform Path Planning Dataset.
Learning-Based Shadow Detection in Aerial Imagery Using Automatic Training Supervision from 3D Point Clouds.
Point Cloud Object Segmentation Using Multi Elevation-Layer 2D Bounding-Boxes.
An Algorithmic Approach to Quantifying GPS Trajectory Error.
JanusNet: Detection of Moving Objects from UAV Platforms.
Simulated Photorealistic Deep Learning Framework and Workflows to Accelerate Computer Vision and Unmanned Aerial Vehicle Research.
Appearance and Motion Based Persistent Multiple Object Tracking in Wide Area Motion Imagery.
Robust Multi-Object Tracking Using Re-Identification Features and Graph Convolutional Networks.
From VIS To OVIS: A Technical Report To Promote The Development Of The Field.
A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings.
Limited Sampling Reference Frame for MaskTrack R-CNN.
Occluded Video Instance Segmentation with Set Prediction Approach.
Characterizing Scattered Occlusions for Effective Dense-Mode Crowd Counting.
Pedestrian Occlusion Level Classification using Keypoint Detection and 2D Body Surface Area Estimation.
The Aircraft Context Dataset: Understanding and Optimizing Data Variability in Aerial Domains.
Leveraging Temporal Information for 3D Trajectory Estimation of Space Objects.
Bridging the gap between debiasing and privacy for deep learning.
Toward Affective XAI: Facial Affect Analysis for Understanding Explainable Human-AI Interactions.
Unravelling the Effect of Image Distortions for Biased Prediction of Pre-trained Face Recognition Models.
Towards Solving the DeepFake Problem : An Analysis on Improving DeepFake Detection using Dynamic Face Augmentation.
XAI Handbook: Towards a Unified Framework for Explainable AI.
The Watchlist Imbalance Effect in Biometric Face Identification: Comparing Theoretical Estimates and Empiric Measurements.
Simple baselines can fool 360° saliency metrics.
The Marine Debris Dataset for Forward-Looking Sonar Semantic Segmentation.
In-Situ Joint Light and Medium Estimation for Underwater Color Restoration.
The VAROS Synthetic Underwater Data Set: Towards realistic multi-sensor underwater data with ground truth.
Underwater marker-based pose-estimation with associated uncertainty.
Hyperspectral 3D Mapping of Underwater Environments.
A New Deep Learning Engine for CoralNet.
Super-resolution for in situ Plankton Images.
Improving Rare-Class Recognition of Marine Plankton with Hard Negative Mining.
Anomaly Detection for In situ Marine Plankton Images.
Analysing Affective Behavior in the second ABAW2 Competition.
An audiovisual and contextual approach for categorical and continuous emotion recognition in-the-wild.
Multitask Multi-database Emotion Recognition.
Student Engagement Dataset.
Public Life in Public Space (PLPS): A multi-task, multi-group video dataset for public life research.
Emotion Recognition Based on Body and Context Fusion in the Wild.
A Multi-task Mean Teacher for Semi-supervised Facial Affective Behavior Analysis.
MTMSN: Multi-Task and Multi-Modal Sequence Network for Facial Action Unit and Expression Recognition.
Emotion Recognition With Sequential Multi-task Learning Technique.
Noisy Annotations Robust Consensual Collaborative Affect Expression Recognition.
Evaluating the Performance of Ensemble Methods and Voting Strategies for Dense 2D Pedestrian Detection in the Wild.
Continuous Emotion Recognition with Audio-visual Leader-follower Attentive Fusion.
Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition.
Causal affect prediction model using a past facial image sequence.
Prior Aided Streaming Network for Multi-task Affective Analysis.
FSER: Deep Convolutional Neural Networks for Speech Emotion Recognition.
Single-stage Face Detection under Extremely Low-light Conditions.
Multiple GAN Inversion for Exemplar-based Image-to-Image Translation.
Blocks World Revisited: The Effect of Self-Occlusion on Classification by Convolutional Neural Networks.
LLVIP: A Visible-infrared Paired Dataset for Low-light Vision.
UAC: An Uncertainty-Aware Face Clustering Algorithm.
Temporal Kernel Consistency for Blind Video Super-Resolution.
1000 Pupil Segmentations in a Second using Haar Like Features and Statistical Learning.
Egocentric Indoor Localization from Room Layouts and Image Outer Corners.
Seeing the Unseen: Predicting the First-Person Camera Wearer's Location and Pose in Third-Person Scenes.
SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos.
MAAD: A Model and Dataset for "Attended Awareness" in Driving.
Studying the Effects of Self-Attention for Medical Image Analysis.
MedSkip: Medical Report Generation Using Skip Connections and Integrated Attention.
Segmentation for Classification of Screening Pancreatic Neuroendocrine Tumors.
Style Transfer based Coronary Artery Segmentation in X-ray Angiogram.
Generalizing Few-Shot Classification of Whole-Genome Doubling Across Cancer Types.
Unsupervised 3D Shape Coverage Estimation with Applications to Colonoscopy.
DMNet: Dual-Stream Marker Guided Deep Network for Dense Cell Segmentation and Lineage Tracking.
EfficientARL: improving skin cancer diagnoses by combining lightweight attention on EfficientNet.
Medical Image Classification Using Generalized Zero Shot Learning.
BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease Diagnosis.
SOoD: Self-Supervised Out-of-Distribution Detection Under Domain Shift for Multi-Class Colorectal Cancer Tissue Types.
Learning to Automatically Diagnose Multiple Diseases in Pediatric Chest Radiographs Using Deep Convolutional Neural Networks.
Graph Cuts Loss to Boost Model Accuracy and Generalizability for Medical Image Segmentation.
End-to-End Learning of Fused Image and Non-Image Features for Improved Breast Cancer Classification from MRI.
Multi-scanner Harmonization of Paired Neuroimaging Data via Structure Preserving Embedding Learning.
Deep Frequency Re-calibration U-Net for Medical Image Segmentation.
Improving Tuberculosis (TB) Prediction using Synthetically Generated Computed Tomography (CT) Images.
Uncertainty-aware GAN with Adaptive Loss for Robust MRI Image Enhancement.
A Dual Adversarial Calibration Framework for Automatic Fetal Brain Biometry.
VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers.
VLG-Net: Video-Language Graph Matching Network for Video Grounding.
Learning Where to Cut from Edited Videos.
Plots to Previews: Towards Automatic Movie Preview Retrieval using Publicly Available Meta-data.
Video Contrastive Learning with Global Context.
Face, Body, Voice: Video Person-Clustering with Multiple Modalities.
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks.
Video Transformer Network.
Language-guided Multi-Modal Fusion for Video Action Recognition.
Visual Question Answering with Textual Representations for Images.
What You Say Is Not What You Do: Studying Visio-Linguistic Models for TV Series Summarization.
Latent Variable Models for Visual Question Answering.
Semi-Autoregressive Transformer for Image Captioning.
CIGLI: Conditional Image Generation from Language & Image.
Egocentric Biochemical Video-and-Language Dataset.
Multi-Stage Fusion for Multi-Class 3D Lidar Detection.
Cross-modal Matching CNN for Autonomous Driving Sensor Data Monitoring.
Visual Reasoning using Graph Convolutional Networks for Predicting Pedestrian Crossing Intention.
Autonomous Vehicle Vision 2021: ICCV Workshop Summary.
Efficient Uncertainty Estimation in Semantic Segmentation via Distillation.
Synthetic Data Generation using Imitation Training.
Few-Shot Batch Incremental Road Object Detection via Detector Fusion.
Graph Convolutional Networks for 3D Object Detection on Radar Data.
SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments.
Semantics-aware Multi-modal Domain Translation: From LiDAR Point Clouds to Panoramic Color Images.
SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection.
DVTracker: Real-Time Multi-Sensor Association and Tracking for Self-Driving Vehicles.
SCARF: A Semantic Constrained Attention Refinement Network for Semantic Segmentation.
It's All Around You: Range-Guided Cylindrical Network for 3D Object Detection.
CenterPoly: real-time instance segmentation using bounding polygons.
Causal BERT: Improving object detection by searching for challenging groups.
CDAda: A Curriculum Domain Adaptation for Nighttime Semantic Segmentation.
RaidaR: A Rich Annotated Image Dataset of Rainy Street Scenes.
A Computer Vision-Based Attention Generator using DQN.
Occupancy Grid Mapping with Cognitive Plausibility for Autonomous Driving Applications.
Frustum-PointPillars: A Multi-Stage Approach for 3D Object Detection using RGB Camera and LiDAR.
YOLinO: Generic Single Shot Polyline Detection in Real Time.
Multi-weather city: Adverse weather stacking for autonomous driving.
Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver Gaze Zone Estimation Dataset.
Weakly Supervised Approach for Joint Object and Lane Marking Detection.
On the Road to Large-Scale 3D Monocular Scene Reconstruction using Deep Implicit Functions.
DriPE: A Dataset for Human Pose Estimation in Real-World Driving Settings.
Monocular 3D Localization of Vehicles in Road Scenes.
VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results.
VisDrone-MOT2021: The Vision Meets Drone Multiple Object Tracking Challenge Results.
VisDrone-CC2021: The Vision Meets Drone Crowd Counting Challenge Results.
VistrongerDet: Stronger Visual Information for Object Detection in VisDrone Images.
GIAOTracker: A comprehensive framework for MCMOT with global information and optimizing strategies in VisDrone 2021.
ViT-YOLO: Transformer-Based YOLO for Object Detection.
Coarse-grained Density Map Guided Object Detection in Aerial Images.
TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios.
Tackling the Background Bias in Sparse Object Detection via Cropped Windows.
The First Vision For Vitals (V4V) Challenge for Non-Contact Video-Based Physiological Estimation.
Automatic region-based heart rate measurement using remote photoplethysmography.
LCOMS Lab's approach to the Vision For Vitals (V4V) Challenge.
Estimating Heart Rate from Unlabelled Video.
Beat-to-Beat Cardiac Pulse Rate Measurement From Video.
The Ninth Visual Object Tracking VOT2021 Challenge Results.

Matej Kristan Jirí Matas Ales Leonardis Michael Felsberg Roman P. Pflugfelder Joni-Kristian Kämäräinen Hyung Jin Chang Martin Danelljan Luka Cehovin Zajc Alan Lukezic Ondrej Drbohlav Jani Käpylä Gustav Häger Song Yan Jinyu Yang Zhongqun Zhang Gustavo Fernández Mohamed H. Abdelpakey Goutam Bhat Llukman Cerkezi Hakan Cevikalp Shengyong Chen Xin Chen Miao Cheng Ziyi Cheng Yu-Chen Chiu Ozgun Cirakman Yutao Cui Kenan Dai Mohana Murali Dasari Qili Deng Xingping Dong Daniel K. Du Matteo Dunnhofer Zhen-Hua Feng Zhiyong Feng Zhihong Fu Shiming Ge Rama Krishna Gorthi Yuzhang Gu Bilge Gunsel Qing Guo Filiz Gurkan Wencheng Han Yanyan Huang Felix Järemo Lawin Shang-Jhih Jhang Rongrong Ji Cheng Jiang Yingjie Jiang Felix Juefei-Xu J. Yin Xiao Ke Fahad Shahbaz Khan Byeong Hak Kim Josef Kittler Xiangyuan Lan Jun Ha Lee Bastian Leibe Hui Li Jianhua Li Xianxian Li Yuezhou Li Bo Liu Chang Liu Jingen Liu Li Liu Qingjie Liu Huchuan Lu Wei Lu Jonathon Luiten Jie Ma Ziang Ma Niki Martinel Christoph Mayer Alireza Memarmoghadam Christian Micheloni Yuzhen Niu Danda Pani Paudel Houwen Peng Shoumeng Qiu Aravindh Rajiv Muhammad Rana Andreas Robinson Hasan Saribas Ling Shao Mohamed Shehata Furao Shen Jianbing Shen Kristian Simonato Xiaoning Song Zhangyong Tang Radu Timofte Philip H. S. Torr Chi-Yi Tsai Bedirhan Uzun Luc Van Gool Paul Voigtlaender Dong Wang Guangting Wang Liangliang Wang Lijun Wang Limin Wang Linyuan Wang Yong Wang Yunhong Wang Chenyan Wu Gangshan Wu Xiaojun Wu Fei Xie Tianyang Xu Xiang Xu Wanli Xue Bin Yan Wankou Yang Xiaoyun Yang Yu Ye Jun Yin Chengwei Zhang Chunhui Zhang Haitao Zhang Kaihua Zhang Kangkai Zhang Xiaohan Zhang Xiaolin Zhang Xinyu Zhang Zhibin Zhang Shao-Chuan Zhao Ming Zhen Bineng Zhong Jiawen Zhu Xuefeng Zhu

Is First Person Vision Challenging for Object Tracking?
Learning Tracking Representations via Dual-Branch Fully Transformer Networks.
Learning Spatio-Appearance Memory Network for High-Performance Visual Tracking.
A Unified Efficient Pyramid Transformer for Semantic Segmentation.
LiteEdge: Lightweight Semantic Edge Detection Network.
Semantic Segmentation With Multi Scale Spatial Attention For Self Driving Cars.
SignPose: Sign Language Animation Through 3D Pose Lifting.
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling.
InstancePose: Fast 6DoF Pose Estimation for Multiple Objects from a Single RGB Image.
Cloth mechanical parameter estimation and simulation for optimized robotic manipulation.
3D Semantic Label Transfer in Human-Robot Collaboration.
An Anomaly Detection System via Moving Surveillance Robots with Human Collaboration.
Multi-modal Variational Faster R-CNN for Improved Visual Object Detection in Manufacturing.
Markerless Visual Tracking of a Container Crane Spreader.
Absolute and Relative Pose Estimation in Refractive Multi View.
DC-VINS: Dynamic Camera Visual Inertial Navigation System with Online Calibration.
A closed form solution for viewing graph construction in uncalibrated vision.
Towards realistic symmetry-based completion of previously unseen point clouds.
Adapting Deep Neural Networks for Pedestrian-Detection to Low-Light Conditions without Re-training.
CAFT: Class Aware Frequency Transform for Reducing Domain Gap.
Building 3D Morphable Models from a Single Scan.
Effect of Parameter Optimization on Classical and Learning-based Image Matching Methods.
Object Detection in Cluttered Environments with Sparse Keypoint Selection.
Robust Face Frontalization For Visual Speech Recognition*.
Finite Aperture Stereo: 3D Reconstruction of Macro-Scale Scenes.
A Technical Survey and Evaluation of Traditional Point Cloud Clustering Methods for LiDAR Panoptic Segmentation.
A Robust End-to-end Method for Parametric Curve Tracing via Soft Cosine-similarity-based Objective Function.
SketchBird: Learning to Generate Bird Sketches from Text.
Supporting Reference Imagery for Digital Drawing.
Scene Designer: a Unified Model for Scene Search and Synthesis from Sketch.
SketchyDepth: from Scene Sketches to RGB-D Images.
The 2nd Challenge on Remote Physiological Signal Sensing (RePSS).
Time Lab's approach to the Challenge on Computer Vision for Remote Physiological Measurement.
Weakly Supervised rPPG Estimation for Respiratory Rate Estimation.
MANet: a Motion-Driven Attention Network for Detecting the Pulse from a Facial Video with Drastic Motions.
An End-to-end Efficient Framework for Remote Physiological Signal Sensing.
Addressing Target Shift in Zero-shot Learning using Grouped Adversarial Learning.
Analyzing and Mitigating JPEG Compression Defects in Deep Learning.
MGGAN: Solving Mode Collapse Using Manifold-Guided Training.
Online Continual Learning For Visual Food Classification.
Fine-Grain Prediction of Strawberry Freshness using Subsurface Scattering.
Instance Search via Fusing Hierarchical Multi-level Retrieval and Human-object Interaction Detection.
What Matters for Ad-hoc Video Search? A Large-scale Evaluation on TRECVID.
Hard-Negatives or Non-Negatives? A Hard-Negative Selection Strategy for Cross-Modal Retrieval Using the Improved Marginal Ranking Loss.
Multi-Input Fusion for Practical Pedestrian Intention Prediction.
Learning Decoupled Representations for Human Pose Forecasting.
STIRNet: A Spatial-temporal Interaction-aware Recursive Network for Human Trajectory Prediction.
Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers.
SCAT: Stride Consistency with Auto-regressive regressor and Transformer for hand pose estimation.
Simple Baseline for Single Human Motion Forecasting.
Audio-Visual Transformer Based Crowd Counting.
UniNet: A Unified Scene Understanding Network and Exploring Multi-Task Relationships through the Lens of Adversarial Attacks.
ConvNets vs. Transformers: Whose Visual Representations are More Transferable?
MILA: Multi-Task Learning from Videos via Efficient Inter-Frame Attention.
In Defense of the Learning Without Forgetting for Task Incremental Learning.
Multi-Modal RGB-D Scene Recognition Across Domains.
Concurrent Discrimination and Alignment for Self-Supervised Feature Learning.
Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic Interactions.
Emotional Features of Interactions with Empathic Agents.
Multiple Instance Triplet Loss for Weakly Supervised Multi-Label Action Localisation of Interacting Persons.
Temporal Cues from Socially Unacceptable Trajectories for Anomaly Detection.
SkeletonNetV2: A Dense Channel Attention Blocks for Skeleton Extraction.
Distance and Edge Transform for Skeleton Extraction.
DISCO - U-Net based Autoencoder Architecture with Dual Input Streams for Skeleton Image Drawing.
PatchAugment: Local Neighborhood Augmentation in Point Cloud Classification.
3D Shapes Local Geometry Codes Learning with SDF.
U-Net based skeletonization and bag of tricks.
Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification.
Evaluation of Latent Space Learning with Procedurally-Generated Datasets of Shapes.
Investigating transformers in the decomposition of polygonal shapes as point collections.
Learning Laplacians in Chebyshev Graph Convolutional Networks.
SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environments.
ABD-Net: Attention Based Decomposition Network for 3D Point Cloud Decomposition.
MRGAN: Multi-Rooted 3D Shape Representation Learning with Unsupervised Part Disentanglement.
3D Scene Angles using UL Decomposition of Planar Homography.
A System for Fusing Color and Near-Infrared Images in Radiance Domain.
Underwater Image Color Correction Using Ensemble Colorization Network.
Graph2Pix: A Graph-Based Image to Image Translation Framework.
Sparse to Dense Motion Transfer for Face Image Animation.
Simple and Efficient Unpaired Real-world Super-Resolution using Image Statistics.
DeepFake MNIST+: A DeepFake Facial Animation Dataset.
Improving Key Human Features for Pose Transfer.
Saliency-Guided Transformer Network combined with Local Embedding for No-Reference Image Quality Assessment.
Efficient Wavelet Boost Learning-Based Multi-stage Progressive Refinement Network for Underwater Image Enhancement.
Contrastive Feature Loss for Image Prediction.
SMILE: Semantically-guided Multi-attribute Image and Layout Editing.
Manipulating Image Style Transformation via Latent-Space SVM.
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data.
SDWNet: A Straight Dilated Network with Wavelet Transformation for image Deblurring.
Distilling Reflection Dynamics for Single-Image Reflection Removal.
Reducing Noise Pixels and Metric Bias in Semantic Inpainting on Segmentation Map.
Stochastic Image Denoising by Sampling from the Posterior Distribution.
Generalized Real-World Super-Resolution through Adversarial Robustness.
Test-Time Adaptation for Super-Resolution: You Only Need to Overfit on a Few More Images.
SwinIR: Image Restoration Using Swin Transformer.
Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement.
Unsupervised Generative Adversarial Networks with Cross-model Weight Transfer Mechanism for Image-to-image Translation.
High Perceptual Quality Image Denoising with a Posterior Sampling CGAN.
ORB-SLAM with Near-infrared images and Optical Flow data.
ToFNest: Efficient normal estimation for time-of-flight depth cameras.
HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor.
Deep Embeddings-based Place Recognition Robust to Motion Blur.
Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World.
FrankMocap: A Monocular 3D Whole-Body Pose Estimation System via Regression and Integration.
Optical Braille Recognition Using Object Detection Neural Network.
Exploiting Egocentric Vision on Shopping Cart for Out-Of-Stock Detection in Retail Environments.
Efficient Search in a Panoramic Image Database for Long-term Visual Localization.
Audi-Exchange: AI-Guided Hand-based Actions to Assist Human-Human Interactions for the Blind and the Visually Impaired.
Virtual Touch: Computer Vision Augmented Touch-Free Scene Exploration for the Blind or Visually Impaired.
InAugment: Improving Classifiers via Internal Augmentation.
All you need are a few pixels: semantic segmentation with PixelPick.
Weakly-Supervised Semantic Segmentation by Learning Label Uncertainty.
Bounding Box Dataset Augmentation for Long-range Object Distance Estimation.
Object-Based Augmentation for Building Semantic Segmentation: Ventura and Santa Rosa Case Study.
Interactive Labeling for Human Pose Estimation in Surveillance Videos.
Self-improving classification performance through GAN distillation.
Reducing Label Effort: Self-Supervised meets Active Learning.
Class-Agnostic Segmentation Loss and Its Application to Salient Object Detection and Segmentation.
Learning to Localise and Count with Incomplete Dot-annotations.
Localizing Human Keypoints beyond the Bounding Box.
Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark.
Using Synthetic Data Generation to Probe Multi-View Stereo Networks.
Multi-Domain Conditional Image Translation: Translating Driving Datasets from Clear-Weather to Adverse Conditions.
Data Augmentation for Scene Text Recognition.
EdgeFlow: Achieving Practical Interactive Segmentation with Edge-Guided Flow.
Nuisance-Label Supervision: Robustness Improvement by Free Labels.
Boosting Fairness for Masked Face Recognition.
Revisting Quantization Error in Face Alignment.
An Efficient Network Design for Face Video Super-resolution.
Explainable Face Recognition based on Accurate Facial Compositions.
Balanced Masked and Standard Face Recognition.
Towards Mask-robust Face Recognition.
Masked Face Recognition Datasets and Validation.
Rectifying the Data Bias in Knowledge Distillation.
ResSaNet: A Hybrid Backbone of Residual Block and Self-Attention Module for Masked Face Recognition.
Improving Representation Consistency with Pairwise Loss for Masked Face Recognition.
Mask Aware Network for Masked Face Recognition in the Wild.
MaskOut: A Data Augmentation Method for Masked Face Recognition.
Partial FC: Training 10 Million Identities on a Single Machine.
Masked Face Recognition Challenge: The InsightFace Track Report.
SSR: Semi-supervised Soft Rasterizer for single-view 2D to 3D Reconstruction.
DeepDraper: Fast and Accurate 3D Garment Draping over a 3D Human Body.
What Does TERRA-REF's High Resolution, Multi Sensor Plant Sensing Public Domain Data Offer the Computer Vision Community?
Multi-Domain Few-Shot Learning and Dataset for Agricultural Applications.
Field-Based Plot Extraction Using UAV RGB Images.
Leaf Area Estimation by Semantic Segmentation of Point Cloud of Tomato Plants.
Predicting Protein Content in Grain Using Hyperspectral Deep Learning.
Visualizing Feature Maps for Model Selection in Convolutional Neural Networks.
Classification and Visualization of Genotype × Phenotype Interactions in Biomass Sorghum.
A Semi-self-supervised Learning Approach for Wheat Head Detection using Extremely Small Number of Labeled Samples.
WheatNet-Lite: A Novel Light Weight Network for Wheat Head Detection.
Identification and Measurement of Individual Roots in Minirhizotron Images of Dense Root Systems.
From RGB to NIR: Predicting of near infrared reflectance from visible spectrum aerial images of crops.
Machine learning meets distinctness in variety testing.
Analysis of Arabidopsis Root Images - Studies on CNNs and Skeleton-Based Root Topology.
Semi-supervised dry herbage mass estimation using automatic data and synthetic images.
Dynamic Color Transform for Wheat Head Detection.
Enlisting 3D Crop Models and GANs for More Data Efficient and Generalizable Fruit Detection.
Tip-burn stress detection of lettuce canopy grown in Plant Factories.
LeafMask: Towards Greater Accuracy on Leaf Segmentation.
A Real-time Anti-distractor Infrared UAV Tracker with Channel Feature Refinement Module.
Semi-Automatic Annotation For Visual Object Tracking.
Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark.
A Unified Approach for Tracking UAVs in Infrared.
SiamSTA: Spatio-Temporal Attention based Siamese Tracker for Tracking UAVs.
Generative Models for Multi-Illumination Color Constancy.
HyperMixNet: Hyperspectral Image Reconstruction with Deep Mixed Network from a Snapshot Measurement.
Deep Single Fisheye Image Camera Calibration for Over 180-degree Projection of Field of View.
Efficient light transport acquisition by coded illumination and robust photometric stereo by dual photography using deep neural network.
DeLiEve-Net: Deblurring Low-light Images with Light Streaks and Local Events.
Enforcing Temporal Consistency in Video Depth Estimation.
Precise Forecasting of Sky Images Using Spatial Warping.
Multi-Level Adaptive Separable Convolution for Large-Motion Video Frame Interpolation.
Weakly-supervised Semantic Segmentation in Cityscape via Hyperspectral Image.
Deep Manifold Prior.
ScatSimCLR: self-supervised contrastive learning with pretext task regularization for small-scale datasets.
How to Transform Kernels for Scale-Convolutions.
Predictive Coding with Topographic Variational Autoencoders.
Relational Prior for Multi-Object Tracking.
Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification.
Few-shot Learning with Online Self-Distillation.
Self-supervised Visual Attribute Learning for Fashion Compatibility.
Multimodal Continuous Visual Attention Mechanisms.
LSD-C: Linearly Separable Deep Clusters.
MEAL: Manifold Embedding-based Active Learning.
Description of Corner Cases in Automated Driving: Goals and Challenges.
Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization.
Semantic Concept Testing in Autonomous Driving by Extraction of Object-Level Annotations from CARLA.
MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach.
Instance Segmentation in CARLA: Methodology and Analysis for Pedestrian-oriented Synthetic Data Generation in Crowded Scenes.
About the Ambiguity of Data Augmentation for 3D Object Detection in Autonomous Driving.
ProAI: An Efficient Embedded AI Hardware for Automotive Applications - a Benchmark Study.
perf4sight: A toolflow to model CNN training performance on Edge GPUs.
Visual Domain Adaptation for Monocular Depth Estimation on Resource-Constrained Hardware.
Boosting Instance Segmentation with Synthetic Data: A study to overcome the limits of real world data sets.
Bridging the Reality Gap for Pose Estimation Networks using Sensor-Based Domain Randomization.
MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation.
FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection.
Parameterized Pseudo-Differential Operators for Graph Convolutional Neural Networks.
Unsupervised Learning of Geometric Sampling Invariant Representations for 3D Point Clouds.
Zero-Shot Learning via Contrastive Learning on Dual Knowledge Graphs.
Moving Object Detection for Event-based Vision using Graph Spectral Clustering.
Border-SegGCN: Improving Semantic Segmentation by Refining the Border Outline using Graph Convolutional Network.
Skeleton Graph Scattering Networks for 3D Skeleton-based Human Motion Prediction.
3D mask presentation attack detection via high resolution face parts.
Single Patch Based 3D High-Fidelity Mask Face Anti-Spoofing.
A Dual-stream Framework for 3D Mask Face Presentation Attack Detection.
On Improving Temporal Consistency for Online Face Liveness Detection System.
3D High-Fidelity Mask Face Presentation Attack Detection Challenge.
The Multi-Modal Video Reasoning and Analyzing Competition.
Post-training deep neural network pruning via layer-wise calibration.
FOX-NAS: Fast, On-device and Explainable Neural Architecture Search.
Exploring the power of lightweight YOLOv4.
Knowledge Distillation for Low-Power Object Detection: A Simple Technique and Its Extensions for Training Compact Models Using Unlabeled Data.
LUAI Challenge 2021 on Learning to Understand Aerial Images.
Progressive Unsupervised Deep Transfer Learning for Forest Mapping in Satellite Image.
Get better 1 pixel PCK: ladder scales correspondence flow networks for remote sensing image matching in higher resolution.
Self-Supervised Pretraining and Controlled Augmentation Improve Rare Wildlife Recognition in UAV Images.
Double Head Predictor based Few-Shot Object Detection for Aerial Imagery.
Convolutional Neural Networks Based Remote Sensing Scene Classification under Clear and Cloudy Environments.
A Framework for Semi-automatic Collection of Temporal Satellite Imagery for Analysis of Dynamic Regions.
Real-Time Cell Counting in Unlabeled Microscopy Images.
Lizard: A Large-Scale Dataset for Colonic Nuclear Instance Segmentation and Classification.
Robust Interactive Semantic Segmentation of Pathology Images with Minimal User Input.
ALBRT: Cellular Composition Prediction in Routine Histology Images.
Deep Ordinal Focus Assessment for Whole Slide Images.
A QuadTree Image Representation for Computational Pathology.
Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology.
A Pathology Deep Learning System Capable of Triage of Melanoma Specimens Utilizing Dermatopathologist Consensus as Ground Truth *.
Multi-Prototype Few-shot Learning in Histopathology.
An investigation of attention mechanisms in histopathology whole-slide-image analysis for regression objectives.
H&E-adversarial network: a convolutional neural network to learn stain-invariant features through Hematoxylin & Eosin regression.
Joint Semi-supervised and Active Learning for Segmentation of Gigapixel Pathology Images with Cost-Effective Labeling.
Iterative Cross-Scanner Registration for Whole Slide Images.
Probeable DARTS with Application to Computational Pathology.
Improving Self-supervised Learning with Hardness-aware Dynamic Curriculum Learning: An Application to Digital Pathology.
Simultaneous Nuclear Instance and Layer Segmentation in Oral Epithelial Dysplasia.
Guided Representation Learning for the Classification of Hematopoietic Cells.
MIA-COV19D: COVID-19 Detection through 3-D Chest CT Image Analysis.
Evaluating volumetric and slice-based approaches for COVID-19 detection in chest CTs.
A Hierarchical Classification System for the Detection of Covid-19 from Chest X-Ray Images.
A transformer-based framework for automatic COVID19 diagnosis in chest CTs.
A hybrid and fast deep learning framework for Covid-19 detection via 3D Chest CT Images.
COVID19 Diagnosis using AutoML from 3D CT scans.
TeliNet: Classifying CT scan images for COVID-19 diagnosis.
Brain midline shift detection and quantification by a cascaded deep network pipeline on non-contrast computed tomography scans.
Visual interpretability analysis of Deep CNNs using an Adaptive Threshold method on Diabetic Retinopathy images.
Adaptive Distribution Learning with Statistical Hypothesis Testing for COVID-19 CT Scan Classification.
Residual Dilated U-net For The Segmentation Of COVID-19 Infection From CT Images.
CMC-COV19D: Contrastive Mixup Classification for COVID-19 Diagnosis.
Intelligent Radiomic Analysis of Q-SPECT/CT images to optimize pulmonary embolism diagnosis in COVID-19 patients.
A 3D CNN Network with BERT For Automatic COVID-19 Diagnosis From CT-Scan Images.
The Value of Visual Attention for COVID-19 Classification in CT Scans.
Advanced 3D Deep Non-Local Embedded System for Self-Augmented X-Ray-based COVID-19 Assessment.
Leveraging Batch Normalization for Vision Transformers.
Contextual Convolutional Neural Networks.
Graph-based Neural Architecture Search with Operation Embeddings.
Convolutional Filter Approximation Using Fractional Calculus.
Single-DARTS: Towards Stable Architecture Search.
PP-NAS: Searching for Plug-and-Play Blocks on Convolutional Neural Network.
DDUNet: Dense Dense U-Net with Applications in Image Denoising.
Tiled Squeeze-and-Excite: Channel Attention With Local Spatial Context.
Russian Doll Network: Learning Nested Networks for Sample-Adaptive Dynamic Inference.
CONet: Channel Optimization for Convolutional Neural Networks.
SCARLET-NAS: Bridging the Gap between Stability and Scalability in Weight-sharing Neural Architecture Search.
Infrared dataset generation for people detection through superimposition of different camera sensors.
Where Did I See It? Object Instance Re-Identification with Attention.
Domain-based semi-supervised learning: exploiting label invariance in unlabeled data from distributed cameras.
Resolution based Feature Distillation for Cross Resolution Person Re-Identification.
Self-Attention Agreement Among Capsules.
An Embedded Deep Learning-based Package for Traffic Law Enforcement.
Pedestrian Tracking through Coordinated Mining of Multiple Moving Cameras.
Deep Quaternion Pose Proposals for 6D Object Pose Tracking.
PanopTOP: a framework for generating viewpoint-invariant human pose estimation datasets.
Graph CNN for Moving Object Detection in Complex Environments from Unseen Videos.
TransBlast: Self-Supervised Learning Using Augmented Subspace with Transformer for Background/Foreground Separation.
Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection.
Convolutional Auto-Encoder with Tensor-Train Factorization.
Fast Robust Tensor Principal Component Analysis via Fiber CUR Decomposition *.
Background/Foreground Separation: Guided Attention based Adversarial Modeling (GAAM) versus Robust Subspace Learning Methods.
Double-Weighted Low-Rank Matrix Recovery Based on Rank Estimation.
Relaxations for Non-Separable Cardinality/Rank Penalties.
On Adversarial Robustness: A Neural Architecture Search perspective.
AdvFoolGen: Creating Persistent Troubles for Deep Classifiers.
Towards Category and Domain Alignment: Category-Invariant Feature Enhancement for Adversarial Domain Adaptation.
Can Optical Trojans Assist Adversarial Perturbations?
Patch Attack Invariance: How Sensitive are Patch Attacks to 3D Pose?
Countering Adversarial Examples: Combining Input Transformation and Noisy Training.
Optical Adversarial Attack.
Enhancing Adversarial Robustness via Test-time Transformation Ensembling.
Detecting and Segmenting Adversarial Graphics Patterns from Images.
A Hierarchical Assessment of Adversarial Severity.
Encouraging Intra-Class Diversity Through a Reverse Contrastive Loss for Single-Source Domain Generalization.
Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap?
Evasion Attack STeganography: Turning Vulnerability Of Machine Learning To Adversarial Attacks Into A Real-world Application.
Impact of Colour on Robustness of Deep Neural Networks.
Trojan Signatures in DNN Weights.
On the Effect of Pruning on Adversarial Robustness.