admin管理员组

文章数量:1593932

本专栏是计算机视觉方向论文收集积累,时间:2021年6月1日,来源:paper digest

欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!

直达笔记地址:机器学习手推笔记(GitHub地址)

 1, TITLE: STRIDE Along Spectrahedral Vertices for Solving Large-Scale Rank-One Semidefinite Relaxations
AUTHORS: Heng Yang ; Ling Liang ; Kim-Chuan Toh ; Luca Carlone
CATEGORY: math.OC [math.OC, cs.CV, cs.LG]
HIGHLIGHT: We propose a new algorithmic framework, called SpecTrahedral pRoximal gradIent Descent along vErtices (STRIDE), that blends fast local search on the nonconvex POP with global descent on the convex SDP.

2, TITLE: RPG: Learning Recursive Point Cloud Generation
AUTHORS: WEI-JAN KO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we propose a novel point cloud generator that is able to reconstruct and generate 3D point clouds composed of semantic parts.

3, TITLE: Driver Intention Anticipation Based on In-Cabin and Driving Scene Monitoring Using Deep Learning
AUTHORS: Mahdi Bonyani ; Mina Rahmanian ; Simindokht Jahangard
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this study, we proposed new framework in which 4 inputs are employed to anticipate diver maneuver using Brain4Cars dataset and the maneuver prediction is achieved from 5, 4, 3, 2, 1 seconds before the actual action occurs.

4, TITLE: LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering
AUTHORS: Zujie Liang ; Haifeng Hu ; Jiaying Zhu
CATEGORY: cs.CV [cs.CV, cs.CL]
HIGHLIGHT: To address this issue, we propose a novel Language-Prior Feedback (LPF) objective function, to re-balance the proportion of each answer's loss value in the total VQA loss.

5, TITLE: SN-Graph: A Minimalist 3D Object Representation for Classification
AUTHORS: SIYU ZHANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a Sphere Node Graph (SN-Graph) to represent 3D objects.

6, TITLE: Augmenting Anchors By The Detector Itself
AUTHORS: Xiaopei Wan ; Shengjie Chen ; Yujiu Yang ; Zhenhua Guo ; Fangbo Tao
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose a gradient-free anchor augmentation method named AADI, which means Augmenting Anchors by the Detector Itself.

7, TITLE: Beyond The Spectrum: Detecting Deepfakes Via Re-Synthesis
AUTHORS: Yang He ; Ning Yu ; Margret Keuper ; Mario Fritz
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In order to overcome this issue, we propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.

8, TITLE: BAAI-VANJEE Roadside Dataset: Towards The Connected Automated Vehicle Highway Technologies in Challenging Environments of China
AUTHORS: DENG YONGQIANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we firstly introduce a challenging BAAI-VANJEE roadside dataset which consist of LiDAR data and RGB images collected by VANJEE smart base station placed on the roadside about 4.5m high.

9, TITLE: On The Bias Against Inductive Biases
AUTHORS: George Cazenavette ; Simon Lucey
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we analyze the effect of these and more inductive biases on small to moderately-sized isotropic networks used for unsupervised visual feature learning and show that their removal is not always ideal.

10, TITLE: SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
AUTHORS: ENZE XIE et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders.

11, TITLE: VidFace: A Full-Transformer Solver for Video FaceHallucination with Unaligned Tiny Snapshots
AUTHORS: Yuan Gan ; Yawei Luo ; Xin Yu ; Bang Zhang ; Yi Yang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we investigate the task of hallucinating an authentic high-resolution (HR) human face from multiple low-resolution (LR) video snapshots.

12, TITLE: TransCamP: Graph Transformer for 6-DoF Camera Pose Estimation
AUTHORS: Xinyi Li ; Haibin Ling
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we propose a neural network approach with a graph transformer backbone, namely TransCamP, to address the camera relocalization problem.

13, TITLE: The Effectiveness of Feature Attribution Methods and Its Correlation with Automatic Evaluation Scores
AUTHORS: Giang Nguyen ; Daeyoung Kim ; Anh Nguyen
CATEGORY: cs.CV [cs.CV, cs.AI, cs.HC]
HIGHLIGHT: In this paper, we conduct the first, large-scale user study on 320 lay and 11 expert users to shed light on the effectiveness of state-of-the-art attribution methods in assisting humans in ImageNet classification, Stanford Dogs fine-grained classification, and these two tasks but when the input image contains adversarial perturbations.

14, TITLE: Scene-aware Generative Network for Human Motion Synthesis
AUTHORS: Jingbo Wang ; Sijie Yan ; Bo Dai ; Dahua LIn
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a new framework, with the interaction between the scene and the human motion taken into account.

15, TITLE: Three-dimensional Multimodal Medical Imaging System Based on Free-hand Ultrasound and Structured Light
AUTHORS: Jhacson Meza ; Sonia H. Contreras-Ortiz ; Lenny A. Romero ; Andres G. Marrugo
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We propose a three-dimensional (3D) multimodal medical imaging system that combines freehand ultrasound and structured light 3D reconstruction in a single coordinate system without requiring registration.

16, TITLE: 3D U-NetR: Low Dose Computed Tomography Reconstruction Via Deep Learning and 3 Dimensional Convolutions
AUTHORS: Doga Gunduzalp ; Batuhan Cengiz ; Mehmet Ozan Unal ; Isa Yildirim
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we introduced a novel deep learning based reconstruction technique using the correlations of all 3 dimensions with each other by taking into account the correlation between 2-dimensional low-dose CT images.

17, TITLE: Demographic Fairness in Biometric Systems: What Do The Experts Say?
AUTHORS: Christian Rathgeb ; Pawel Drozdowski ; Naser Damer ; Dinusha C. Frings ; Christoph Busch
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This work summarises opinions of experts and findings of said events on the topic of demographic fairness in biometric systems including several important aspects such as the developments of evaluation metrics and standards as well as related issues, e.g. the need for transparency and explainability in biometric systems or legal and ethical issues.

18, TITLE: Data-driven 6D Pose Tracking By Calibrating Image Residuals in Synthetic Domains
AUTHORS: Bowen Wen ; Chaitanya Mitash ; Kostas Bekris
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: This work presents se(3)-TrackNet, a data-driven optimization approach for long term, 6D pose tracking.

19, TITLE: About Explicit Variance Minimization: Training Neural Networks for Medical Imaging With Limited Data Annotations
AUTHORS: Dmitrii Shubin ; Danny Eytan ; Sebastian D. Goodfellow
CATEGORY: cs.CV [cs.CV, 68T07 (Primary) 68T45 (Secondary)]
HIGHLIGHT: We propose the Variance Aware Training (VAT) method that exploits this property by introducing the variance error into the model loss function, i.e., enabling minimizing the variance explicitly.

20, TITLE: Knowledge Transfer for Few-shot Segmentation of Novel White Matter Tracts
AUTHORS: Qi Lu ; Chuyang Ye
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we explore the transfer of such knowledge to the segmentation of novel WM tracts in the few-shot setting.

21, TITLE: MAOMaps: A Photo-Realistic Benchmark For VSLAM and Map Merging Quality Assessment
AUTHORS: Andrey Bokovoy ; Kirill Muravyev ; Konstantin Yakovlev
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this paper we introduce a novel benchmark that is aimed at quantitatively evaluating the quality of vision-based simultaneous localization and mapping (vSLAM) and map merging algorithms.

22, TITLE: More Is Better: An Analysis of Instance Quantity/Quality Trade-off in Rehearsal-based Continual Learning
AUTHORS: Francesco Pelosin ; Andrea Torsello
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In our study, we propose an analysis of the memory quantity/quality trade-off adopting various data reduction approaches to increase the number of instances storable in memory.

23, TITLE: Unsupervised Joint Learning of Depth, Optical Flow, Ego-motion from Video
AUTHORS: Jianfeng Li ; Junqiao Zhao ; Shuangfu Song ; Tiantian Feng
CATEGORY: cs.CV [cs.CV, 65Dxx]
HIGHLIGHT: In this paper, we improve the joint self-supervision method from three aspects: network structure, dynamic object segmentation, and geometric constraints.

24, TITLE: MixerGAN: An MLP-Based Architecture for Unpaired Image-to-Image Translation
AUTHORS: George Cazenavette ; Manuel Ladron De Guevara
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Leveraging this efficient alternative to self-attention, we propose a new unpaired image-to-image translation model called MixerGAN: a simpler MLP-based architecture that considers long-distance relationships between pixels without the need for expensive attention mechanisms.

25, TITLE: StyTr^2: Unbiased Image Style Transfer with Transformers
AUTHORS: YINGYING DENG et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: To address this critical issue, we take long-range dependencies of input images into account for unbiased style transfer by proposing a transformer-based approach, namely StyTr^2.

26, TITLE: Urban Traffic Surveillance (UTS): A Fully Probabilistic 3D Tracking Approach Based on 2D Detections
AUTHORS: Henry Bradler ; Adrian Kretz ; Rudolf Mester
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: UTS then tracks the vehicles using a 3D bounding box representation and a physically reasonable 3D motion model relying on an Unscented Kalman filter based approach.

27, TITLE: Non-Convex Tensor Low-Rank Approximation for Infrared Small Target Detection
AUTHORS: TING LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Considering that different singular values have different importance and should be treated discriminatively, in this paper, we propose a non-convex tensor low-rank approximation (NTLA) method for infrared small target detection.

28, TITLE: Polygonal Point Set Tracking
AUTHORS: Gunhee Nam ; Miran Heo ; Seoung Wug Oh ; Joon-Young Lee ; Seon Joo Kim
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we propose a novel learning-based polygonal point set tracking method.

29, TITLE: Detecting Backdoor in Deep Neural Networks Via Intentional Adversarial Perturbations
AUTHORS: MINGFU XUE et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, a novel backdoor detection method based on adversarial examples is proposed.

30, TITLE: FoveaTer: Foveated Transformer for Image Classification
AUTHORS: Aditya Jonnalagadda ; William Wang ; Miguel P. Eckstein
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose foveated Transformer (FoveaTer) model, which uses pooling regions and saccadic movements to perform object classification tasks using a vision Transformer architecture.

31, TITLE: Unsupervised Action Segmentation with Self-supervised Feature Learning and Co-occurrence Parsing
AUTHORS: ZHE WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Thus in this work we explore a self-supervised method that operates on a corpus of unlabeled videos and predicts a likely set of temporal segments across the videos.

32, TITLE: Enhancing Environmental Enforcement with Near Real-Time Monitoring: Likelihood-Based Detection of Structural Expansion of Intensive Livestock Farms
AUTHORS: Ben Chugg ; Brandon Anderson ; Seiji Eicher ; Sandy Lee ; Daniel E. Ho
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Using a new hand-labeled dataset of 175,736 images of 1,513 CAFOs, we combine state-of-the-art building segmentation with a likelihood-based change-point detection model to provide a robust signal of building expansion (AUC = 0.80).

33, TITLE: MSG-Transformer: Exchanging Local Spatial Information By Manipulating Messenger Tokens
AUTHORS: JIEMIN FANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This paper aims to alleviate the conflict between efficiency and flexibility, for which we propose a specialized token for each region that serves as a messenger (MSG).

34, TITLE: Adaptive Feature Alignment for Adversarial Training
AUTHORS: TAO WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we observe an interesting phenomenon that feature statistics change monotonically and smoothly w.r.t the rising of attacking strength.

35, TITLE: Pho(SC)Net: An Approach Towards Zero-shot Word Image Recognition in Historical Documents
AUTHORS: Anuj Rai ; Narayanan C. Krishnan ; Sukalpa Chanda
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Based on previous state-of-the-art methods for word spotting and recognition, we propose a hybrid representation that considers the character's shape appearance to differentiate between two different words and has shown to be more effective in recognizing unseen words.

36, TITLE: Analogous to Evolutionary Algorithm: Designing A Unified Sequence Model
AUTHORS: JIANGNING ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Analogous to the dynamic local population in EA, we improve the existing transformer structure and propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly.

37, TITLE: Large-Scale Spatio-Temporal Person Re-identification: Algorithm and Benchmark
AUTHORS: XIUJUN SHU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we contribute a novel Large-scale Spatio-Temporal (LaST) person re-ID dataset, including 10,860 identities with more than 224k images.

38, TITLE: SDNet: Mutil-branch for Single Image Deraining Using Swin
AUTHORS: FUXIANG TAN et. al.
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: In this paper, we introduce Swin-transformer into the field of image deraining for the first time to study the performance and potential of Swin-transformer in the field of image deraining.

39, TITLE: Bounded Logit Attention: Learning to Explain Image Classifiers
AUTHORS: Thomas Baumhauer ; Djordje Slijepcevic ; Matthias Zeppelzauer
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: We present a trainable explanation module for convolutional image classifiers we call bounded logit attention (BLA).

40, TITLE: ACNet: Mask-Aware Attention with Dynamic Context Enhancement for Robust Acne Detection
AUTHORS: Kyungseo Min ; Gun-Hee Lee ; Seong-Whan Lee
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address these problems, we propose an acne detection network which consists of three components, specifically: Composite Feature Refinement, Dynamic Context Enhancement, and Mask-Aware Multi-Attention.

41, TITLE: EPSANet: An Efficient Pyramid Split Attention Block on Convolutional Neural Network
AUTHORS: Hu Zhang ; Keke Zu ; Jian Lu ; Yuru Zou ; Deyu Meng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, a novel lightweight and effective attention method named Pyramid Split Attention (PSA) module is proposed.

42, TITLE: Gaze Estimation Using Transformer
AUTHORS: Yihua Cheng ; Feng Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we employ transformers and assess their effectiveness for gaze estimation.

43, TITLE: Learning Free-Form Deformation for 3D Face Reconstruction from In-The-Wild Images
AUTHORS: Harim Jung ; Myeong-Seok Oh ; Seong-Whan Lee
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address the limitations of 3DMM, we propose a straightforward learning-based method that reconstructs a 3D face mesh through Free-Form Deformation (FFD) for the first time.

44, TITLE: Transformer-Based Deep Image Matching for Generalizable Person Re-identification
AUTHORS: Shengcai Liao ; Ling Shao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Accordingly, we propose a new simplified decoder, which drops the full attention implementation with the softmax weighting, keeping only the query-key similarity computation.

45, TITLE: Rethinking The Constraints of Multimodal Fusion: Case Study in Weakly-Supervised Audio-Visual Video Parsing
AUTHORS: Jianning Wu ; Zhuqing Jiang ; Shiping Wen ; Aidong Men ; Haiying Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This problem is modeled as an optimization problem in this paper.

46, TITLE: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
AUTHORS: SHUAI BAI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we apply one new modality, i.e., the language description, to search the vehicle of interest and explore the potential of this task in the real-world scenario.

47, TITLE: Implementing A Foveal-pit Inspired Filter in A Spiking Convolutional Neural Network: A Preliminary Study
AUTHORS: Shriya T. P. Gupta ; Basabdatta Sen Bhattacharya
CATEGORY: cs.CV [cs.CV, cs.AI, I.2.10; I.4.5; I.4.10]
HIGHLIGHT: We have presented a Spiking Convolutional Neural Network (SCNN) that incorporates retinal foveal-pit inspired Difference of Gaussian filters and rank-order encoding.

48, TITLE: A Spectral-Spatial-Dependent Global Learning Framework for Insufficient and Imbalanced Hyperspectral Image Classification
AUTHORS: QIQI ZHU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, a spectral-spatial dependent global learning (SSDGL) framework based on global convolutional long short-term memory (GCL) and global joint attention mechanism (GJAM) is proposed for insufficient and imbalanced HSI classification.

49, TITLE: OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers
AUTHORS: Kuniaki Saito ; Donghyun Kim ; Kate Saenko
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this problem, we propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch.

50, TITLE: Foveal-pit Inspired Filtering of DVS Spike Response
AUTHORS: Shriya T. P. Gupta ; Pablo Linares-Serrano ; Basabdatta Sen Bhattacharya ; Teresa Serrano-Gotarredona
CATEGORY: cs.CV [cs.CV, cs.AI, cs.AR, I.2.10; I.4.5; I.4.10]
HIGHLIGHT: In this paper, we present results of processing Dynamic Vision Sensor (DVS) recordings of visual patterns with a retinal model based on foveal-pit inspired Difference of Gaussian (DoG) filters.

51, TITLE: Automatic CT Segmentation from Bounding Box Annotations Using Convolutional Neural Networks
AUTHORS: Yuanpeng Liu ; Qinglei Hui ; Zhiyi Peng ; Shaolin Gong ; Dexing Kong
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this problem, we proposed an automatic CT segmentation method based on weakly supervised learning, by which one could train an accurate segmentation model only with weak annotations in the form of bounding boxes.

52, TITLE: VersatileGait: A Large-Scale Synthetic Gait Dataset Towards In-the-Wild Simulation
AUTHORS: PENGYI ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To remedy this problem, we propose to construct a large-scale gait dataset with the help of controllable computer simulation. As a result, we obtain an in-the-wild gait dataset, called VersatileGait, which has more than one million silhouette sequences of 10,000 subjects with diverse scenarios.

53, TITLE: RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks
AUTHORS: Edouard Yvinec ; Arnaud Dapogny ; Matthieu Cord ; Kevin Bailly
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we present RED, a data-free structured, unified approach to tackle structured pruning.

54, TITLE: Transformer-Based Source-Free Domain Adaptation
AUTHORS: GUANGLEI YANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we study the task of source-free domain adaptation (SFDA), where the source data are not available during target adaptation.

55, TITLE: Towards Diverse Paragraph Captioning for Untrimmed Videos
AUTHORS: Yuqing Song ; Shizhe Chen ; Qin Jin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a paragraph captioning model which eschews the problematic event detection stage and directly generates paragraphs for untrimmed videos.

56, TITLE: Learning Personal Style from Few Examples
AUTHORS: David Chuan-En Lin ; Nikolas Martelaro
CATEGORY: cs.CV [cs.CV, cs.HC, H.5.0; I.5.4; J.5]
HIGHLIGHT: In this paper, we leverage the pattern recognition capability of computational models to aid in this task.

57, TITLE: Not All Images Are Worth 16x16 Words: Dynamic Vision Transformers with Adaptive Sequence Length
AUTHORS: Yulin Wang ; Rui Huang ; Shiji Song ; Zeyi Huang ; Gao Huang
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we argue that every image has its own characteristics, and ideally the token number should be conditioned on each individual input.

58, TITLE: Can Attention Enable MLPs To Catch Up With CNNs?
AUTHORS: MENG-HAO GUO et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this perspective, we give a brief history of learning architectures, including multilayer perceptrons (MLPs), convolutional neural networks (CNNs) and transformers.

59, TITLE: Identity and Attribute Preserving Thumbnail Upscaling
AUTHORS: Noam Gat ; Sagie Benaim ; Lior Wolf
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We consider the task of upscaling a low resolution thumbnail image of a person, to a higher resolution image, which preserves the person's identity and other attributes.

60, TITLE: Scorpion Detection and Classification Systems Based on Computer Vision and Deep Learning for Health Security Purposes
AUTHORS: Francisco Luis Giambelluca ; Marcelo A. Cappelletti ; Jorge Osio ; Luis A. Giambelluca
CATEGORY: cs.CV [cs.CV, cs.AI, eess.IV]
HIGHLIGHT: In this paper, two novel automatic and real-time systems for the detection and classification of two genera of scorpions found in La Plata city (Argentina) were developed using computer vision and deep learning techniques.

61, TITLE: ArtGraph: Towards An Artistic Knowledge Graph
AUTHORS: Giovanna Castellano ; Giovanni Sansaro ; Gennaro Vessio
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents our ongoing work towards ArtGraph: an artistic knowledge graph based on WikiArt and DBpedia.

62, TITLE: FCPose: Fully Convolutional Multi-Person Pose Estimation with Dynamic Instance-Aware Convolutions
AUTHORS: Weian Mao ; Zhi Tian ; Xinlong Wang ; Chunhua Shen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a fully convolutional multi-person pose estimation framework using dynamic instance-aware convolutions, termed FCPose.

63, TITLE: E2ETag: An End-to-End Trainable Method for Generating and Detecting Fiducial Markers
AUTHORS: J. Brennan Peace ; Eric Psota ; Yanfeng Liu ; Lance C. P�rez
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: E2ETag introduces an end-to-end trainable method for designing fiducial markers and a complimentary detector.

64, TITLE: DAAIN: Detection of Anomalous and Adversarial Input Using Normalizing Flows
AUTHORS: Samuel von Bau�nern ; Johannes Otterbach ; Adrian Loy ; Mathieu Salzmann ; Thomas Wollmann
CATEGORY: cs.CV [cs.CV, cs.CR, cs.LG]
HIGHLIGHT: In this work, we introduce a novel technique, DAAIN, to detect OOD inputs and AA for image segmentation in a unified setting.

65, TITLE: Non-local Patch-based Low-rank Tensor Ring Completion for Visual Data
AUTHORS: Yicong He ; George K. Atia
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we develop a novel non-local patch-based tensor ring completion algorithm.

66, TITLE: Image-to-Video Generation Via 3D Facial Dynamics
AUTHORS: XIAOGUANG TU et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We present a versatile model, FaceAnime, for various video generation tasks from still images.

67, TITLE: Long-term Person Re-identification: A Benchmark
AUTHORS: Peng Xu ; Xiatian Zhu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we contribute timely a large, realistic long-term person re-identification benchmark.

68, TITLE: Know Your Surroundings: Panoramic Multi-Object Tracking By Multimodality Collaboration
AUTHORS: YUHANG HE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we focus on the multi-object tracking (MOT) problem of automatic driving and robot navigation.

69, TITLE: Transferable Sparse Adversarial Attack
AUTHORS: Ziwen He ; Wei Wang ; Jing Dong ; Tieniu Tan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we focus on sparse adversarial attack based on the $\ell_0$ norm constraint, which can succeed by only modifying a few pixels of an image.

70, TITLE: Multiscale IoU: A Metric for Evaluation of Salient Object Detection with Fine Structures
AUTHORS: Azim Ahmadzadeh ; Dustin J. Kempton ; Yang Chen ; Rafal A. Angryk
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we present a new metric that is a marriage of a popular evaluation metric, namely Intersection over Union (IoU), and a geometrical concept, called fractal dimension.

71, TITLE: 1$\times$N Block Pattern for Network Sparsity
AUTHORS: MINGBAO LIN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose one novel concept of $1\times N$ block sparsity pattern (block pruning) to break this limitation.

72, TITLE: Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data Via Differentiable Cross-Approximation
AUTHORS: MIKHAIL USVYATSOV et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking \emph{at a fraction of their entries only}.

73, TITLE: Compressed Sensing for Photoacoustic Computed Tomography Using An Untrained Neural Network
AUTHORS: Hengrong Lan ; Juze Zhang ; Changchun Yang ; Fei Gao
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, a novel compressed sensing method for PACT using an untrained neural network is proposed, which decreases half number of the measured channels and recoveries enough details.

74, TITLE: Learning Inductive Attention Guidance for Partially Supervised Pancreatic Ductal Adenocarcinoma Prediction
AUTHORS: YAN WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we consider a partially supervised setting, where cheap image-level annotations are provided for all the training data, and the costly per-voxel annotations are only available for a subset of them.

75, TITLE: Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
AUTHORS: Jia-Hong Huang ; Ting-Wei Wu ; Chao-Han Huck Yang ; Marcel Worring
CATEGORY: cs.CV [cs.CV, cs.AI, cs.CL, cs.MM]
HIGHLIGHT: In this work, we propose a new context-driven encoding network to automatically generate medical reports for retinal images.

76, TITLE: Analysis and Applications of Class-wise Robustness in Adversarial Training
AUTHORS: Qi Tian ; Kun Kuang ; Kelu Jiang ; Fei Wu ; Yisen Wang
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose to analyze the class-wise robustness in adversarial training.

77, TITLE: Attention Based Semantic Segmentation on UAV Dataset for Natural Disaster Damage Assessment
AUTHORS: Tashnim Chowdhury ; Maryam Rahnemoonfar
CATEGORY: cs.CV [cs.CV, 68T45, I.4.6]
HIGHLIGHT: We implement a novel self-attention based semantic segmentation model on a high resolution UAV dataset and attain Mean IoU score of around88%on the test set.

78, TITLE: Bio-inspired Visual Attention for Silicon Retinas Based on Spiking Neural Networks Applied to Pattern Classification
AUTHORS: Am�lie Gruel ; Jean Martinet
CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE]
HIGHLIGHT: In this paper, we review the biological background behind the attentional mechanism, and introduce a case study of event videos classification with SNNs, using a biology-grounded low-level computational attention mechanism, with interesting preliminary results.

79, TITLE: A Protection Method of Trained CNN Model with Secret Key from Unauthorized Access
AUTHORS: AprilPyone MaungMaung ; Hitoshi Kiya
CATEGORY: cs.CV [cs.CV, cs.CR]
HIGHLIGHT: In this paper, we propose a novel method for protecting convolutional neural network (CNN) models with a secret key set so that unauthorized users without the correct key set cannot access trained models.

80, TITLE: Transforming The Latent Space of StyleGAN for Real Face Editing
AUTHORS: Heyi Li ; Jinlong Liu ; Yunzhi Bai ; Huayan Wang ; Klaus Mueller
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To solve this problem, we propose to expand the latent space by replacing fully-connected layers in the StyleGAN's mapping network with attention-based transformers.

81, TITLE: Less Is More: Pay Less Attention in Vision Transformers
AUTHORS: Zizheng Pan ; Bohan Zhuang ; Haoyu He ; Jing Liu ; Jianfei Cai
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we present a novel Less attention vIsion Transformer (LIT), building upon the fact that convolutions, fully-connected (FC) layers, and self-attentions have almost equivalent mathematical expressions for processing image patch sequences.

82, TITLE: Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation
AUTHORS: Jin-Hwa Kim ; Do-Hyeong Kim ; Saehoon Yi ; Taehoon Lee
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: We present the efficiency of semi-orthogonal embedding for unsupervised anomaly segmentation.

83, TITLE: Dual-stream Network for Visual Recognition
AUTHORS: MINGYUAN MAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a generic Dual-stream Network (DS-Net) to fully explore the representation capacity of local and global pattern features for image classification.

84, TITLE: Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization
AUTHORS: JICHAO ZHANG et. al.
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: To solve this problem, we introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.

85, TITLE: A Study On The Effects of Pre-processing On Spatio-temporal Action Recognition Using Spiking Neural Networks Trained with STDP
AUTHORS: El-Assal Mireille ; Tirilly Pierre ; Bilasco Ioan Marius
CATEGORY: cs.CV [cs.CV, I.4.7; I.4.8; I.5.0]
HIGHLIGHT: In this paper, we rely on the network architecture of a convolutional spiking neural network trained with STDP, and we test the performance of this network when challenged with action recognition tasks.

86, TITLE: Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview
AUTHORS: ZHAOXIN FAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, this paper presents a comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route.

87, TITLE: Learning Convolutions with Only Additions
AUTHORS: HANTING CHEN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs.

88, TITLE: UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis
AUTHORS: ZHU ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, instead of investigating these control signals separately, we propose a new two-stage architecture, UFC-BERT, to unify any number of multi-modal controls.

89, TITLE: Training Domain-invariant Object Detector Faster with Feature Replay and Slow Learner
AUTHORS: Chaehyeon Lee ; Junghoon Seo ; Heechul Jung
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we introduce our proposed method, A-NDFT, which is an improvement to NDFT.

90, TITLE: Z2P: Instant Rendering of Point Clouds
AUTHORS: Gal Metzer ; Rana Hanocka ; Raja Giryes ; Niloy J. Mitra ; Daniel Cohen-Or
CATEGORY: cs.GR [cs.GR, cs.CV, cs.LG]
HIGHLIGHT: We present a technique for rendering point clouds using a neural network.

91, TITLE: ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX
AUTHORS: Pratik Kayal ; Mrinal Anand ; Harsh Desai ; Mayank Singh
CATEGORY: cs.IR [cs.IR, cs.AI, cs.CV]
HIGHLIGHT: This report describes the datasets and ground truth specification, details the performance evaluation metrics used, presents the final results, and summarizes the participating methods.

92, TITLE: Adversarial Training with Rectified Rejection
AUTHORS: TIANYU PANG et. al.
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: To this end, we propose to use true confidence (T-Con) (i.e., predicted probability of the true class) as a certainty oracle, and learn to predict T-Con by rectifying confidence.

93, TITLE: Gotta Go Fast When Generating Data with Score-Based Models
AUTHORS: Alexia Jolicoeur-Martineau ; Ke Li ; R�mi Pich�-Taillefer ; Tal Kachman ; Ioannis Mitliagkas
CATEGORY: cs.LG [cs.LG, cs.CV, math.OC, stat.ML]
HIGHLIGHT: In this work, we aim to accelerate this process by devising a more efficient SDE solver.

94, TITLE: Improving Entropic Out-of-Distribution Detection Using Isometric Distances and The Minimum Distance Score
AUTHORS: David Mac�do ; Teresa Ludermir
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.NE]
HIGHLIGHT: In this paper, we propose to perform what we call an isometrization of the distances used in the IsoMax loss.

95, TITLE: An Attention Free Transformer
AUTHORS: SHUANGFEI ZHAI et. al.
CATEGORY: cs.LG [cs.LG, cs.CL, cs.CV]
HIGHLIGHT: We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention.

96, TITLE: Toward Understanding The Feature Learning Process of Self-supervised Contrastive Learning
AUTHORS: Zixin Wen ; Yuanzhi Li
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: In this work, we formally study how contrastive learning learns the feature representations for neural networks by analyzing its feature learning process.

97, TITLE: Applications of Epileptic Seizures Detection in Neuroimaging Modalities Using Deep Learning Techniques: Methods, Challenges, and Future Works
AUTHORS: AFSHIN SHOEIBI et. al.
CATEGORY: cs.LG [cs.LG, cs.CV, eess.SP]
HIGHLIGHT: In this paper, a comprehensive overview of the types of DL methods exploited to diagnose epileptic seizures from various neuroimaging modalities has been studied.

98, TITLE: Greedy Bayesian Posterior Approximation with Deep Ensembles
AUTHORS: Aleksei Tiulpin ; Matthew B. Blaschko
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: This paper proposes a novel and principled method to tackle this limitation, minimizing an $f$-divergence between the true posterior and a kernel density estimator in a function space.

99, TITLE: Representation Learning in Continuous-Time Score-Based Generative Models
AUTHORS: Korbinian Abstreiter ; Stefan Bauer ; Arash Mehrjou
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: Score-based methods represented as stochastic differential equations on a continuous time domain have recently proven successful as a non-adversarial generative model.

100, TITLE: EDDA: Explanation-driven Data Augmentation to Improve Model and Explanation Alignment
AUTHORS: RUIWEN LI et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: We verify in all cases that our explanation-driven data augmentation method improves alignment of the model and explanation in comparison to no data augmentation and non-explanation driven data augmentation methods.

101, TITLE: Consistency Regularization for Variational Auto-Encoders
AUTHORS: Samarth Sinha ; Adji B. Dieng
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose a regularization method to enforce consistency in VAEs.

102, TITLE: An Improved LogNNet Classifier for IoT Application
AUTHORS: Hanif Heidari ; Andrei Velichko
CATEGORY: cs.LG [cs.LG, cs.CV, cs.NE, eess.IV, nlin.CD]
HIGHLIGHT: This paper proposes a feed forward LogNNet neural network which uses a semi-linear Henon type discrete chaotic map to classify MNIST-10 dataset.

103, TITLE: Dominant Patterns: Critical Features Hidden in Deep Neural Networks
AUTHORS: Zhixing Ye ; Shaofei Qin ; Sizhe Chen ; Xiaolin Huang
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this paper, we find the existence of critical features hidden in Deep NeuralNetworks (DNNs), which are imperceptible but can actually dominate the outputof DNNs.

104, TITLE: Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation
AUTHORS: Stephen James ; Andrew J. Davison
CATEGORY: cs.RO [cs.RO, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: With this in mind, we present our Attention-driven Robotic Manipulation (ARM) algorithm, which is a general manipulation algorithm that can be applied to a range of sparse-rewarded tasks, given only a small number of demonstrations.

105, TITLE: Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms
AUTHORS: Shivin Devgon ; Jeffrey Ichnowski ; Ashwin Balakrishna ; Harry Zhang ; Ken Goldberg
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: We present an algorithm to orient novel objects given a depth image of the object in its current and desired orientation.

106, TITLE: A Remark on A Paper of Krotov and Hopfield [arXiv:2008.06996]
AUTHORS: Fei Tang ; Michael Kopp
CATEGORY: q-bio.NC [q-bio.NC, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: In their recent paper titled "Large Associative Memory Problem in Neurobiology and Machine Learning" [arXiv:2008.06996] the authors gave a biologically plausible microscopic theory from which one can recover many dense associative memory models discussed in the literature.

107, TITLE: Self-Supervised Nonlinear Transform-Based Tensor Nuclear Norm for Multi-Dimensional Image Recovery
AUTHORS: YI-SI LUO et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we study multi-dimensional image recovery.

108, TITLE: Boosting The Performance of Video Compression Artifact Reduction with Reference Frame Proposals and Frequency Domain Information
AUTHORS: YI XU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose an effective reference frame proposal strategy to boost the performance of the existing multi-frame approaches.

109, TITLE: Classification of Brain Tumours in MR Images Using Deep Spatiospatial Models
AUTHORS: Soumick Chatterjee ; Faraz Ahmed Nizamani ; Andreas N�rnberger ; Oliver Speck
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: This paper uses two spatiotemporal models, ResNet (2+1)D and ResNet Mixed Convolution, to classify different types of brain tumours.

110, TITLE: SNIPS: Solving Noisy Inverse Problems Stochastically
AUTHORS: Bahjat Kawar ; Gregory Vaksman ; Michael Elad
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work we introduce a novel stochastic algorithm dubbed SNIPS, which draws samples from the posterior distribution of any linear inverse problem, where the observation is assumed to be contaminated by additive white Gaussian noise.

111, TITLE: Refined Deep Neural Network and U-Net for Polyps Segmentation
AUTHORS: Quoc-Huy Trinh ; Minh-Van Nguyen ; Thiet-Gia Huynh ; Minh-Triet Tran
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this task, we propose methods combining Residual module, Inception module, Adaptive Convolutional neural network with U-Net model, and PraNet for semantic segmentation of various types of polyps in endoscopic images.

112, TITLE: Conditional Deep Convolutional Neural Networks for Improving The Automated Screening of Histopathological Images
AUTHORS: Gianluca Gerard ; Marco Piastra
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Our goal is to address this challenge by using a conditional Fully Convolutional Network (co-FCN) whose output can be conditioned at run time, and which can improve its performance when a properly selected set of reference slides are used to condition the output.

113, TITLE: Feasibility Assessment of Multitasking in MRI Neuroimaging Analysis: Tissue Segmentation, Cross-Modality Conversion and Bias Correction
AUTHORS: Mohammad Eslami ; Solale Tabarestani ; Malek Adjouadi
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, physics.med-ph]
HIGHLIGHT: This study examines the feasibility of using multitasking in three different applications, including tissue segmentation, cross-modality conversion, and bias-field correction.

114, TITLE: Covid-19 Diagnosis from X-ray Using Neural Networks
AUTHORS: Dinesh J ; Mohammed Rhithick A
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Covid-19 Diagnosis from X-ray Using Neural Networks

115, TITLE: CTSpine1K: A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography
AUTHORS: YANG DENG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we introduce a large-scale spine CT dataset, called CTSpine1K, curated from multiple sources for vertebra segmentation, which contains 1,005 CT volumes with over 11,100 labeled vertebrae belonging to different spinal conditions.

116, TITLE: Human-level COVID-19 Diagnosis from Low-dose CT Scans Using A Two-stage Time-distributed Capsule Network
AUTHORS: PARNIAN AFSHAR et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this study, we consider low-dose and ultra-low-dose (LDCT and ULDCT) scan protocols that reduce the radiation exposure close to that of a single X-Ray, while maintaining an acceptable resolution for diagnosis purposes.

117, TITLE: BaMBNet: A Blur-aware Multi-branch Network for Defocus Deblurring
AUTHORS: Pengwei Liang ; Junjun Jiang ; Xianming Liu ; Jiayi Ma
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: To this end, this study designs a novel blur-aware multi-branch network (BaMBNet), in which different regions (with different blur amounts) should be treated differentially.

118, TITLE: Low-Dose CT Denoising Using A Structure-Preserving Kernel Prediction Network
AUTHORS: LU XU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: To address this issue, we propose a Structure-preserving Kernel Prediction Network (StructKPN) that combines the kernel prediction network with a structure-aware loss function that utilizes the pixel gradient statistics and guides the model towards spatially-variant filters that enhance noise removal, prevent over-smoothing and preserve detailed structures for different regions in CT imaging.

119, TITLE: Hierarchical Deep Network with Uncertainty-aware Semi-supervised Learning for Vessel Segmentation
AUTHORS: CHENXIN LI et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, to address the above issues, we propose a hierarchical deep network where an attention mechanism localizes the low-contrast capillary regions guided by the whole vessels, and enhance the spatial activation in those areas for the sub-type vessels.

本文标签: 视觉计算机论文