AICurious Logo

What is: PointQuad-Transformer?

SourcePQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

PQ-Transformer, or PointQuad-Transformer, is a Transformer-based architecture that predicts 3D objects and layouts simultaneously, using point cloud inputs. Unlike existing methods that either estimate layout keypoints or edges, room layouts are directly parameterized as a set of quads. Along with the quad representation, a physical constraint loss function is used that discourages object-layout interference.

Given an input 3D point cloud of NN points, the point cloud feature learning backbone extracts MM context-aware point features of (3+C)\left(3+C\right) dimensions, through sampling and grouping. A voting module and a farthest point sampling (FPS) module are used to generate K_1K\_{1} object proposals and K_2K\_{2} quad proposals respectively. Then the proposals are processed by a transformer decoder to further refine proposal features. Through several feedforward layers and non-maximum suppression (NMS), the proposals become the final object bounding boxes and layout quads.