AICurious Logo

What is: Semantic Reasoning Network?

SourceTowards Accurate Scene Text Recognition with Semantic Reasoning Networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Semantic reasoning network, or SRN, is an end-to-end trainable framework for scene text recognition that consists of four parts: backbone network, parallel visual attention module (PVAM), global semantic reasoning module (GSRM), and visual-semantic fusion decoder (VSFD). Given an input image, the backbone network is first used to extract 2D features VV. Then, the PVAM is used to generate NN aligned 1-D features GG, where each feature corresponds to a character in the text and captures the aligned visual information. These NN 1-D features GG are then fed into a GSRM to capture the semantic information SS. Finally, the aligned visual features GG and the semantic information SS are fused by the VSFD to predict NN characters. For text string shorter than NN, ’EOS’ are padded.