AICurious Logo

What is: BezierAlign?

SourceABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

BezierAlign is a feature sampling method for arbitrarily-shaped scene text recognition that exploits parameterization nature of a compact Bezier curve bounding box. Unlike RoIAlign, the shape of sampling grid of BezierAlign is not rectangular. Instead, each column of the arbitrarily-shaped grid is orthogonal to the Bezier curve boundary of the text. The sampling points have equidistant interval in width and height, respectively, which are bilinear interpolated with respect to the coordinates.

Formally given an input feature map and Bezier curve control points, we concurrently process all the output pixels of the rectangular output feature map with size h_out ×w_out h\_{\text {out }} \times w\_{\text {out }}. Taking pixel g_ig\_{i} with position (g_iw,g_ih)\left(g\_{i w}, g\_{i h}\right) (from output feature map) as an example, we calculate tt by:

t=g_iww_outt=\frac{g\_{i w}}{w\_{o u t}}

We then calculate the point of upper Bezier curve boundary tptp and lower Bezier curve boundary bpbp. Using tptp and bpbp, we can linearly index the sampling point opop by:

op=bpg_ihh_out +tp(1g_ihh_out )op=bp \cdot \frac{g\_{i h}}{h\_{\text {out }}}+tp \cdot\left(1-\frac{g\_{i h}}{h\_{\text {out }}}\right)

With the position of opop, we can easily apply bilinear interpolation to calculate the result. Comparisons among previous sampling methods and BezierAlign are shown in the Figure.