Image Sizes Tensorflow Object Detection using pretrained models -
i see tensorflow object detection api allows 1 customise image sizes fed in. question how works pretrained weights, trained on 224*224 images, or 300*300 images.
in other frameworks used, such caffe rfcn, , yolo , keras ssd, images downscaled fit standard size coming pretrained weights.
are pretrained weights used tf of 300*300 input size ? , if so, how can use these weights classify customised image sizes ? tf downsize respective weights size ?
for understanding input size affects input layer of network. please correct me if wrong, i'm still quite new whole deep learning paradigm.
i have used 3 models of tensorflow object detection api. faster r-cnn , r-fcn, both resnet101 feature extractor , ssd model inception v2. ssd model reshapes images fixed m x m
size. mentioned in paper "speed/accuracy trade-offs modern convolutional object detectors" huang et al., whereas n faster r-cnn , r-fcn, models trained on images scaled m pixels on shorter edge. resizing located in preprocessing stage of model.
another method keep aspect ratio , crop fixed size on image, 1 can crop different positions (center, top-left, top-right, bottom-left, bottom-right etc.) make model robust. more sophisticated ways include resizing image several scales , cropping, , using different aspect ratios in convolutional layers adaptive pooling size later make same feature dimension spp (see spatial pyramid pooling in deep convolutional networks visual recognition et al. more detail.) thing done keep_aspect_ratio_resizer
in config proto.
this makes architectures understanding resilient different image sizes. internal weights of hidden layers not affected input size of image.
Comments
Post a Comment