有声小说打包下载,盗墓笔记,小说阅读器

EAST模型

EAST( An Efficient and Accurate Scene Text Detector)是標題的英文首字母縮寫，模型出自曠視科技。相比其他幾種場景文字檢測模型，表現開掛。在ICDAR 2015數據集上表現優異，見下圖：

可以看到紅色點標記EAST模型的速度與性能超過之前的模型。EAST模型是一個全卷積神經網絡(FCN)它會預測每個像素是否是TEXT或者WORDS，對比之前的一些卷積神經網絡剔除了區域候選、文本格式化等操作，簡潔明了，后續操作只需要根據閾值進行過濾以及通過非最大抑制(NMS)得到最終的文本區域即可，EAST模型結構如下：

其中stem網絡是一個基于ImageNet預訓練的卷積神經網絡(CNN)比如VGG-16，剩下的分別是通過卷積不斷降低尺度大小，再通過不同層的反卷積進行合并，這個有點像UNet圖像分割網絡，最后輸出層，通過1x1的卷積分別得到score、RBOX、QUAD，輸出參數的解釋如下：

OpenCV DNN使用

OpenCV4.0 的深度神經網絡(DNN)模塊能力大大加強，不僅支持常見的圖像分類、對象檢測、圖像分割網絡，還實現了自定義層與通用網絡模型支持，同時提供了非最大抑制相關API支持，使用起來十分方便。EAST模型的tensorflow代碼實現參見如下：

https://github.com/argman/EAST

下載預訓練模型，生成pb文件，OpenCV DNN中導入tensorflow模型的API如下：

Netcv::readNet(
constString&model,
constString&config="",
constString&framework=""
)
model表示模型路徑
config表示配置文件，缺省為空
framework表示框架，缺省為空，根據導入模型自己決定

OpenCV DNN已經實現非最大抑制算法，支持的API調用如下：

voidcv::NMSBoxes(
conststd::vector&bboxes,
conststd::vector&scores,
constfloatscore_threshold,
constfloatnms_threshold,
std::vector&indices,
constfloateta=1.f,
constinttop_k=0
)
Bboxes表示輸入的boxes
Score表示每個box得分
score_threshold表示score的閾值
nms_threshold表示非最大抑制閾值
indices表示輸出的結果，是每個box的索引index數組
eta表示自適應的閾值nms閾值方式
top_k表示前多少個，為0表示忽略

代碼實現

首先加載模型，然后打開攝像頭，完成實時檢測，C++的代碼如下：

#include>
#include

usingnamespacecv;
usingnamespacecv::dnn;

voiddecode(constMat&scores,constMat&geometry,floatscoreThresh,
std::vector&detections,std::vector&confidences);

intmain(intargc,char**argv)
{
floatconfThreshold=0.5;
floatnmsThreshold=0.4;
intinpWidth=320;
intinpHeight=320;
Stringmodel="D:/python/cv_demo/ocr_demo/frozen_east_text_detection.pb";

//Loadnetwork.
Netnet=readNet(model);

//Openacamerastream.
VideoCapturecap(0);

staticconststd::stringkWinName="EAST:AnEfficientandAccurateSceneTextDetector";
namedWindow(kWinName,WINDOW_AUTOSIZE);

std::vectorouts;
std::vectoroutNames(2);
outNames[0]="feature_fusion/Conv_7/Sigmoid";
outNames[1]="feature_fusion/concat_3";

Matframe,blob;
while(waitKey(1)>frame;
if(frame.empty())
{
waitKey();
break;
}

blobFromImage(frame,blob,1.0,Size(inpWidth,inpHeight),Scalar(123.68,116.78,103.94),true,false);
net.setInput(blob);
net.forward(outs,outNames);

Matscores=outs[0];
Matgeometry=outs[1];

//Decodepredictedboundingboxes.
std::vectorboxes;
std::vectorconfidences;
decode(scores,geometry,confThreshold,boxes,confidences);

//Applynon-maximumsuppressionprocedure.
std::vectorindices;
NMSBoxes(boxes,confidences,confThreshold,nmsThreshold,indices);

//Renderdetections.
Point2fratio((float)frame.cols/inpWidth,(float)frame.rows/inpHeight);
for(size_ti=0;ilayersTimes;
doublefreq=getTickFrequency()/1000;
doublet=net.getPerfProfile(layersTimes)/freq;
std::stringlabel=format("Inferencetime:%.2fms",t);
putText(frame,label,Point(0,15),FONT_HERSHEY_SIMPLEX,0.5,Scalar(0,255,0));

imshow(kWinName,frame);
}
return0;
}

voiddecode(constMat&scores,constMat&geometry,floatscoreThresh,
std::vector&detections,std::vector&confidences)
{
detections.clear();
CV_Assert(scores.dims==4);CV_Assert(geometry.dims==4);CV_Assert(scores.size[0]==1);
CV_Assert(geometry.size[0]==1);CV_Assert(scores.size[1]==1);CV_Assert(geometry.size[1]==5);
CV_Assert(scores.size[2]==geometry.size[2]);CV_Assert(scores.size[3]==geometry.size[3]);

constintheight=scores.size[2];
constintwidth=scores.size[3];
for(inty=0;y(0,0,y);
constfloat*x0_data=geometry.ptr(0,0,y);
constfloat*x1_data=geometry.ptr(0,1,y);
constfloat*x2_data=geometry.ptr(0,2,y);
constfloat*x3_data=geometry.ptr(0,3,y);
constfloat*anglesData=geometry.ptr(0,4,y);
for(intx=0;x

	

	python的代碼實現如下：

	
if__name__=="__main__":
text_detector=TextAreaDetector("D:/python/cv_demo/ocr_demo/frozen_east_text_detection.pb")
frame=cv.imread("D:/txt.png")
start=time.time()
text_detector.detect(frame)
end=time.time()
print("[INFO]textdetectiontook{:.4f}seconds".format(end-start))
#showtheoutputimage
cv.imshow("TextDetection",frame)
cv.waitKey(0)
cap=cv.VideoCapture(0)
whileTrue:
ret,frame=cap.read()
ifretisnotTrue:
break
text_detector.detect(frame)
cv.imshow("easttextdetectdemo",frame)
c=cv.waitKey(5)
ifc==27:
break
cv.destroyAllWindows()


	

	運行結果

	圖書封面 – 圖像檢測

	

	視頻場景中文字檢測

	

	手寫文本檢測

	

	

	   審核編輯：彭靜