C# Open Vocabulary Object Detection 部署开放域目标检测

本文主要是介绍C# Open Vocabulary Object Detection 部署开放域目标检测,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

目录

介绍

效果

模型信息

owlvit-image.onnx

owlvit-post.onnx

owlvit-text.onnx

项目

代码

Form1.cs

OWLVIT.cs 

下载 


C# Open Vocabulary Object Detection 部署开放域目标检测

介绍

训练源码地址:https://github.com/google-research/scenic/tree/main/scenic/projects/owl_vit

效果

模型信息

owlvit-image.onnx

Inputs
-------------------------
name:pixel_values
tensor:Float[1, 3, 768, 768]
---------------------------------------------------------------

Outputs
-------------------------
name:image_embeds
tensor:Float[1, 24, 24, 768]
name:pred_boxes
tensor:Float[1, 576, 4]
---------------------------------------------------------------

owlvit-post.onnx

Inputs
-------------------------
name:image_embeds
tensor:Float[1, 24, 24, 768]
name:/owlvit/Div_output_0
tensor:Float[1, 512]
name:input_ids
tensor:Int64[1, 16]
---------------------------------------------------------------

Outputs
-------------------------
name:logits
tensor:Float[-1, 576, 1]
---------------------------------------------------------------

owlvit-text.onnx

Inputs
-------------------------
name:input_ids
tensor:Int64[1, 16]
name:attention_mask
tensor:Int64[1, 16]
---------------------------------------------------------------

Outputs
-------------------------
name:text_embeds
tensor:Float[1, 1, 512]
---------------------------------------------------------------

项目

代码

Form1.cs

using OpenCvSharp;
using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;namespace Onnx_Demo
{public partial class Form1 : Form{public Form1(){InitializeComponent();}OWLVIT owlvit = new OWLVIT("model/owlvit-image.onnx", "model/owlvit-text.onnx", "model/owlvit-post.onnx", "model/vocab.txt");string image_path = "";string fileFilter = "*.*|*.bmp;*.jpg;*.jpeg;*.tiff;*.tiff;*.png";StringBuilder sb = new StringBuilder();Mat image;Mat result_image;private void button2_Click(object sender, EventArgs e){OpenFileDialog ofd = new OpenFileDialog();ofd.Filter = fileFilter;if (ofd.ShowDialog() != DialogResult.OK) return;pictureBox1.Image = null;pictureBox2.Image = null;txtInfo.Text = "";image_path = ofd.FileName;pictureBox2.Image = new Bitmap(image_path);image = new Mat(image_path);}private void button3_Click(object sender, EventArgs e){if (image_path == ""){return;}if (String.IsNullOrEmpty(txt_input_text.Text)){return;}pictureBox1.Image = null;txtInfo.Text = "检测中,请稍等……";button3.Enabled=false;if (pictureBox1.Image!=null){pictureBox1.Image.Dispose();pictureBox1.Image = null;   }Application.DoEvents();List<string> texts = txt_input_text.Text.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries).ToList();owlvit.encode_texts(texts);List<BoxInfo> objects = owlvit.detect(image, texts);result_image = image.Clone();sb.Clear();for (int i = 0; i < objects.Count; i++){Cv2.Rectangle(result_image, objects[i].box, new Scalar(0, 0, 255), 2);Cv2.PutText(result_image, objects[i].text + " " + objects[i].prob.ToString("F2"), new OpenCvSharp.Point(objects[i].box.X, objects[i].box.Y), HersheyFonts.HersheySimplex, 1, new Scalar(0, 0, 255), 2); ;sb.AppendLine(objects[i].text + " " + objects[i].prob.ToString("F2"));}pictureBox1.Image = new Bitmap(result_image.ToMemoryStream());button3.Enabled = true;txtInfo.Text = sb.ToString();}private void Form1_Load(object sender, EventArgs e){image_path = "test_img/2.jpg";pictureBox2.Image = new Bitmap(image_path);image = new Mat(image_path);owlvit.encode_image(image);}}
}

OWLVIT.cs 

using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using OpenCvSharp;
using OpenCvSharp.Dnn;
using System;
using System.Collections.Generic;
using System.Linq;namespace Onnx_Demo
{public class OWLVIT{float bbox_threshold = 0.02f;int inpWidth = 768;int inpHeight = 768;float[] mean = new float[] { 0.48145466f, 0.4578275f, 0.40821073f };float[] std = new float[] { 0.26862954f, 0.26130258f, 0.27577711f };Net net;float[] image_features_input;SessionOptions options;InferenceSession onnx_session;List<NamedOnnxValue> input_container;IDisposableReadOnlyCollection<DisposableNamedOnnxValue> result_infer;DisposableNamedOnnxValue[] results_onnxvalue;Tensor<float> result_tensors;TokenizerBase tokenizer;SessionOptions options_transformer;InferenceSession onnx_session_transformer;float[] image_features;List<long[]> input_ids = new List<long[]>();List<float[]> text_features = new List<float[]>();long[] attention_mask;int len_image_feature = 24 * 24 * 768;int cnt_pred_boxes = 576;int len_text_token = 16;int context_length = 52;int len_text_feature = 512;int[] image_features_shape = { 1, 24, 24, 768 };int[] text_features_shape = { 1, 512 };public int imgnum = 0;public List<string> imglist = new List<string>();List<Rect2f> pred_boxes = new List<Rect2f>();public OWLVIT(string image_modelpath, string text_modelpath, string decoder_model_path, string vocab_path){net = CvDnn.ReadNetFromOnnx(image_modelpath);input_container = new List<NamedOnnxValue>();options = new SessionOptions();options.LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_INFO;options.AppendExecutionProvider_CPU(0);onnx_session = new InferenceSession(text_modelpath, options);options_transformer = new SessionOptions();options_transformer.LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_INFO;options_transformer.AppendExecutionProvider_CPU(0);onnx_session_transformer = new InferenceSession(decoder_model_path, options);load_tokenizer(vocab_path);}void load_tokenizer(string vocab_path){tokenizer = new TokenizerClip();tokenizer.load_tokenize(vocab_path);}Mat normalize_(Mat src){Cv2.CvtColor(src, src, ColorConversionCodes.BGR2RGB);Mat[] bgr = src.Split();for (int i = 0; i < bgr.Length; ++i){bgr[i].ConvertTo(bgr[i], MatType.CV_32FC1, 1.0 / (255.0 * std[i]), (0.0 - mean[i]) / std[i]);}Cv2.Merge(bgr, src);foreach (Mat channel in bgr){channel.Dispose();}return src;}float sigmoid(float x){return (float)(1.0f / (1.0f + Math.Exp(-x)));}public unsafe void encode_image(Mat srcimg){pred_boxes.Clear();Mat temp_image = new Mat();Cv2.Resize(srcimg, temp_image, new Size(inpWidth, inpHeight));Mat normalized_mat = normalize_(temp_image);Mat blob = CvDnn.BlobFromImage(normalized_mat);net.SetInput(blob);//模型推理,读取推理结果Mat[] outs = new Mat[2] { new Mat(), new Mat() };string[] outBlobNames = net.GetUnconnectedOutLayersNames().ToArray();net.Forward(outs, outBlobNames);float* ptr_feat = (float*)outs[0].Data;image_features = new float[len_image_feature];for (int i = 0; i < len_image_feature; i++){image_features[i] = ptr_feat[i];}float* ptr_box = (float*)outs[1].Data;Rect2f temp;for (int i = 0; i < cnt_pred_boxes; i++){float xc = ptr_box[i * 4 + 0] * inpWidth;float yc = ptr_box[i * 4 + 1] * inpHeight;temp = new Rect2f();temp.Width = ptr_box[i * 4 + 2] * inpWidth;temp.Height = ptr_box[i * 4 + 3] * inpHeight;temp.X = (float)(xc - temp.Width * 0.5);temp.Y = (float)(yc - temp.Height * 0.5);pred_boxes.Add(temp);}}public unsafe void encode_texts(List<string> texts){List<List<int>> text_token = new List<List<int>>(texts.Count);for (int i = 0; i < texts.Count; i++){text_token.Add(new List<int>());}text_features.Clear();input_ids.Clear();for (int i = 0; i < texts.Count; i++){tokenizer.encode_text(texts[i], text_token[i]);int len_ids = text_token[i].Count;long[] temp_ids = new long[len_text_token];attention_mask = new long[len_text_token];for (int j = 0; j < len_text_token; j++){if (j < len_ids){temp_ids[j] = text_token[i][j];attention_mask[j] = 1;}else{temp_ids[j] = 0;attention_mask[j] = 0;}}input_ids.Add(temp_ids);input_container.Clear();Tensor<long> input_tensor = new DenseTensor<long>(input_ids[i], new[] { 1, len_text_token });Tensor<long> input_tensor_mask = new DenseTensor<long>(attention_mask, new[] { 1, attention_mask.Length });input_container.Add(NamedOnnxValue.CreateFromTensor("input_ids", input_tensor));input_container.Add(NamedOnnxValue.CreateFromTensor("attention_mask", input_tensor));result_infer = onnx_session.Run(input_container);results_onnxvalue = result_infer.ToArray();result_tensors = results_onnxvalue[0].AsTensor<float>();float[] temp_text_features = results_onnxvalue[0].AsTensor<float>().ToArray();text_features.Add(temp_text_features);}}List<float> decode(float[] input_image_feature, float[] input_text_feature, long[] input_id){input_container.Clear();Tensor<float> input_tensor_image_embeds = new DenseTensor<float>(input_image_feature, image_features_shape);Tensor<float> input_tensor_Div_output_0 = new DenseTensor<float>(input_text_feature, text_features_shape);Tensor<long> input_ids = new DenseTensor<long>(input_id, new[] { 1, 16 });/*name:image_embedstensor:Float[1, 24, 24, 768]name:/owlvit/Div_output_0tensor:Float[1, 512]name:input_idstensor:Int64[1, 16]*/input_container.Add(NamedOnnxValue.CreateFromTensor("image_embeds", input_tensor_image_embeds));input_container.Add(NamedOnnxValue.CreateFromTensor("/owlvit/Div_output_0", input_tensor_Div_output_0));input_container.Add(NamedOnnxValue.CreateFromTensor("input_ids", input_ids));result_infer = onnx_session_transformer.Run(input_container);results_onnxvalue = result_infer.ToArray();result_tensors = results_onnxvalue[0].AsTensor<float>();return results_onnxvalue[0].AsTensor<float>().ToList();}public List<BoxInfo> detect(Mat srcimg, List<string> texts){float ratioh = 1.0f * srcimg.Rows / inpHeight;float ratiow = 1.0f * srcimg.Cols / inpWidth;List<float> confidences = new List<float>();List<Rect> boxes = new List<Rect>();List<string> className = new List<string>();for (int i = 0; i < input_ids.Count; i++){List<float> logits = decode(image_features, text_features[i], input_ids[i]);for (int j = 0; j < logits.Count; j++){float score = sigmoid(logits[j]);if (score >= bbox_threshold){//还原回到原图int xmin = (int)(pred_boxes[j].X * ratiow);int ymin = (int)(pred_boxes[j].Y * ratioh);int xmax = (int)((pred_boxes[j].X + pred_boxes[j].Width) * ratiow);int ymax = (int)((pred_boxes[j].Y + pred_boxes[j].Height) * ratioh);//越界检查保护xmin = Math.Max(Math.Min(xmin, srcimg.Cols - 1), 0);ymin = Math.Max(Math.Min(ymin, srcimg.Rows - 1), 0);xmax = Math.Max(Math.Min(xmax, srcimg.Cols - 1), 0);ymax = Math.Max(Math.Min(ymax, srcimg.Rows - 1), 0);boxes.Add(new Rect(xmin, ymin, xmax - xmin, ymax - ymin));confidences.Add(score);className.Add(texts[i]);}}}float nmsThreshold = 0.5f;int[] indices;CvDnn.NMSBoxes(boxes, confidences, bbox_threshold, nmsThreshold, out indices);List<BoxInfo> objects = new List<BoxInfo>();for (int i = 0; i < indices.Length; ++i){BoxInfo temp = new BoxInfo();temp.text = className[i];temp.prob = confidences[i];temp.box = boxes[i];objects.Add(temp);}return objects;}}
}

下载 

源码下载

这篇关于C# Open Vocabulary Object Detection 部署开放域目标检测的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/771420

相关文章

C# 比较两个list 之间元素差异的常用方法

《C#比较两个list之间元素差异的常用方法》:本文主要介绍C#比较两个list之间元素差异,本文通过实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录1. 使用Except方法2. 使用Except的逆操作3. 使用LINQ的Join,GroupJoin

C++ 检测文件大小和文件传输的方法示例详解

《C++检测文件大小和文件传输的方法示例详解》文章介绍了在C/C++中获取文件大小的三种方法,推荐使用stat()函数,并详细说明了如何设计一次性发送压缩包的结构体及传输流程,包含CRC校验和自动解... 目录检测文件的大小✅ 方法一:使用 stat() 函数(推荐)✅ 用法示例:✅ 方法二:使用 fsee

OpenCV实现实时颜色检测的示例

《OpenCV实现实时颜色检测的示例》本文主要介绍了OpenCV实现实时颜色检测的示例,通过HSV色彩空间转换和色调范围判断实现红黄绿蓝颜色检测,包含视频捕捉、区域标记、颜色分析等功能,具有一定的参考... 目录一、引言二、系统概述三、代码解析1. 导入库2. 颜色识别函数3. 主程序循环四、HSV色彩空间

C#如何去掉文件夹或文件名非法字符

《C#如何去掉文件夹或文件名非法字符》:本文主要介绍C#如何去掉文件夹或文件名非法字符的问题,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录C#去掉文件夹或文件名非法字符net类库提供了非法字符的数组这里还有个小窍门总结C#去掉文件夹或文件名非法字符实现有输入字

C#之List集合去重复对象的实现方法

《C#之List集合去重复对象的实现方法》:本文主要介绍C#之List集合去重复对象的实现方法,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录C# List集合去重复对象方法1、测试数据2、测试数据3、知识点补充总结C# List集合去重复对象方法1、测试数据

C#实现将Office文档(Word/Excel/PDF/PPT)转为Markdown格式

《C#实现将Office文档(Word/Excel/PDF/PPT)转为Markdown格式》Markdown凭借简洁的语法、优良的可读性,以及对版本控制系统的高度兼容性,逐渐成为最受欢迎的文档格式... 目录为什么要将文档转换为 Markdown 格式使用工具将 Word 文档转换为 Markdown(.

Java调用C#动态库的三种方法详解

《Java调用C#动态库的三种方法详解》在这个多语言编程的时代,Java和C#就像两位才华横溢的舞者,各自在不同的舞台上展现着独特的魅力,然而,当它们携手合作时,又会碰撞出怎样绚丽的火花呢?今天,我们... 目录方法1:C++/CLI搭建桥梁——Java ↔ C# 的“翻译官”步骤1:创建C#类库(.NET

C#代码实现解析WTGPS和BD数据

《C#代码实现解析WTGPS和BD数据》在现代的导航与定位应用中,准确解析GPS和北斗(BD)等卫星定位数据至关重要,本文将使用C#语言实现解析WTGPS和BD数据,需要的可以了解下... 目录一、代码结构概览1. 核心解析方法2. 位置信息解析3. 经纬度转换方法4. 日期和时间戳解析5. 辅助方法二、L

使用C#删除Excel表格中的重复行数据的代码详解

《使用C#删除Excel表格中的重复行数据的代码详解》重复行是指在Excel表格中完全相同的多行数据,删除这些重复行至关重要,因为它们不仅会干扰数据分析,还可能导致错误的决策和结论,所以本文给大家介绍... 目录简介使用工具C# 删除Excel工作表中的重复行语法工作原理实现代码C# 删除指定Excel单元

C#使用MQTTnet实现服务端与客户端的通讯的示例

《C#使用MQTTnet实现服务端与客户端的通讯的示例》本文主要介绍了C#使用MQTTnet实现服务端与客户端的通讯的示例,包括协议特性、连接管理、QoS机制和安全策略,具有一定的参考价值,感兴趣的可... 目录一、MQTT 协议简介二、MQTT 协议核心特性三、MQTTNET 库的核心功能四、服务端(BR