基于模因框架的包装过滤特征选择算法

2024-05-09 00:58

本文主要是介绍基于模因框架的包装过滤特征选择算法,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

#引用

##LaTex

@ARTICLE{4067093,
author={Z. Zhu and Y. S. Ong and M. Dash},
journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)},
title={Wrapper ndash;Filter Feature Selection Algorithm Using a Memetic Framework},
year={2007},
volume={37},
number={1},
pages={70-76},
keywords={biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
doi={10.1109/TSMCB.2006.883267},
ISSN={1083-4419},
month={Feb},}

##Normal

Z. Zhu, Y. S. Ong and M. Dash, “Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework,” in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 37, no. 1, pp. 70-76, Feb. 2007.
doi: 10.1109/TSMCB.2006.883267
keywords: {biology computing;genetic algorithms;learning (artificial intelligence);pattern classification;search problems;classification problem;feature selection algorithm;genetic algorithm;local search;memetic framework;microarray data set;wrapper filter;Acceleration;Classification algorithms;Computational efficiency;Filters;Genetic algorithms;Machine learning;Machine learning algorithms;Pattern recognition;Pervasive computing;Spatial databases;Chi-square;feature selection;filter;gain ratio;genetic algorithm (GA);hybrid GA (HGA);memetic algorithm (MA);relief;wrapper;Algorithms;Artificial Intelligence;Biomimetics;Computer Simulation;Models, Theoretical;Pattern Recognition, Automated;Software;Systems Theory},
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4067093&isnumber=4067063


#摘要

a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework

a filter ranking method
genetic algorithm
univariate feature ranking information

the University of California, Irvine repository and microarray data sets

classification accuracy, number of selected features, and computational efficiency.

memetic algorithm (MA) — balance between local search and genetic search
to maximize search quality and efficiency


#主要内容

  1. filter methods
  2. wrapper methods

##wrapper–filter feature selection algorithm (WFFSA) using a memetic framework

这里写图片描述

WFFSA

Lamarckian learning

local improvement
Genetic operators


###A 编码表示与初始化

这里写图片描述

a chromosome is a binary string of length equal to the total number of features

randomly initialized


###B 目标函数

the classification accuracy

这里写图片描述

S c S_c Sc — the corresponding selected feature subset encoded in chromosome c c c
J ( S c ) J \left( S_c \right) J(Sc) — criterion function


###C LS改进过程

domain knowledge and heuristics

filter ranking methods as memes or LS heuristics

three different filter ranking methods, namely:

  1. ReliefF;
  2. gain ratio;
  3. chi-square.

based on different criteria:

  1. Euclidean distance,
  2. information entropy,
  3. chi-square statistics

basic LS operators:

  1. “Add”: select a feature from Y using the linear ranking selection and move it to X.
  2. “Del”: select a feature from X using the linear ranking selection and move it to Y .

这里写图片描述

The intensity of LS — the LS length l l l and interval w w w
LS length l l l — the maximum number of Del and Add operations in each LS — l 2 l^2 l2 possible combinations of Add and Del operations
interval w w w — the w w w elite chromosomes in the population

until a local optimum or an improvement is reached

  1. Improvement First Strategy: a random choice from the l 2 l^2 l2 combinations. stops once an improvement is obtained either in terms of classification accuracy or a reduction in the number of selected features without deterioration in accuracy greater than ε ε ε.
    这里写图片描述
  2. Greedy Strategy: carries out all possible l 2 l^2 l2 combinations — the best improved solution
    这里写图片描述
  3. Sequential Strategy: the Add operation searches for the most significant feature y y y in Y Y Y in a sequential manner; the Del operation searches for the least significant feature x from X in a sequential manner
  4. Evolutionary Operators:
    这里写图片描述
    这里写图片描述
  5. Computational Complexity:
    The ranking of features based on the filter methods — linear time complexity — a one-time offline cost — negligible
    the computational cost of a single fitness evaluation — the basic unit of computational cost
    GA — O ( p g ) O(pg) O(pg): p p p — the size of population, g g g — the number of search generations
    +improvement first strategy — O ( p g + l 2 w g / 2 ) O (pg + l^2wg/2) O(pg+l2wg/2)
    +the greedy strategy ( l 2 w / p l^2w/p l2w/p) — O ( p g + l 2 w g ) O (pg + l^2wg) O(pg+l2wg)
    +the sequential strategy ( ( 2 ∣ Y ∣ − l ) l / 2 (2|Y | − l)l/2 (2Yl)l/2 and ( 2 ∣ X ∣ + l ) l / 2 2|X| + l)l/2 2X+l)l/2 — Add and Del operations — n l w nlw nlw) — O ( p g + n l w g ) O(pg + nlwg) O(pg+nlwg)
    KaTeX parse error: Unexpected character: '' at position 8: n \gg ̲ lKaTeX parse error: Unexpected character: '' at position 8: nlw \gg̲ l^2w > l^2w/2 — sequential LS strategy requires significantly more computations

##试验

UCI data sets
ALL/AML, Colon, NCI60, and SRBCT

population size — 30 30 30 and 50 50 50 or 100 100 100 (microarray data sets)
fitness function calls — 6000 6000 6000 and 10000 10000 10000 or 20000 20 000 20000 (microarray data sets)

这里写图片描述

the one nearest neighbor (1NN) classifier
the leave-one-out cross validation (LOOCV)

这里写图片描述
这里写图片描述
这里写图片描述
这里写图片描述

这篇关于基于模因框架的包装过滤特征选择算法的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/971941

相关文章

Java中的雪花算法Snowflake解析与实践技巧

《Java中的雪花算法Snowflake解析与实践技巧》本文解析了雪花算法的原理、Java实现及生产实践,涵盖ID结构、位运算技巧、时钟回拨处理、WorkerId分配等关键点,并探讨了百度UidGen... 目录一、雪花算法核心原理1.1 算法起源1.2 ID结构详解1.3 核心特性二、Java实现解析2.

Spring 框架之Springfox使用详解

《Spring框架之Springfox使用详解》Springfox是Spring框架的API文档工具,集成Swagger规范,自动生成文档并支持多语言/版本,模块化设计便于扩展,但存在版本兼容性、性... 目录核心功能工作原理模块化设计使用示例注意事项优缺点优点缺点总结适用场景建议总结Springfox 是

Python的端到端测试框架SeleniumBase使用解读

《Python的端到端测试框架SeleniumBase使用解读》:本文主要介绍Python的端到端测试框架SeleniumBase使用,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全... 目录SeleniumBase详细介绍及用法指南什么是 SeleniumBase?SeleniumBase

C++ HTTP框架推荐(特点及优势)

《C++HTTP框架推荐(特点及优势)》:本文主要介绍C++HTTP框架推荐的相关资料,本文通过实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录1. Crow2. Drogon3. Pistache4. cpp-httplib5. Beast (Boos

SpringBoot基础框架详解

《SpringBoot基础框架详解》SpringBoot开发目的是为了简化Spring应用的创建、运行、调试和部署等,使用SpringBoot可以不用或者只需要很少的Spring配置就可以让企业项目快... 目录SpringBoot基础 – 框架介绍1.SpringBoot介绍1.1 概述1.2 核心功能2

使用雪花算法产生id导致前端精度缺失问题解决方案

《使用雪花算法产生id导致前端精度缺失问题解决方案》雪花算法由Twitter提出,设计目的是生成唯一的、递增的ID,下面:本文主要介绍使用雪花算法产生id导致前端精度缺失问题的解决方案,文中通过代... 目录一、问题根源二、解决方案1. 全局配置Jackson序列化规则2. 实体类必须使用Long封装类3.

Springboot实现推荐系统的协同过滤算法

《Springboot实现推荐系统的协同过滤算法》协同过滤算法是一种在推荐系统中广泛使用的算法,用于预测用户对物品(如商品、电影、音乐等)的偏好,从而实现个性化推荐,下面给大家介绍Springboot... 目录前言基本原理 算法分类 计算方法应用场景 代码实现 前言协同过滤算法(Collaborativ

Spring框架中@Lazy延迟加载原理和使用详解

《Spring框架中@Lazy延迟加载原理和使用详解》:本文主要介绍Spring框架中@Lazy延迟加载原理和使用方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录一、@Lazy延迟加载原理1.延迟加载原理1.1 @Lazy三种配置方法1.2 @Component

openCV中KNN算法的实现

《openCV中KNN算法的实现》KNN算法是一种简单且常用的分类算法,本文主要介绍了openCV中KNN算法的实现,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的... 目录KNN算法流程使用OpenCV实现KNNOpenCV 是一个开源的跨平台计算机视觉库,它提供了各

springboot+dubbo实现时间轮算法

《springboot+dubbo实现时间轮算法》时间轮是一种高效利用线程资源进行批量化调度的算法,本文主要介绍了springboot+dubbo实现时间轮算法,文中通过示例代码介绍的非常详细,对大家... 目录前言一、参数说明二、具体实现1、HashedwheelTimer2、createWheel3、n