脚本

Perl学习笔记如何在REAPER软件上使用python脚本

前端

html深入学习(ajax,浏览器兼容性等)

图像

Image scaling tools 三维场景下用鼠标在window上绘图图像幅度谱（amplitude spectrum）与相位谱（phase spectrum）

算法

排序算法总结高级搜索树(Advanced search tree) 为面试所准备的算法练习二叉搜索树(Binary search tree) 图(graph):广度优先(BFS)和深度优先(DFS) 队列(Queue) 栈(Stack) 向量(vector) 链表(list) 图(graph) 动态规划(Dynamic Programming) 二叉树(binary tree) 用球谐函数描述3D物体进行3D搜索 KMP算法

cpp

C++ STL 容器预分配空间的笔记 C++一般向问题汇总 C++内存管理,堆栈溢出问题总结 Practice on Programming笔记（第十一/二周:枚举/递归） Practice on Programming笔记（第九周:STL3） Practice on Programming笔记（第九周:STL2） Practice on Programming笔记（第八周:STL） Practice on Programming笔记（第七周：模板,string类） Practice on Programming笔记（第六周：多态与虚函数） Practice on Programming笔记（第五周：继承与派生） Practice on Programming笔记（第四周：运算符重载） Practice on Programming笔记（第三周：类和对象进阶） Practice on Programming笔记（第二周:类和对象初探） Practice on Programming笔记（第一周:从C到C++）

笔记

别人的百度面经整理,看自己还欠缺哪部分知识 3DI面试的一道题

dataanalysis

Text Mining (Kernel) Models and Analysis 3 (Homophily) Models and Analysis 2 (Clustering) Models and Analysis 1 Principles of Analytic Graphics

ruby

Ruby+Postgresql notes

learning

Language and Models of Prosositional Logic 2 Language and Models of Prosositional Logic 1

标签

python 1

如何在REAPER软件上使用python脚本

html 1

html深入学习(ajax,浏览器兼容性等)

ajax 1

html深入学习(ajax,浏览器兼容性等)

perl 1

Perl学习笔记

图像处理 1

图像幅度谱（amplitude spectrum）与相位谱（phase spectrum）

C++ 31

别人的百度面经整理,看自己还欠缺哪部分知识排序算法总结高级搜索树(Advanced search tree) 3DI面试的一道题为面试所准备的算法练习二叉搜索树(Binary search tree) C++ STL 容器预分配空间的笔记图(graph):广度优先(BFS)和深度优先(DFS) 队列(Queue) 栈(Stack) 向量(vector) 链表(list) 图(graph) 动态规划(Dynamic Programming) 二叉树(binary tree) C++一般向问题汇总 C++内存管理,堆栈溢出问题总结用球谐函数描述3D物体进行3D搜索 Practice on Programming笔记（第十一/二周:枚举/递归） Practice on Programming笔记（第九周:STL3） Practice on Programming笔记（第九周:STL2） Practice on Programming笔记（第八周:STL） Practice on Programming笔记（第七周：模板,string类） Practice on Programming笔记（第六周：多态与虚函数） Practice on Programming笔记（第五周：继承与派生） Practice on Programming笔记（第四周：运算符重载） Practice on Programming笔记（第三周：类和对象进阶） Practice on Programming笔记（第二周:类和对象初探） Practice on Programming笔记（第一周:从C到C++） KMP算法三维场景下用鼠标在window上绘图

OpenGL 1

三维场景下用鼠标在window上绘图

算法 17

别人的百度面经整理,看自己还欠缺哪部分知识排序算法总结高级搜索树(Advanced search tree) 3DI面试的一道题为面试所准备的算法练习二叉搜索树(Binary search tree) 图(graph):广度优先(BFS)和深度优先(DFS) 队列(Queue) 栈(Stack) 向量(vector) 链表(list) 图(graph) 动态规划(Dynamic Programming) 二叉树(binary tree) 用球谐函数描述3D物体进行3D搜索 Practice on Programming笔记（第十一/二周:枚举/递归） KMP算法

数据结构 14

别人的百度面经整理,看自己还欠缺哪部分知识排序算法总结高级搜索树(Advanced search tree) 3DI面试的一道题为面试所准备的算法练习二叉搜索树(Binary search tree) 图(graph):广度优先(BFS)和深度优先(DFS) 队列(Queue) 栈(Stack) 向量(vector) 链表(list) 图(graph) 动态规划(Dynamic Programming) 二叉树(binary tree)

data analysis 5

Text Mining (Kernel) Models and Analysis 3 (Homophily) Models and Analysis 2 (Clustering) Models and Analysis 1 Principles of Analytic Graphics

ruby 1

Ruby+Postgresql notes

postgresql 1

Ruby+Postgresql notes

logic 2

Language and Models of Prosositional Logic 2 Language and Models of Prosositional Logic 1

models 3

Models and Analysis 3 (Homophily) Models and Analysis 2 (Clustering) Models and Analysis 1

image processing 1

Image scaling tools

logo 1

Image scaling tools

text mining 1

Text Mining (Kernel)

Text Mining (Kernel)

2014年12月01日

Analysis of Text Patterns Using Kernel Methods

The first step of the kernel approach is to embed the data items (e.g., documents) into a Euclidean space where the pattens can be represented by a linear relation.
This step reduces many complex problems to a class of linear problems, and algorithms used to solve them are efficient and well understood.

The second step is to detect relations within the embedded data set, using a robust and efficient pattern analysis algorithm.

The main idea is quite similar to SVM (Support Vector Machine).
The first step of SVM is to apply a kernel function and transfer a non-linear problem to linear problem.
Then do classification or regression, and either of them is linear.

Linear algorithms are preferred because of their efficiency and indeed they are well understood, both from a statistical and computational perspective.

Using efficient kernels, we can look for linear relations in very high dimensional spaces at a very low computational cost.
If it is necessary to consider a non-linear map, we are still provided with an efficient way to discover non-linear relations in the data, by using a linear algorithms in a different space.

It is important to highlight that the feature space is not uniquely determined by the kernel function.